LLM Playground

Select Model

Chrome Built-in AI (Gemini Nano) Checking...

Google's on-device AI model built into Chrome browser

~3B parameters No download (built-in)

DistilGPT-2

Distilled version of GPT-2, faster and lighter

82M parameters ~330MB download

GPT-2

OpenAI's classic autoregressive language model

124M parameters ~500MB download

LaMini-Cerebras 590M

Instruction-tuned model for following prompts and tasks

590M parameters ~1.2GB download

TinyLlama 1.1B Chat

Chat-optimized model with strong conversational abilities

1.1B parameters ~2.2GB download

+ Add Custom Model

About This Playground

This testing harness uses Transformers.js to run language models directly in your browser using WebAssembly and WebGPU. All computation happens locally - no data is sent to any server.

Generation Parameters

Max New Tokens: Maximum number of tokens to generate
Temperature: Controls randomness (higher = more creative, lower = more focused)
Top-K: Limits sampling to the K most likely next tokens
Top-P: Nucleus sampling - samples from smallest set of tokens with cumulative probability ≥ P

Chrome Built-in AI Setup

If Chrome AI shows as "Unavailable", follow these steps:

Use Chrome 127+ (Canary, Dev, or Beta recommended)
Enable flag: chrome://flags/#optimization-guide-on-device-model
Enable flag: chrome://flags/#prompt-api-for-gemini-nano
Restart Chrome
Visit chrome://components and click "Check for update" on "Optimization Guide On Device Model"
Wait for the model to download (~1.7GB), then reload this page

Platform support: Windows 10/11, macOS 13+, Linux, ChromeOS (Chromebook Plus). Not available on mobile yet.

Debugging: Open DevTools Console (F12) to see detailed availability status and setup instructions.

Performance Tips

Chrome Built-in AI is fastest and requires no download
First load of other models may take a while as they're downloaded and cached
Subsequent loads will be much faster thanks to browser caching
For quick testing, start with DistilGPT-2 (smallest download)
TinyLlama offers the best balance of quality and size for chat applications

Memory Management

Automatic cleanup: When switching models, the previous model is automatically disposed from memory
Memory usage: Loaded models stay in RAM (330MB-2.2GB depending on model)
Refresh to clear: Reload the page to completely free all memory
Note: Downloaded model files are cached by your browser for faster subsequent loads

Generated Text

About This Playground

Generation Parameters

Chrome Built-in AI Setup

Performance Tips

Memory Management