AI Token Counter

Count tokens for GPT-4o & Gemini models

Instantly count tokens for GPT-4o and Gemini models. See context window usage and estimated API cost in real time — no sign-up needed.

Your Text
0 characters
Select Model
GPT-4o Exact

0

tokens

0% of context window $0.000000

What is an AI Token?

A token is the basic unit that large language models (LLMs) process. Tokens are not the same as words — they can be parts of words, whole words, punctuation, or even spaces. On average, 1 token ≈ 4 characters or ¾ of a word in English, but this varies significantly by language and content type.

Every AI API charges by the token and enforces a context window limit — the maximum number of tokens the model can process in a single request (including your prompt and its response). Knowing your token count before sending a request helps you avoid unexpected costs and context overflow errors.

This tool uses the o200k_base tokenizer (used by GPT-4o and GPT-4o mini) for exact counts on OpenAI models, and a close approximation for Gemini models. Click Get token count when a Gemini model is selected to get precise numbers via Google's API.

How to Use This Tool

  1. Paste your prompt, system message, or any text into the input area.
  2. Select the model you want to count tokens for from the dropdown.
  3. For GPT-4o models, token count, context window usage, and cost update instantly as you type.
  4. For Gemini models, click 'Get token count' to fetch exact numbers via Google's API.
  5. Switch models at any time — select a new model and the result updates immediately.

Common Use Cases

Staying Within Context Limits

Check if your prompt plus conversation history fits within the model's context window before hitting a 400 error.

Estimating API Costs

Calculate input token costs across models before choosing which to use for a high-volume task.

Comparing Models

Switch between GPT-4o and Gemini to see how token counts differ for the same text.

Optimizing System Prompts

Reduce token count in your system prompt to leave more room for conversation without cutting context.

RAG Chunk Sizing

Measure how many tokens your retrieval chunks use to set the right chunk size for your vector store.

Fine-tuning Dataset Prep

Estimate token counts across your training examples to plan fine-tuning costs and stay within limits.

Frequently Asked Questions

What's the difference between input and output tokens?

Input tokens are everything you send to the model: your prompt, any system instructions, conversation history, and context. Output tokens are the tokens the model generates in its response. Most APIs charge differently for input vs output — output tokens typically cost more per token. For example, GPT-4o charges $2.50 per 1M input tokens but $10.00 per 1M output tokens, so a long response can be expensive.

How many tokens is 1000 words?

Roughly 1,300–1,500 tokens for typical English prose. Code tends to tokenize to more tokens per word due to special characters and identifiers. Non-English languages (especially CJK scripts) can be 2–3x more tokens per character than English.

What happens if I exceed the context window?

The API will return an error (typically a 400 with a message about context length). Some SDKs will silently truncate older messages instead. Always check your total token count — prompt + conversation history + expected response — against the model's context limit.

Does this tool send my text to a server?

For GPT models, no — counting runs entirely in your browser. For Gemini exact counts, your text is sent to our server which calls Google's token counting API. Your text is not stored or logged.

Which tokenizer does GPT-4o use?

GPT-4o and GPT-4o mini use the o200k_base encoding, which has a larger vocabulary than the older cl100k_base used by GPT-4 and GPT-3.5. This tool uses o200k_base for exact counts on OpenAI models.

Want to learn more? Read our guide: What Are AI Tokens? How GPT-4o, Gemini, and Claude Count Them

Read →