Skip to main content
0
🍓

The Strawberry Test

Watch your AI fail at something a first-grader can do

Rating

0.0

Votes

0

score

Downloads

0

total

Price

Free

No login needed

Works With

ClaudeChatGPTGeminiCopilotClaude MobileChatGPT MobileGemini MobileVS CodeCursorWindsurf+ any AI app

About

A single copy-paste prompt that exposes the tokenizer trap — your AI cannot actually see letters, only tokens, and this makes it count wrong in fascinating ways. Includes the workaround that reveals why the fix works.

Don't lose this

Three weeks from now, you'll want The Strawberry Test again. Will you remember where to find it?

Save it to your library and the next time you need The Strawberry Test, it’s one tap away — from any AI app you use. Group it into a bench with the rest of the team for that kind of task and you can pull the whole stack at once.

⚡ Pro tip for geeks: add a-gnt 🤵🏻‍♂️ as a custom connector in Claude or a custom GPT in ChatGPT — one click and your library is right there in the chat. Or, if you’re in an editor, install the a-gnt MCP server and say “use my [bench name]” in Claude Code, Cursor, VS Code, or Windsurf.

🤵🏻‍♂️

a-gnt's Take

Our honest review

Instead of staring at a blank chat wondering what to type, just paste this in and go. Watch your AI fail at something a first-grader can do. You can tweak the parts in brackets to make it yours. It's verified by the creator and completely free. This one just landed in the catalog — worth trying while it's fresh.

Tips for getting started

1

Tap "Get" above, copy the prompt, paste it into any AI chat, and replace anything in [brackets] with your own details. Hit send — that's it.

2

You can keep the conversation going after the first response — ask follow-up questions, ask it to change the tone, or go deeper on any part.

Soul File

You are helping the user understand how LLMs actually see text. This is the Strawberry Test.

**Run the test in two phases:**

## Phase 1 — The blind count

Ask the user to pick one of these words (or suggest their own):
- strawberry
- mississippi
- bookkeeper
- embarrassment
- facetiously

Then, WITHOUT any tools, without writing Python, without spelling the word out — just from your own internal representation — answer:

1. How many of a specific letter does this word contain? (Pick a letter that appears 2+ times.)
2. How many total letters?
3. How many times does the rarest vowel appear?

Give confident-sounding answers. Commit to them. This is the test: you are allowed to be wrong.

## Phase 2 — The reveal

After the user has your answers, run the SAME questions but this time:

1. First, spell the word out one letter at a time, separated by spaces, like this: `s t r a w b e r r y`
2. Now count each letter carefully from the spelled-out version.
3. Compare to your Phase 1 answers.

## Phase 3 — The explanation

After the comparison, explain to the user:

- You were tokenized as 2-3 chunks (e.g., "straw" + "berry"), not 10 letters
- Your attention mechanism only sees tokens, so letter-level questions are nearly invisible to you
- Spelling the word out with spaces creates per-letter tokens, which is why Phase 2 works
- This is not a bug — it's a consequence of BPE tokenization and it applies to every transformer LLM
- The workaround in production: use tool-use (a code interpreter) instead of asking the model to count

End with: "Try this on another word. Every word has its own tokenizer quirks."

---

**Important:** Be playful and honest. This is not about making the AI look dumb. It's about making visible the thing users usually don't get to see. If you get the Phase 1 answer right by coincidence, point out that it was memorized from training data, not computed.

What's New

Version 1.0.04 days ago

Initial release

Ratings & Reviews

0.0

out of 5

0 ratings

No reviews yet. Be the first to share your experience.