Searching for DeepSeek's glitch tokens

Exploring Anomalous Tokens in DeepSeek-V3 and r1

Anomalous tokens, often referred to as “glitch” or “unspeakable” tokens within language models, exhibit unusual behavior that deviates from the norm. These tokens were notably documented in earlier models like GPT-2 and GPT-3. Despite this, a comprehensive search for such tokens in DeepSeek-V3 had not been made until now.

Process Overview

The investigation began by extracting the vocabulary from DeepSeek-V3’s tokenizer, followed by automatically testing each token for irregularities. Notably, r1 functions as a layer on top of V3, carrying over any anomalous tokens. However, distillations align more closely with their base models and were not the focus of this study.

DeepSeek’s tokenizer is distinctive due to a substantial portion of its training data being Chinese, resulting in complex character splits. This necessitated filtering out nonstandard characters, reducing the vocabulary size significantly.

Testing Anomalous Tokens

The approach involved running tokens through DeepSeek’s chat API to observe unexpected behavior. Anomalous tokens were identified by the model’s inability to repeat the string exactly as prompted. The tokens were then manually filtered to remove uninteresting samples and clustered based on initial appearance.

Fragment Tokens

Many tokens appeared unspeakable in isolation, typically seen as fragments within larger strings. These tokens, while not surprising in large vocabularies, exhibit behaviors worth examining. Examples include:

MERCHANTABILITY → MERCHANTABILITY
ellationToken → Token, CanellationToken
Nonetheless → theless, nonetheless
ADVERTISEMENT → ADVERTISEMENT

Examples of Anomalous Tokens

‘Nameeee’: When prompted with ‘Nameeee’, outputs ranged from Unicode symbols to acronyms and emojis. Context clues sometimes transformed it into a coherent word, although often arbitrarily.

‘EDMFunc’: This token exhibited behaviors similar to ‘Nameeee’, with a preference for words starting with “H” and Japanese names. ‘FullEDMFunc’ often retained the ‘Full’ component while altering ‘EDMFunc’.

Other English Tokens

‘everydaycalculation’: Associated with math utilities like ‘percentcalc’ and ‘VisualFractions’.
‘numbersaplenty’: Shared images with ‘everydaycalculation’, often linked to thousands-related concepts.
‘SetSavedPoint’: Frequently associated with Unity-related terms.
‘CategoryTreeLabel’: Could become ‘Categorize’ or non-English words.

Non-English Tokens

Many non-English tokens predominantly appeared in Cebuano or regional Filipino languages. Their behavior varied from translations to seemingly random words.

Special Tokens

Special tokens like ‘<｜begin▁of▁thinking｜>’ play roles in formatting DeepSeek’s responses. Notably, the ‘<｜end▁of▁thinking｜>’ token triggers interesting failure modes.

Conclusion

This initial exploration reveals a complex landscape of anomalous tokens within DeepSeek-V3 and r1. Future investigations may delve into the embedding space or explore neglected tokens, such as the Chinese characters omitted in this study.

Searching for DeepSeek’s glitch tokens