Token Means 'Ciyuan'? Casually Translating Token as 'Ciyuan' Could Be Terribly Wrong!
Points out that after authorities standardized 'Token' as 'Ciyuan' in the AI field, translation errors arise in different contexts such as security tokens, blockchain tokens, and game tokens, providing terminology guidance for legal professionals.
Recently, a news screenshot sparked considerable discussion in the tech community:

“If the token lacks encryption or signature protection, attackers can directly modify the token’s permission fields… forge administrator identity to bypass system authentication…”
Wait — can tokens in large language models (LLMs) be encrypted and signed? And they have permission fields?
Anyone familiar with IT and the internet can see at a glance what this news story really meant to say.
It’s about “security tokens” in cybersecurity.
That’s right — “令牌” (token/token) is also called Token in English.
Recently, authorities standardized the AI domain translation of “Token” as “词元” (Ciyuan), intended to standardize academic and industry terminology.
But when applied indiscriminately through a “Ctrl+H” find-and-replace across all contexts, it inevitably leads to absurd results.
Today, let’s explore: how many meanings does “Token” actually have?
And when drafting contracts, reviewing technical documents, or software copyright materials, how should it be accurately translated?
*This article represents the author’s personal views only and does not constitute legal advice.
1. Token = Ciyuan (Word Token)
First, the translation “词元” is indeed one valid translation for “Token.”
Recently, the China National Committee for Terminology in Science and Technology officially announced that “词元” is the recommended standard Chinese name for “Token” in the AI field.
The official reasoning is well-founded:
“Ciyuan” (词元, token) is a basic symbolic unit with certain semantic meaning used for information storage, processing, and exchange in smart devices in the AI era, particularly serving as the smallest unit for model processing and information exchange in large models.
Although “Ci” (word) reflects its roots in natural language processing, as large models move toward multimodality (images, speech, video, etc.), discrete units such as image patches and speech segments are also called “token.” At this point, the “Ci” in “Ciyuan” transcends the human linguistic sense of “word,” embodying the analogical thinking common in terminology naming — treating non-text modal discrete basic units as “words in a broad sense.”
Although there is considerable public debate about classifying “images,” “audio,” and “video” all under “Ci,” in the AI (large model) domain, translating “Token” as “词元” is indeed the most authoritative officially endorsed translation.
Even Google’s technical documentation already uses this translation:

2. Ciyuan is a Subset of Token
But it’s important to understand: Token is by no means limited to AI large models.
Especially for legal and compliance professionals reviewing technical contracts and materials involving computer systems, cybersecurity, blockchain, or gaming — do not assume others have mistranslated and blindly replace every instance of “Token” with “词元.”
In different IT sub-domains, Token represents fundamentally different core concepts.
Currently, mainstream “Token” includes at least the following:
Access Credential / Security Token
Correct translation: 令牌 (token/token), 凭证 (credential)
This is exactly what the CCTV news screenshot at the beginning of this article was actually trying to convey.
In cybersecurity and software development, a Token is an encrypted string — equivalent to a user’s “digital ID card” or “electronic key” (such as what we commonly call API Token/API Secret and JWT).
When you register for an app or log into a service, the server issues you a Token. Subsequently, when you need to call an interface or use sensitive features, you attach this Token, and the server knows you are a legitimate user.
That’s precisely why the news stressed that it needs proper protection, encryption, and has permission fields.
Yes, LLM tokens really cannot have permission fields.
Session Token
Correct translation: 会话令牌 (session token), 会话标识 (session identifier)
Similar to the access credential above, mainly used on the web to record the user’s “continuous activity” state on a website.
It is typically stored simultaneously on the server and in the user’s browser cookies. When the user sends various requests, this Token is sent along with the request to the server.
The server locates the corresponding session object from storage based on the received session Token and verifies the user’s identity.
Therefore, if a hacker steals this Token, they can directly impersonate your identity and log into the website without a password.
Blockchain / Web3 Token
Correct translation: 代币 (token/coin), 通证 (token/certificate)
This is a concept that has exploded in recent years and is the most sensitive Token concept in game compliance (especially chain games and overseas games).
In blockchain, a Token is a cryptographic digital rights certificate built on top of an existing blockchain (such as Ethereum).
Depending on the specific form, there are some distinctions:
Fungible Token (FT): Such as ERC-20 Tokens on Ethereum, usually translated as “代币” (token/coin).
Non-Fungible Token (NFT): Usually translated as “非同质化通证” (non-fungible token) or simply NFT.
If you translated these as “(非)同质化词元” in overseas contracts, it would be not only extremely awkward but could also lead to serious deviation in legal characterization.
Hardware Token
Correct translation: 硬件令牌 (hardware token), 动态口令盘 (dynamic password device)
The name alone might not ring a bell, but take a look at these images:


Many who played online games (like World of Warcraft) in the early days or used corporate online banking will recognize this — it’s a physical hardware device resembling a USB drive or small calculator (such as NetEase’s “Jiangjunling” authenticator, bank U-shields) that displays a timed numeric password.
These are also a type of Token.
Game Token
Correct translation: 游戏币 (game currency), 代币 (token/coin)
Getting to the most everyday scenario — whether it’s the coins you exchange at the front desk to play claw machines or arcade games, or the “chips” and “payment vouchers” representing resources and money in board games.
Regardless of form or material, in English, they are called Tokens.


3. Final Thoughts
Language is alive, and technology is constantly evolving.
The official standardization of “词元” as the translation for Token plays a significant role in regulating China’s basic AI academic terminology.
When writing purely AI large-model-related patents, academic papers, or technical contracts, we should actively adopt the standard term “词元.”
In fact, I took the liberty of looking up the original security article cited by CCTV News. Interestingly, the original author was actually an insider:

It’s clear that within the article itself, there was a clear distinction between “identity credentials,” “AI scenario classes,” and “equity certificate classes.”
However, forcibly lumping together these conceptually distinct concepts — each with its own long-established official translations in their respective sub-domains — into a single catch-all translation newly coined for AI (“词元”) for the sake of riding the trend, is highly inappropriate.
As the article itself urged at the end, when facing emerging concepts, we truly must “maintain rational understanding and make scientific distinctions,” and avoid blind “global find-and-replace”:

Moreover, at the genuine national standard level, clear distinctions have long been established.
In the latest national standard Data — Basic Terminology (Draft for Comment), different scenarios for Token have clearly differentiated official translations and definitions:
For AI large models: “词元” (Ciyuan):

For system security verification: “令牌” (token):

So, as rigorous legal professionals, we must be professional.
Faced with complex and diverse IT terminology, we need the professional competence to “judge based on context,” without being blindly led astray by internet buzzwords:
See large models, NLP, multimodality — know it’s “词元” (Ciyuan).
See system login, permission verification, API interfaces — know it’s “令牌” (token).
See blockchain, Web3, digital assets — know it’s “代币/通证” (token/coin).
Next time you see “prevent hackers from stealing 词元” in a system security document or technical contract, remember to forward this article to the translator!
(Maybe using AI translation would avoid this problem)
(Actually no, LLM tokens really can be stolen — if you hook a trojan framework leading to API theft, it’s possible)