The Listening Machine

Mar 16, 2026

China’s first Tibetan large language model isn’t a cultural project. It’s an infrastructure project. The question is: infrastructure for what?

On March 15, 2026, a company called CHOKNOR Information Technology Co., Ltd. launched DeepZang in Lhasa, billing it as the world’s first Tibetan large language model. The Global Times ran the story. The World Record Certification Agency issued a certificate. Government officials attended. The framing was cultural preservation meets technological progress.

The reality is more complicated, and considerably darker, than the press coverage suggests. A close reading of the platform’s capabilities, its regulatory status, the legal framework governing its operation, and the stated intentions of its leadership reveals a system whose architecture is indistinguishable from a surveillance and narrative control apparatus targeting one of China’s most politically sensitive minority populations.

The Platform

CHOKNOR, led by chairman Danzeng Luobu (Tenzin Norbu), has been developing DeepZang since 2018. The company has built a parallel corpus of nearly 70 million Tibetan-Chinese sentence pairs and collected over 30,000 hours of annotated speech data spanning all three major Tibetan dialect regions: approximately 10,500 hours from U-Tsang, 10,000 from Kham, and 10,000 from Amdo. This constitutes, by their own account, China’s largest accurately annotated Tibetan speech database.

The consumer-facing app, launched simultaneously, supports voice and text input for real-time translation between Tibetan, Putonghua, and English, along with Tibetan-language Q&A and “cultural knowledge inquiries.” It recorded an average of 4,000 downloads in its first two hours. The broader platform claims support for over 80 languages with multimodal capabilities.

DeepZang is the first Tibetan-language generative AI to complete China’s mandatory national filing process. CHOKNOR treats this as a point of pride. It warrants closer scrutiny.

What “National Filing” Requires

Under the Interim Measures for the Management of Generative AI Services, jointly issued by seven PRC ministries in July 2023, any generative AI platform with “public opinion attributes or social mobilization capabilities” must undergo a security assessment by the Cyberspace Administration of China (CAC) and file its algorithms with the state. The filing requires disclosure of training data sources, annotation rules, algorithmic mechanisms, and model architecture. Providers must cooperate with state inspections and provide “necessary technical and data support” on demand.

A Tibetan-language AI that handles questions about “Tibetan culture, history and politics” self-evidently possesses public opinion attributes. Its filing with the CAC is not optional. It is a legal prerequisite for operation, and it means the state has full technical visibility into the system’s architecture.

Separately, China’s generative AI regulations mandate real-name user verification. Every person who uses DeepZang is linked to a verified national identity. There is no anonymous use.

The combination is significant: a platform that processes Tibetan-language voice and text input, filed with the state censorship authority, with every user tied to their real identity.

The National Intelligence Law

Layered over the AI-specific regulations is the PRC’s 2017 National Intelligence Law. Article 7 states: “All organizations and citizens shall support, assist, and cooperate with national intelligence efforts in accordance with law.” Article 14 grants intelligence institutions authority to “demand that concerned organs, organizations, or citizens provide needed support, assistance, and cooperation.”

Legal analysts from Lawfare to the Canadian Security Intelligence Service have noted that this law creates affirmative obligations for companies to cooperate with intelligence gathering when requested. The U.S. Department of Homeland Security has assessed that PRC firms “are required to secretly share data with the PRC government or other entities upon request, even if that request is illegal under the jurisdiction in which these firms operate.”

CHOKNOR, as a PRC-registered company operating in the Tibet Autonomous Region, has no legal mechanism to refuse a data request from the Ministry of State Security or the Public Security Bureau. This is not a theoretical risk. It is the operational legal reality for every technology company in China, and it applies with particular force to a company handling minority-language communications in a region under heavy security presence.

The Intelligence Collection Value

Assessed from a signals intelligence perspective, DeepZang’s architecture has significant collection value across multiple dimensions.

Voiceprint database: Thirty thousand hours of dialect-annotated speech data, covering all three major Tibetan dialect regions with precise annotations, is functionally a biometric signals library. Paired with real-name registration, it enables voice-to-identity matching across the Tibetan-speaking population. This capability has applications well beyond translation.

Real-time content monitoring: A platform that processes Tibetan voice and text input, translates it, and responds to queries about culture, history, and politics is simultaneously a system that can log, analyze, and flag the content of user interactions in real time. Every query reveals what a user is thinking about, asking about, and interested in.

Social network mapping: An app with mandatory real-name registration that achieves rapid adoption provides an immediate social graph of its user base. Early adopters of a Tibetan-language AI tool are, almost by definition, the technologically engaged, educated, and connected segment of Tibetan society: precisely the demographic of greatest interest to security services.

Dialectal identification: The granularity of CHOKNOR’s speech database, broken down by U-Tsang, Kham, and Amdo dialect regions, enables not just translation but regional identification of speakers. In a security context, this means a voice sample can be geolocated to a dialect community even without metadata.

None of these capabilities require the platform to have been designed as a surveillance tool. They are inherent in its architecture. The question is not whether these capabilities exist but whether they will be exploited, and the legal framework provides no barrier to exploitation.

The ASPI Report

The timing of DeepZang’s launch is notable. In December 2025, the Australian Strategic Policy Institute published “The Party’s AI: How China’s New AI Systems Are Reshaping Human Rights,” a comprehensive investigation into Beijing’s use of AI for censorship, surveillance, and suppression of dissent.

ASPI found that the Chinese government is “developing, and in some cases already testing, AI-enabled public-sentiment analysis in ethnic minority languages, especially Uyghur, Tibetan, Mongolian and Korean, for the explicitly stated purpose of enhancing the state’s capacity to monitor and control communications in those languages.” The report noted that commercial LLMs have no market incentive to develop sophisticated models for small language groups. The state is filling that gap because “minority languages have long represented a blind spot for Chinese state surveillance.”

The report identified the National Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance at Minzu University of China as a key node. Established by the Ministry of Education, the lab’s stated purpose is to “maintain national stability and ethnic unity.” Its research areas include building LLMs for minority languages to power “public opinion analysis and online security systems.” Researchers collect internet data from minority regions, extracting meaning from text, audio, video, and emojis to build “internet public opinion monitoring and sentiment analysis technology.”

ASPI also documented the export dimension. Beijing is “purposefully expanding its minority-language public-opinion monitoring software throughout Belt and Road Initiative countries, effectively extending its censorship apparatus to monitor Tibetan and Uyghur diaspora communities abroad.” The surveillance infrastructure being built domestically is designed for projection.

DeepZang fits the pattern ASPI documented with uncomfortable precision. A Tibetan-language LLM, developed with state support, collecting voice and text data across all dialect regions, filed with the CAC, with stated plans to expand into education, healthcare, and government services. It closes exactly the surveillance blind spot ASPI identified.

The Tell

The most significant indicator in the available reporting comes from CHOKNOR’s chairman himself. In his interview with the Global Times, Tenzin Norbu stated:

“Through this large language model and its application, we also aim to provide an authentic platform for global users seeking to learn about Tibetan culture, history and politics, thereby preventing the dissemination of distorted ideologies and values.”

In PRC political discourse, “distorted ideologies and values” is a term of art. It refers to narratives that contradict the Party’s official position on Tibet, including assertions of historical Tibetan sovereignty, support for the Dalai Lama or the Central Tibetan Administration, advocacy for genuine autonomy, documentation of cultural or religious suppression, and expression of Tibetan identity outside state-sanctioned frameworks. This is the vocabulary of the United Front Work Department.

The chairman of a Tibetan AI company is publicly declaring that his platform’s purpose includes narrative control over what the world understands about Tibetan culture, history, and politics. He is stating, in a state media interview, that DeepZang will serve as a filter against information the Party considers ideologically incorrect.

This is not a translation tool with an unfortunate side effect. It is a narrative control platform by stated design.

The Information Shaping Function

DeepZang’s threat model extends beyond passive surveillance into active information shaping.

As the primary AI interface for Tibetan-language queries about history, culture, and politics, the platform determines what answers users receive. Its training data has been filed with and approved by the CAC. Its outputs must comply with PRC content regulations, which require generative AI to uphold “core socialist values” and prohibit content that “undermines national unity” or “damages the honor and interests of the state.”

When a Tibetan student asks DeepZang about the history of Tibet, the answer will reflect the PRC’s official narrative. When a user asks about the Dalai Lama, the response will conform to Party guidelines. When someone queries the platform about Tibetan political status, they will receive the “Xizang Autonomous Region” framing, not the perspective of the exile government or international human rights bodies.

ASPI tested Chinese LLMs on politically sensitive content and found systematic distortion. When researchers showed Alibaba’s Qwen model an image of a protest against human rights violations in Xinjiang, the AI described it as “individuals in a public setting holding signs with incorrect statements” rooted in “prejudice and lies.” ASPI calls this “informational gaslighting”: the machine that describes reality deciding which parts of reality may be seen.

DeepZang brings this capability to Tibetan for the first time at scale.

The SunshineGLM Precedent

DeepZang is not the first Tibetan-language AI to emerge from within the PRC. In November 2025, Tibet University launched SunshineGLM V1.0, described as the first Tibetan foundation model with over 100 billion parameters, trained on 28.8 billion tokens of Tibetan-language data covering news, law, medicine, education, and technology. The academician behind SunshineGLM, Nima Zaxi, publicly acknowledged that CHOKNOR’s corpus work provided data foundations for that model as well.

The ecosystem is interconnected. Academic institutions, private companies, and government agencies are building a shared Tibetan-language AI infrastructure within the PRC. The data flows between them. The regulatory and legal frameworks governing all of them are identical. The distinction between “academic research,” “commercial product,” and “state surveillance tool” is, under PRC law, a distinction without a difference when the state decides to collapse it.

The Exile Alternative

Tibetan-language AI development also exists outside PRC jurisdiction. Monlam AI, based in Dharamsala, India, within the Tibetan exile community, has developed translation, speech-to-text, OCR, and text-to-speech tools for Tibetan with no obligation to any state intelligence apparatus. Its corpus reflects the full breadth of Tibetan literary, philosophical, and religious tradition, including material the PRC classifies as ideologically impermissible.

The existence of Monlam AI makes the stakes of DeepZang’s adoption concrete. Tibetan speakers, both inside the PRC and in diaspora communities, face a choice between AI tools built under fundamentally different governance models. One operates under mandatory state surveillance and narrative control. The other does not. The tools a community adopts for processing its language will shape how that language is preserved, what it can express, and who has access to what its speakers say.

Assessment

The probability that DeepZang’s data, infrastructure, and analytical capabilities are or will be accessible to Chinese state security services is functionally certain. The legal, regulatory, and political conditions make it inevitable. The National Intelligence Law compels cooperation. The CAC filing provides technical transparency to the state. Real-name registration links every user to an identity. The platform’s stated mission includes narrative control over Tibetan political discourse.

Whether CHOKNOR was designed as a surveillance platform from inception or will be co-opted into that function is, in practical terms, a distinction without operational significance. The architecture permits it. The law requires it. The political context demands it. And the chairman has publicly stated that ideological control is among the platform’s purposes.

The relevant question is not whether DeepZang will be used for surveillance and information control. It is whether the Tibetan-speaking population, and the international community that claims to care about their rights, will recognize what this platform is before its adoption becomes irreversible.

Six million Tibetan speakers just got offered a free AI assistant that listens to everything they say, knows exactly who they are, and has been designed to ensure they never encounter a “distorted” idea about their own history.

That is not a cultural preservation tool. That is a listening machine.

Robert DeVito

Discussion about this post

Ready for more?