Chinese Military Collecting Voice Data Samples Of Indians From Sensitive Border Areas For Mass Surveillance

According to a report by New Kite Data Labs, the Chinese military is collecting voice data samples of Indians from sensitive border areas for mass surveillance through an Indian middleman.

Chinese Military Collecting Voice Data Samples Of Indians From Sensitive Border Areas For Mass Surveillance  1

According to a US-based think tank, voice samples from “military sensitive regions of India,” such as Jammu & Kashmir and Punjab, are being gathered by a Beijing-based AI company through an Indian middleman and sold to Chinese government organisations for “use and analysis.”

The Beijing-based company, Speechocean, is believed to have close ties to Chinese security agencies and the People’s Liberation Army in a report by New Kite Data Labs, which studies how China uses and exploits data.

It is believed that China may use the information gathered by this company for “automated extra-territorial mass surveillance.”

Academician and New Kite Labs founder Christopher Balding said that Speechocean (SO), a business process outsourcing (BPO) company, recruits people to record their voices from specified regions of India, particularly militarised areas.

Subscribe to GreatGameIndia

Enter your email address to subscribe to GGI and receive notifications of new posts by email.

“These people are paid small amounts of money to record phrases, words, sentences in their language and accent. These recordings are collected using the Speechocean app, which can be downloaded onto your phone. So, people from Kashmir, Punjab were identified and were paid money to record their voice samples, without really divulging the purpose. These samples were then sold to China,” Balding, who led the investigation, alleged.

In response to concerns raised by the transmission of speech data from India to China, Balding asserted that Speechocean is “known” to provide the Chinese military.

“Speechocean’s attempts to obfuscate their activity on behalf of the Chinese security agencies raise legitimate security questions and imply this data is used to train technological tools engaged in mass surveillance outside of China,” Balding said.

Speechocean describes itself as a provider of artificial intelligence data resources on its website, emphasising its commitment to providing “engineering data products and services to enterprises and scientific research institutions in the whole industry chain of AI.”

New Kite Labs has informed the Indian security establishment of its findings, according to Balding.

According to a source in an Indian security agency, the information is being investigated.

‘Absolute proof SO worked in Kashmir, Punjab’

According to the study, Speechocean, a Shanghai Stock Exchange-listed data provider that creates datasets for algorithmic model training and development, was discovered by New Kite Labs researchers after they discovered a database in China with a large number of Indian IP addresses.

Balding claims that SO uses a local middleman to gather voice data from India, namely from sensitive areas.

“SO has worked in Punjab and Kashmir and we have absolute proof of that at every level,” Balding said.

“We obtained log files sent from Indian IP addresses in Punjab and Kashmir to Speechocean databases in China of voice file transfers. We traced this back to a recruitment effort where individuals recited scripts in Indian languages using the Speechocean app,” he added.

Due to the company’s alleged connections with the PLA and other Chinese security agencies, Balding described this as “worrisome.”

“The company is known to sell to the PLA’s cyber warfare division. There is a document where Speechocean was bidding to sell Vietnamese-language data to the PLA’s cyber-warfare division. Selling this data is SpeechOcean’s primary business model. They gather data and sell it,” he claimed.

Balding responded that it is “unclear” when asked about the nature of the voice samples collected in India.

“Since we do not have access to raw files, we are not clear about the nature of voice samples being sent to China from India,” he said.

Links to Chinese military, security agencies

He Lin, the current chairperson of Speechocean, started the company in 2005, according to the organization’s website.

According to the story, as of September 2021, he was wed to Cai Huizhi, the founder and chairman of Beijing Zhongke Haixun Digital Technology, a publicly traded Chinese defence company that gives the Chinese military critical submarine-related technology.

“The company website includes a video of President Xi Jinping touring a military installation equipped with its technology. Their relationship highlights the extent of their access and integration within the Chinese state security apparatus,” the report said.

Since China’s National Computer Network and Information Center is a founding non-founder stakeholder, investor, and customer of the business, the article claimed that SO is “deeply embedded in state security apparatus.”

The report continues, “this government entity is responsible for internet security and censorship in China and is an investor in SO through an investment fund and holding company.”

“Their mission is to localise technological development and advancement to make China a global tech leader and assist in the promotion and defence of national security in information management,” it adds.

The think tank further asserted that it has records showing that the corporation “collaborates with security intelligence agencies with foreign targets” in addition to maintaining public security within China.

“We have identified public tenders on projects relating to Vietnamese speech classification for the People’s Liberation Army Strategic Support Force (SSF), better known as the cyber warfare division . Other public tender documents relate to classification projects relating to machine translation projects in English and for Chinese minority languages in northwestern China,” the report says.

Data traced back to Beijing, Hong Kong

The data gathered by SO, which included voice samples of words, phrases, or conversations with particular “accents and nationality,” was linked to three main IP addresses in Beijing, Hong Kong, and Germany, according to the New Kite Labs data report.

“The data was tracked to Aliyun Computing in Beijing, Alicloud in Hong Kong and servers in Frankfurt, Germany registered to Alibaba Singapore,” the report says.

In addition to having the capacity to collect and store data, the report stated that China has invested “major resources to create technological capabilities through software to automate behavior oversight technologies usually assisted through AI (Artificial Intelligence) / ML (Machine Learning) applications.”

GreatGameIndia is being actively targeted by powerful forces who do not wish us to survive. Your contribution, however small help us keep afloat. We accept voluntary payment for the content available for free on this website via UPI, PayPal and Bitcoin.

Support GreatGameIndia

Leave a Reply