On the one hand, artificial intelligence helps data security protection. The use of artificial intelligence technologies such as machine learning can realize data identification and protection, data security traceability and other functions, and also enhance the protection of data security.
But on the other hand, artificial intelligence will also bring data security issues.While accelerating traditional data security issues, the large-scale use of artificial intelligence has further aggravated the security issues of over-collected data, and even produced new types of data security such as “data poisoning” problem.
The issue of new technology safety is hotly discussed again. At the World Artificial Intelligence Conference held on July 9, He Xiaolong, deputy director of the National Industrial Information Security Development Research Center, pointed out an aspect of artificial intelligence that cannot be ignored in data security in a keynote speech on “Artificial Intelligence Data Security and Regulatory Mechanism Research” .
He Xiaolong pointed out that artificial intelligence exacerbates traditional data security problems, which are mainly manifested in:
The first is the problem of over-collection. At present, various face recognition systems, smart audio equipment, and mobile apps are everywhere, and they are widely collecting users’ facial, voiceprint and other data with strong personal attributes such as biometric information and behavior trajectories. Once leaked, personal privacy will be threatened. .
The second is the problem of data theft. Using technologies such as image recognition and optical character recognition, verification codes such as pictures and characters can be easily cracked to obtain system data.
For example, in October 2018, Apple’s App Store broke out a large-scale “secret-free” stealing incident. Criminals used image recognition technology to develop a “coding platform” to automatically recognize image verification codes, and the verification code recognition accuracy rate reached 95%. the above.
The third is the problem of reverse restoration. Using technologies such as data association and algorithm deduction, some core algorithms and training data can be restored through the reversal of the public access interface, resulting in the leakage of personal information and commercial secrets.
According to related reports, the current mainstream natural language models have the possibility of leaking training data.
The fourth is the risk of open source frameworks. Current artificial intelligence applications are based on the open source architecture of Tensorflow. Due to the lack of a security review mechanism, the open source architecture may have serious security vulnerabilities.
“For example, in a project of security testing for mainstream open source architectures, we found 24 security issues in a short period of time during the testing process, including 2 severely dangerous vulnerabilities and 8 high-risk vulnerabilities.” He Xiaolong said.
AI also brings new data security issues. He Xiaolong pointed out that artificial intelligence algorithms have a strong dependence on data, which may bring new data security challenges such as “data poisoning”.
What is “data poisoning”? That is, disguised data or malicious samples are added to the training data, which destroys the integrity of the data and causes errors in the results of the algorithm model.
“For example, the original Microsoft chatbot Tay was shut down for publishing discriminatory and offensive remarks. The main reason was that inappropriate data was maliciously added to the conversation data set.” He Xiaolong said.
Second, there is the problem of sample bias. Based on the diversity and lack of representativeness of the basic data set, the artificial intelligence algorithm will hide the specific social value tendency or bias, and output unfair results. For example, He Xiaolong said that the NIST study in the United States shows that nearly 200 face recognition algorithms are relatively poor for non-white face recognition, and their mismatch rates are even 100 times different.
There is also the problem of adversarial samples. The adversarial sample attack refers to the input of interference data to the artificial intelligence model, and the information feedback mechanism of the artificial intelligence model is used to initiate an attack, which causes the model algorithm to output an incorrect result during normal operation, and then affects the decision result of the artificial intelligence model. Adversarial attacks are used to interfere with the operation. Data will lead to model identification errors.
In January of this year, Ruilai Smart Company cracked the face recognition and unlocking system of 19 common Android phones by wearing glasses with anti-sample patterns.
Finally, there is the problem of deep forgery, that is, by merging and superimposing pictures or videos through a machine learning model, false files can be generated, and face and sound changes are realized.
Recommendation: Carry out research on credible AI cutting-edge algorithms
In response to the new threats caused by AI, major countries around the world attach great importance to the safety supervision of artificial intelligence data.
At present, the United States has launched an active strategic plan to promote the integration and implementation of data security management requirements.
Through legislation surrounding various scenarios and legislation at the state level, data in the fields of artificial intelligence such as face recognition, autonomous driving, and deep forgery are carried out. Security supervision, for example, in the face recognition scene, California, Washington state, etc. have passed local legislation to protect it.
The EU also attaches great importance to data security governance. On the basis of the GDPR General Data Protection Regulation, it released the world’s first “Artificial Intelligence Law” proposal in April this year, which requires artificial intelligence data, which stipulates high-risk artificial intelligence technologies. It is necessary to clarify the relevant data security protection requirements before putting it on the market.
In addition, relevant departments have also put forward detailed management regulations and management frameworks for important areas of artificial intelligence, and put forward requirements for targeted data management.
In terms of countermeasure research, He Xiaolong suggested that, first, Country needs to establish and improve the artificial intelligence data security supervision structure, on the one hand, strengthen the legal system, study and promulgate the existing legal supporting implementation rules, and accelerate the development of artificial intelligence identification, autonomous driving, legislative research in key areas such as service robots.
Secondly, accelerate the construction of an artificial intelligence security risk classification management system based on application scenarios, scope of influence, and possible hazards, and strengthen supervision of high-risk areas.
Thirdly, promote the development of cutting-edge algorithms for trusted artificial intelligence, and formulate relevant industry norms and technical standards around the application of artificial intelligence in different scenarios.
Fourthly, enhance the safety protection awareness of all stakeholders, strengthen international exchanges and cooperation in technology, ecology, and industry, further improve the global artificial intelligence governance system, and promote the healthy development of artificial intelligence technology.