There are many technologies that can deceive artificial intelligence, and many artificial intelligence technologies are used to deceive people. In the era of artificial intelligence ( AI ), security issues cannot be ignored.
In recent years, artificial intelligence technology has achieved initial success in many fields, whether it is image classification, target tracking in the field of video surveillance, or autonomous driving, face recognition, Go, etc., have made very good progress. So, is artificial intelligence technology safe or not? In fact, there are still many problems with current artificial intelligence technology.
Artificial intelligence is not safe
There are many technologies that can deceive artificial intelligence, such as adding some countermeasures to the picture. The so-called anti-interference is to design an algorithm for the defects of the intelligent discriminative model to carefully construct samples that are minimally different from the normal samples and can make the model misidentify. As shown in Figure 1, it was originally a picture of a pistol. If some countermeasures are added, the recognition result will be wrong, and the model will recognize it as not a gun. Hanging a sign with a specific pattern in front of a person can make the person “invisible” in the video surveillance system (see Figure 2). In the autonomous driving scenario, if some disturbance is added to the speed limit sign, it can mislead the automatic driving system to recognize it as “Stop” (see Figure 3). Obviously, this will cause a great safety hazard in traffic. On the other hand, some technologies of artificial intelligence are now being abused to deceive people. For example, using artificial intelligence to generate fake content, including face-changing videos, fake news, fake faces, virtual social accounts, etc.
Figure 1. Recognized as a normal picture by the riot detection system
Figure 2. Stealth under intelligent monitoring
Figure 3. Misleading automatic driving system
Not only in the field of pictures and videos, but also in the field of voice recognition, there are such security risks. For example, if a very small interference is arbitrarily added to the speech, the speech recognition system may also recognize this segment of speech incorrectly. Similarly, in the field of text recognition, only one letter needs to be changed to make the text content misclassified.
In addition to the adversarial attack type, there is also a backdoor attack type. A backdoor attack is to insert a backdoor into the training data of an intelligent recognition system to make it sensitive to specific signals and induce it to produce wrong behaviors specified by the attacker. For example, when we train the machine, we insert a backdoor pattern into certain samples of a certain type, such as adding specific glasses to the image of a person as a backdoor, and use some training techniques to let the robot learn the glasses and a certain association of a judgment result (such as a specific celebrity). After training, this model can still make correct recognition for such a person, but if you input another person’s picture and let him wear specific glasses, he will be recognized as the person in front. During training, a back door was left in the model, which is also a safety hazard.
In addition to adversarial samples and backdoors, if AI technology is abused, some new security risks may also be formed. For example, fake content is generated, but not all of it is generated by artificial intelligence, but also generated by man. Earlier, the “Shenzhen Special Zone News” reported that the most beautiful girl in Shenzhen fed disabled beggars and moved passers-by, as reported by the People’s Daily Online and Xinhua News Agency. Later, people dig deeper and found that this news was artificially made. There are many examples of this on social networks, and much so-called news is actually untrue. On the one hand, artificial intelligence can play an important role in detecting whether news is true or false; on the other hand, artificial intelligence can also be used to generate false content, using intelligent algorithms to generate a face that does not exist at all.
The use of artificial intelligence technology to generate fake videos, especially the use of video to change faces to generate a video of a specific person, may pose a threat to social stability and even national security. For example, imitating a leader’s speech may deceive the public. Therefore, it is urgent to explore whether the generation technology needs some identification means or corresponding management standards. For example, generating fake faces, establishing fake social accounts, allowing it to establish associations with many real people, and even forming some automatic conversations, which look like a real person’s account, but are actually generated completely virtual. How to manage such a situation requires further exploration and research.
Technical analysis of hidden dangers of artificial intelligence security
Aiming at the hidden dangers of AI, to find a defense method, we must first understand the technology that generates the hidden dangers. Taking adversarial sample generation as an example, it is mainly divided into two categories: one is the generation of adversarial samples in white-box scenarios; the other is the generation of adversarial samples in black-box scenarios. The model parameters of the white box scene are completely known, and all the parameters in the model can be accessed. In this case, the attack will become relatively easy. You only need to evaluate the influence of the direction of the information change on the model output, and find the direction with the highest sensitivity. Make some disturbance to the ground to complete the attack on the model. Attacks in black box scenarios are relatively difficult. Most of the actual situations are black box scenarios. We can still access the model remotely, input samples, and get the detection results, but we cannot get the parameters in the model.
Black box attacks at this stage can be roughly divided into three categories. The first type is a migration-based attack method. The attacker can use the input and output information of the target model to train a replacement model to simulate the decision boundary of the target model, and use the white box attack method to generate adversarial samples in the replacement model. Finally, the migration of the adversarial sample is used to complete the attack on the target model. The second category is an attack method based on gradient estimation. Attackers can use finite difference and natural evolution strategies to estimate gradient information, and at the same time combine white-box attack methods to generate adversarial samples. In the natural evolution strategy, the attacker can use multiple randomly distributed unit vectors as the search direction, and maximize the expected value of the adversarial target in these search directions. The third category is an attack method based on the decision boundary. The decision boundary is searched through a heuristic search strategy, and then the adversarial sample closer to the original sample is continuously searched along the decision boundary.
When there is an attack, there is a defense. There are currently three main methods for the detection of adversarial samples. The first is to train a two-classifier to classify whether the sample is interfered, but the generality will be relatively poor. Generally speaking, training a classifier can only target a specific attack algorithm, but under normal circumstances, it is not known which attack algorithm others use. The second is to train the denoiser. The so-called anti-interference basically involves adding noise to the sample, and restoring the sample through denoising, so as to achieve defense. The third is to use adversarial means to improve the robustness of the model. By adding adversarial samples to the model training, the model will have stronger robustness when facing adversarial samples and improve the success rate of recognition, but the training complexity is relatively high. high. On the whole, these methods are not ideal. We urgently need to study a versatile and efficient defense method against samples.
For the generation of face-changing videos, the current mainstream technology is to reconstruct face images based on autoencoders. In the model training stage, all face images use the same encoder. The goal of this encoder is to learn to capture the key features of the face. For face reconstruction, each person’s face has a separate decoder, which is used to learn the unique characteristics of different people’s faces. The trained encoder and decoder can be used to generate fake faces.
For the identification of face-changing videos, the current mainstream technology is based on visual blemishes. This assumption is that the face-changing videos are unrealistic. Therefore, features can be extracted from blinking frequency, head posture estimation, illumination estimation, geometric estimation, etc., and use these features to determine the authenticity of a picture or video of a human face.
Some research results have been achieved in counter offense and defense
At present, we have increased investment in artificial intelligence security technology, and carried out some research on the issues of artificial intelligence security.
The first work is to counter the black box attack on the video recognition model. In this work, we take advantage of the mobility against disturbances, take the disturbances obtained in the image pre-training model as the initial disturbances of the video frame, and use natural evolution strategies to correct these initial disturbances on this basis. After we get the gradient information specially corrected for the video domain, we use the projection gradient descent to update the input video. This method can attack mainstream video recognition models in black box scenarios. This is also the world’s first work on black box attacks on video models. The result we achieved is that in the case of a targeted attack, it takes 30,000 to 80,000 queries to achieve a 93% attack success rate, and a non-targeted attack only requires hundreds of queries to complete an attack on the mainstream model. Targeted attack means not only to make the model misidentify, but also to specify what it recognizes as the thing, such as recognizing the photo of A as B. Non-targeted attack means that as long as the identification is wrong, it does not matter who it is identified. For example, as long as the photo of A is not identified as A, it is fine.
The second work is based on the sparse space-time video confrontation attack. Due to the high dimensionality of video data, the complexity of attack algorithms is often high. In this regard, we propose a video data counterattack method based on spatio-temporal sparseness. Spatial-temporal sparseness means that when generating counter disturbances, disturbances are generated only in specific areas of specific frames, so as to reduce the search space of counter disturbances and improve attack efficiency. In this work, in order to achieve spatio-temporal sparseness, we measure the importance of each frame according to heuristic rules, and select a subset of video frames for perturbation; at the same time, in space, we choose to specify the writing area of the frame, such as for foreground motion People do some interference. In this way, an efficient video black box attack can be realized.
The third task is to conduct a backdoor attack on the video recognition model. Regarding backdoor attacks, previous studies have focused on the image field, and they all generate backdoors in a fixed checkerboard format. This method has a very low attack success rate on video. In this regard, we propose a backdoor attack method against video data. In this work, we first generate backdoors on the video data, and place the backdoor patterns in the inconspicuous corners of the video. At the same time, we apply some anti-interference to the other content of the original video, so that the model we recognize is more focused on the use of backdoors. This obtains the polluted data, and replaces the corresponding data in the original data set with the polluted data to realize the backdoor attack. This work has achieved relatively good attack results on public data sets. The average attack success rate in many categories can be about 80%, which is much higher than the existing backdoor attack methods based on image data.
Technology is essential to artificial intelligence governance
In the future, technology will play an important role in the detection of artificial intelligence security issues and the implementation of corresponding rules. In terms of ensuring model security, by developing counterattack and defense theories, more robust intelligent models are designed to ensure the safe operation of intelligent systems in complex environments and form artificial intelligence security assessment and control capabilities. In terms of privacy protection, develop federated learning and differential privacy theories and technologies, standardize the behavior of intelligent systems to analyze and use data, and protect the privacy of data owners. Aiming at the interpretability problem of intelligent system decision-making, develop machine learning interpretability theory and technology, improve the human comprehensibility of intelligent algorithm decision-making process, and establish a transparent supervision mechanism that can be reviewed, traced, and deduced. In terms of decision-making fairness, statistical theory and technology can be used to eliminate discriminatory deviations in algorithms and data, and build an unbiased artificial intelligence system. Finally, in order to ensure that artificial intelligence technology is not abused, it is possible to prevent, detect, and monitor the abuse of intelligent technology through the development of big data computing and pattern recognition and other theories and technologies, and create an artificial intelligence application ecosystem that is beneficial to human well-being.