Machine Learning is an involved statistics, system identification, approximation theory, neural network, optimization theory, computer science, brain science, and many other areas of interdisciplinary, studies how computer simulation or implement the human Learning behavior, with new knowledge or skills, reorganize the existing knowledge structure to improve its own performance, Is the core of artificial intelligence technology. Data-based machine learning is one of the important methods in modern intelligence technology. The research starts from the observed data (samples) to look for rules, and use these rules to predict the future data or the data that cannot be observed. There are different classification methods of machine learning according to different learning modes, learning methods and algorithms.
(1) According to learning modes, machine learning is classified into supervised learning, unsupervised learning and reinforcement learning, etc.
Supervised learning is the use of labeled limited training data set, through a certain learning strategy/method to build a model, to achieve the new data/instance labeling (classification)/mapping, the most typical supervised learning algorithms include regression and classification. Supervised learning requires that the classification labels of the training samples are known. The higher the accuracy of the classification labels, the more representative the sample, and the higher the accuracy of the learning model. Supervised learning has been widely used in natural language processing, information retrieval, text mining, handwriting recognition, spam detection and other fields.
Unsupervised learning uses unlabeled limited data to describe the structures/rules hidden in unlabeled data. The most typical unsupervised learning algorithms include single-class density estimation, single-class data dimension reduction, clustering, etc. Unsupervised learning does not require training samples and manual data annotation, which facilitates data compression and storage, reduces computation, improves algorithm speed, and avoids classification errors caused by positive and negative sample migration. Mainly used in economic forecasting, anomaly detection, data mining, image processing, pattern recognition and other fields, such as large-scale computer cluster organization, social network analysis, market segmentation, astronomical data analysis and so on.
Reinforcement learning is the learning of intelligent system from environment to behavior mapping to maximize the value of reinforcement signal function. Because the external environment provides little information, reinforcement learning systems must rely on their own experience to learn. The goal of reinforcement learning is to learn the mapping from the environmental state to the behavior, so that the behavior selected by the agent can get the maximum reward from the environment, and the external environment can make the best evaluation of the learning system in a certain sense. It has been successfully used in robot control, unmanned driving, chess playing, industrial control and other fields.
(2) According to the learning methods, machine learning can be divided into traditional machine learning and deep learning.
Traditional machine learning
Traditional machine learning starts from a number of observational (training) samples and tries to discover the laws that cannot be obtained through principle analysis, so as to achieve accurate prediction of future data behavior or trend. Related algorithms include logistic regression, hidden Markov method, support vector machine method, K-nearest neighbor method, three-layer artificial neural network method, Adaboost algorithm, Bayesian method and decision tree method, etc. Traditional machine learning balances the validity of learning results with the interpretability of learning models, and provides a framework for solving finite sample learning problems. It is mainly used for pattern classification, regression analysis, probability density estimation and so on in the case of limited samples. One of the important theoretical bases of traditional machine learning methods is statistics, which has been widely used in many computer fields such as natural language processing, speech recognition, image recognition, information retrieval and biological information.
Deep learning is a learning method for building deep structure models. Typical deep learning algorithms include deep positioning network, convolutional neural network, limited Boltzmann machine and recurrent neural network, etc. Deep learning is also called deep neural network (the neural network with more than three layers). Deep learning, as an emerging field of machine learning research, was proposed by Hinton et al in 2006. Deep learning is derived from multi-layer neural networks, which provides a way to combine feature representation and learning into one. Deep learning is characterized by giving up interpretability and simply pursuing the effectiveness of learning. After years of exploration and research, many deep neural network models have been produced, among which convolutional neural network and recurrent neural network are two typical models. Coiling neural networks are often applied to spatially distributed data. The recurrent neural network introduces memory and feedback into neural network and is often applied to temporally distributed data. Deep learning framework is the basis of in-depth study of the underlying framework, generally contain the mainstream algorithm of neural network model, a steady supply of deep learning API, support training model in distributed learning between the server and the GPU, TPU, part of the framework also has in a variety of platforms, including mobile devices, cloud platform running on transplantation ability, This brings unprecedented speed and practicality to deep learning algorithms. At present, mainstream open source algorithm frameworks include TensorFlow, Caffe/Caffe2, CNTK, MXNet, Paddle- Paddle, Torch/PyTorch, Theano, etc.
(3) In addition, the common algorithms of machine learning also include transfer learning, active learning and evolutionary learning.
The migration study
Transfer learning refers to learning by using relationships obtained from data in another domain when enough data cannot be obtained for model training in some fields. Transfer learning can transfer the trained model parameters to the new model to guide the training of the new model, which can learn the underlying rules more effectively and reduce the amount of data. At present, migration learning technology is mainly used in small-scale applications with limited variables, such as sensor network-based positioning, text classification and image classification. In the future, transfer learning will be widely used to solve more challenging problems, such as video classification, social network analysis, logical reasoning, etc.
Active learning queries the most useful unlabeled samples through certain algorithms, and then gives them to experts to mark, and then uses the queried samples to train the classification model to improve the accuracy of the model. Active learning can selectively acquire knowledge and obtain high-performance models with fewer training samples. The most commonly used strategy is to select effective samples through uncertainty criteria and difference criteria.
Evolutionary learning has very few requirements on the nature of optimization problems. It only needs to be able to evaluate the quality of the solution. It is applicable to solve complex optimization problems and can be directly applied to multi-objective optimization. Evolutionary algorithms include particle swarm optimization algorithm, multi-objective evolutionary algorithm and so on. At present, researches on evolutionary learning mainly focus on clustering of evolutionary data, more effective classification of evolutionary data, and providing some adaptive mechanisms to determine the impact of evolutionary mechanisms.
Knowledge graph is essentially a structured semantic knowledge base and a graph data structure composed of nodes and edges. It describes concepts and their relationships in the physical world in symbolic form. Its basic component units are “entity-relationship-entity” triplet, and entities and their related “attribution-value” pairs. Different entities are connected with each other through relationships, forming a network of knowledge structure. In the knowledge graph, each node represents a real-world “entity” and each edge represents a “relationship” between entities. In general terms, a knowledge graph is a network of relationships that connect all different kinds of information, providing the ability to analyze problems from a “relational” perspective.
Knowledge graph can be used in anti-fraud, inconsistency verification, group fraud and other public security fields, and data mining methods such as anomaly analysis, static analysis and dynamic analysis are needed. In particular, knowledge graph has great advantages in search engine, visual display and precision marketing, and has become a popular tool in the industry. However, the development of knowledge graph still has great challenges, such as data noise problem, that is, the data itself has errors or data redundancy. With the development of knowledge graph application, a series of key technologies need to be broken through.
3.Natural language processing
Natural language processing is an important direction in the field of computer science and artificial intelligence. It researches various theories and methods that can realize the effective communication between human and computer in natural language. It involves many fields, including machine translation, machine reading comprehension and question answering system.
(1) Machine translation
Machine translation technology refers to the use of computer technology from one natural language to another natural language translation process. Statistical machine translation methods break through the limitations of previous rule-based and case-based methods, and the translation performance has been greatly improved. The successful application of machine translation based on deep neural network in some situations such as daily spoken English has shown great potential. With the development of context representation and knowledge logic reasoning ability, and the expansion of natural language knowledge map, machine translation will make more progress in the fields of multi-round dialogue translation and discourse translation.
Statistical machine translation (MT), which includes training and decoding, is one of the best performance in UNrestricted domain MACHINE translation. The goal of the training phase is to obtain the model parameters, and the goal of the decoding phase is to obtain the best translation results of the sentences to be translated using the estimated parameters and the given optimization objectives. Statistical machine translation mainly includes corpus preprocessing, word alignment, phrase extraction, phrase probability calculation, maximum entropy order adjustment and so on. The end-to-end translation method based on Shenjing network does not need to design feature models for bilingual sentences, but directly feed the word strings of the source language sentences into the neural network model, and get the translation results of the target language sentences after the neural network operation. In the end-to-end machine translation system, recursive neural network or convolutional neural network is usually used to model the representation of sentences, and semantic information is extracted from massive training data. Compared with phrase-based statistical translation, the translation results are smoother and more natural, and better results have been achieved in practical application.
(2) Semantic understanding
Semantic understanding technology refers to the process of using computer technology to understand text and answer questions related to text. Semantic understanding pays more attention to the understanding of the context and the precision of the answer. With the release of MCTest data set, semantic understanding has attracted more attention and made rapid development. Related data sets and corresponding neural network models emerge one after another. Semantic understanding technology will play an important role in intelligent customer service, automatic product answer and other related fields, and further improve the accuracy of question answering and dialogue systems.
In the aspect of data acquisition, semantic understanding can effectively expand data resources by means of automatic construction of data and automatic construction of fill-in-the-blank problems. In order to solve the filling problem, some deep learning-based methods have been proposed, such as attention-based neural network method. At present, the mainstream model is to use the neural network technology to model the chapter and the question, to predict the start and end of the answer, and to extract the chapter fragments. For the further generalization of the answer, the processing difficulty is further improved, and the current semantic understanding technology still has a large space for improvement.
(3) Question answering system
The question answering system is divided into open domain dialogue system and domain specific question answering system. Question answering technology is a technology that allows computers to communicate with people in natural language just like humans do. People can submit questions in natural language to the system, which returns highly relevant answers. Although there have been many application products of q&A system, most of them are applied in the field of actual information service system and smart phone assistant, and there are still problems and challenges in the robustness of THE Q&A system.
There are four major challenges in natural language processing. First, there are uncertainties in lexical, syntactic, semantic, pragmatic and phonetic aspects. Second, new vocabulary, terminology, semantics and grammar lead to the unpredictability of unknown language phenomena; Third, insufficient data resources make it difficult to cover complex language phenomena; Fourth, it is difficult to use simple mathematical model to describe the fuzziness and complicated correlation of semantic knowledge, and semantic calculation needs nonlinear calculation with huge parameters.