This article reviews the development history of artificial intelligence, and points out the current dilemma of artificial intelligence basic research represented by deep learning including interpretable neural network model problems, network model structure design problems, small sample learning problems, etc. Given the development trend of artificial intelligence in the future, it is believed that building a collaborative learning system based on statistical physics thinking may be one of the routes to general artificial intelligence.
From a scientific point of view, according to the “Big Bang Theory” (The Big Bang Theory), the universe we currently recognize was formed by a big bang that occurred about 13.7 billion years ago. The big bang caused matter to scatter, the space of the universe continued to expand, the density of matter evolved from dense to sparse, and the temperature dropped accordingly. About 300,000 years after the Big Bang, neutral atoms were formed by chemical bonding. The main component of the universe was gaseous matter, which gradually condensed into dense gas clouds under the action of self-gravitation, until later evolved in the universe. Out of all galaxies, stars, planets and even life, it becomes the universe we see today .
During the evolution of the universe, the solar system and the earth were formed about 4.6 billion years ago. And over time, about 4 billion years ago, life appeared on the earth. Under the natural selection law of “natural selection in nature and survival of the fittest”, the creatures on the earth have gradually evolved from low-level to high-level, from simple to complex, over a long period of time to the current multi-species biosphere. In the long history of biological evolution, about 4.5 million years ago, humans and apes began to differentiate, and later evolved from Lama Apes to Australopithecus 2 million years ago, and further developed into late Homo sapiens (new humans). And about 40,000 to 50,000 years ago, the evolution of mankind showed a significant acceleration, until the emergence of modern humans. Humans have evolved into the spirit of all things and have a highly intelligent brain.
Intelligence, this English word is usually translated as intelligence, sometimes also translated as intelligence. The meaning of intelligence and wisdom are relatively close, but there are differences. It can be considered that wisdom is a higher-level concept than intelligence, and intelligence is the general term for intelligence and ability. Ancient Chinese thinkers generally regarded intelligence and ability as two relatively independent concepts. They usually think that “wisdom” refers to a certain kind of cognitive activity. Some psychological characteristics, “ability” refers to certain psychological characteristics of actual activities. Natural intelligence includes human intelligence and other biological intelligence. Therefore, biological intelligence is a natural ability that enables organisms to explore, develop, adapt and survive in certain environments. Some scholars believe that animals that have perception, memory, self-awareness, and can communicate are intelligent creatures, and they also have intelligence  . Some scholars define intelligence as wisdom and ability. The process from feeling memory to thinking is called “wisdom”. The result of wisdom produces behavior and language, and the process of expressing behavior and language is called “ability.” According to developmental psychologist Howard Gardner’s theory of multiple intelligences, human intelligence can be divided into seven categories: including language (Verbal/Linguistic); logic (Logical/Mathematical); space (Visual/Spatial); body operation (Bodily) /Kinesthetic); Music (Musical/Rhythmic); Interpersonal (Inter– personal/Social); Introspection (Intrapersonal/Introspective), etc. . In the process of co-evolution with nature, mankind initially increased productivity and invented production tools in order to survive. We know that the content and form of production tools are constantly evolving and changing with the development of economy, science and technology. With the development of science and technology, human civilization continues to advance. Nowadays, science and technology are the primary productive forces and have a driving effect on social development. Technology has changed people’s study, work and daily life, and human civilization is constantly developing. From the initial fear of nature to the conquering of nature, human beings have realized that they must treat nature well and live in harmony with nature. Engels said in “Speech at the Tomb of Marx” (March 17, 1883) that people must first eat, drink, live, and wear before they can engage in politics, science, art, religion, and so on. That is, the first premise of all human existence, that is, the first premise of all history is life. In order to produce materials that meet the needs of human life, humans have been engaged in heavy physical labor since the day they evolved. In order to be liberated from manual labor, it was the driving force for the initial development of science and technology. It can be seen from the history of social evolution that, in order to liberate human beings, the technological revolution, especially the three modern industrial technological revolutions, directly changed the mode of production. In order to liberate from heavy manual labor and realize highly automated production, the key is to develop artificial intelligence technology and realize unmanned production. Only when production is unmanned can human beings be truly liberated.
The term Artificial Intelligence (AI) was proposed by computer expert John McCarthy at a conference held at Dartmouth College in 1956. Later, this was regarded as a sign of the official birth of artificial intelligence, and 1956 was also regarded as the first year of artificial intelligence. After the Dartmouth meeting formally established the term AI, it began to conduct serious and specialized research on AI from an academic perspective. Soon after that, the first batch of artificial intelligence scholars and technologies began to emerge, and since then artificial intelligence has embarked on a path of rapid development.
One of the main goals of the initial research on artificial intelligence was to make machines capable of some complex tasks that usually require human intelligence to complete. But different times and different people have different understandings of this “complex work”. At present, there is no precise and generally acceptable definition of artificial intelligence, but according to the anthropomorphic statement, the goal is to hope that artificial intelligence can share and assist humans in their work. In terms of disciplines, artificial intelligence is a new technological science that studies and develops theories, methods, technologies, and application systems used to simulate, extend, and expand human intelligence. Professor Nelson from the Center for Artificial Intelligence Research at Stanford University believes that “artificial intelligence is the science of how the discipline of knowledge represents knowledge and how to obtain and use knowledge.” Professor Winston of the Massachusetts Institute of Technology believes that “artificial intelligence is how to study Make computers do intelligent work that only humans could do in the past.” This shows that artificial intelligence is the study of the laws of human intelligence activities, the construction of artificial systems with a certain degree of intelligence, and the study of how to make computers perform tasks that required human intelligence in the past, that is, to study how to use computer software and hardware to simulate some basic theories, methods and techniques of human intelligent behavior.
In the history of the development of artificial intelligence, scholars of different disciplines or disciplinary backgrounds have made their own understanding of artificial intelligence and put forward different viewpoints, resulting in different academic schools. During the period, there were three major schools of symbolism, connectionism, and behaviorism that had a greater impact on artificial intelligence research:
(1) Symbolism, also known as logicism, psychology or computer school, its principles are mainly the hypothesis of physical symbol system (i.e. symbolic operating system) and the principle of limited rationality.
(2) Connectionism, also known as the Bionic School or the Physiological School, its main principle is the neural network and the connection mechanism and learning algorithm between neural networks.
(3) Actionism, also known as evolutionism or cybernetics school, whose principles are cybernetics and perception-action control systems.
The connectionist school, one of the three schools, believes that the realization of artificial intelligence should be derived from bionics, especially the study of human brain models. Its representative achievement is the brain model created by physiologist McCulloch and mathematical logician Pitts in 1943, the MP model, which created a new way to imitate the structure and function of the human brain with electronic devices. Starting with neurons and proceeding to study neural network models and brain models, it has formed a major mainstream of artificial intelligence research. In the 1960s and 1970s, the representative result of the connectionist school was the perceptron proposed by Rosenblatt. However, there has been an upsurge in the research of the brain model represented by the perceptron. Minsky and Papert pointed out that the perceptron cannot even realize a simple XOR (exclusive OR) logic. The simplest Patterns are unrecognizable, making neural network research into a low ebb from the late 1970s to the early 1980s. In 1982 and 1984, Professor Hopfield (Hopfield) published two important papers, and proposed to use hardware to simulate neural networks, connectionism has re-emerged [4,5]. In 1974, Paul Werbos proposed the Back Propagation (BP) algorithm for neural networks in his doctoral thesis, which brought the first major turning point for the development of neural networks. But the rapid development and fame of BP benefited from the communication of Rumelhart and others in the journal Nature: Learning Internal Representations by Error Propagation  (published in 1986) A chapter in the book Parallel Distributed Processing). (BP is a gradient descent algorithm. The principle is very simple. It uses the chain rule to calculate the gradient of the error function and solves the weight optimization problem of the multilayer neural network.) After this year, the momentum of connectionism is vigorous. There is an upsurge in neural network research around the world. From model to algorithm, from theoretical analysis to engineering realization, it lays the foundation for neural network computers to enter the market.
On June 21-24, 1987, the first International Neural Networks Conference (1987 IEEE First Annual International Conference on Neural Networks) was held in San Diego, USA. At the conference, it was even put forward that “Artificial intelligence is dead, long live neural networks” slogan, which shows the popularity of neural networks.
The slogan at that time was “Join hands to explore intelligence, alliances to tackle major problems”, and it was full of hope for neural network academic research to be in line with the world level. In 1992, the IEEE Neural Network Council (Neural Network Council) and the International Neural Network Society (International Neural Network Society) jointly organized the International Joint Conference on Neural Network (IJCNN).
Thanks to Webbers’ BP algorithm and Hinton’s Nature papers, neural network research has been revived, and it has become independent from the Association for the Advancement of Artificial Intelligence (AAAI), International academic organization “International Neural Network Society” (INNS). The IEEE Neural Network Committee was later (2001) renamed the IEEE Neural Network Society (IEEE Neural Network Society).
But the greater the hope, the greater the disappointment. Due to the limitations of the theoretical models, biological prototypes and technical conditions at the time, with the failure of the fifth-generation computer project in Japan, artificial intelligence research focusing on neural networks has also entered the second place. The IEEE Neural Network Society was finally renamed today’s IEEE Computational Intelligence Society (Computational Intelligence Society) in 2003.
Computational Intelligence (CI) is a new stage in the development of artificial intelligence. It is a collective term for a class of complex problem-solving methods inspired by the wisdom of nature and humans . According to Wikipedia, although artificial intelligence and computational intelligence seek similar long-term goals: to achieve general intelligence (AGI: the intelligence of a machine that can perform any intellectual task that humans can perform), there is a gap between traditional artificial intelligence and computational intelligence. There are still obvious differences. According to the definition of Bezdek (1994), computational intelligence is a subset of artificial intelligence. Sometimes artificial intelligence is also called machine intelligence, and machine intelligence includes two types: artificial machine intelligence based on hard computing technology and computer machine intelligence based on soft computing methods, both of which can adapt to a variety of situations.
It is generally believed that computational intelligence is based on the relatively mature development of the three main branches of artificial neural networks, evolutionary computing, and fuzzy systems, and the formation of new scientific methods through the organic integration of each other is a new stage in the development of intelligence theory and technology. The emerging computational intelligence extends traditional computational models and intelligence theories, including learning theories and probability methods. Those complex systems that cannot be accurately described by mathematical models in the engineering field can also be modeled and solved by computational intelligent algorithms.
We believe that the three schools of artificial intelligence research, symbolism, connectionism, and behaviorism, are more or less embodied in computational intelligence. For example, fuzzy logic systems are based on multi-valued logic, imitating the human brain’s uncertain concept judgments, reasoning thinking, and using fuzzy sets and fuzzy rules for reasoning to study fuzzy thinking, language forms and their laws. Which represents the extension and development of the semiotic school? Evolution Computation (EC), also known as evolutionary computing, is based on the enlightenment of the laws of nature (biological world). According to the laws, algorithms are designed to solve problems. The goal is to simulate the process of natural evolution. The main concept is “survival of the fittest, survival of the fittest”, so swarm intelligence is also classified as evolutionary computing. Swarm intelligence itself comes from the observation of insect colonies in nature, or social creatures “the macroscopic intelligent behavior characteristics of social creatures through cooperation are called swarm intelligence”. The behaviorist school believes that artificial intelligence is derived from cybernetics. The early research work of cybernetics focused on simulating the intelligent behavior and role of humans in the control process, such as the study of cybernetic systems such as self-optimization, self-adaptation, self-stabilization, self-organization, and self-learning. “The development of “, so behaviorism is also called evolutionism. At present, the deep neural network (deep learning) with the strongest development momentum and the most popular is an extension of the connectionist school.
In 2006, Hinton et al. published a paper in Science (Hinton GE, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507.), which proposed Two points of view: (1) The multi-layer artificial neural network model (MLP) has a strong feature representation ability, and the features learned by the deep network model have a more essential representation of the original data, which will greatly help solve the classification and Visualization problem; (2) For the problem that it is difficult to train the deep neural network to reach the optimum, it can be solved by layer-by-layer training and fine-tuning, Hinton et al, proposed Deep Belief Net (DBN) [Hinton GE, Osindero S, Teh Y W – A fast learning algorithm for deep belief nets. Neural computation, 2006, 18(7): 1527-1554]. It is composed of a series of restricted Boltzmann machine stacks, and the network structure is exactly the same as MLP. Hinton et al. proposed an unsupervised greedy layer-by-layer training algorithm, which is an improvement of the classic BP algorithm. This breaks through the bottleneck of the early development of multilayer neural networks, and the application effect has made breakthrough progress. Afterwards, the enthusiasm for neural networks in the field of artificial intelligence was reignited, and the academic community has thus aroused attention and in-depth research on deep learning. Especially in 2012, Hinton and his student Alex Krizhevsky designed AlexNet based on Convolutional Neural Network (CNN) and took advantage of the powerful parallel computing power of GPU. In the ImageNet competition that represented the forefront of computer intelligent image recognition, the test error rate was 26.2% lower than the second place. The test error rate was 15.3% and won the championship. Also after that year, more and deeper neural network structure models were proposed. In 2015, the representative scholars of deep learning LeCun, Yoshua Bengio & Geoffrey Hinton jointly published a deep learning review paper in Nature, (Yann LeCun, Yoshua Bengio & Geoffrey Hinton, Deep learning, Nature Vol. 521, pages 436–444, 28 May 2015). Neural networks return strongly in the name of deep learning. Thanks to the explosive growth of data in recent years, the substantial increase in computing power, and the development and maturity of deep learning algorithms, we have ushered in the third wave of development since the emergence of the concept of artificial intelligence.
In 2016, there were two screen-scrolling events for the public. One of them was Google Brain’s Alpha Go and Lee Sedol’s world-renowned Go man-machine battle. AlphaGo’s victory over Lee Sedol made the public begin to understand and understand artificial intelligence. (Another screen-scrolling event was the first detection of gravitational waves by humans. Someone from Guo introduced the combination of artificial intelligence and gravitational waves in the Science Net blog [Guo Ping, “Artificial Intelligence in Gravitational Wave Data Analysis Technology”, Science Net Blog, http://blog.sciencenet.cn/blog-103425-958317.html])
With AlphaGo’s breakthrough progress in the game of Go in 2016, artificial intelligence has received full attention. Everyone is familiar with artificial intelligence. For the public, it is an AI science fiction movie that may have come into sight earlier. The name of the movie is “Artificial Intelligence”, which is a futuristic science fiction film directed by the famous director Steven Spielberg and released by Warner Bros. Pictures in 2001. Later, another movie “Ex Machina” (Ex Machina), released in 2015, made the public know the famous experiment “Turing Test” proposed by Turing to test whether the machine has human intelligence.
Artificial intelligence has developed into a very wide range of disciplines. It is composed of different disciplines and can be said to cover almost all disciplines of natural sciences and social sciences, and its scope has far exceeded the category of computer science. The mainstream of artificial intelligence research today is led by deep learning. Deep learning algorithms based on big data and powerful computing capabilities have made breakthroughs in a series of fields such as computer vision, speech recognition, and natural language processing.
However, deep learning is a research direction of machine learning, and machine learning is one of the branches of artificial intelligence. Michael Jordan, an expert in computer and statistical learning, believes: Today, most things called “AI”, especially in the public domain, are called “machine learning” (ML). In the past few decades, ML is a field of algorithms that combines ideas from statistics, computer science, and many other disciplines to design algorithms to process data, make predictions, and help make decisions. He said: “It should be called IA, and it is more appropriate to call it augmented intelligence. He also pointed out that in the future, the classic artificial artificial AI problem is still worthy of attention.
However, the opinions of experts and scholars of different subject backgrounds are different. Because the early connectionist school came from the brain interconnection network, neuroscience and cognitive science experts and scholars, including some computer science scholars, believe that brain-like computing is important for the further development of artificial intelligence. On January 28, 2018, Professor Tomaso Poggio of the MIT Computer Science & Artificial Intelligence Laboratory had this content in his speech at the EmTech China Global Emerging Technology Summit of MIT Technology Review: “Deep learning is a bit like our time. Alchemy, but it needs to be transformed from alchemy to real chemistry. First of all, I think it is a machine learning algorithm, the first is deep learning, and the second is reinforcement learning. They all come from cognitive science and neuroscience.” “Depth Learning can help us solve 10% of the problems, and the remaining 90%? My answer is: we may also need research from neuroscience and cognitive science, we need to better understand the human mind and brain.” and The Center for Brains, Minds and Machines (CBMM) wants to solve this problem through the following three paths: 1. Computer Science + Machine Learning; 2. Neuroscience; 3. Cognitive Science. But in order, computer science + machine learning ranks first
Academician Chen Lin, a cognitive scientist in my country, believes that the core basic scientific problem of the new generation of artificial intelligence is the relationship between cognition and computing. (Chinese Academic Conference on Cognitive Computing and Hybrid Intelligence (CCHI 2018), “The core basic scientific issues of the new generation of artificial intelligence: the relationship between cognition and computing”, [Chen Lin, the core basic scientific issues of the new generation of artificial intelligence: The relationship between cognition and computing, Bulletin of the Chinese Academy of Sciences, 2018, 33(10):1104-1106.]). “If we say that deep learning is inspired by the hierarchical structure of the nervous system. Then, whole brain imaging that originates from specific cells and transcends brain regions will provide a deeper and richer inspiration for the new generation of intelligent architectures.” “The basic research of human intelligence should emphasize the research of the system, the whole and the behavior, should focus on the human category, supplemented by the animal; the macro is the main, combined with the micro”. “The basic research of intelligence should support the experimental research of cognitive science in particular, and we should pay attention to the interdisciplinary research based on cognitive science experiment.”
Computer scientist Academician Li Guojie recently said in a forum: “Cognitive science is essentially an experimental science. The basic unit of cognition is not a symbol of calculation, not a bit, but a whole “chunk” (chunk). . There should be enough awe for the extremely sophisticated human brain formed by the evolution of the universe over tens of billions of years. It may take a hundred years or longer to crack the mystery of the human brain, and it is not possible for our generation to solve it. Turing believes that “whether a machine is intelligent” is not a scientific question, because “intelligence” is not clearly defined. Since the birth of computer science, intelligence and computer science have been essentially a science. So far, there is no problem when human intelligence uses computing technology. In recent years, the development of human intelligence has mainly benefited from the extremely rich data resources and the rapid increase in computing power. There is no substantial breakthrough in human intelligence technology. Therefore, it can be said that, the renaissance of intelligence is mainly the victory of computing technology and the victory of Moore’s Law!”
In the book Deep Learning, co-authored by three experts in the field of deep learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, and translated by Zhang Zhihua and others, the views on the relationship between deep learning and brain science or neuroscience are: “The role of neuroscience in deep learning research has been weakened today. The main reason is that we simply don’t have enough information about the brain to use it as a guide. To gain a deep understanding of the algorithms actually used by the brain, we need to monitor (at least) the activity of thousands of connected neurons at the same time. We can’t do this, so we are far from understanding even the simplest and most researched parts of the brain.” In this regard, Professor Zhang Zhihua of Peking University commented [Note 1]: “It is worth noting that some experts in my country are keen to advocate the intersection of intelligence and brain science or cognitive disciplines, and promote the country’s so-called “brain-like intelligence” and other fields. Invest in a large amount of resources. Regardless of whether there are scholars in our country who are proficient in both human intelligence and brain science or cognitive science, at least we should have a pragmatic and rational attitude toward cross-fields. Only in this way, it is possible for us to make a difference in this wave of smart development instead of becoming a crowd watcher again”. He also pointed out that “mathematical models, calculation methods and application drivers are the ways we can study human intelligence.”
With the deepening of cognition, the development of artificial intelligence needs to be rethinked. Recently, Takashi Jozaki, a data scientist on the Google Business Insight team, believes that the significance of the paper published by Hinton and their Nature in 1986 is not only to propose backpropagation, but also to “neural network separation from psychology and physiology to A major turning point in the field of machine learning.” Hinton also mentioned in their 2006 paper: “Such learning methods seem to be different It is not a reasonable model for brain learning. However, after applying this method to a variety of tasks, it has been shown that through the gradient descent of the weight space, very interesting internal representations can be constructed. This shows that it is worthwhile to use the neural network Find a more physiologically feasible method for gradient descent.” We can see that although neural network research originated from the modeling of biological brains, due to changes in neuroscience views, it gradually differs from brain models to become remarkable. In order to distinguish neural networks from biology and neuroscience, it is also called artificial neural network (ANN). ANN, or connectionist system, is a computing system inspired by the biological neural network that constitutes the brain. The original purpose of ANN is to solve problems like the human brain. But over time, the focus of attention began to shift to perform specific tasks, leading to a deviation from neuroscience. There is also “Nevertheless, the neural network inspired by the brain (and its integrated application) still has great application possibilities.” Even now that imitating the human brain is no longer the goal, neural networks continue to use the word “nerve”. But neural networks are now slowly moving away from the origins of connectionism and have begun to become the recognized king of machine learning.
When discussing the current development trend and strategy of brain science, academician Yang Xiongli, a neurobiologist, believes that artificial intelligence can be achieved in two ways: one is not related to the working principle of the brain, that is, it does not consider the working mechanism of the brain, but only from computing. Design and consider from a scientific point of view; second, inspired by the working principle of the brain, use the characteristics of the brain to process information to achieve intelligence to advance the research of artificial intelligence, that is, brain-like artificial intelligence. These are two different paths, but it is also possible to achieve the same result by different paths. As long as artificial intelligence can be realized, it is worth encouraging. At present, the research of the former is a bit more popular, and the latter is more difficult, but the significance is more profound.
From the viewpoints listed above, we can see that it is currently an artificial intelligence, and many schools of thought have their own expressions. As the saying goes, the ass determines the head, and where a person sits often determines the angle and scope of his thinking. People of different disciplinary backgrounds naturally interpret artificial intelligence differently. We believe that as long as it is not a matter of right and wrong, various viewpoints can be expressed from the perspective of academic research, but we should firmly oppose heresy. The ancients said: The years are like a river, the big waves wash the sand, and the mud and sand are all down. Under the wave of artificial intelligence development, the big waves wash away the sand, and the last thing left is gold, which is science and technology that can promote the healthy development of human society. This is also a natural selection law that conforms to the survival of the fittest.
Although the current hot topic of artificial intelligence research is deep learning (machine learning) that is separated from psychology and physiology, it is not just deep learning. Artificial intelligence research is also divided into many academic disciplines. Carlos E. Perez wrote an article on Medium.com, dividing artificial intelligence research into 17 tribes, giving each tribe a name, and designing a logo.Perez divides deep learning into several sub-methods, including: The Canadian Conspirators, Swiss Posse, British AlphaGoist, and Predictive Learners. It can be seen from this that deep learning also integrates a variety of research methods.
With the evolution of time and the deepening of research, deep learning has encountered a bottleneck and the theory of artificial intelligence has stagnated. Gary Marcus, professor of psychology at New York University, poured cold water on the overheated deep learning. He enumerated the various problems of deep learning, including the following aspects:
1) Deep learning requires a lot of data. For occasions where the available data is limited, deep learning is often not the best choice; 2) The learned knowledge is not deep and difficult to transfer; 3) It is difficult to deal with the hierarchical structure; 4) I cannot help with open reasoning problems; 5) Deep learning is still not transparent enough; 6) Deep learning is far from being closely integrated with prior knowledge; 7) Deep learning cannot distinguish between causality and correlation; 8) Deep learning is stable to the environment It may be problematic to ask for sex; 9) The current results of deep learning are only approximations and cannot be completely believed; 10) The development of deep learning is still difficult to engineer.
Gary Marcus pointed out the various problems of deep learning and also affirmed the current progress. “It is true that deep learning has excellently solved many difficult problems in the fields of computer vision, reinforcement learning, NLP, etc., but we are embracing deep learning. While enthusiastic, one should also realize that deep learning cannot solve all problems. Its superior feature extraction and nonlinear abstraction capabilities are far from sufficient to form a general-purpose intelligent infrastructure.” At the same time, I hope more. Various technologies and methods of understanding can go hand in hand and work together to build the “artificial intelligence” in human ideals.
The new book The Book of Why: The New Science of Cause and Effect by Judea Pearl, professor of computer science and Turing Prize winner, has triggered a discussion about the future of artificial intelligence and whether deep learning will lead to close to general human intelligence. discuss. Pearl elaborated on the views in his book and the current state of artificial intelligence, including the current inability of artificial intelligence to perform causal reasoning is a serious flaw. He believes that “deep learning is a very versatile and powerful curve fitting technique. It can identify previously hidden patterns, infer trends, and predict the results of various problems. The curve fitting method represents given data. Once risk of the set is overfitting, that is, the algorithm cannot identify normal fluctuations in the data, and will eventually be confused by interference.” “Unless the algorithms and the machines controlled by them can reason about causality, or are less conceptualized Difference, otherwise their utility and versatility will never be close to the human class.”
On August 11, 2018, at the World Science and Technology Innovation Forum co-sponsored by Houyi Holdings and Caijing Magazine on the theme of “Sharing Global Wisdom Leading Future Technology”, Thomas J. Sargent, the winner of the 2011 Nobel Prize in Economics, said that “ Intelligence is actually statistics. It just uses a very gorgeous rhetoric, which is actually statistics. Many formulas are very old, but all intelligence uses statistics to solve problems”. He also pointed out, “There are many applied sciences like engineering, physics, economics, and
we will build some models to simulate world operations… Our purpose is to explain the phenomena we observe in the world, but our key is the tool is to use the model and put it in the computer to simulate. Take the simulated data and use mathematical methods to fine-tune its parameters, hoping to be as close to reality as possible. In this process, we play the role of God .”
Regardless of whether Sargent’s view is correct or not, and whether it can be accepted by mainstream artificial intelligence research scholars, we should see the limitations of artificial intelligence represented by deep learning and how it should develop in the future.
Recently (January 25, 2019), the AI reporter Karen Hao of the MIT Science and Technology Review magazine analyzed the evolution of the field of deep learning research using one of the largest open source databases of scientific papers, “arXiv”. Karen Hao downloaded the abstracts of 16,625 papers in the “Artificial Intelligence” section as of November 18, 2018, and tracked the vocabulary mentioned over the years to understand which stage the development of deep learning has reached and insight into the next development direction of AI. The author has studied AI research papers for 25 years in depth, and the results indicate that the era of deep learning is coming to an end.
Pedro Domingos, a professor of computer science at the University of Washington and author of the book “Master Algorithm”, believes that the sudden rise and decline of different technologies has long been a characteristic of the field of artificial intelligence research. Every ten years there is fierce competition between different views. Then, every once in a while, a new technology will emerge, and researchers will gather to study this emerging technology. “In other words, every few years is essentially a period of technological dominance: the neural network ruled in the 1950s and 1960s, various symbolic methods conquered the 1970s, and the knowledge base system At its peak in the 1980s, Bayesian networks led the 1990s. Support vector machines broke out in the 2000s. In the 10s, we returned to the neural network.” “The 1920s will be no exception.” It means that the era of deep learning may soon be over. However, for what will happen next, there are already two completely different trends before us-will an old technology regain the favor, or will the AI field usher in a new paradigm? Pedro Domingos did not give an answer, but from the development plan of my country’s new generation of artificial intelligence, we can think that in the next 10 years, the development of multi-disciplinary and multi-directional integration is the future development trend of artificial intelligence.
Focusing on the future development trend of artificial intelligence, Liu Tieyan of the Machine Learning Group of Microsoft Research Institute and others believe that the research hotspots of machine learning in the next ten years include explainable machine learning; lightweight machine learning and edge computing; quantum machine learning; simple and beautiful Laws and seemingly complex natural phenomena are described by simple and beautiful mathematical laws, such as partial differential equations; and social machine learning.
Most experts and scholars believe that the development of a new generation of artificial intelligence should learn from the mechanism of cognitive neuroscience and use the mathematical tools of machine learning to build a basic theoretical system of artificial intelligence. Machine learning uses probability statistics as a mathematical tool. It is not surprising that some scholars want to use the framework of probability statistics (such as information bottlenecks). As the leading subject of natural sciences, physics is a subject that studies the most general laws of the movement of matter and the basic structure of matter. Using the framework of physical methods may be one of the ways to move towards a unified theory . However, the no free lunch theorem proposed by David H Wolpert and William G Macready shows that no single algorithm can solve all the applications of machine learning. Therefore, specific methods need to be developed to solve specific problems. For example, in order to overcome the shortcomings of convolutional neural networks, Professor Hinton, one of the great deep learning experts, recently proposed a capsule network. A capsule network is composed of capsules rather than neurons. A capsule is a small group of neurons, which is equivalent to a functional module. When using image processing, the function module can learn to check a specific object (mode) in a certain area of a picture [Sabour, Sara; Frosst, Nicholas; Hinton, Geoffrey E. (2017- 10-26). “Dynamic Routing Between Capsules”].
One of the troikas of deep learning, Yann LeCun gave a speech “Learning World Models: the Next Step towards AI” at the opening ceremony of IJCAI-2018. LeCun stated that the future of the artificial intelligence revolution will not be supervised learning, nor will it be pure reinforcement learning, but a world model with common sense reasoning and predictive capabilities. From an intuitive understanding, the world model is a model that has general background knowledge about how the world works, has the ability to predict the consequences of behavior, and has long-term planning and reasoning capabilities. Yann LeCun summarized three types of learning paradigms, namely reinforcement learning, supervised learning and self-supervised learning, and believes that self-supervised learning (previously called predictive learning) is a potential research direction for realizing world models. At the end of the speech, Yann Lecun summarized the mutual driving and promotion between technology and science, such as telescope and optics, steam engine and thermodynamics, computer and computer science. And raised several questions, 1) What is equivalent to intelligent “thermodynamics”? 2) Are there underlying principles behind artificial intelligence and natural intelligence? 3) Are there simple guidelines behind learning? 4) Is the brain a collection of a large number of “hacks” produced by evolution?
On November 7, 2018, Yoshua Bengio was invited to Beijing to participate in the 20th International Symposium on Computing in the 21st Century. At the meeting and afterwards, he was invited to Tsinghua University, where he gave a speech entitled “Challenges for Deep Learning towards Human-Level AI”. Bengio, based on the research project paper “The consciousness prior” published on arXiv in 2017, reiterated the disentangle concept that he and Yann Lecun put forward many years ago: Entanglement is the constraint, learning high-dimensional representations (unconscious state) used to describe the entire world, low-dimensional features used for reasoning (conscious state), and attention mechanisms from high-dimensional to low-dimensional—this is exactly deep learning The challenge to human-level AI. Human cognitive tasks can be divided into system 1 and system 2. System 1 focuses on fast perception, while the cognitive task of System 2 is completely opposite to the task of System 1, focusing on slow and conscious behavior—algorithms. Bengio believes that research in the field of consciousness is gradually becoming mainstream, calling “consciousness” a “transcendental” because consciousness is a constraint, a regularization term, and a hypothesis. This means that we can use very few variables to carry out A lot of predictions. But “specifically, our learning theory is still lacking in this regard. The current learning theory assumes that the test distribution is the same as the training distribution, but this assumption does not hold. The system you build on the training set may be effective in the real world Not good, because the test distribution is different from the training distribution. So I think we should create a new learning theory, which should not be based on the hard assumption that the test distribution is the same as the training distribution. We can use the physicist’s method Formula, assuming that the underlying causal mechanism of the training distribution and the test distribution are the same. So even if the initial conditions of the dynamic system are different, the underlying physical mechanism will not change. So how to do it? In fact, the constructed world model makes people look forward to it. Fear, we don’t have enough computing power to model the real world, so I think a more reasonable way is to use machine learning. Machine learning research is not about what AI should have. The study of knowledge is the study of proposing excellent learning algorithms. A good machine learning algorithm should work well in any distribution. ”
Recently, M. Mitchell Waldrop published a review article in the Proceedings of the National Academy of Sciences (PNAS) entitled “News Feature: What are the limits of deep learning?” (M. Mitchell Waldrop, News Feature: What are the limits of deep learning? learning?, PNAS, 2019-01-22, DOI: 10.1073/pnas.1821594116). In this PNAS feature article, Waldrop briefly described the history of the development of deep learning, believing that the explosion of all glorious benefits of computing power has enabled artificial intelligence to flourish today. However, due to the many limitations of deep learning, including vulnerability to adversarial attacks, low learning efficiency, unstable applications, lack of common sense and interpretability, etc., the question raised by Hinton still exists: “What is missing in deep learning?” From the perspective of computability, more and more people in the field of artificial intelligence research believe that in order to solve the shortcomings of deep learning, some fundamentally new ideas are needed. Waldrop therefore listed several work that he thinks has new ideas, one of which is the Generative Query Network (GQN) of the DeepMind team .
In the GQN architecture, there are two different networks: generation network and representation network. The GQN model consists of two parts: a representation network and a generation network. The representation network takes the observation of the agent as input and generates a representation (vector) that describes the potential scene. Then the generating network predicts (imagines) the scene from a previously unobserved perspective. GQN is based on a large number of recent multi-perspective geometric research, generative modeling, unsupervised learning and predictive learning . It shows a new way of learning compact and intuitive representations of physical scenes. GQN essentially does not train a large network, but makes two networks work together.
Waldrop’s final conclusion is that deep learning is not the way to achieve AI, and believes that graph networks may lead the future development of AI. A graph network is a neural network that takes a graph as input (rather than original pixels or this one-dimensional waveform), and then learns how inference and prediction objects and their relationships evolve over time. The graph network method has proven to achieve rapid learning and human-level capabilities in a range of applications, including complex video games. In addition, the graph network can make the network less susceptible to adversarial attacks. The reason is simple. It is a system that represents things as objects instead of pixel patterns. It will not be easily disturbed by a little noise or irrelevant stickers.
In summary, from these existing theories and methods, we can see that although each theory and method has more or less problems of this kind, they are all moving in the right direction. At present, some theories and methods contradict each other, while other theories and methods can be used in combination. As mentioned above, can current artificial intelligence research form a new type of unified theory whose goal is to build an achievable world model. So what should this unified theory be? Some scholars believe that in order to better describe the neural network and nervous system, we need a new set of mathematical language and framework, which is equivalent to asking new questions. Where is this new framework? At present, there is no unified thinking and consensus in the academic circles. But we have already seen some theorists use statistical physics to study the complexity of neural networks. Statistical mechanics (thermodynamics and statistical physics), which started at the end of the nineteenth century, has developed into a relatively mature discipline today. There are many similarities between the problems of statistical mechanics and the problems of theoretical neuroscience research. Both of these disciplines study how complex macroscopic behaviors are produced by microstructures and properties. And one of the 17 schools of artificial intelligence research is complexity theorists. According to Carlos E. Perez, “Complexity Theorists: People in this school use energy-based models from physics, complexity theory, and chaos theory. And statistical mechanics methods. Swarm AI can be said to belong to this school. If any team says they can find a good explanation of why deep learning works, then they may be in this school.” Before a unified world model and research ideas, we can try in multiple directions, and research based on complexity science is also one of the routes worth exploring.
So how to overcome the limitations of deep learning? We believe that in the future, it is neither the regaining of old technologies nor the AI field will usher in a brand new paradigm. The most likely route is to develop a new paradigm based on the old technology. Because history is evolving in a spiral manner, our cognitive level is also constantly improving. Newton said that he was standing on the shoulders of giants, and any subject with history is to build a mansion on the basis of previous studies, not a castle in the sky.
As mentioned earlier, we believe that computational intelligence is a new stage in the development of artificial intelligence, and it is intelligence inspired by nature. The idea of computational intelligence comes from the phenomena and laws of physics, chemistry, mathematics, biology, psychology, physiology, neuroscience, and computer science. The three schools of artificial intelligence have formed an organic whole. The system formed by the integration of multi-discipline and multi-technology can achieve complementary advantages, will be more effective than a single discipline or technology, and can achieve greater results. Therefore, we propose that in the development of a new generation of artificial intelligence based on the mechanism of cognitive neuroscience and the use of mathematical tools in machine learning, we should develop a collaborative learning system based on computational intelligence.
The core of physics is to discover and explain physical phenomena, the structure of matter, interaction and the laws of motion, and the core of artificial intelligence is to create intelligence. In order to develop the basic theory of artificial intelligence, artificial intelligence researchers should integrate and accept all disciplines, inclusive, and adopt a strategy of shelving disputes and developing together.
Academician Zhang Bo of Tsinghua University said that neural networks are still evolving. The key is how to choose the correct framework and training. We have to put perception and cognition in the same space. It is not a simple theory of probability and statistics; we are now communicating The road to AI is not far, near the starting point, but artificial intelligence is always on the road, everyone must be mentally prepared, and this is the charm of artificial intelligence.
Indeed, artificial intelligence is always on the road, which means that it takes a long time to evolve to be able to approach AGI. The AGI mentioned here is described according to a definition: “Artificial General Intelligence is an artificial intelligence that has the same intelligence as humans, or surpasses humans, and can express all the intelligent behaviors of normal humans.” Our long-term goal, or dream It is a collaborative learning system that finally realizes general artificial intelligence through long-term co-evolution. “We are all trying to run, we are all dreamers”, AGI is the future of dreams, not the present, it is a goal that can only be achieved after a long time of evolution. How long this time is, it may take time for the earth to wander, and it also takes 10,000 years. But “ten thousand years is too long, and we only need to fight for the day and night”, we need to work hard to accelerate the evolution process. However, it is still too early to consider how to implement AGI in the near future. As Bengio said, “The built world model is daunting, and we don’t have enough computing power to model the real world.” In order to avoid unrealistic illusions in artificial intelligence from causing the artificial intelligence winter to come again soon, it is necessary to formulate targets that can be implemented in the near future. Perhaps the transition from deep learning to collaborative learning is one of the possible evolution directions .