Recently, a tech-illiterate friend said he wanted to buy a Tesla because of its “stand still and let the car find you” autopilot feature. I was blown away by the sense that autonomous driving, while still a topic of concern for practitioners, could become a key factor in ordinary people’s car-buying decisions within a few years. In this case, it is necessary for ordinary people to understand the key technologies of autonomous driving in advance, but not as detailed as the practitioners. This answer is for the general public to make an easy to understand the introduction.
What technology is required for autonomous driving?
What technology is required for autonomous driving? To answer this question, you don’t need to open a book, just think back to how we usually drive:
The Eye’s environmental Perception: Where is the driveway? Red or green? Oh, why is there a battery car out of nowhere? Oh, a little too close to the van! All these tasks are accomplished by the ultra-wide Angle, fast focusing, stepless aperture adjustment, real-time binocular ranging, and the ultra-high performance bionic camera — the eye, which is self-repairing for damage .
More importantly, the bionic camera bring strong (artificial) intelligent processor, automatic image processing (excluding the capillaries in the shade, plug frame completion blind pixels, etc.), object recognition (traffic lights, lane), track prediction (storage battery will rush out), and other functions, and then report the information to the upper “consciousness”.
The behavioral decision of the brain: judging the control strategy of the vehicle through the information of the environment perception: there is no car ahead, hurry up and run; The car’s out. Hit the brakes. It should also be mentioned that path planning, such as “whether to take the high speed today”, also belongs to the broad decision-making function.
Vehicle control of hands and feet: After receiving the decision instructions from the brain, the nerves and limbs of the driver take the accelerator brake and the steering wheel as the two major media of human-vehicle interaction, and assume the function of vehicle control together with the entire vehicle system.
What is autopilot? — is to replace all or part of these functions that are originally performed by human beings.
What technologies are needed for autonomous driving? Naturally, there is a need for environmental awareness, behavioral decision making (broadly), and vehicle control techniques.
So, which technology has the highest and most critical threshold today?
First, we can exclude vehicle control technology. This is not to say that vehicle control technology is simple, L1 autonomous driving can only help the driver automatically acceleration and deceleration or automatic steering, progress to L2 “acceleration and deceleration and steering at the same time,” the industry has taken a long time.
However, it is generally a mechanical and electrical control engineering problem, related technology and supply chain is basically mature; Although advanced autonomous driving will require redundancy in braking and steering for safety reasons, there is no escaping engineering, and engineering problems can eventually be solved.
Second, when it comes to behavioral decisions (broadly), navigation software is already better than humans at such things as path rules. The decision of when to accelerate, when to brake and when to turn, when the environmental perception is absolutely correct, is not difficult (and may involve some legal and ethical issues that are not in the technical domain).
So what’s left is environmental perception. Environmental awareness is a new challenge for the automotive industry, and it is also the most critical step to achieve autonomous driving, the most important link.
Two technical routes of environmental perception
It is precisely because environmental awareness is so difficult to achieve that it leads to differences in technological paths. The most interesting landmark was when Tesla founder Elon Musk launched his own Autopilot 3.0 with a controversial line that blasted much of the self-driving industry:
“Lidar is stupid. Anyone who relies on lidar is doomed.
“Lidar is a fool. Anyone relying on Lidar is Recommended.”
Mr Musk struck another surprise
What is Lidar? Why did Musk say that?
To answer this question, it is necessary to mention the two technical routes of autonomous driving environment awareness: weak perception + super intelligence vs strong perception + strong intelligence.
1.Weak perception + super intelligence
In his speech, Musk insisted on the weak perception + super intelligence technology route, which mainly relies on camera and deep learning technology to realize environment perception, rather than relying on lidar. There is something bionic about this approach: just as people can drive with a pair of eyes, cars can use cameras to see their surroundings. For Mr Musk, who sees first principles as the secret to disruptive innovation, this is not surprising.
I also like the natural beauty of the technology route, but the question is: when will super intelligence be achieved? You know, the battle of technology routes to pay attention to the pace and speed of commercialization. A hundred years ago, the internal-combustion engine had none of the electromagnetism of an electric motor, but it still beat the electric car. A decade ago, plasma TVS weren’t all bad, but they were beaten by LCD TVS.
The real problem is that deep learning is still stuck in the awkward “recognition” stage. For example, the paper “Why Deep-learning AIs are So Easy to Fool”  published in Nature gives an example: The distinctive “STOP” sign is recognized by the AI as a dumb-bell and Racket after changing the Angle.
Note: The above identification results are only examples from the paper and do not represent the strongest identification ability at present
That’s stiff. Such a recognition result, afraid is not as good as the two fools in the next village?
“Weak” perception is the opposite of “super” intelligence. The reason we think the human eye is so powerful is because even dull humans are equipped with super intelligent recognition capabilities that we just take for granted and don’t notice.
The above situation is still in good light, if it is night, fog, rain and snow conditions, the identification effect will be reduced. When tesla crashed into a truck in the U.S., it was because the camera recognized the white truck as the sky.
If the ability to identify is such, then the subsequent behavior prediction and logical reasoning is even less water. Therefore, the current level of deep learning is far from human “super intelligence”; We don’t know when we can reach or even exceed human level.
With so much unknown about how fast deep learning will advance, the self-driving industry can’t afford to stop and wait. So here’s another idea: If super intelligence is temporarily out of reach, what if we could give cars more senses than the human eye? This is the technical route of strong perception + strong intelligence.
2.Strong perception + strong intelligence
Compared with the technology route of weak perception + super intelligence, the biggest feature of the technology route of strong perception + strong intelligence is the addition of lidar sensor, which greatly improves the perception ability.
Before introducing the principle of lidar, let’s start with a simple analogy: it’s a violent solution, so you can scan every corner with a clairsighted, and theoretically know what’s around. With a little learning algorithm, you can map out the scope of the obstacles and know where the car is heading.
Just like the two children in the gourd baby, he does not have violent skills like the iron baby and the fire baby. As a child with pure heart and lack of stratagem (not super intelligence), he can only be caught without a fight when he meets the monster. But in fact, with his clairvoyant powers, he also won a lot of battles. This approach may sound a little too straightforward, but it’s a more feasible technology route to advanced autonomous driving at a time when deep learning has hit a wall.
In fact, there are not many players on this technology route, such as Google Waymo, Baidu Apollo, Uber, Ford Motor, General Motors and other AI companies, travel companies, traditional car companies are in the strong perception + strong intelligence technology camp. In addition, violent schemes often have a beauty, such as the Old Russian Katyusha rocket, although there is no subtle guidance method, and do not need to know the exact location of the enemy, as soon as the fight to deliver enough fire to ensure coverage.
Key sensors for environment awareness
Regardless of the technical route, the core of environmental perception lies in the ancient term “Sensor”. The cameras and lidar we mentioned earlier are sensors.
In addition to these two sensors, which are closely related to autonomous driving, there are two other types of Radar: ultrasonic Radar and millimeter wave Radar.
Suddenly many a few professional nouns, is the head to want big? At first, I also had this feeling, and then I summed up a simple way to remember, and I will share it with you:
Camera: relying on others, through the third party emitted waves (light is also a wave) perception information, called a camera;
Radar: a self-contained device that senses information by transmitting its own waves.
Different radars: According to the detection range, resolution and other factors, the three types of radars are divided into ultrasonic radar, millimeter wave radar (1-10mm), and laser radar. In general, lidar has the longest range and the highest resolution.
The principle of lidar can be understood as a digital quantity: a laser used to be straight, which is equivalent to a digital sweep of points. Theoretically, all the surrounding points can be swept, and you can clearly know what the surrounding environment is like. In fact, the vehicle shape scan (reverse) is also this principle.
Of course, in the actual engineering application, there are still many details. As mentioned earlier, the lidar has a very high resolution, which means it can only scan a very small area at a time, which means it has to solve two problems:
Space coverage: not only a small area, we need to see the whole! How do you spell that? Generally speaking, longitudinal by multiple lines, that is, more than a few waves; Transverse by rotation, rotation is mechanical, called mechanical laser radar; There is also the use of optical phased array technology to achieve, that is called solid state lidar.
Time consistency: horizontal scanning is done sequentially, so there is a time difference. During this short time gap, the vehicle itself is moving and the surrounding environment is changing, which requires artificial intelligence technology to “reproduce the picture”. Of course, the points that lidar sweeps back are precise, precise, and have depth information, so only “strong intelligence” needs to be relied on, not “super intelligence”.
Laser radar scanning, longitudinal by multi-line, transverse by rotation
Unlike lidar, cameras collect information in pixels, similar to what the human eye sees. With people, to the human eye is equipped with a super intelligent processor (artificial), in the case of effortless identify lane in the environment, vehicles, pedestrians and so on, and for vehicles, pixel just meaningless huge amounts of digital information, must go through the following abstract, refactoring, such as complicated process, must rely on super intelligence to achieve human recognition effect.
Millimeter-wave radar is mentioned at the end, as if to imply that millimeter-wave radar is not important? On the contrary, millimeter-wave radar is very important:
Whether weak perception + super intelligence, or strong perception + strong intelligence, all technical routes must be millimeter wave radar. Millimeter-wave radar is necessary in the foreseeable stages of autonomous driving, whether it is the current mature L2 autonomous driving, or the future L3, L4, L5 autonomous driving. Just because millimeter-wave radar is needed and less controversial, it is not often compared and mentioned, but that does not mean it is not important.