An hour later, Fang Hong arrived at the Quantitative Capital headquarters again.
Chen Yu's assistant came to receive him, led him towards the reception room, and said: "Mr. Fang, Mr. Chen is having a meeting in the technical department. Please wait a moment, I will inform him."
Fang Hong said: "No, just take me to his conference room and I will listen."
Hearing this, Chen Yu's assistant took out his cell phone and sent him a message. Soon Chen Yu responded. The assistant turned to Fang Hong and smiled and said, "Mr. Fang, please come this way."
After a while, Fang Hong came to the conference room where Chen Yu was. There were more than thirty people present. When they saw a strange young man walking in, everyone looked at him curiously.
They found that Fang Hong was about the same age as their boss Chen Yu, but the difference was that they felt an aura of superiority from Fang Hong that they did not have at this age, which made everyone realize that this strange young man was not an ordinary person.
At this moment, Chen Yu saw Fang Hong looking at him and nodding in greeting. The latter smiled slightly and silently found a seat in the conference room to sit down and listen.
Chen Yu withdrew his gaze, turned to look around at the attendees and continued: "... Regarding the basic implementation ideas of artificial intelligence and the process of machine learning, simply speaking, it is how computers learn by themselves."
"Because all computer operations are based on mathematical operations, any machine learning idea is ultimately to transform a practical problem into a mathematical problem. In order for the computer to predict or identify something, it needs to construct a mathematical function first.
This mathematical function is called the prediction function.”
It may be difficult for ordinary people to imagine that Quantitative Capital, as a diversified financial company, is even a non-bank financial investment company in the eyes of most investors. The leader is also engaged in investment transactions, but he talks about these contents in the company.
But Fang Hong was very calm. This was actually normal. Wall Street gathered a group of top mathematicians and physicists.
At this moment, Chen Yu turned to look at the conference screen and said: "For example, the function that predicts a full meal can be described as [full = n bowls of rice]. Is this prediction calculation accurate? How many bowls of rice can a person eat?
What is the relationship between fullness and fullness? Do you need to eat one bowl or three bowls to feel full?"
"This requires actually trying it. If the prediction is that you will be full with two bowls of rice, but you actually need to eat three bowls of rice to be full, the error of one bowl is the loss. The function describing this loss is [3-n=1],
This is the loss function."
"Machine learning is the process of constantly trying to minimize the error. The method of finding the minimum loss is usually gradient descent. Once we find the minimum error, we will find that the error is minimum when [n=3], that is, the machine
If you learn to find the real rules, you will successfully solve the problem."
Chen Yu looked at the crowd again and said: "So, machine learning is looking for patterns in data. Most of the time, its essence is to project the data into a coordinate system, and then use computers to draw a line to distinguish or simulate these data through mathematical methods.
the process of."
"Different machine learning methods are using different mathematical models to project data and draw lines. From the last century to the present, different schools have found different methods and are good at solving different problems. There are a few that have had a huge impact.
Types: linear regression and logistic regression, k-nearest neighbor, decision tree, support vector machine, Bayesian classification and perceptron, etc."
Fang Hong sat aside and listened silently. He was also half an industry insider in the field of computer science. He also had the advantage of foresight in remembering past lives. There was no pressure to listen at this moment.
Chen Yu and others obviously followed the school of neural networks, but they also took a step forward and entered reinforcement deep learning, and the predecessor of neural networks was the perceptron.
These three terms are essentially playing with the same thing.
But at this moment, Chen Yu said slowly: "The most basic idea of deep learning is to simulate the activity of brain neurons to construct prediction functions and loss functions. Since it is called a neural network, it must have a certain relationship with the neurons of the human brain.
The algorithm mechanism of a single perceptron is actually simulating the operating mechanism of brain neurons."
A structural diagram of brain neurons is presented on the screen.
"This is a neuron. Everyone knows its structure. This is the dendrite, and this is the axon. Signals from other neurons enter the neuron through the dendrite, and then are emitted through the axon. This is a nerve.
Yuan’s operating mechanism.”
"Now we mutate the neuron's tree into an input value, and mutate the axon into an output value, so the neuron becomes a picture like this. It is simpler to convert it into a mathematical formula, [
1
3=y], that’s the formula.”
"Yes, it's that simple. The most complex things are often created by the simplest things. Simple 0s and 1s shape the huge computer world, and four nucleotides are used to create complex life phenomena. A
Simple neuronal reflexes shape our brains."
Chen Yu paused for a while and looked around the crowd again: "The key to the problem is not how simple the basic structure is, but how we use this basic structure to build a huge world. The reason why neurons are amazing is that they have an activation mechanism, which is the so-called
threshold."
"Each dendrite of a neuron continuously receives input signals, but not every input signal can cause the axon to output a signal, and each dendrite has a different weight when receiving input."
"For example, if you pursue a girl, you take various actions tirelessly. Today you gave her a bouquet of flowers, and tomorrow you treat her to a feast, but you find that none of these actions can impress her. Until one day you accompany her shopping for a day.
, she was suddenly moved and agreed to be your girlfriend, what does this mean? "
"It shows that not all input weights are the same. Shopping may have the highest weight among girls. Secondly, the accumulation of effects is not a linear and gradual process, but quantitative changes lead to qualitative changes."
"All inputs have no effect before a certain point, but once they reach a certain value, they are suddenly excited. Therefore, to imitate the activation characteristics of neurons, then make a modification to the formula just now."
"Each input requires a certain weight. Add a coefficient [w] to adjust the weight in front, and add a constant in the back to facilitate better adjustment of the threshold, so the function becomes like this."
Fang Hong also looked at the big screen at the conference, which was a new mathematical formula.
【33 b=y】
Chen Yu looked at the formula on the screen and said: "In order to realize the activation process, the output value is further processed and an activation function is added. For example, when >1, output 1; when <1, output 0, so
It’s become like this.”
"However, this function does not seem round enough and is not differentiable everywhere, so it is difficult to handle. So I changed it to the igmoid function. Such a simple function can handle the classification problem."
"A single perceptron actually draws a line to separate two different things. A single perceptron can solve linear problems, but it is powerless for linearly inseparable problems. That means that even the simplest XOR problem cannot be solved.
Unable to handle."
The XOR problem was understood by everyone present, including Fang Hong, as it is one of the basic operations of computers.
At this time, Chen Yu asked himself: "If the XOR problem cannot be solved, wouldn't that be a death sentence?"
Chen Yu immediately answered himself: "It's very simple, just use the kernel function to increase the dimension. The reason why the perceptron can become the current deep learning is because it has changed from one layer to multiple layers. The depth of deep learning refers to the perceptron.
There are many layers. We usually call neural networks with more than three hidden layers deep neural networks. How does the perceptron solve the XOR problem by adding layers?"
Chen Yu looked back at the screen to retrieve the next slide and said: "Computers have four basic operational logics, AND, OR, NOT, and XOR. There is no need to go into detail about this. If we put XOR in a coordinate system
To show that this is what it is.”
"The origin position is 0, y is 0, so take 0; when =1, y=0, if the two are different, take 1, together, here is also 1, and at this position, y is equal to 1, so take 0, in this
If we need to separate 0 and 1 on the picture, a straight line cannot do it."
"What to do? This depends on the nature of the XOR operation. Mathematically speaking, the XOR operation is actually a compound operation. It can actually be obtained through other operations. The proof process is too complicated and will not be expanded upon here."
"If we can use the perceptron to complete the operation in the brackets first, and then input the result into another perceptron to perform the outer layer operation, we can complete the doubt operation, and then the XOR problem will be so magical
Solved, while solving the problem, it also solved the problem of linear inseparability."
"What does this mean? It means that no matter how complex the data is, we can fit a suitable curve to separate them by adding layers. Adding layers is the nesting of functions. In theory, no matter how complex the problem is, we can
Through the combination of simple linear functions, in theory, multi-layer perceptrons can become a universal method that can solve various machine learning problems across fields."