Chapter 156 Academic Tools Get(1/3)
Chapter 156 Academic tool man get√
Although Eve Carly didn't know why Lin Hui asked so suddenly.
But how could Eve Carly give up this opportunity so easily that she could get some advice from Lin Hui?
Eve Kali first explained to Lin Hui the role that vectors usually play when calculating semantic text similarity in the West.
Then Eve Carly officially began to answer the questions Lin Hui asked her before:
“The introduction of vectors can make it easier for machines to process semantic text information.
If we do not introduce vectors, we have few options when dealing with semantic text similarity.
And without introducing vectors, the solution we choose to calculate semantic text similarity is more or less LOW.
For example, string-based methods, which compare original texts.
It mainly includes edit distance, longest common subsequence, N-Gram similarity, etc. for measurement.
Take edit distance, for example, which measures the similarity between two texts based on the minimum number of edit operations required to convert one text into the other.
The editing operations defined by this algorithm include three types: add, delete, and replace.
The longest common subseries is based on...
This set of metrics is even a bit like the Microsoft Word format for measuring general.
Although the string-based method is simple in principle and easy to implement.
But this method does not take into account the meaning of words and the relationship between words.
Issues involving synonyms, polysemy, etc. cannot be dealt with.
At present, string-based methods are rarely used alone to calculate text similarity.
Instead, the calculation results of these methods are used as features to characterize text and are integrated into more complex methods.
In addition to this method, there is also..."
Lin Hui also knows a little bit about these things.
He just wanted to determine the progress of the research on this space and time through Eve Carly's mouth.
Measuring semantic text similarity based on string editing operations and the longest common subseries is indeed a bit low-end.
But low-end does not mean useless, so this algorithm cannot be said to be worthless.
Imagine if there was a breakthrough in the field of text recognition.
If the judgment method of defining text similarity is combined with the text recognition algorithm.
Instead, the method of determining text similarity based on strings is the most appropriate.
After all, this string-based discrimination method is the closest to the intuitive logical form of computer vision.
In fact, text recognition algorithms are also very common technology in later generations.
Even the screenshot tool of any chat software can be very capable of text recognition tasks.
But in this time and space, there are even some software that specialize in text recognition as a gimmick.
The actual work done is just to scan the document and convert it into PDF.
A batch that is inefficient when it comes to actual text recognition.
Lin Hui felt as if he had stumbled upon another business opportunity.
Although I have discovered a business opportunity, it is not suitable to do it now.
After all, the aspect of text recognition is related to the field of computer vision.
The so-called computer vision simply means letting machines see things.
This is considered a field of artificial intelligence.
Research in this area enables computers and systems to obtain meaningful information from images, videos, and other visual inputs.
The machine takes action or provides recommendations based on this information.
If artificial intelligence gives computers the ability to think.
Then computer vision is to give the ability to discover, observe and understand.
Although computer vision cannot be said to be very complicated.
But at least the threshold is much higher than natural language processing.
Obviously it is not suitable for forest ash to be mixed in now.
However, Lin Hui was patient, and Lin Hui silently kept this matter in his heart.
Lin Hui felt that he should not be too short-sighted.
Some things seem useless now.
This does not necessarily mean that the long-term perspective is useless.
Thinking of this, Lin Hui suddenly felt very lucky.
After rebirth, the experience in his previous life made him more comfortable.
On the other hand, what benefited him from rebirth was a change in his thinking.
When it comes to many things, Lin Hui will subconsciously consider the long-term value.
You may even inadvertently consider what will happen ten or twenty years from now.
There is this long-term way of thinking.
Lin Hui felt that given time, he would be able to reach a height that few others could reach.
But these ideas are not enough for outsiders.
Although there are some differences with Eve Carly on the method of evaluating text similarity based on strings.
But Lin Hui didn't show it, and academic exchanges were often just about seeking common ground while reserving differences.
Eve Carley continued to state her views:
"...I think it is indeed a good idea to introduce vectors into the measurement of semantic text similarity.
But after the vector is involved, it's like opening Pandora's box.
Vectors are used when processing some semantically complex text information.
It is extremely easy to form some high-dimensional spaces, causing dimension explosion.
When this happens, the application scenario often becomes extremely bad.
The problem of dimensionality explosion often occurs.
In fact, the problem of dimensionality explosion has already greatly restricted our research.
Dear Lin, I wonder what you think about this issue?"
Lin Hui said: "Dimension explosion is mainly a problem that is difficult to deal with in high dimensions.
In this case, why not consider reducing the high dimensions?”
Lin Hui's tone was so calm and calm.
It's like describing a natural thing.
Dimensionality reduction? Dimensionality reduction of high-dimensional things?
I listened to the information that was transmitted simultaneously by the interpreter.
Eve Carly felt like she was going to vomit blood.
She wants to learn Chinese a little bit.
She didn't know that Lin Hui's original intention was to transform high dimensions into low dimensions.
Or is it that when Lin Hui expressed it, he was talking about converting something high-dimensional into a low-dimensional thing, but something was omitted when the translation was conveyed.
It would be really bad if some important terms are omitted.
In the end, what Lin Hui wants to express is to convert high-dimensional data into low-dimensional data?
Or do you mean converting a high-dimensional model into a low-dimensional model?
Or does it have some other meaning?
Eve Carly wanted to ask.
But considering Lin Hui’s previous thoughtful actions for Mina Kali.
Eve Carly was not good. This kind of thing made the translator brought by Lin Hui feel uneasy.
Think carefully about the meaning of Lin Hui's words.
First of all, Eve Carly felt that what Lin Hui wanted to say was not to reduce high-dimensional data to low-dimensional data.
If high-dimensional data appears when performing natural language processing.
To be continued...