Chapter 90 Admiration from the top algorithm team_Crossing: 2014_Inokuma

Font

Large

Medium

Small

Night

Chapter 90 Admiration from the top algorithm team（1/2）

Chapter 90 Praise from the top algorithm team

Eve Carly is 25 years old and is a Ph.D. in computer science from the Massachusetts Institute of Technology in the United States.

In fact, few computer majors get a Ph.D., and most of them go their separate ways after completing a master's degree.

But Eve Carly has her own pursuits on the academic road.

Although this pursuit is almost destined to be a solitary journey.

But she always enjoys it, and the biggest motivation along the way is interest.

Besides interest, the biggest reason is the pride brought by work.

She also has reason to be proud as a member of the text summarization team in MIT's natural language processing research project.

After all, the world's most efficient extractive text summarization algorithm was developed by their team.

Eve Carley has always been proud of it.

However, this glory disappeared half an hour ago.

A new text summarization algorithm that is stronger than the extractive text summarization algorithm developed by their team has been released.

And it appears directly in Apple's App Store as a mature application.

Eve Carly also learned about this incident after receiving an email asking for help from Nick.

In fact, she was still a little skeptical after receiving the exaggerated request for help email from Nick.

She even thought for a time that Nick, the arrogant and stupid lucky idiot, had misremembered the date of April Fool's Day.

The algorithm in the software Nick used was ostensibly the responsibility of Yishuo's team.

But in fact, the Text Summarization Group of the Natural Language Processing Project at MIT is the real source of this algorithm.

The algorithm used in Nick’s software can be said to be the culmination of everyone’s hard work in Eve Carley’s research team.

Eve Carly is still very confident about the algorithm she personally participated in.

How is it possible that there are software algorithms that can process news summaries more efficiently than the algorithms they developed?

It's not that she is arrogant and blindly confident.

In the past, the core algorithms of many software that appeared in app stores under the banner of news summaries were actually very inefficient.

Even many news summarization programs that claim to have unique algorithms turn out to be just that.

Regarding this time, the so-called Nanfeng APP claims that the surface efficiency is the strongest and has the highest accuracy in the world.

At the beginning, Eve Carly only regarded these slogans as gimmicks and did not take them seriously.

However, the reality is shocking. Far from being a paper tiger, this Nanfeng APP is a peerless beast.

At least in terms of processing news summaries, the algorithm used by Nanfeng APP is ridiculously efficient in terms of efficiency.

After quantitative testing, Eve Carly even found that the average speed of English news summaries in Nanfeng APP in 100 rounds of tests was 241% faster than the software developed by Nick.

This is nothing, when running Nanfeng APP on a virtual machine with higher computing power.

The average speed of English news summaries in 100 rounds of tests is 350% faster than the average speed of their algorithm's summaries under the same conditions.

It can be said that it was defeated in all aspects.

Eve Carly couldn't understand how there could be an algorithm that was three times more efficient in extractive text summarization than the algorithm they developed.

According to their research, the potential of current extractive text summarization algorithms has almost been exhausted.

Could it be that the algorithm team of Nanfeng APP has found a new way to squeeze the potential of the extractive text summary algorithm?

Impossible, absolutely impossible.

After all, their research team is also a natural language processing algorithm team that brings together the world's leading technology experts.

It makes no sense that elites like them would be overtaken by others in the same direction.

If the Nanfeng APP algorithm team hadn’t come from behind, then it would have been overtaking in a corner?

In other words, the algorithm of Nanfeng APP is definitely not the traditional extractive text summary algorithm, but a brand new summary algorithm.

A layman watches the excitement, an expert watches the door.

Eve Carly quickly verified her guess from the input and output results of several sets of news summary tests previously conducted by Nanfeng APP.

As expected, Nanfeng APP adopts a brand new text summarization algorithm.

As for the basis for judgment, it is very simple.

Extractive text summarization directly extracts words or complete phrases from the original text as a summary of the article.

This process does not produce words and phrases that are not in the original news text.

However, the software Nanfeng APP produces many words and phrases in the news summary that are not in the original news text.

In other words, the algorithm used in Nanfeng APP is definitely not an extractive algorithm, at least not just an extractive algorithm.

A major feature of this new algorithm in news summarization is that it will generate words and phrases that are not in the original news text.

Compared with traditional extractive text summarization, Eve Carly feels that this new summarization method in Nanfeng APP is more like a generative summarization method.

However, new questions immediately appeared in Eve Carly's mind.

How did the developer of Nanfeng APP come up with this brand-new algorithm tentatively called "generative summary algorithm"?

Their development team has also dabbled in the so-called generative summary algorithm and similar summary algorithms that rely on neural networks before.

At that time, they called this algorithm a "summary summary algorithm", but after many rounds of testing by their team, the actual performance of this algorithm was not ideal.

Although this summary algorithm, called generalized or generative text summarization, can generate expressions that have not appeared in the original text, it is more flexible than the extractive summary algorithm.

However, it is precisely because of this that generative summarization is more likely to produce factual errors, which include content that is contrary to the original information and people's common sense.

In addition, this generative text summarization algorithm can easily show obvious weakness when dealing with long news.

Although putting this generative summary algorithm and the extractive summary algorithm together will improve the generative summary algorithm's ability to handle news length.

However, after testing, it was found that the generative summary algorithm is not a drag, and the extractive summary algorithm can perform more ideally.

For the sake of safety, Eve Carley's team finally chose the traditional text summarization direction of further strengthening the speed and accuracy of extractive text summarization.

A direction that they had abandoned but was picked up again by others?

It sounds a bit incredible, but the fact is that the developers of Nanfeng APP not only picked up the research direction they once abandoned, but also did it better than them, which can be said to be a slap in the face.

Eve Carly was a little confused. She couldn't figure out how the developers of Nanfeng APP managed to create a path in a direction they thought was unfeasible.

But one thing is certain, although the developers of Nanfeng APP also use an algorithm similar to the general/generative algorithm, the specific generative algorithm itself is at least one generation more advanced than the generative algorithm they originally made.

Despite her confusion and being slapped in the face, Eve Carly did not appear very emotional, at least not as emotional as Nick showed in his letter.

Years of research career have long developed Eve Carly's rational character that is not surprised by favors or insults.

Furthermore, technological progress comes one after another.

If you worry about gains and losses because of temporary gains and losses, then it is better to change careers as soon as possible.

Excessive emotional fluctuations are not only unnecessary, but will affect rational judgment.

After experiencing Nanfeng APP in depth, Eve Carly had to admit that although this APP looks like a temporary translation software, the core algorithm is indeed very strong.

Even as the promotional slogan of this software says - "The strongest on the surface".

In addition, the summary speed and summary accuracy claimed by this software to overwhelm similar software are also true.

Wait, thinking of the "accuracy" emphasized in the promotional slogan of Nanfeng APP, Eve Carly suddenly thought of something.

Current news summarization software algorithms emphasize speed in terms of publicity, and rarely talk about accuracy.

It’s not that accuracy is not important in news summarization, on the contrary, accuracy is extremely important in news summarization. It can be said that accuracy is the most fundamental factor to measure whether a summary algorithm is useful, but various summary algorithms rarely

Accuracy is advertised with extremely precise quantification.

There is no other reason, because the industry currently lacks a unified standard for measuring accuracy.

It sounds incredible, but it is true. Evaluating the accuracy of an abstract may seem easy, but in fact it is a more difficult task.

For the measurement of an abstract, it is difficult to say that there is a standard answer. Unlike many tasks that have objective evaluation criteria, the evaluation of abstracts relies on subjective judgment to a certain extent.

In summarization tasks, there is a lack of a unified standard for measuring summary accuracy such as grammatical correctness, language fluency, and completeness of key information.

There are currently two methods for evaluating the quality of automatic text summarization: manual evaluation methods and automatic evaluation methods.

Manual evaluation involves inviting a number of experts to set standards for manual evaluation. This method is closer to human reading experience.

However, it is time-consuming and labor-intensive, and not only cannot be used to evaluate large-scale automatic text summarization data, but it is also not consistent with the application scenarios of automatic text summarization.

The most important thing is that if people with subjective ideas evaluate summaries, it is easy to have deviations. After all, there are a thousand Hamlets in the eyes of a thousand people. Everyone has their own criteria for measuring news summaries. Perhaps a measurement team

A unified measurement standard can be formulated, but if you change the measurement team, the measurement standard is likely to be different.

This can easily lead to completely different evaluations of the same summary results due to different judging teams when judging accuracy.

Judging teams vary widely, and it is easy for some teams that are obviously capable of doing good algorithms to die before they can be accomplished because of the judging team's interference.

The text summarization algorithm of Eve Carley and her team was once leading the world.
To be continued...

Prev Index Favorite NextPage