search is a solved problem, and traditional search and ranking algorithms vastly outperform LLMs while being much more efficient. Google search was exceptional a long time ago, before SEO and sponsored results ruined the internet. LLMs are even more dangerously susceptible to SEO-like techniques.
in practice, the context and source of information is just as important as the information itself, while LLMs provide the information out of context and without the source, or worse, with fake sources. Users get a lot of understanding from the way the information is presented on the page, where it is located on the page, the writing style, and what other information is nearby around it, than they might consciously realize.
it’s been proven time after time that LLMs are uniquely bad at summarization, and LLMs are not a “knowledge store”, but unfortunately they are still misused in this way for search
Search, summarization, and “knowledge store” are not valid usecases of LLM technology.
Intuitively, it at first seems like LLM would be great for summarization, however researchers have analyzed and evaluated LLM summarization results and concluded that LLMs do not actually even do summarization at all. Instead, they only shorten the text by removing repetition, based on statistical patterns in their training set, rather than based on an understanding of the specific given content. Again, intuitively it might seem like this is still something that might be useful to be able to do, but in practice the results are almost never useful or what the user actually wanted.
This is a fundamentally different task than doing summarization, because summarization requires understanding of the content and context, in order to identify the key information and point/purpose of the content. A statistical model of language just cannot do summarization.
Our misleading intuition about how LLMs work and how they can be used makes them even more unsuitable for it, because they seem like they are doing what the user asked, when they are actually doing something entirely different.
tl;dr summary: What they provide appears to be a summary of the given content, until you really dig in and evaluate it, then you realize it isn’t really a summary of the content at all, it just looks like one.
LLMs are especially unsuitable for use as search.
search is a solved problem, and traditional search and ranking algorithms vastly outperform LLMs while being much more efficient. Google search was exceptional a long time ago, before SEO and sponsored results ruined the internet. LLMs are even more dangerously susceptible to SEO-like techniques.
in practice, the context and source of information is just as important as the information itself, while LLMs provide the information out of context and without the source, or worse, with fake sources. Users get a lot of understanding from the way the information is presented on the page, where it is located on the page, the writing style, and what other information is nearby around it, than they might consciously realize.
it’s been proven time after time that LLMs are uniquely bad at summarization, and LLMs are not a “knowledge store”, but unfortunately they are still misused in this way for search
Search, summarization, and “knowledge store” are not valid usecases of LLM technology.
Can you please expand on point 3? With reference only to anecdotes, I thought LLM summarised extremely well, albeit with certain hallucinations.
Intuitively, it at first seems like LLM would be great for summarization, however researchers have analyzed and evaluated LLM summarization results and concluded that LLMs do not actually even do summarization at all. Instead, they only shorten the text by removing repetition, based on statistical patterns in their training set, rather than based on an understanding of the specific given content. Again, intuitively it might seem like this is still something that might be useful to be able to do, but in practice the results are almost never useful or what the user actually wanted.
This is a fundamentally different task than doing summarization, because summarization requires understanding of the content and context, in order to identify the key information and point/purpose of the content. A statistical model of language just cannot do summarization.
Our misleading intuition about how LLMs work and how they can be used makes them even more unsuitable for it, because they seem like they are doing what the user asked, when they are actually doing something entirely different.
tl;dr summary: What they provide appears to be a summary of the given content, until you really dig in and evaluate it, then you realize it isn’t really a summary of the content at all, it just looks like one.
Huh, that’s really interesting ( and makes sense ). Appreciate you taking the time to write it out.
Here Gamers Nexus talks about how YouTube missummarizes their content.
https://youtu.be/MrwJgDHJJoE