YATI works much better than its predecessors with the meaning of the request, the algorithm is aimed at a deeper analysis of the text, understanding its essence. This means that the search engine will more accurately understand what information is most relevant to the user's request.
Speaking about ranking, we can predict russia company email list that the semantic load of the content will have a more significant role. That is, expert texts that fully reveal the answer to the user's request will increasingly get into the TOP.
YATI Features:
Reformulating queries and “pre-training for clicks”. Yandex has a database of 1 billion reformulated queries: [1 formulation] → no click → [2 formulation]. This way, the model learns to predict the probability of a click.
Ratings on Yandex.Toloka. Using ratings from Toloka users.
Assessors' ratings. Using expert relevance ratings.
request text;
query expansion;
"good" parts of the document;
streams for document: anchor list, query index for document.
YATI and Google Bert
One of the latest updates of Yandex's main competitor in the search field, Google, was the introduction of the BERT algorithm. This neural network, like YATI, solves the problem of analyzing search queries and their context, rather than a separate analysis of key queries. That is, BERT analyzes the entire sentence.
Both YATI and BERT are aimed at better understanding the meaning of a search query. However, as Yandex specialists claim, the YATI algorithm copes with its tasks better, since in addition to the query text, it also analyzes the texts of documents, and also learns to predict clicks.
The table below shows a comparison of the quality of neural network-based algorithms in the ranking task, where “% NDCG” is the normalized value of the DCG quality metric relative to the ideal ranking on the Yandex dataset. 100% here means that the model ranks documents in descending order of their real offline ratings.
At the same time, it should be noted that BERT solves a significantly larger number of problems, among which recognizing the "meaning" of text is only one of many others. A large family of language models is based on BERT: