Error Correction for Search Engine by Mining Bad Case

Jianyong DUAN  Tianxiao JI  Hao WANG  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.7   pp.1938-1945
Publication Date: 2018/07/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDP7284
Type of Manuscript: PAPER
Category: Natural Language Processing
query correction,  Bad Case mining,  N-gram model,  

Full Text: PDF(1.4MB)>>
Buy this Article

Automatic error correction of users' search terms for search engines is an important aspect of improving search engine retrieval efficiency, accuracy and user experience. In the era of big data, we can analyze and mine massive search engine logs to release the hidden mind with big data ideas. It can obtain better results through statistical modeling of query errors in search engine log data. But when we cannot find the error query in the log, we can't make good use of the information in the log to correct the query result. These undiscovered error queries are called Bad Case. This paper combines the error correction algorithm model and search engine query log mining analysis. First, we explored Bad Cases in the query error correction process through the search engine query logs. Then we quantified the characteristics of these Bad Cases and built a model to allow search engines to automatically mine Bad Cases with these features. Finally, we applied Bad Cases to the N-gram error correction algorithm model to check the impact of Bad Case mining on error correction. The experimental results show that the error correction based on Bad Case mining makes the precision rate and recall rate of the automatic error correction improved obviously. Users experience is improved and the interaction becomes more friendly.