Academy & Industry Research Collaboration Center (AIRCC)

Volume 10, Number 16, November 2020

Linear Regression Evaluation of Search Engine Automatic Search Performance Based on Hadoop and R


Hong Xiong, University of California, USA


The automatic search performance of search engines has become an essential part of measuring the difference in user experience. An efficient automatic search system can significantly improve the performance of search engines and increase user traffic. Hadoop has strong data integration and analysis capabilities, while R has excellent statistical capabilities in linear regression. This article will propose a linear regression based on Hadoop and R to quantify the efficiency of the automatic retrieval system. We use R's functional properties to transform the user's search results upon linear correlations. In this way, the final output results have multiple display forms instead of web page preview interfaces. This article provides feasible solutions to the drawbacks of current search engine algorithms lacking once or twice search accuracies and multiple types of search results. We can conduct personalized regression analysis for user’s needs with public datasets and optimize resources integration for most relevant information.


Hadoop, R, search engines, linear regression, machine learning.