Data intensive systems flourish in the last decades with an ever increasing rate of data production. A characteristic such case is the Web graph. Given the dynamism of the Web we aim to study the Web graph in terms of learning models and monitoring its evolution. The main problems we study are: • identification of trends and patterns in the web graph, using the spectral properties of the evolving web adjacency matrix. • monitoring of web pages’ ranking over time, and prediction of pages web ranking. • learn models for the evolving web graph with statistical learning techniques. The results of the proposed research will be a framework of approaches and algorithms that will enable effective and efficient: - Query based top-k list predictions (future and historical ones) - Prediction based crawling: based on our ranking predictive modeling, crawling resources can be optimized maintaining at the same time a satisfactory top-k quality All the above are profoundly beneficial for resource management in the context of large scale Web search, and the added value of the above will be the potential use of these techniques by the Web search industry.
Call for proposal
See other projects for this call