TY - CHAP
T1 - Web data reconciliation
T2 - Models and experiences
AU - Blanco, Lorenzo
AU - Crescenzi, Valter
AU - Merialdo, Paolo
AU - Papotti, Paolo
PY - 2012/12/1
Y1 - 2012/12/1
N2 - An increasing number of web sites offer structured information about recognizable concepts, relevant to many application domains, such as finance, sport, commercial products. However, web data is inherently imprecise and uncertain, and conflicting values can be provided by different web sources. Characterizing the uncertainty of web data represents an important issue and several models have been recently proposed in the literature. This chapter illustrates state-of-the-art Bayesan models to evaluate the quality of data extracted from the Web and reports the results of an extensive application of the models on real life web data. Experimental results show that for some applications even simple approaches can provide effective results, while sophisticated solutions are needed to obtain a more precise characterization of the uncertainty.
AB - An increasing number of web sites offer structured information about recognizable concepts, relevant to many application domains, such as finance, sport, commercial products. However, web data is inherently imprecise and uncertain, and conflicting values can be provided by different web sources. Characterizing the uncertainty of web data represents an important issue and several models have been recently proposed in the literature. This chapter illustrates state-of-the-art Bayesan models to evaluate the quality of data extracted from the Web and reports the results of an extensive application of the models on real life web data. Experimental results show that for some applications even simple approaches can provide effective results, while sophisticated solutions are needed to obtain a more precise characterization of the uncertainty.
UR - http://www.scopus.com/inward/record.url?scp=84893705112&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893705112&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-34213-4_1
DO - 10.1007/978-3-642-34213-4_1
M3 - Chapter
AN - SCOPUS:84893705112
SN - 9783642342127
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 15
BT - Search Computing
A2 - Ceri, Stefano
A2 - Brambilla, Marco
ER -