Towards a Better Understanding of Language Model Information Retrieval

dc.contributor.advisor	Sprinkhuizen-Kuyper, I.G.
dc.contributor.advisor	Weide, Th.P. van der
dc.contributor.author	Heijden, M. van der
dc.date.issued	2008-08-20
dc.description.abstract	Language models form a class of successful probabilistic models in information retrieval. However, knowledge of why some methods perform better than others in a particular situation remains limited. In this study we analyze what language model factors influence information retrieval performance. Starting from popular smoothing methods we review what data features have been used. Document length and a measure of document word distribution turned out to be the important factors, in addition to a distinction in estimating the probability of seen and unseen words. We propose a class of parameter-free smoothing methods, of which multiple specific instances are possible. Instead of parameter tuning however, an analysis of data features should be used to decide upon a specific method. Finally, we discuss some initial experiments.	en_US
dc.identifier.uri	http://theses.ubn.ru.nl/handle/123456789/170
dc.language.iso	en	en_US
dc.thesis.faculty	Faculteit der Sociale Wetenschappen	en_US
dc.thesis.specialisation	Master Artificial Intelligence	en_US
dc.thesis.studyprogramme	Artificial Intelligence	en_US
dc.thesis.type	Master	en_US
dc.title	Towards a Better Understanding of Language Model Information Retrieval	en_US

Files

Now showing 1 - 1 of 1