Abstract:
In an overarching project concerning information gathering and pro-
cessing, this thesis is concerned with the gathering of large quantities
of images belonging to a speci c domain. A focused crawler speci cally
designed to retrieve images, a focussed image crawler, is proposed and
implemented. The crawling strategy is based primarily on an external
image classi er. Criteria are established to evaluate the performance of
the crawler in regard to what the overarching project might require. It was
found that the employed crawling strategy is not e ective and can in some
cases misguide the crawler. Three out of four criteria to evaluate crawling
performance were met by the crawler, indicating that a focussed image
crawler can be a suitable solution to the image retrieval problem. Finally,
suggestions are given for improvements on the presented approach.