Developing Eventscraper for Ugenda: How to keep a web scraper functional after a DOM change

dc.contributor.advisorVuurpijl, L.G.
dc.contributor.advisorGrootjen, F.A.
dc.contributor.authorAckermans, G.F.M.J.
dc.date.issued2016-08-25
dc.description.abstractThe goal of this thesis was to explore techniques that can be used to develop a web scraper that is still able to scrape web pages after their DOM has been altered. In this thesis, the modern applications of web scraping are discussed, as well as literature on existing web scraping approaches. A prototype web scraper, Eventscraper, was developed for the purpose of evaluating the performance of several web scraping techniques. This research proposes a new technique to handle DOM changes: Path distance search. It turned out to be infeasible to conduct an experiment to compare the performance of path distance search with existing techniques. However, a hypothesis on its performance has been formed, based on a detailed analysis of its behaviour. This research concludes with several suggestions for future research.en_US
dc.embargo.lift3000-12-31
dc.identifier.urihttp://theses.ubn.ru.nl/handle/123456789/5254
dc.language.isoenen_US
dc.thesis.facultyFaculteit der Sociale Wetenschappenen_US
dc.thesis.specialisationBachelor Artificial Intelligenceen_US
dc.thesis.studyprogrammeArtificial Intelligenceen_US
dc.thesis.typeBacheloren_US
dc.titleDeveloping Eventscraper for Ugenda: How to keep a web scraper functional after a DOM changeen_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Ackermans, G._BSc_Thesis_2016.pdf
Size:
2.54 MB
Format:
Adobe Portable Document Format
Description:
Thesis text