Adaptable Crawler Specification Generation System for Leisure Activity RSS Feeds

Keywords
Loading...
Thumbnail Image
Issue Date
2015-07-08
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
When looking for an activity in a bar or trying to find a good movie it often seems difficult to find complete and correct information about the event. Hyperleap tries to solve this problem of bad information giving by bundling the information from various sources and invest in good quality checking. Currently information retrieval is performed using site-specific crawlers, when a crawler breaks the feedback loop for fixing it contains different steps and requires someone with a computer science background. A crawler generation system has been created that uses directed acyclic word graphs to assist solving the feedback loop problem. The system allows users with no particular computer science background to create, edit and test crawlers for RSS feeds. In this way the feedback loop for broken crawlers is shortened, new sources can be incorporated in the database quicker and, most importantly, the information about the latest movie show, theater production or conference will reach the people looking for it as fast as possible.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen