Adaptable Crawler Specification Generation System for Leisure Activity RSS Feeds

Keywords

Loading...
Thumbnail Image

Issue Date

2015-07-08

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

When looking for an activity in a bar or trying to find a good movie it often seems difficult to find complete and correct information about the event. Hyperleap tries to solve this problem of bad information giving by bundling the information from various sources and invest in good quality checking. Currently information retrieval is performed using site-specific crawlers, when a crawler breaks the feedback loop for fixing it contains different steps and requires someone with a computer science background. A crawler generation system has been created that uses directed acyclic word graphs to assist solving the feedback loop problem. The system allows users with no particular computer science background to create, edit and test crawlers for RSS feeds. In this way the feedback loop for broken crawlers is shortened, new sources can be incorporated in the database quicker and, most importantly, the information about the latest movie show, theater production or conference will reach the people looking for it as fast as possible.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen