Text Classification - Classifying events to ugenda calendar genres

dc.contributor.advisorVuurpijl, L.G.
dc.contributor.advisorGrootjen, F.A.
dc.contributor.advisorTrooster, S.
dc.contributor.authorCrijns, T.
dc.date.issued2016-07-18
dc.description.abstractUgenda is a leading cultural event website that faces a challenge in the information management of their event calendar. The process of verifying and preparing information from venues is time-consuming and the team is looking for a way to automate this process. Certain event details are often missing, such as the event genres. These are important for sorting the calendar. This thesis proposes a solution for automatically labeling events that lack a genre. The focus is on three subjects; event details, pre-processing techniques and classification methods. We try to find a combination that works well enough for an operating website. The pre-processing methods included natural language processing, HTML tag removal, date, time and location feature mapping. The four classifiers used were support vector machines, logistic regression, naïve bayes and random forest. Results show that the logistic regression classifier has the best performance with a complete setup of proposed pre-precessing methods and event details. An F1-score of 0.8110 was achieved, which is not enough for an operating website.en_US
dc.identifier.urihttp://hdl.handle.net/123456789/1875
dc.language.isoenen_US
dc.thesis.facultyFaculteit der Sociale Wetenschappenen_US
dc.thesis.specialisationBachelor Artificial Intelligenceen_US
dc.thesis.studyprogrammeArtificial Intelligenceen_US
dc.thesis.typeBacheloren_US
dc.titleText Classification - Classifying events to ugenda calendar genresen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Crijns,T._Bachelor_Thesis_2016.pdf
Size:
380.38 KB
Format:
Adobe Portable Document Format
Description:
Thesis text