Organizing Flickr30k Using Text Clustering

Güclü, I.

Organizing Flickr30k Using Text Clustering

Files

Guclu, I.-BSc-Thesis-2018.pdf (831.31 KB)

Authors

Güclü, I.

Issue Date

2018-06-18

Language

en

URI

https://theses.ubn.ru.nl/handle/123456789/7033

Abstract

Text clustering is the process of clustering similar documents together based on the textual information within a document. The captions provided with the Flickr30k dataset will be used to organize the images. The dataset consists of captioned images of everyday life. The two approaches to clustering (hierarchical and partitional) will be implemented to assess the formed clusters. K-means and agglomerative clustering will be used to experiment with. The performance of the two algorithms will be assessed using internal validity measurements. The difference between the two algorithms was too small to judge which one performed better. However the clusters that are formed did differ. K-means made a distinction between ‘adult people’ vs. ‘young people’, agglomerative clustering made a distinction between ‘people’ vs. ‘bullfighting’.

Supervisor

Kachergis, G.E.

Grootjen, F.A.

Faculty

Faculteit der Sociale Wetenschappen

Programme

Artificial Intelligence

Specialisation

Bachelor Artificial Intelligence

Collections

Faculteit der Sociale Wetenschappen

Full item page

Organizing Flickr30k Using Text Clustering

Keywords

Files

Authors

Issue Date

Language

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

URI

DOI

Abstract

Description

Citation

Supervisor

Faculty

Programme

Specialisation

Collections