Identifying Purchase Intentions by Extracting Information from Tweets
Social Marketing is significantly gaining importance with companies advertising on social media such as Facebook and Twitter. Consumers prefer personalised advertisements that are related to for example their hobbies, work or interests. However, not all companies can afford to spend a lot of money for buying data of their potential customers. Therefore, I want to investigate if an artificial intelligence approach can predict (from existing user created content on social media) if someone is a potential customer for a specific company or product. In my approach I focus on predictions based on streams of short messages only, in contrast to approaches adopted by companies such as for example Facebook that involves many additional sources of data (e.g. mouse trajectories, browse history). The predictions of the artificial intelligence approach are compared to annotations from a human expert. This annotator reads the timeline of all users in the dataset and assigns a label PC (Potential Customer) or non-PC (no Potential Customer) to the users. There are already many studies that investigate how data models can process natural language in tweets and identify the sentiment of social posts, but combining all these techniques to detect purchase intentions in tweets is a relatively unexplored field. In my approach, models have been trained using different machine learning algorithms and tested with 10-fold cross-validation. Two experiments have been conducted in which the performance of the these models is evaluated by comparing the model’s predictions to the annotations of the human expert. I have trained one model using knowledge-rich features, two models using knowledge-poor features, and a model using both. The results are compared to find out the contribution of the features to each other. The results of the experiments give an answer to the research question: To what extend can AI approach the predictions (potential customer vs. no potential customer) that are assigned by a human expert? The results show that the AI approach classifies a user too often as a potential customer. When the artificial intelligence model uses all information that is available it can identify nearly 90% of the PC instances. However, the precision at this threshold is slightly below chance level. There are several reasons for future work. One reason is that the dataset that is used in this study is based on predictions of a human expert and not on actual purchase behaviour. The reliability of the artificial intelligence approach can be improved by using a dataset that contains tweets of users that are actual customers of a company. This dataset can be used to investigate if the artificial model can identify someone’s purchase intention before he became a real customer of the company.
Faculteit der Sociale Wetenschappen