Dong Nguyen

Thesis

Text as social and cultural data: A computational perspective on variation in text

Public defense: March 10, 2017 at 16.30. University of Twente.

The defense is preceded by a symposium on Text as social and cultural data.

Massive digital datasets, such as social media data, are a promising source to study social and cultural phenomena. They provide the opportunity to study language use and behavior in a variety of social situations on a large scale and often with the availability of detailed contextual information. However, to fully leverage their potential for research in the social sciences and the humanities, new computational approaches are needed.

This dissertation explores computational approaches to text analysis for studying cultural and social phenomena and focuses on two emerging areas: computational sociolinguistics and computational folkloristics. Both areas share the recognition that variation in text is often meaningful and may provide insights into social and cultural phenomena. This dissertation develops computational approaches to analyze and model variation in text.

Winner of the Overijssel PhD-Award (2017) and the Gerrit van Dijk Prize for the best thesis in Data Science (Dutch Data Science Awards 2018).