Hi, I am an assistant professor at Utrecht University. Previously I was a research fellow at the Alan Turing Institute. I was also affiliated with Edinburgh University. I completed my Ph.D. at the University of Twente. I received a master's degree from the Language Technologies Institute at Carnegie Mellon University and a bachelor's degree in Computer Science from the University of Twente. I have interned at Facebook (fall 2011), Microsoft Research (fall 2013), and Google (summer 2014). In fall 2015 I visited Georgia Tech.
Interested in doing a PhD with me in Utrecht? Please get in touch! More info here.
I have worked on various topics in Natural Language Processing and Information Retrieval. I'm especially interested in computational text analysis for research questions from the social sciences. Most of my research focuses on social media data.
- Computational sociolinguistics: The emerging research area of computational sociolinguistics (Nguyen et al. Computational Linguistics 2016) aims to model an analyze language variation and change using computational approaches.
Recent (selected) articles: automatic detection of semantic change (Shoemark and Liza et al., EMNLP 2019) and analysis of dialect variation in social media (Nguyen, book chapter to appear 2020).
Funded projects: NWO Veni (2020-2024).
- NLP for computational social science: The development of NLP tools to study social phenomena, often in close collaboration with researchers from the social sciences.
Recent (selected) articles: How we do things with words: Analyzing text as social and cultural data (Nguyen et al. arXiv 2019).
Funded projects: NWO digital society 'Digitale samenleving - de geïnformeerde burger' (2020-2024). I'm also involved in the The (mis)informed citizen and Hate speech: measures and counter-measures projects of the Alan Turing Institute (UK).
- Explainable NLP: To make NLP systems that are transparent and/or explainable.
Recent articles: evaluation of local explanations (Nguyen, NAACL 2018).
Funded projects: UU Research IT innovation fund 'PROVEE: Progressive Explainable Embeddings' (2020).
I defended my PhD thesis Text as social and cultural data: A computational perspective on variation in text at the University of Twente in 2017. I explored computational approaches to text analysis for studying cultural and social phenomena, focusing on textual variation in social media and folk narratives. Winner of the Overijssel PhD-Award (2017) and the Gerrit van Dijk Prize for the best thesis in Data Science (Dutch Data Science Awards 2018).
- I will give an invited tutorial on word embeddings at COMPTEXT 2020, May 14, 2020, Innsbruck, Austria.
- Together with Vincent Traag I organized a 4TU seminar on Computational Social Science, 7 April, 2017.
- I gave a tutorial on NLP for computational social science at the Language, Data and Knowledge (LDK) conference (2017). [webpage] [slides]
- I gave a tutorial on Computational Sociolinguistics at the 3rd International Conference on Computational Social Science (2017).
- Digital diasporas: Interdisciplinary perspectives conference in London, 2019.
- Lorentz workshop: New methods in computational sociolinguistics, Leiden 2018
- Computational sociolinguistics workshop at NWAV47, New York 2018. (slides are here)
- Workshop on "Bridging disciplines in analysing text as social and cultural data", 21-22 September 2017 at the Turing Institute in London.
- Co-organizer Federated Web Search track at TREC (2013-2014).
- Program Committees: EMNLP (2013, 2017-2018), ACL (2014-2018), EACL 2017, ECIR (2015-2019), NAACL (2016, 2018-2019), SIGIR short + full (2015-2016), CIKM (2015-2016), FAT* 2019, ICWSM 2019,
Area chair: ACL 2019 (Multidisciplinary and COI), EMNLP 2019 (Social Media and Computational Social Science), and EMNLP 2020 (senior area chair Computational Social Science and Social Media).
- Media coverage: New York Times, Time Magazine, New Scientist, Radio 538, Volkskrant, etc.
- Invited talks: Keynote at COMPTEXT 2020, Digital language research: Computation meets interaction workshop at Universität Hamburg (2019), University of Sheffield (2019), University of Exeter (2019), University of Greenwich (2019), Saarland University (2019), The Future of AI: Language, Technology and gender workshop at the University of Cambridge (2019), University of Cambridge (2019), SAGE Ocean Speaker Series (2018), Bell Labs Cambridge (2018), University of Amsterdam (2018), Data Science Festival meetup (2018), Understanding Euroscepticism Through the Lens of Big-Data workshop (2017), Women in Machine Intelligence Dinner (2017), etc.
- A video on Youtube of a talk I gave in Feb 2017 summarizing some of my work.
- Featured in Women in NLP spotlights.
- Special issue on computational sociolinguistics.
- Awards/Fellowships: KNAW Early Career Award (2019), Veni grant (2019), Gerrit van Dijk Award (2018), Overijssel PhD Award (2017), Alan Turing Institute Fellowship (2016), UT in the media award (2013), Fulbright scholarship (2009) and the WO Echo Beta Techniek award (2007).
- Teaching: At Utrecht University I'm one of the teachers of the Methods in AI Research course (AI Msc, intro to ML/NLP) and the social computing course (AI MSc). Next year I will teach a new course on human-centered machine learning (AI Msc).