Publications

Google Scholar profile | Github

2023

C. Fang, Q. Fang,and D. Nguyen. Epicurus at SemEval-2023 Task 4: Improving Prediction of Human Values behind Arguments by Leveraging Their Definitions. SemEval 2023 at ACL 2023 [pdf] [Github]

Y. Du and D. Nguyen. Measuring the instability of fine-tuning. ACL 2023 [pdf] [Github]

G. Juijn, N. Stoimenova, J. Reis and D. Nguyen. Perceived algorithmic fairness using organizational justice theory: An empirical case study on algorithmic hiring. AIES 2023 [pdf] [Github]

2022

A. Wegmann, M. Schraagen and D. Nguyen. Same author or just same topic? Towards content-independent style representations. Workshop on Representation Learning for NLP [pdf]. [Github]

Q. Fang, D. Nguyen, D. L. Oberski. Evaluating the construct validity of text embeddings with application to survey questions. EPJ Data Science [link]

I. Bilal, B. Wang, A. Tsakalidis, D. Nguyen, R. Procter, M. Liakata. Template-based abstractive microblog opinion summarisation. TACL, 2022 [link]

2021

Y. Du, Q. Fang and D. Nguyen. Assessing the reliability of word embedding gender bias measures. EMNLP 2021 [pdf]. [Github]

A. Wegmann and D. Nguyen. Does It Capture STEL? A Modular, Similarity-based Linguistic Style Evaluation Framework. EMNLP 2021 [pdf]. [Github]

P. Röttger, B. Vidgen, D. Nguyen, Z. Waseem, H. Margetts and J. Pierrehumbert. HateCheck: Functional tests for hate speech detection models. ACL 2021. [pdf] [Github] [MIT Technology Review]

B. Vidgen, D. Nguyen, H. Margetts, P. Rossini and R. Tromble. Introducing CAD: the Contextual Abuse Dataset. NAACL 2021. [pdf] [Github]

D. Nguyen, L. Rosseel and J. Grieve. On learning and representing social meaning in NLP: a sociolinguistic perspective. NAACL 2021. [pdf]

D. Nguyen. Dialect variation on social media. Similar Languages, Varieties, and Dialects (Studies in Natural Language Processing series), edited by Marcos Zampieri and Preslav Nakov. [link] [pdf (preprint)]

A. Robertson, F. F. Liza, D. Nguyen, B. McGillivray and S. Hale. Semantic journeys: Quantifying change in emoji meaning from 2012-2018. 4th International Workshop on Emoji Understanding and Applications in Social Media (Emoji2021) at ICWSM. [Emoji dashboard] [pdf]

2020

D. Nguyen, J. Grieve. Do word embeddings capture spelling variation? COLING 2020. [pdf] [Github] [video]

D. Nguyen, M. Liakata, S. DeDeo, J. Eisenstein, D. Mimno, R. Tromble, J. Winters. How we do things with words: Analyzing text as social and cultural data. [link] Frontiers in Artificial Intelligence, section Language and Computation

N. Peinelt, D. Nguyen and M. Liakata. tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection. ACL 2020. [link] [Github]

2019

P. Shoemark*, F. F. Liza*, D. Nguyen, S. A. Hale, B. McGillivray. Room to glo: A systematic comparison of semantic change detection approaches with word embeddings. EMNLP 2019. [pdf] [Github] *These authors contributed equally to the study

N. Peinelt, M. Liakata and D. Nguyen. Aiming beyond the obvious: Identifying non-obvious cases in semantic similarity datasets. ACL 2019. [link] [Github]

B. Vidgen, A. Harris, D. Nguyen, R. Tromble, S. Hale and H. Margetts. Challenges and frontiers in abusive content detection. The 3rd Workshop on Abusive Language Online at ACL 2019. [pdf]

2018

D. Nguyen. Comparing automatic and human evaluation of local explanations for text classification. NAACL 2018. [pdf] [Github]

D. Nguyen, B. McGillivray, T. Yasseri. Emo, love, and god: Making sense of Urban Dictionary, a crowd-sourced online dictionary. Royal Society Open Science. [link]

F. Nanni, G. Glavaš, S. P. Ponzetto, S. Tonelli, N. Conti, A. Aker, A. Palmero Aprosio, A. Bleier, B. Carlotti, T. Gessler, T. Henrichsen, D. Hovy, C. Kahmann, M. Karan, A. Matsuo, S. Menini, D. Nguyen, A. Niekler, L. Posch, F. Vegetti, Z. Waseem, T. Whyte and N. Yordanova. Findings from the hackathon on understanding euroscepticism through the lens of textual data. ParlaCLARIN 2018 at LREC 2018. [pdf]

2017

D. Nguyen. Text as social and cultural data: A computational perspective on variation in text. PhD thesis, University of Twente. [pdf]

D. Nguyen and J. Eisenstein. A Kernel Independence Test for Geographical Language Variation. Computational Linguistics, Volume 43, Issue 3, Pages 567-592. [pdf] [Github] (also presented at NWAV44)

2016

D. Nguyen, A.S. Doğruöz, C.P. Rosé, and F.M.G. de Jong. Computational Sociolinguistics: A Survey. Computational Linguistics. Vol. 42, No. 3, Pages 537-593. [Link to CL article] [arXiv version, pdf].

D. Nguyen and L. Cornips. Automatic Detection of Intra-Word Code-Switching. The 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology. [pdf]

T. Meder, F. Karsdorp, D. Nguyen, M. Theune, D. Trieschnigg, I. Muiser: The Dutch Folktale Database: From Automatic Enrichment with Metadata towards Research into Variation, Narrative Building Blocks and 'Grammar'. Journal of American Folklore. [link]

2015

D. Nguyen, D. Trieschnigg and L. Cornips: Audience and the Use of Minority Languages on Twitter. ICWSM 2015. [pdf] [slides (LiM6)] (also presented at the 6th International Language in the Media conference)

D. Nguyen, T. van den Broek, C. Hauff, D. Hiemstra and M. Ehrenhard: #SupportTheCause: Identifying Motivations to Participate in Online Health Campaigns. EMNLP 2015. [pdf] [data]

T. Meder, D. Nguyen and R. Gravel: The Apocalypse on Twitter in Digital Scholarship in the Humanities. [link]

T. Demeester, D. Trieschnigg, K. Zhou, D. Nguyen and D. Hiemstra: FedWeb Greatest Hits: Presenting the New Test Collection for Federated Web Search. WWW 2015. [pdf] [data]

T. Demeester, R. Aly, D. Hiemstra, D. Nguyen and C. Develder: Predicting Relevance based on Assessor Disagreement: Analysis and Practical Applications for Search Evaluation. Information Retrieval Journal [link]

N. Dwi Prasetyo, C. Hauff, D. Nguyen, T. van den Broek and D. Hiemstra: On the Impact of Twitter-based Health Campaigns: A Cross-Country Analysis of Movember. The Sixth International Workshop on Health Text Mining and Information Analysis at EMNLP 2015 [pdf]

2014

D. Nguyen, D. Trieschnigg and M. Theune: Using Crowdsourcing to Investigate Perception of Narrative Similarity. CIKM 2014. [pdf] [slides]

D. Nguyen, D. Trieschnigg, A. S. Doğruöz, R. Gravel, M. Theune, T. Meder and F. de Jong : Why Gender and Age Prediction from Tweets is Hard: Lessons from a Crowdsourcing Experiment. COLING 2014. [pdf]

D. Nguyen, D. Trieschnigg and T. Meder: TweetGenie: Development, Evaluation, and Lessons Learned. COLING 2014 (demo paper). [pdf]

E. Papalexakis, D. Nguyen, and A. S. Doğruöz: Predicting Code-switching in Multilingual Communication for Immigrant Communities. The Workshop on Computational Approaches to Code Switching at EMNLP 2014.

T. Demeester, R. Aly, D. Hiemstra, D. Nguyen, D. Trieschnigg and C. Develder: Exploiting User Disagreement for Search Evaluation: an Experimental Approach. WSDM 2014. [pdf]

K. Zhou, T. Demeester, D. Nguyen, D. Hiemstra and D. Trieschnigg: Aligning Vertical Collection Relevance with User Intent. CIKM 2014. [pdf]

T. Demeester, D. Trieschnigg, D. Nguyen, K. Zhou, D. Hiemstra: Overview of the TREC 2014 Federated Web Search Track. TREC 2014. [pdf]

2013

D. Nguyen, A.S. Doğruöz : Word Level Language Identification in Online Multilingual Communication. EMNLP 2013. [pdf] [data]

D. Nguyen, R. Gravel, D. Trieschnigg and T. Meder: "How Old Do You Think I Am?": A Study of Language and Age in Twitter. ICWSM 2013. [pdf]
Press: NYTimes, NewScientist, Time, United Press International, Radio 538, etc.

D. Nguyen, R. Gravel, D. Trieschnigg and T. Meder: TweetGenie: Automatic Age Prediction From Tweets. ACM SIGWEB Newsletter, Issue Autumn, Autumn 2013.
[url] (Invited article, summary of our ICWSM 2013 paper)

D. Nguyen, D. Trieschnigg, M. Theune: Folktale Classification using Learning to Rank. ECIR 2013. [pdf] [slides]

T. Demeester, D. Trieschnigg, D. Nguyen and D. Hiemstra: Overview of the TREC-2013 Federated Web Search Track. TREC 2013. [pdf]

T. Demeester, D. Nguyen, D. Trieschnigg, C. Develder and D. Hiemstra: Snippet-based Relevance Predictions for Federated Web Search. ECIR 2013. [pdf]

D. Trieschnigg, D. Nguyen, M. Theune: Learning to Extract Folktale Keywords. Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities at ACL 2013. [pdf]

D. Trieschnigg, D. Nguyen, T. Meder: In search of Cinderella: A transaction log analysis of folktale searchers. Workshop on the Exploration, Navigation and Retrieval of Information in Cultural Heritage at SIGIR 2013. [pdf]

2012

D. Nguyen, T. Demeester, D. Trieschnigg, D. Hiemstra: Federated Search in the Wild. CIKM 2012. [pdf] [data]

D. Nguyen, D. Trieschnigg, T. Meder and M. Theune: Automatic classification of folk narrative genres. First International Workshop on Language Technology for Historical Text(s) at KONVENS 2012. [pdf]

D. Nguyen and D. Hiemstra: Ensemble clustering for result diversification. Proceedings of the Twenty First Text REtrieval Conference (TREC 2012). National Institute of Standards and Technology, special publication. [pdf]

T. Demeester, D. Nguyen, D. Trieschnigg, C. Develder and D. Hiemstra: What Snippets Say About Pages in Federated Web Search. AIRS 2012. [pdf]

2011

D. Nguyen and C. P. Rosé: Language use as a reflection of socialization in online communities. Workshop on Language in Social Media at ACL 2011. [pdf]

D. Nguyen, N. A. Smith and C. P. Rosé : Author Age Prediction from Text using Linear Regression. Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities at ACL 2011. [pdf]

2010

D. Nguyen and J. Callan: Combination of evidence for effective web search . The Nineteenth Text REtrieval Conference Proceedings (TREC 2010). National Institute of Standards and Technology, special publication. [pdf]

D. Nguyen, E. Mayfield and C. P. Rosé: An Analysis of Perspectives in Interactive Settings. Workshop on Social Media Analytics at KDD 2010. [pdf]

S. V. Oberoi, D. Nguyen, S. Finger and C. P. Rosé: Automatic extraction of conceptual maps from design team documents. 7th Symposium on International Design and Design Education (DEC), Montreal, Canada, August 15-18, 2010

S. V. Oberoi, D. Nguyen, G. Gweon, S. Finger and C. P. Rosé: DesignWebs: A Tool for Automatic Construction of Interactive Conceptual Maps from Document Collections. Intelligent Tutoring Systems (2) 2010: 387-389 [poster description, pdf]

H. Ai, R. Kumar, D. Nguyen, A. Nagasunder and C. P.Rosé: Exploring the Effectiveness of Social Capabilities and Goal Alignment in Computer Supported Collaborative Learning Intelligent Tutoring Systems 2010: 134-143

2009

D. Nguyen, A. Overwijk, C.Hauff, R.B. Trieschnigg, D. Hiemstra, F.M.G. de Jong: WikiTranslate: Query Translation for Cross-lingual Information Retrieval using only Wikipedia. Proceedings CLEF 2008, LNCS, Springer, August 2009. [pdf] [poster]

A. Overwijk, D. Nguyen, C. Hauff, R.B. Trieschnigg, D. Hiemstra, F.M.G. de Jong: On the Evaluation of Snippet Selection for Information Retrieval. Proceedings CLEF 2008, LNCS, Springer, August 2009. [pdf]