Dr. Aron Culotta

Assistant Professor, Department of Computer Science
Northeastern Illinois University, Chicago, IL 60625



The proliferation of social media -- such as Twitter, Facebook, blogs, and Web forums -- has created an unprecedented, continuous stream of messages containing the thoughts, opinions, and beliefs of millions of people. In addition to the primary benefit users of this technology enjoy, a secondary benefit is emerging as scientists discover how to analyze this new data source to provide insights into society. Evidence is mounting that such analysis can be valuable in understanding public health, finance, politics, social unrest, and natural disasters.

The goal of my research is two-fold: (1) to leverage this unprecedented source of data to advance research in automated processing of informal human communication; and (2) to apply these techniques to analyze trends in social media and produce socially beneficial technology. My research contributions can be categorized into three main areas:

Press: Our work in social media analysis has been discussed in several press outlets, including the Wall Street Journal, The Atlantic, CNET, and the Communcations of the ACM.
(see also my Google Scholar page)

2012 A demographic analysis of online sentiment during Hurricane Irene
Benjamin Mandel, Aron Culotta, John Boulahanis, Danielle Stark, Bonnie Lewis, Jeremy Rodrigue
NAACL-HLT Workshop on Language in Social Media, 2012
Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages
Aron Culotta
Language Resources and Evaluation, Special Issue on Analysis of Short Texts on the Web, 2012
Preprint. Final version available at springer.com
2011 SampleRank: Training factor graphs with atomic gradients
Michael Wick, Khashayar Rohanimanesh, Kedar Bellare, Aron Culotta, Andrew McCallum
Proceedings of the International Conference on Machine Learning (ICML), 2011
2010 Detecting influenza epidemics by analyzing Twitter messages
Aron Culotta
arXiv:1007.4748v1 [cs.IR], 2010
Towards detecting influenza epidemics by analyzing Twitter messages
Aron Culotta
KDD Workshop on Social Media Analytics, 2010
2009 SampleRank: Learning preferences from atomic gradients
Michael Wick, Khashayar Rohanimanesh, Aron Culotta, Andrew McCallum
Neural Information Processing Systems (NIPS) Workshop on Advances in Ranking, 2009
An entity-based model for coreference resolution
Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew McCallum
SIAM International Conference on Data Mining, 2009
2008 Learning and inference in weighted logic with application to natural language processing
Aron Culotta
Ph.D. Thesis, University of Massachusetts, Amherst, 2008
2007 Canonicalization of Database Records using Adaptive Similarity Measures
Aron Culotta, Michael Wick, Robert Hall, Matthew Marzilli, Andrew McCallum
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2007
Sparse Message Passing Algorithms for Weighted Maximum Satisfiability
Aron Culotta, Andrew McCallum, Bart Selman, Ashish Sabharwal
New England Student Colloquium on Artificial Intelligence (NESCAI), 2007
Author Disambiguation using Error-driven Machine Learning with a Ranking Loss Function
Aron Culotta, Pallika Kanani, Robert Hall, Michael Wick, Andrew McCallum
Sixth International Workshop on Information Integration on the Web (IIWeb-07), 2007
First-Order Probabilistic Models for Coreference Resolution
Aron Culotta, Michael Wick, Robert Hall, Andrew McCallum
Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL), 2007
2006 Corrective Feedback and Persistent Learning for Information Extraction
Aron Culotta, Trausti Kristjansson, Andrew McCallum, Paul Viola
Artificial Intelligence, 2006
Tractable Learning and Inference with High-Order Representations
Aron Culotta, Andrew McCallum
International Conference on Machine Learning Workshop on Open Problems in Statistical Relational Learning, 2006
Learning field compatibilities to extract database records from unstructured text
Michael Wick, Aron Culotta, Andrew McCallum
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2006
Practical Markov logic containing first-order quantifiers with application to identity uncertainty
Aron Culotta, Andrew McCallum
Human Language Technology Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing (HLT/NAACL), 2006
Integrating probabilistic extraction models and data mining to discover relations and patterns in text
Aron Culotta, Andrew McCallum, Jonathan Betz
Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL), 2006
2005 Learning clusterwise similarity with first-order features
Aron Culotta, Andrew McCallum
Neural Information Processing Systems (NIPS) Workshop on the Theoretical Foundations of Clustering, 2005
A conditional model of deduplication for multi-type relational data
Aron Culotta, Andrew McCallum
University of Massachusetts IR-443, 2005
Joint deduplication of multiple record types in relational data
Aron Culotta, Andrew McCallum
ACM CIKM International Conference on Information and Knowledge Management, 2005
Reducing labeling effort for structured prediction tasks
Aron Culotta, Andrew McCallum
The Twentieth National Conference on Artificial Intelligence (AAAI), 2005
Gene prediction with conditional random fields
Aron Culotta, David Kulp, Andrew McCallum
University of Massachusetts, Amherst UM-CS-2005-028, 2005
2004 Dependency tree kernels for relation extraction
Aron Culotta, Jeffery Sorensen
42nd Annual Meeting of the Association for Computational Linguistics (ACL), 2004
Interactive information extraction with constrained conditional random fields
Trausti Kristjannson, Aron Culotta, Paul Viola, Andrew McCallum
Nineteenth National Conference on Artificial Intelligence (AAAI), 2004
Best Paper Award (Honorable Mention)
Confidence estimation for information extraction
Aron Culotta, Andrew McCallum
Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL), 2004
Extracting social networks and contact information from email and the Web
Aron Culotta, Ron Bekkerman, Andrew McCallum
First Conference on Email and Anti-Spam (CEAS), 2004
2003 Maximizing cascades in social networks
Aron Culotta
University of Massachusetts, 2003
Spring 2013 CS207: Programming II [schedule]
CS300: Client-side Web Development [schedule]
Fall 2012 CS207: Programming II
CS300: Client-side Web Development

Courses taught at previous instututions:
Some data sets I've created to train and evaluate machine learning algorithms: