Topic modeling is a simple way to analyze large volumes of unlabeled text. A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. (Wikipedia <http://en.wikipedia.org/wiki/Topic_model>). Thus, a "topic" consists of a cluster of words that frequently occur together. Using contextual clues, topic models can connect words with "similar" meanings and distinguish between uses of words with multiple meanings. For a general introduction to topic modeling, see for example: Steyvers and Griffiths (2007).
The textual material the topic modeling resources will be applied on is i) Swedish literature collections and ii) Swedish biomedical texts. The Purpose is to identify e.g. topics that rose or fell in popularity; classify text passages (cf. Jockers, 2011); visualize topics with authors (cf. Meeks, 2011); identify potential issues of interest for historians, literary scholars or other (cf. Yang et al., 2011).
Avaialable Software to be used:
Good programming skills
Not necessary to have Swedish as mother tongue!
Blei DM. 2012. Probabilistic topic models. Communications of the ACM. vol. 55 no. 4. <http://www.cs.princeton.edu/~blei/papers/Blei2012.pdf>
Jockers M. 2011 Who's your DH Blog Mate: Match-Making the Day of DH Bloggers with Topic Modeling Matthew L. Jockers, posted 19 March 2010
Meeks E. 2011 Comprehending the Digital Humanities Digital Humanities Specialist, posted 19 February 2011
Steyvers M. and Griffiths T. (2007). Probabilistic Topic Models. In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum. <http://psiexp.ss.uci.edu/research/papers/SteyversGriffithsLSABookFormatted.pdf>.
Yang T., Torget A. and Mihalcea R. (2011) Topic Modeling on Historical Newspapers. Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. The Association for Computational Linguistics, Madison, WI. pages 96–104.
Extensive Topic Modeling bibliography: <http://www.cs.princeton.edu/~mimno/topics.html>