Paper Machines is a topic modelling and visualization tool available as a plugin for Zotero. It analyzes Zotero bibliographic collections based on a selection of text mining processes, and enables users to export a variety of visualizations, such as word clouds, phrase nets or heat maps, from the results.
CATMA (Computer Aided Textual Markup and Analysis) is a free, open source markup and analysis tool from the University of Hamburg's Department of Languages, Literature and Media. It incorporates three interactive modules, a tagger enabling textual markup and markup editing, an analyzer incorporating a query language and predefined functions, and a query builder that allows users to construct queries from combinations of pre-defined questions while allowing for manual modification for more specific questions. It also interfaces with the Voyant toolset. As of version 4.1, CATMA is a web application with collaborative work functions, and improvements to its user interface, queries and corpus analysis capacity.
TextArc is a free visualization tool that represents an entire text on a single page. It has elements of an index, concordance and summary all in one place, encouraging the viewer to use its juxtapositions to uncover meaning. The web-based applet is preloaded with thousands of text collected from Project Gutenberg. Note: TextArc was designed for browsers circa 2002 and requires Java SE 6 to run.
KH Coder is a tool for quantitative content analysis and text mining that has been under continuous development since 2001. It was originally developed for Japanese text and now supports numerous other languages, including English, Italian and French. KH Coder also has computational linguistics applications.
LitStats is a tool for statistical analysis of natural language texts developed in the 1980s by Dr. Stephen Reimer of the University of Alberta. From an ASCII text, it can generate word frequency counts, word lengths, initial letter frequencies, sentence length frequencies and verbal segment frequencies. It was originally developed for an IBM 3033, and can still be run on systems using Windows XP.
DocuBurst is a free web-based visualization tool for exploring the contents of a text. Visitors can upload their own text or view those provided by others. DocuBurst presents an interactive chart called a ‘radial sunburst’ diagram which organizes the nouns extracted from the user-supplied text based on their meaning, and colours them based on frequency, revealing common themes in the text. The visualization also shows the proper nouns (e.g. character names) in a linked word cloud. The visualization may be zoomed, filtered, or refocused to target types of words of interest (e.g. “animal” words or “feeling” words). The visualization also provides a comparison tool to contrast word use across two documents. DocuBurst views can be bookmarked, annotated, shared, and embedded in your own website.