TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

Popular Tools
User Recommended Tools
Random Tools

LIWC (Linguistic Inquiry and Word Count)

LIWC (Linguistic Inquiry and Word Count) is a text analysis program available for purchase. It calculates the degree to which various categories of words are used in a text, and can process texts ranging from e-mails to speeches, poems and transcribed ...
LIWC (Linguistic Inquiry and Word Count)
LIWC (Linguistic Inquiry and Word Count)

Voyant Corpus Summary

Corpus Summary is a tool that provides a simple, textual overview of the current corpus. Features of this tool include number of words, number of unique words, longest documents, highest vocabulary density, most frequent words, notable peaks in frequency, ...
Voyant Corpus Summary
Voyant Corpus Summary

Prism

Prism is a visualization tool that represents texts based on crowdsourced interpretations. Users are encouraged to highlight words in a text according to categories ("facets"). Users' individual interpretations contribute to a visualization combining ...
Prism
Prism

Stanford Vis Group: d3.js - Data Driven Documents

D3.js is a free, open source JavaScript library for manipulating documents with data utilizing HTML5, SVG and CSS3. It is designed to create visualizations that work with current web standards to make the best possible use of the most recent browsers ...
Stanford Vis Group: d3.js - Data Driven Documents
Stanford Vis Group: d3.js - Data Driven Documents

Voyant Lava

Lava allows you to view multiple levels of a corpus in a three-dimensional environment. Clicking on certain documents within the corpus expands the Lava visualization in a ring to explore further.
Voyant Lava
Voyant Lava

TUSTEP

TUSTEP (Tubingen System of Text Processing Tools) is a free, open source, widely-used toolbox for text processing. It is aimed at scholarly audiences, can work with texts in both latin and non-latin scripts, and is primarily designed for humanites applications. ...
TUSTEP
TUSTEP

WordStat

WordStat is a commercial software package for content analysis and text mining on natural language text such as interview transcripts. It also includes features for automatic document tagging and classification.
WordStat
WordStat

Mallet

Mallet is a Java based library and command line framework that provides statistical and machine learning tools for use with natural language processing.
Mallet
Mallet

Principal Components Analysis on Plain Text - Beta (TAPoRware)

This tool applies Principal Components Analysis rules to a text to generate relationships between words and text units. It works best with large texts where users can specify units of over 500 words. HTML and XML versions are not currently avaiable. ...
Principal Components Analysis on Plain Text - Beta (TAPoRware)
Principal Components Analysis on Plain Text - Beta (TAPoRware)

IBM: Many Eyes

Many Eyes is a free collection of data visualization tools enabling exploration and discussion of the data. Users who post comments on a visualization may also save their view for others to see in conjunction with their comment. Visualizations can be ...
IBM: Many Eyes
IBM: Many Eyes

SCAN

SCAN was a conversational programming language available in the 1970s for text analysis. It was specific to text processing and could be used divide a text into sentences or words or split on separators. It was capable of running counts on a text, printing ...
SCAN
SCAN

DISCAN

DISCAN was a tool for content and discourse analysis originally available for mainframe computers, and adapted for PCs with DOS 3.0 in 1991. It accepted ASCII files, could create KWIC and cooccurrence listings, and had modules for both creating content ...
DISCAN
DISCAN

word tree

word tree is a free, web-based tool for generating dynamic word trees from user-supplied texts. Users can paste their text directly in the box provided, enter a URL or Twitter handle in the search bar, or install the bookmarklet into their browser's ...
word tree
word tree

HyperPo

HyperPo is an important legacy tool, developed as the first web-based text analysis tool aimed at humanities scholars available from 1996 through 2006. Users could input a web address, upload a file or directly enter text for analysis. HyperPo's interface ...
HyperPo
HyperPo

Transana

Transana is a free, open source program offered by the University of Wisconsin-Madison Center for Education Research for transcribing and analyzing large collections of video and audio data aimed at academic researchers. It enables users to manually ...
Transana
Transana

Keywords Finder - Beta (TAPoRware)

This tool identifies keywords or key phrases within a user-specified text, using the assumption that they will appear with the greatest frequency. It applies a stemmer to every word. Plain text input is recommended. All tags will be stripped from an ...
Keywords Finder - Beta (TAPoRware)
Keywords Finder - Beta (TAPoRware)

DM (Digital MappaeMundi)

The DM (Digital MappaeMundi) is an environment for studying and annotating images and texts. It enables users to link together images, texts, or fragments of images or texts, such as a textual annotation of an image or text. It is aimed at scholars ...
DM (Digital MappaeMundi)
DM (Digital MappaeMundi)

Stanford NLP Group: Stanford Word Segmenter

Stanford Word Segmenter is a free, open source Java-based tokenization tool for Chinese and Arabic text that integrates token pre-processing, or segmentation. For Arabic, the tool processes text according to the Penn Arabic Treebank 3 standard. For ...
Stanford NLP Group: Stanford Word Segmenter
Stanford NLP Group: Stanford Word Segmenter

Mandala Browser

Mandala Browser is a rich-prospect browsing interface for exploring a data set in .txt, .rft, .pdf, .csv or .xml format. Searches can be constrained by columns or fields. A version of this tool is also available in the Voyant toolset.
Mandala Browser
Mandala Browser

CLAS (Computerized Language Analysis System)

CLAS (Computerized Language Analysis System) was an important historic text analysis system available in the 1970s. It was written in PL/I for IBM 360/370 punch card machines and performed standard statistical tests and concordances on natural language ...
CLAS (Computerized Language Analysis System)
CLAS (Computerized Language Analysis System)

List Words - XML (TAPoRware)

This tool lists words in an XML document, either uploaded by the user or from a web address. List Words works with relatively small texts of under a megabyte in size. It is part of the TAPoRware collection of tools; there are HTML and plain text versions ...
List Words - XML (TAPoRware)
List Words - XML (TAPoRware)
View tools by tag:
1960s 1970s 1980s 1990s 2000s 2010s American Canadian Comparator Dutch English English (language) French (language) German Historic Java Metadata Multilingual Natural language processing Social media
All Tags: