TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

Popular Tools
User Recommended Tools
Random Tools

word tree

word tree is a free, web-based tool for generating dynamic word trees from user-supplied texts. Users can paste their text directly in the box provided, enter a URL or Twitter handle in the search bar, or install the bookmarklet into their browser's ...
word tree
word tree

Voyant Document Term Frequencies

Document Term Frequencies shows word frequencies for each document in the corpus. You can see the selected word at the top of the window highlighted in yellow. Its relevance to the documents is shown in the table below.
Voyant Document Term Frequencies
Voyant Document Term Frequencies

WordStat

WordStat is a commercial software package for content analysis and text mining on natural language text such as interview transcripts. It also includes features for automatic document tagging and classification.
WordStat
WordStat

Trend Miner

Trend Miner is a tool designed to enable portable open-source real-time methods for cross-lingual mining and summarizing of large-scale stream media. It combines elements from natural language processing, knowledge-based reasoning, machine learning, ...
Trend Miner
Trend Miner

Voyant Corpus Summary

Corpus Summary is a tool that provides a simple, textual overview of the current corpus. Features of this tool include number of words, number of unique words, longest documents, highest vocabulary density, most frequent words, notable peaks in frequency, ...
Voyant Corpus Summary
Voyant Corpus Summary

INTEX

INTEX is a linguistic development environment active until 2005. It includes large-coverage dictionaries and grammers, can parse texts of several million words in real-time, and tools to create and maintain large-coverage lexical resources, morphological ...
INTEX
INTEX

List Words - Plain Text (TAPoRware)

This tool lists words in an plain text document, either uploaded by the user or from a web address. List Words works with relatively small texts of under a megabyte in size. It is part of the TAPoRware collection of tools; there are XML and HTML versions ...
List Words - Plain Text (TAPoRware)
List Words - Plain Text (TAPoRware)

Voyant Lava

Lava allows you to view multiple levels of a corpus in a three-dimensional environment. Clicking on certain documents within the corpus expands the Lava visualization in a ring to explore further.
Voyant Lava
Voyant Lava

TokenX

TokenX is a free text visualization and analysis tool for XML documents. It offeres a web-based environment and can generate word clouds, highlight parts of text such as words, non-words and punctuation, KWIC (keyword in context) and more. TokenX also ...
TokenX
TokenX

Voyant Term Frequencies Chart

Term Frequencies Chart shows how terms are distributed across document(s) in a corpus (documents are shown in the order in which they were added).
Voyant Term Frequencies Chart
Voyant Term Frequencies Chart

Mallet

Mallet is a Java based library and command line framework that provides statistical and machine learning tools for use with natural language processing.
Mallet
Mallet

BIBCON

BIBCON was a key-word-out-of-context system for concordances developed in FORTRAN and available in the 1960s.  It was a modified version of another system, KWIC.
BIBCON
BIBCON

TextSTAT

TextSTAT is a free text analysis tool offered by Niederländische Philologie, FU Berlin. It is a simple program designed to accept plain text, HTML, Word and OpenOffice files to produce word frequency lists and concordances, and versions are available ...
TextSTAT
TextSTAT

SCAN

SCAN was a conversational programming language available in the 1970s for text analysis. It was specific to text processing and could be used divide a text into sentences or words or split on separators. It was capable of running counts on a text, printing ...
SCAN
SCAN

Concordance - HTML (TAPoRware)

This tool finds the context for a specified word or pattern anywhere in an HTML document, and can be narrowed to only the text within specified tags. Users may specify the context length, and whether the tool returns the context length in words, sentences, ...
Concordance - HTML (TAPoRware)
Concordance - HTML (TAPoRware)

Voyant Reader

Reader acts as a method of reading all documents within a specified corpus. It does not provide text analysis but rather a method of viewing the contents of a corpus.
Voyant Reader
Voyant Reader

HyperPo

HyperPo is an important legacy tool, developed as the first web-based text analysis tool aimed at humanities scholars available from 1996 through 2006. Users could input a web address, upload a file or directly enter text for analysis. HyperPo's interface ...
HyperPo
HyperPo

DISCAN

DISCAN was a tool for content and discourse analysis originally available for mainframe computers, and adapted for PCs with DOS 3.0 in 1991. It accepted ASCII files, could create KWIC and cooccurrence listings, and had modules for both creating content ...
DISCAN
DISCAN

Voyant Corpus Summary

Corpus Summary is a tool that provides a simple, textual overview of the current corpus. Features of this tool include number of words, number of unique words, longest documents, highest vocabulary density, most frequent words, notable peaks in frequency, ...
Voyant Corpus Summary
Voyant Corpus Summary

DM (Digital MappaeMundi)

The DM (Digital MappaeMundi) is an environment for studying and annotating images and texts. It enables users to link together images, texts, or fragments of images or texts, such as a textual annotation of an image or text. It is aimed at scholars ...
DM (Digital MappaeMundi)
DM (Digital MappaeMundi)

Tokenize - HTML (TAPoR)

This tool splits an HTML document at specified points into 'tokens' - words, lines, sentences, paragraphs or characters. The user can specify characters, patterns, or tags upon which to separate tokens, and choose to have the results listed separator ...
Tokenize - HTML (TAPoR)
Tokenize - HTML (TAPoR)
View tools by tag:
1960s 1970s 1980s 1990s 2000s 2010s American Annotation Canadian Comparator English English (language) French (language) German Historic Java Metadata Multilingual Natural language processing Social media
All Tags: