TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

Popular Tools
User Recommended Tools
Random Tools

Concordance - HTML (TAPoRware)

This tool finds the context for a specified word or pattern anywhere in an HTML document, and can be narrowed to only the text within specified tags. Users may specify the context length, and whether the tool returns the context length in words, sentences, ...
Concordance - HTML (TAPoRware)
Concordance - HTML (TAPoRware)

VINCI

VINCI is a natural language generation environment, first introduced in 1986 and with a sustained web presence. It provides linguists with a collection of linguist-friendly metalanguages for modelling natural language. It can generate sentences and ...
VINCI
VINCI

UNICON

UNICON was a concordance generator written in FORTRAN IV and available in the 1960s. It was available first for the IBM 7094, and for IBM 1410/7090 computers after 1970.
UNICON
UNICON

Word Cloud - Beta (TAPoRware)

This tool generates a word cloud of the top frequency words from a text document, with word size determined by its frequency. The user can specify how many words are to be included from the document, whether to apply a modified Glasgow Stop Words list, ...
Word Cloud - Beta (TAPoRware)
Word Cloud - Beta (TAPoRware)

EURAC: Double Tree

Double Tree is a free, open source Java application providing a visualization component for supporting exploratory corpus analysis. It focuses particularly on analyzing concordances, and can also represent a KWIC for a single word by collapsing the ...
EURAC: Double Tree
EURAC: Double Tree

InTEXT

InTEXT is a legacy, commercial suite of programs designed to supplement web search, intranet management and accomplish variety of document generation, search and manipulation functions. It includes natural language query, full-text search and retrieval ...
InTEXT
InTEXT

word tree

word tree is a free, web-based tool for generating dynamic word trees from user-supplied texts. Users can paste their text directly in the box provided, enter a URL or Twitter handle in the search bar, or install the bookmarklet into their browser's ...
word tree
word tree

TUSTEP

TUSTEP (Tubingen System of Text Processing Tools) is a free, open source, widely-used toolbox for text processing. It is aimed at scholarly audiences, can work with texts in both latin and non-latin scripts, and is primarily designed for humanites applications. ...
TUSTEP
TUSTEP

XTRACT

XTRACT was a tool for lexical collocation developed by Frank Smadja, then of Columbia University. It was designed to use statistical techniques to identify collocations of aribitrary length, and to generate syntactic relationships between words. This ...
XTRACT
XTRACT

TXM

TXM is a free, open source text corpus analysis environment. Its features include concordance, collocate search, frequencies based on the CQP full text search engine and statistical functions based on R packages. It can export results in CSV, XML or ...
TXM
TXM

Voyant Reader

Reader acts as a method of reading all documents within a specified corpus. It does not provide text analysis but rather a method of viewing the contents of a corpus.
Voyant Reader
Voyant Reader

INKE: Dynamic Table of Contexts

Dynamic Table of Contexts is a free tool for supplementing a traditional table of contents for a text. It allows users to add or subtract index items based on XML encoding. This tool requires the creation of an account.
INKE: Dynamic Table of Contexts
INKE: Dynamic Table of Contexts

SentiStrength

SentiStrength is a tool for sentiment analysis available for free to academic users (with registration), in Java and Windows-optimized versions. It estimates the strength of sentiment in short texts, and can handle informal language. Strength is expressed ...
SentiStrength
SentiStrength

Ethnograph

Ethnograph is a long-standing legacy software package, maintained into the present, for quantitative analysis and data management. It contains features for search, applying metadata and conducting corpus analysis.
Ethnograph
Ethnograph

PC-KIMMO

PC-KIMMO is a tool for morphological parsing available since 1985. It is designed to generate and/or parse words, for use by computational linguists, descriptive linguists and others interested in natural language processing. Though this tool is no ...
PC-KIMMO
PC-KIMMO

Twitter Capture and Analysis Toolset (DMI-TCAT)

Twitter Capture and Analysis Toolset (DMI-TCAT) is a free, open source tool for capturing and analyzing tweets. The tool's web interface is currently closed to researchers outside the University of Amsterdam's Media Studies department, however, the ...
Twitter Capture and Analysis Toolset (DMI-TCAT)
Twitter Capture and Analysis Toolset (DMI-TCAT)

Co-Occurrence - HTML (TAPoRware)

This tool looks for two words a certain distance apart from one another in an HTML document, within the user-specified limits of words, sentences or lines. The results can be narrowed to only include words found within certain tags. XML and plain text ...
Co-Occurrence - HTML (TAPoRware)
Co-Occurrence - HTML (TAPoRware)

TEXTPACK V

TEXTPACK V is a historic collection of interrelated text analysis utilities first released for mainframe computers in the 1970s. With the fifth edition, released in the 1980s, it was ported from FORTRAN to run on PC.
TEXTPACK V
TEXTPACK V

Wmatrix

Wmatrix is a free tool for corpus analysis and comparison. It provides a web interface for USAS and CLAWS, in addition to enabling standard corpus linguistic functions such as frequency lists and concordances.
Wmatrix
Wmatrix

Voyant Document Term Frequencies

Document Term Frequencies shows word frequencies for each document in the corpus. You can see the selected word at the top of the window highlighted in yellow. Its relevance to the documents is shown in the table below.
Voyant Document Term Frequencies
Voyant Document Term Frequencies

Ngram Statistics Package (NSP)

The Ngram Statistics Package (NSP) is a free suite for identifying word and character ngrams in large corpora developed by Ted Pederson and his team. It also generates frequency data and co-occurrences, and can generate correlations between two files. ...
Ngram Statistics Package (NSP)
Ngram Statistics Package (NSP)
View tools by tag:
1960s 1970s 1980s 1990s 2000s 2010s American Canadian Comparator Dutch English English (language) French (language) German Historic Java Metadata Multilingual Natural language processing Social media
All Tags: