TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

TAPoR 2.5 is scheduled for decommissioning.
Please visit TAPoR 3

Popular Tools
User Recommended Tools
Random Tools

Voyant Bubbles

Bubbles reads the words in a document (or corpus) and displays the highest frequency words within proportionately large bubbles. Once you arrive to Bubbles, insert / upload your content and let the tool perform its analysis.
Voyant Bubbles
Voyant Bubbles

LodLive

LodLive is a web-based tool developed to demonstrate Linked Data standards applied to browsing RDF resources with a simple interface. Users can search preset datasets, including DBpedia and Freebase, or search a resource available at a web address of ...
LodLive
LodLive

Siena (Simulation Investigation for Empirical Network Analysis)

Siena (Simulation Investigation for Empirical Network Analysis) is a free program for statistical analysis developed particularly to work with social network data, but also applicable to other forms of network data. It models network dynamics via Markov ...
Siena (Simulation Investigation for Empirical Network Analysis)
Siena (Simulation Investigation for Empirical Network Analysis)

Voyant Mandala

Mandala is a visualization tool that imports “textual” files to perform analysis on the frequency and linkage of words. For example, you may import a play and find the linkage and frequency between a word and its speaker.  
Voyant Mandala
Voyant Mandala

Voyant Corpus Grid

Corpus Grid shows an overview of the corpus, including each document's title, number of word tokens (total words), number or word types (unique words), and lexical density (the ratio of tokens to types).
Voyant Corpus Grid
Voyant Corpus Grid

Voyant Cirrus

Cirrus is a visualization tool that displays a word cloud relating to the frequency of words appearing in one or more documents. One can click on any word appearing in the cloud to obtain detailed information about its relativity.
Voyant Cirrus
Voyant Cirrus

WordHoard

WordHoard is a tool for the study of large texts or transcribed speech. It annotates or tags texts by applying morphological, lexical, prosodic, and narratological criteria. Users may apply WordHoard to their own texts, or to the corpora included with ...
WordHoard
WordHoard

GATE (General Architecture for Text Engineering)

GATE (General Architecture for Text Engineering) is free, open source software offered by the University of Sheffield since 1995. It provides a framework for users to gather a corpus, apply an ontology to it, develop markup, automate the application ...
GATE (General Architecture for Text Engineering)
GATE (General Architecture for Text Engineering)

Pundit

Pundit is a free, creative commons tool developed by the SemLib Project for creating structured annotations of web pages. These annotations can be collected in virtual notebooks and shared to create collaborative structured data. Annotations may be ...
Pundit
Pundit

Textexture

Textexture is a free, web-based tool for visualizing texts as a network. The visualization gives a quick visual summary of the text. Clicking on nodes brings up the excerpts the tool has identified as most relivant, and permits users to locate similar ...
Textexture
Textexture

Distribution Graph - XML (TAPoRware)

This tool creates a graphical distribution list of words found within specific XML elements. HTML and plain text versions are also available within the TAPoRware toolsets.
Distribution Graph - XML (TAPoRware)
Distribution Graph - XML (TAPoRware)

SemLens

SemLens is a tool for visualizing and analyzing RDF data developed as part of the Visual Data Web toolset and based on the Adobe Flex open source framework. Using SPARQL to access the user's dataset, it arranges RDF data into a scatter plot which can ...
SemLens
SemLens

NLTK 2.0 (Natural Language Toolkit)

NLTK 2.0 (Natural Language Tooklt) is a free, open source collection of Python modules, linguistic data and documentation for research and development in natural language processing and text analytics, with distributions for Windows, Mac OS X and Linux. ...
NLTK 2.0 (Natural Language Toolkit)
NLTK 2.0 (Natural Language Toolkit)

OrlandoVision (OVis)

An application for visualizing a specific collection of authors, and the links or associations between them.  Links are determined by co-occurrence in the Orlando dataset.  The current dataset consists of authors, other people associated with them, ...
OrlandoVision (OVis)
OrlandoVision (OVis)

Stanford NLP Group: Stanford Tokenizer

Stanford Tokenizer is a free Java implementation for diving an English text into tokens such as words, and a part of the Stanford Natural Language Processing toolset. This tool is not available on its own, but is bundled with other tools in the same ...
Stanford NLP Group: Stanford Tokenizer
Stanford NLP Group: Stanford Tokenizer

Voyant Bubblelines

Bubblelines is a visualization tool that helps to understand patterns of word repetition in one or more documents. Each document is represented as a horizontal line and each seach term is represented as a bubble – the bubble represents the frequency ...
Voyant Bubblelines
Voyant Bubblelines

Concordance - HTML (TAPoRware)

This tool finds the context for a specified word or pattern anywhere in an HTML document, and can be narrowed to only the text within specified tags. Users may specify the context length, and whether the tool returns the context length in words, sentences, ...
Concordance - HTML (TAPoRware)
Concordance - HTML (TAPoRware)

Mallet

Mallet is a Java based library and command line framework that provides statistical and machine learning tools for use with natural language processing.
Mallet
Mallet

Textalyser

Textalyser is a free web-based text analysis tool offered by the Bernhard Huber Internet Engineering Company. Users can paste text into the provided entry field, upload a file or provide a URL for analysis. Textalyser provides detailed statistics on ...
Textalyser
Textalyser

Aggregator - Other (TAPoRware)

This tool aggregates texts/subtexts from different locations into a single text. The original texts can be from a user-specified web page or files located on one's computer. Aggregating subtexts requires all documents to share a common subtext tag, ...
Aggregator - Other (TAPoRware)
Aggregator - Other (TAPoRware)

RSiena

RSiena is a free, open source social network analysis package for R. It replaces Siena (Simulation Investigation for Empirical Network Analysis), which was a stand-alone program for Windows. Like its predecessor, RSiena is optimized for social network ...
RSiena
RSiena
View tools by tag:
1960s 1970s 1980s 1990s 2000s 2010s American Canadian Comparator Dutch English English (language) French (language) German Historic Java Metadata Multilingual Natural language processing Social media
All Tags: