TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

Search by Attribute Text cleaning
Extract Text - HTML (TAPoRware)

Extract Text - HTML (TAPoRware)


This tool extracts text found within specific tags in an HTML document. It is part of the TAPoRware toolset; an XML version is also available.
Extract Text - XML (TAPoRware)

Extract Text - XML (TAPoRware)


This tool extracts text found within specific tags in an XML document. It is part of the TAPoRware toolset; an HTML version is also available.
Get TEI Meta Data - Beta (TAPoRware)

Get TEI Meta Data - Beta (TAPoRware)


This tool extracts metadata from TEI-compatible XML documents and displays it in name/value format. It is only available for XML.
Extract Text From HTML - Beta (TAPoRware)

Extract Text From HTML - Beta (TAPoRware)


This tool extracts texts from user-specified HTML tags, elements and attributes. There is no XML counterpart at present.
Web Page Cleaner - Beta (TAPoRware)

Web Page Cleaner - Beta (TAPoRware)


This tool removes all HTML formatting from a web page or an uploaded HTML file, leaving the text for further processing. It is particularly good for preparing text-intensive web pages for analysis as plain text.
MorphAdorner

MorphAdorner

Author(s):  Philip R. "Pib" Burns

MorphAdorner, now in version 2.0, is a Java command-line program for the adornment of words in a text. At present, available adornments include standard spellings, parts of speech and lemmata, in addition to tokenization, the recognition of sentence boundaries and extracting names and places. Version 2.0 features a separate MorphAdorner Server, an improved integration with MONK's Abbot and improved processing of the Early English Books Online (EEBO), Eighteenth Century Collections Online (ECCO) and Evans Early American Imprint Collection corpori.
Stanford HCI Group: Gigapixel

Stanford HCI Group: Gigapixel


Gigapixel is a free tool to facilitate experiments in collaborate workspaces, enabling printed visualizations to be augmented with projectors and mobile devices. This tool has been succeeded by PaperToolKit, and continues to be available as a reference resource.
View tools by tag:
1960s 1970s 1980s 1990s 2000s 2010s American Canadian Comparator English English (language) French (language) German German (language) Historic Java Javascript Metadata Multilingual Natural language processing
All Tags: