TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

<<Tool icon not found>>
Last Updated: Nov 27, 2011

This tool lists words in an XML document, either uploaded by the user or from a web address. List Words works with relatively small texts of under a megabyte in size. It is part of the TAPoRware collection of tools; there are HTML and plain text versions available as well.

DocumentationAttributesUser Supplied Tags
Ease of use Easy
Tool family Taporware
Type of analysis Statistical
Type of license Free
Warning Still in development
Web usable Run in browser
2000s, English (language)
Comments
Amy Dyrbye 01
February 08, 2012 12:22 AM

List Words (XML) is a free, web-based tool designed to run in a browser window. It is a simple to use tool, designed to count and generate lists of all words in a document, either hosted at a web address or uploaded from the user's files.

Users may apply the provided Glasgow stop list, upload their own, or work with the full list. This tool also generates a basic statistical analysis, including total number of words, unique words, appearances of particular words, only words matching a specified pattern, or only words within specified tags.

Other features include preset sort options, an inflectional stemmer and output formats including XML Tree and tab delimited. Most notably, the HTML output option includes a small distribution graph for the most frequent words in the document. While the tool can handle novel-sized texts, applying the inflectional stemmer will slow down the tool to a degree proportional to the size of the text.

The tool's tab-delimited output has problems with some characters - for example, cæsar becomes "c¾sar," and punctuation such as quotation marks is often replaced with its unicode equivalent. More generally, words joined by hyphens (such as "wine-dark-sea") are counted as one word, opening quotation marks are appended to the word they are adjacent to, the tool fragments any web addresses in the text as several small words (ex: "//www"). Users are advised to watch for such instances and adjust the results accordingly.

Despite these limitations, List Words (XML) is an effective way to get a quick breakdown of the text. Versions are also available for HTML and plain text documents.

Contribute
Public contributions are currently turned off due to spam issues. Please contact us us to get an account.
Add new tags:
Add a comment:
Other info: (publicly visible)
Your name:
Email:
Website:
Tool Rating: (Unrated)
People also used