This tool lists words in an plain text document, either uploaded by the user or from a web address. List Words works with relatively small texts of under a megabyte in size. It is part of the TAPoRware collection of tools; there are XML and HTML versions available as well.
| Documentation | Attributes | User Supplied Tags | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Documentation: http://tada.mcmaster.ca/Main/TAPoRwarePlainListWords
Author(s): Geoffrey Rockwell et. al.
|
|
2000s, Canadian (language) |
List Words (Plain Text) is a free, web-based tool designed to run in a browser window. It is a simple to use tool, designed to count and generate lists of all words in a document, either hosted at a web address or uploaded from the user's files.
Users may apply the provided Glasgow stop list, upload their own, or work with the full list. This tool also generates a basic statistical analysis, including total number of words, unique words, appearances of particular words or only words matching a specified pattern.
Other features include preset sort options, an inflectional stemmer and output formats including XML Tree and tab delimited. Most notably, the HTML output option includes a small distribution graph for the most frequent words in the document. While the tool can handle novel-sized texts, applying the inflectional stemmer will slow down the tool to a degree proportional to the size of the text.
This tool has problems with some characters - for example, cæsar becomes "c¾sar," and punctuation such as quotation marks is often replaced with its unicode equivalent. In addition, words joined by hyphens (such as "wine-dark-sea") are counted as one word, opening quotation marks are appended to the word they are adjacent to. Users are advised to watch for such instances and adjust the results accordingly.
Despite these limitations, List Words (Plain Text) is an effective way to get a quick breakdown of the text. Versions are also available for HTML and XML documents.

February 08, 2012 12:22 AM