TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

<<Tool icon not found>>
Last Updated: Jan 11, 2012

This tool removes all HTML formatting from a web page or an uploaded HTML file, leaving the text for further processing. It is particularly good for preparing text-intensive web pages for analysis as plain text.

DocumentationAttributesUser Supplied Tags
Ease of use Easy
Tool family Taporware
Type of analysis Text cleaning
Type of license Free
Warning Prototype, Still in development
Web usable Run in browser
2000s, English (language)
Comments
Amy Dyrbye 01
February 28, 2012 12:50 AM

Web Page Cleaner (Beta) is a free, simple to use web-based tool designed to run in a browser window. It processes an HTML document at a user-specified web address or from the user's files and removes all HTML tagging.

This tool is basic, offering only two options: users may either strip tags from their document, or convert it to plain text.

The tool has a few problems. It replaces some punctuation, other non-alphabetical and accented alphabetical characters with Unicode equivalents. When the strip tag option is selected, the tool also runs previously-tagged text together, which can result in issues if no space was included before or after a tagged chunk.  

Despite these limitation, Web Page Cleaner (Beta) is an effectively way to quickly convert an HTML document to untagged text or plain text.

Contribute
Public contributions are currently turned off due to spam issues. Please contact us us to get an account.
Add new tags:
Add a comment:
Other info: (publicly visible)
Your name:
Email:
Website:
Tool Rating: (Unrated)
People also used