TAPoR 2.0

Discover Research Tools for Textual Study

  • Browse Tools by Type or Tag
  • Search and Use Tools
  • Read and Create Tool Reviews
  • Contribute and Advertise Tools

<<Tool icon not found>>
R
Last Updated: May 21, 2013

R is an open source programing language designed for statistical analysis and parrallel computing. R began its life as a research project at the University of Aukland, but has since expanded to become a collaborativly run open source project run by the GNU. R has libraries for data import, regular expression data splitting, and visualization of data. R can be run as either a script or as a working environment.

DocumentationAttributesUser Supplied Tags
Author(s): GNU
Background processing Not applicable
Ease of use Difficult
Legacy Sustained to present
Popularity Widely used
Type of analysis Miscellaneous, Statistical, Text cleaning, Visualization
Type of license Open source
Web usable Software you download and install
1990s
Comments
Ryan Chartier 01
February 22, 2013 01:32 AM

Overview and Setup

R is a free open source programing language and statistical environment maintained by the GNU. R contains powerful libraries for parallel computing and is especially adept at computing on large data sets. The R website maintains and distributes the necessary files to install on all major operating systems. The R executable itself operates inside a command line environment; however, an optional and separate program Rstudio can provide a graphical development environment.

Features

R was not originally designed for textual analysis, and it abilities go beyond the direct needs of the average text analysis researcher. However, R’s vector data type, its ability to work on several data sources simultaneously, and its built-in visualization tools make graphing data exceptionally easy. With some coaxing R provides an ideal environment for large-scale data processing and analysis. R provides all tools necessary for splitting and combining texts, yet prior knowledge of regular expressions is mandatory in order to get the most out of these options. The R language vector data structure scales easily between small and large data sets. R also includes one command graph generation for all common graph and data types.

Conclusion

R is a programing language, yet it does several things differently then most conventional scripting languages. R does not rely heavily on control structures, which makes it easier for someone not familiar with recursive programing to learn. Anyone with a solid understanding of flows and tokenization should be able to pick up the programing style easily. However, R is a tool far bigger then text analysis; its features may cause some confusion for the self-learning especially those not immediately familiar with statistical concepts. R is an excellent tool for large and long-term research projects as well as toying around with smaller data sets. An excellent introduction to R targeted at humanist scholars can be found in the book ‘Quantitative Corpus Linguistics with R,’ by Stefan Th. Gries. (Link below) While more experience programmers might want to look at the official R tutorial on the R website.

http://www.amazon.ca/Quantitative-Corpus-Linguistics-Practical-Introduction/dp/0415962706

http://cran.r-project.org/doc/manuals/R-intro.html

Contribute
Public contributions are currently turned off due to spam issues. Please contact us us to get an account.
Add new tags:
Add a comment:
Other info: (publicly visible)
Your name:
Email:
Website:
Tool Rating: (Unrated)
People also used