You are on page 1of 15

DR.

S OHAI L AKRAM
INTRODUCTION TO R
R

R is a computer language that allows the user to program
algorithms and use tools that have been programmed by
others.

R was originally written by Ross Ihaka and Robert Gentleman,
at the University of Auckland.

It is an implementation of the S language, which was
principally developed by John Chambers.

R
R is an Open Source (and freely available)
environment for statistical computing and graphics.

The Comprehensive R Archive Network (CRAN) links
provide binary downloads for Windows, for Mac OS
X and for several flavours of Linux.

Source code is also available.


R
R is under active development - typically two major
releases per year.

R provides data manipulation, display facilities and
most statistical procedures. It can be extended with
packages containing data, code and
documentation.

Currently there are more than 2400 contributed
packages in the CRAN.

R HISTORY
Statistical programming language S developed at
Bell Labs in 1976 (at the same time as UNIX)
Intended to interactively support research and
data analysis projects
Exclusively licensed to Insightful (S-Plus)
R: Open source platform similar to S developed by
R. Gentleman and R. Ihaka (U of Auckland, NZ)
during the 1990s
Since 1997: international R-core developing team
Updated versions available every couple months

WHAT CAN YOU DO WITH R?
You can ...
do calculations

perform statistical analysis (using available
code)

create powerful graphics

write your own functions
WHAT R IS AND WHAT IT IS NOT
R is
a programming language
a statistical package
an interpreter
Open Source

R is not
a database
a collection of black boxes
a spreadsheet software package
commercially supported

OPEN SOURCE
Provides full access to algorithms and their implementation.
Gives you the ability to fix bugs and extend software.
Provides a forum allowing researchers to explore and expand
the methods used to analyze data
Is the product of thousands of leading experts in the fields they
know best.
Ensures that scientists around the world - and not just ones in
rich countries - are the co-owners to the software tools
needed to carry out research.
Promotes reproducible research by providing open and
accessible tools.
Most of R is written in R! This makes it quite easy to see what
functions are actually doing.


R ADVANTAGES
Fast and free.
State of the art: Statistical researchers provide their
methods as R packages. SPSS and SAS are years
behind R!
2
nd
only to MATLAB for graphics.
Mx, WinBugs, and other programs use or will use R.
Active user community
Excellent for simulation, programming, computer
intensive analyses, etc.
Forces you to think about your analysis.
Interfaces with database storage software (SQL)
R DISADVANTAGES
Not user friendly @ start - steep learning curve,
minimal GUI.
No commercial support; figuring out correct
methods or how to use a function on your own can
be frustrating.
Easy to make mistakes and not know these mistakes.
Working with large datasets is limited by RAM
Data prep & cleaning can be messier & more
mistake prone in R vs. SPSS or SAS
R VS COMMERCIAL PACKAGES
Many different datasets (and other
objects) available at same time

Datasets can be of any dimension

Functions can be modified

Experience is interactive, you
program until you get exactly what
you want

One datasets available at a
given time

Datasets are rectangular

Functions are proprietary

Experience is passive-you
choose an analysis and they
give you everything they think
you need
R VS COMMERCIAL PACKAGES

One stop shopping - almost
every analytical tool you can
think of is available




R is free and will continue to
exist. Nothing can make it go
away, its price will never
increase.


Tend to be have limited scope,
forcing you to learn additional
programs; extra options cost more
and/or require you to learn a
different language (e.g., SPSS
Macros)

They cost money. There is no
guarantee they will continue to
exist, but if they do, you can bet
that their prices will always increase

INSTALLING R
Go to http://cran.r-project.org/ and select either:

MacOS X
Windows and base

Select to download the latest version: 2.14.1

Install and Open.

GETTING STARTED
The R GUI.

R PACKAGES
Applications of R normally use a package; i.e., a
library of special functions designed for a specific
problem.
Hundreds of packages are available, mostly written
by users.
A user normally only loads a handful of packages
for a particular analysis(e.g., library(MASS)).
Standards determine how a package is structured,
works well with other packages and creates new
data types in an easily used manner.
Standardization makes it easy for users to learn new
packages.

You might also like