R guide1

If you are already familiar with R, you can skip this guide.

Install R

Download and install R. Even if you have previously installed R, it is a good idea for you to re-install to be sure your R is up to date. During installation, you may be asked to affiliate with a site (a CRAN Mirror). It doesn’t matter which you choose. I scrolled down to USA and chose the CMU stat department. https://www.r-project.org/

Updating R Studio

If you don’t already have R Studio, go to the next step. If you already have R Studio, update as follows. Launch R Studio. On the main ribbon, click Help/Check for Updates. If there are updates, follow the update instructions.

Installing R Studio

If you have not previously installed R Studio, download and install the free desktop version of R Studio. https://rstudio.com/products/rstudio/download/ You will probably want to print this tutorial so that you have it to refer to as you work your way through the R script that you will load below.

Some basics of R

You can type commands one-at-a-time by the blue arrow. For example, suppose you want to know the square root of 7. Click by the blue arrow in your R studio screen and type: 7^.5 Hit Enter to see the answer.

You can enter commands as described above, but it is typically more convenient to collect a set of commands in an R file. You can then save the file and avoid retyping commands that you wish to use again in the future. We will use the term “script” to refer to a file that contains R commands. The script for this tutorial is named: R_Script_Tutorial_1.R

For R to recognize a file as a set of commands, the file must end, as above, with .R

Choose a folder on your computer to contain your R materials. This will be your “working directory”

Set Your Working Directory

To set a working directory, you can either type the following command

    setwd(YOURDIRECTORY)

where \(YOURDIRECTORY\) is the directory to a folder you would like to set as your working folder. The directory must be inside a quotation mark. For example, on my computer, I choose my desktop as the working directory:

    setwd('/Users/anhnguye/Desktop')

Alternatively, click the following sequence on the ribbon in upper left window: Session/Set working Directory/Choose Directory Then, navigate to the directory containing your R script and click Select Folder.

Install Packages

R code is shared and organized using “packages”. Many R procedures require a combination of several commands.

For this course, I would like you to install a package called \(ggplot2\) that will helps us produce charts and graphs.

The command is the following

    install.packages('ggplot2')

These packages will take a while to download and install. Much text will fly by in the Console as this is proceeding. Some of the text will be in red font, but that’s OK when packages are downloading. You may be asked whether you want to update some packages you already have. If so, click yes. You will only need to execute these three commands when you want to install updates. You will do updating infrequently, e.g., once or twice per year.

Load Packages

Packages need to be loaded each time you launch R. Run the following command to load \(ggplot2\).

    library('ggplot2')

Alternatively, you can also run

    require('ggplot2')

Note: If you are writing a script, the package only needs to be loaded once (usually by placing the command at the beginning of the script).

Sample R script

Save the following commands into a text file in your working directory, name it sample_script.R. Also, save the following data file into your working directory folder.

Link: TBA

    setwd('yourworkingdir')
    require('ggplot2')
    
    chocolate_data = read.csv('data_chocolate.csv')
    
    ggplot(chocolate_data, aes(x=Company, y=Market.Share)) + 
    geom_bar(stat='identity') 

The output looks like this:

## Loading required package: ggplot2

Note that R understands that if a command doesn’t look like it’s completed, R will go to the next line.

For example, BOTH of the following commands work:

ggplot(chocolate_data, aes(x=Company, y=Market.Share)) + 
    geom_bar(stat='identity')

ggplot(chocolate_data, aes(x=Company, y=Market.Share)) + geom_bar(stat='identity') 

But NOT this

ggplot(chocolate_data, aes(x=Company, y=Market.Share)) 
    + geom_bar(stat='identity')

To run all commands in an Rscript, we use the function load

    load('sample_script.R')

  1. Much of this material was taken from Professor Dennis Epple’s statistics course. All errors are mine.