RStudio is a complete environment for working with the R language. Once you have got used to it you will find that it makes working with R far more productive than using the R console version. However some of the concepts involved in using RStudio may be new. RStudio provides an interface for working with R code, rather than an interface for running analyses directly.
There are a lot of new concepts to get used to, so you may have to work throguuh the videos in this class several times in order to get used to the RStudio way of working.
At the end of the class you will have learned
The RStudio server version runs directly through any web browser. There is no need to install any software on your laptop, PC or tablet
Access to the server is through the following URL. This works both on and off campus.
http://r.bournemouth.ac.uk:8789/
The RStudio server is an integrated platform for doing the following …
Advanced features can be used without any programming skills through sharing scripts. However you do need to become familiar with some new concepts in order to use the server. The RStudio server is ideal for collaborative work. You have your own permanent space on the server for saving your own work and building up a portfolio of useful analyses. Only one person can be logged in at any one time under your username. However I can always log into your user space at any time in order to help correct any errors and to give you advice on the analysis.
Once you are logged in you will see three sections of the interface by default. This will change to four sections when you begin using scripts in the interface.
Look carefully at the interface and learn to recognise the sections.
The next steps involve moving around the rstudio interface and trying to understand what you are looking at. The best advice at this point is not to try pointing and clicking on many of the menu options. This is because the way we work in R is quite different to the way you may have been used to running analyses in programs such as SPSS. Watch this video to hear some tips on what to do, and just as importantly what not to do.
Although you don’t need to remember them, it can he handy to navigate the interface using shortcuts. This video shows how to zoom to different panes within the interface using the keyboard.
This is an optional video showing a little trick that can help you learn the keyboard shortcuts. Again, you don’t need to remember this to use the interface and you can skip this video if you wish.
Now that you have seen some of the basics of the rstudio environment it is time to try running some simple R code. You can do this by typing code into the console.
Of course, at this stage you don’t know how to write R code! However if you follow the instructions in this video carefully you will see R in action. This code will produce a simple histogram with simulated data.
A key concept to understand when using the server is that your home directory on the server is like a directory (folder) on your PC. In one sense it is rather like the university H drive. However the data, the instructions for processing the data and the software (R and R packages) are all “encapsulated” on the server. So this is different from your H drive. You cannot run analyses using the standard university server. You can run analyses within the RStudio server.
In order to move data files and scripts into your home directory within the server you must upload them. You will see buttons labelled New Folder, Upload, Delete, Rename and More. If you click on the More button you will also find an option to Export your files. The upload and the export buttons are frequently used to move files onto the server and to directly move files off the server. It is very important to be aware of this concept. Files saved on the server will always be available for use later. In contrast active analyses that take place in the server memory, as opposed to the server’s hard disk space, will be temporary and will be lost between sessions.
To see the way data is moved between the server and your PC or laptop watch this video. Note that the command that you type into the console in order to add the example data set is aqm::add_file()
It is important to draw a clear distinction between uploading data onto the RStudio server and actually working with data in R. Uploading a data file simply involves moving the file from one place to another. Loading data into R involves turning the data held in the data file into data held in a format R can work with.
The video below shows the technique that you might stumble across yourself if you explore the Rstudio interface.** This is not recomended practice **. I include the example in order to dissuade users from following it.
You can use Rstudio without opening projects. However, projects make organising your work much simpler. A project is a set of instructions to restore the server to the same state that it was in when you closed the project. So if you are analysing a range of data sets you can use one project per data set to keep your work organised.
To form a new project and add a new folder
Data files are added to the project using the upload button in the files pane (bottom right). If you want to upload multiple files at once (e.g shapefiles) you should first compress them into a zip file. The zip file will expand when uploaded. Although R can read data from many different formats, the data files that you upload must be in some form of conventional format. The easiest format to use is to save each table as a single comma separated variable (.csv) file. The first line should contain short variable names with no spaces. The variable definitions should be kept separately and referred to when writing figure labels and captions, but not used in the column headers.
So you can use Excel spreadsheets to store your data. However be careful to keep the raw data in a simple R readable format. If you are used to using Excel and wish to combine the use of Excel and R then you can copy the raw data on the first sheet over to a second sheet for analysis using Excel. Do not add anything else, such as figures and compiled statistics, to the sheet of raw data in Excel.
The video below takes you through the steps involved in starting a project and loading a CSV file.
This course will concentrate on the use of markdown documents as a way of running R code. There are many advantages of using markdown.
In order to produce a markdown document you need to follow these steps. RStudio “helpfully” produces a default template document for you to edit. This can seem rather confusing the first time you see it.
Now try pressing the “knit” button on the top right pane. You will see the default demonstration document that was produced as a template “knit” into a simple data report. This is not yet using your data of course.
The steps above will always produce the default “demo” markdown document. Every time you start a new markdown Rstudio will start off with this one.
You should take a look at the logic of the demo document carefully. It consists of “chunks” of R code that produce output in the form of tables and figures embedded in text. The R code automatically produces output and adds it to the document after knitting. So if you have R code available that will run an analysis that you are interested in you don’t have to remember any other steps in order to run it. Simply ensure that the data that is being added to the analysis is appropriate for the type of analysis being run and you can obtain the same results with your own data. This will be the way R is used in this course.
When data has been loaded into R it is often convenient to provide the data set to others. If the data set is very large you would send them the original data file, together with the R script that loads the data and runs the analysis, In the case of small data sets it is convenient to embed the data itself within the report. In this way you only need to provide a single file in order to share both the data and the analysis.