DataScience+ An online community for showcasing R & Python tutorials. It operates as a networking platform for data scientists to promote their talent and get hired. Our mission is to empower data scientists by bridging the gap between talent and opportunity.

Building Packages in R – Part 0: Setting Up R

One of the highly touted features of R is that it allows you, me, and everyone to create packages. Packages are collections of functions that are made to enable end-users to analyze their data more quickly and efficiently. But the package framework is not just suited to allow experts to distribute their statistical approaches and techniques to a wide audience via CRAN, but it also allows us to collect functions we define ourselves in a single place. They are a great way of making your analysis more efficient if you often find yourself doing the same steps again and again. Over the coming weeks I will try to give a brief and simple introduction into the basics of building R packages for your own home use. Mostly, it will contain stuff that I wished someone had told me, when I started out combining my own functions into packages.

This week we’ll start out with a part on the prerequisites – which is why I labeled it Part 0 – of your R-setup.

Using an Editor

When writing functions and creating packages it is extremely important to write your syntax in an environment that has functionalities beyond the basic R GUI. Many people use RStudio, which I can also wholeheartedly recommend. A main pro of RStudio is that it is a cross-platform IDE, meaning that your R-Code will look and feel the same on a Windows machine and on a Mac Book (which can be incredibly helpful if you’re teaching R). The same goes for Emacs with ESS. Most other alternatives are more OS-specific: Notepad Plus with the NppToR add-on is a good choice on Windows and gedit with the rgedit extension is what I can recommend if you’re running Ubuntu Linux. No matter which of these you choose, they all give you much more information about your code than standard R and – in some cases – come with a plethora of debugging functionalities which are quite handy for writing your own packages.

Setting up R in Ubuntu (15.04)

At this point I’ll assume that you have R installed on your machine. If you don’t, follow these instructions on the official site. For creating your own packages it is important that you not only install the base version of R, but also the developer version via terminal.

sudo apt-get install r-base r-base-devel

Additionally, there are two packages I would strongly recommend using when you are building your own R packages: devtools and roxygen2. The problem with using the former is that it requires some additional libraries to be installed on your machine. What you can do is to install all the necessary libraries.

sudo apt-get install libcurl4-openssl-dev libcurl4-gnutls-dev libxml2-dev libssl-dev

Once these are installed you can open up R and use to install these two packages.


Setting up R in Windows

If you are running Windows, the initial setup is a bit more complicated. But, with a few tweaks it is possible to make the package building work just fine. Because I run Ubuntu on my machine I am no expert with setting this up but following this excellent guide by Steven Mosher, I was able to make it work in 10 minutes on Windows 8.1 (and it still works despite the upgrade to Windows 10).

Setting up R in Mac OS X

If you are running Mac OS X you don’t need to take any extra steps in your R setup. It should work straight out of the box – but be aware that the current R version (3.2.2 as of this writing) will only run on Mac OS X 10.9 or higher.

So, now that we’ve gotten this out of the way, we can start taking the first steps towards building our own R package.

Next time, we’ll create some functions and start the initial package build.