DataScience+ We publish R tutorials from scientists at academic and scientific institutions with a goal to give everyone in the world access to a free knowledge. Our tutorials cover different topics including statistics, data manipulation and visualization!
Data Management

Using the apply family of Functions in R

In this article, I will demonstrate how to use the apply family of functions in R. They are extremely helpful, as you will see.

apply

apply can be used to apply a function to a matrix.
For example, let’s create a sample dataset:

data <- matrix(c(1:10, 21:30), nrow = 5, ncol = 4)
data

     [,1] [,2] [,3] [,4]
[1,]    1    6   21   26
[2,]    2    7   22   27
[3,]    3    8   23   28
[4,]    4    9   24   29
[5,]    5   10   25   30

Now we can use the apply function to find the mean of each row as follows:

apply(data, 1, mean)
13.5 14.5 15.5 16.5 17.5

The second parameter is the dimension. 1 signifies rows and 2 signifies columns. If you want both, you can use c(1, 2).

lapply

lapply is similar to apply, but it takes a list as an input, and returns a list as the output.
Let’s create a list:

data <- list(x = 1:5, y = 6:10, z = 11:15)
data

$x
1 2 3 4 5
$y
6  7  8  9 10
$z
11 12 13 14 15

Now, we can use lapply to apply a function to each element in the list. For example:

lapply(data, FUN = median)

$x
[1] 3
$y
[1] 8
$z
[1] 13

sapply

sapply is the same as lapply, but returns a vector instead of a list.

sapply(data, FUN = median)

x  y  z 
3  8 13

tapply

tapply splits the array based on specified data, usually factor levels and then applies the function to it.
For example, in the mtcars dataset:

library(datasets)
tapply(mtcars$wt, mtcars$cyl, mean)

       4        6        8 
2.285727 3.117143 3.999214 

The tapply function first groups the cars together based on the number of cylinders they have, and then calculates the mean weight for each group.

mapply

mapply is a multivariate version of sapply. It will apply the specified function to the first element of each argument first, followed by the second element, and so on. For example:

x <- 1:5
b <- 6:10
mapply(sum, x, b)

7  9 11 13 15

It adds 1 with 6, 2 with 7, and so on.

Let me know if you have any questions by leaving a comment below or reaching out to me on Twitter.