In this article, I will demonstrate how to use the apply
family of functions in R. They are extremely helpful, as you will see.
apply
apply
can be used to apply a function to a matrix.
For example, let’s create a sample dataset:
data <- matrix(c(1:10, 21:30), nrow = 5, ncol = 4) data [,1] [,2] [,3] [,4] [1,] 1 6 21 26 [2,] 2 7 22 27 [3,] 3 8 23 28 [4,] 4 9 24 29 [5,] 5 10 25 30
Now we can use the apply function to find the mean of each row as follows:
apply(data, 1, mean) 13.5 14.5 15.5 16.5 17.5
The second parameter is the dimension. 1
signifies rows and 2
signifies columns. If you want both, you can use c(1, 2)
.
lapply
lapply
is similar to apply, but it takes a list as an input, and returns a list as the output.
Let’s create a list:
data <- list(x = 1:5, y = 6:10, z = 11:15) data $x 1 2 3 4 5 $y 6 7 8 9 10 $z 11 12 13 14 15
Now, we can use lapply to apply a function to each element in the list. For example:
lapply(data, FUN = median) $x [1] 3 $y [1] 8 $z [1] 13
sapply
sapply
is the same as lapply
, but returns a vector instead of a list.
sapply(data, FUN = median) x y z 3 8 13
tapply
tapply
splits the array based on specified data, usually factor levels and then applies the function to it.
For example, in the mtcars
dataset:
library(datasets) tapply(mtcars$wt, mtcars$cyl, mean) 4 6 8 2.285727 3.117143 3.999214
The tapply function first groups the cars together based on the number of cylinders they have, and then calculates the mean weight for each group.
mapply
mapply
is a multivariate version of sapply
. It will apply the specified function to the first element of each argument first, followed by the second element, and so on. For example:
x <- 1:5 b <- 6:10 mapply(sum, x, b) 7 9 11 13 15
It adds 1 with 6, 2 with 7, and so on.
Let me know if you have any questions by leaving a comment below or reaching out to me on Twitter.