In this post, we will show how to create vectors, factors, lists, matrices and datasets in R
The vector is a very important tool in R programming. Through vectors, we create matrix and data frames.
Vectors can have numeric, character and logical values. The function
c() is used to create vectors in R programming.
For example, lets create a numeric vector:
# numeric x <- c(1, 3, 2, 5.2, -4, 5, 12) x 1 3 2 5.2 -4 5 12
Also, we can have a character vector
# character y <- c("red", "blue", "green", "no color") y "red" "blue" "green" "no color"
Finally, we can create and logical vectors
# logical z <- c(TRUE, TRUE, FALSE) z TRUE TRUE FALSE
Additionally, you can create a vector which combine a numeric and a character values. Also we can check if the vector is numeric or character.
# numeric and character x <- c(1, 2.2, "blue") x # check if it is numeric is.numeric(x) # check if it is character is.character(x) "1" "2.2" "blue" FALSE TRUE
Sometimes we might be interested to know the number of elements that a vector has, or in other words the length of vector.
x <- c(1,2,6,4,7) length(x) 5
A simple way to generate vectors is to use
seq() function in arithmetic progression.
x <- seq(from=2, to=10, by=2) x 2 4 6 8 10
Factors are similar to vectors in R but they have another meaning. Factors have levels. In medical research, levels are widely used and they have an important meaning. For example, the smoking could be on 3 levels: never smoker, a former smoker, and current smoker. When we code smoking we can write 0, 1, 2 for never, former and current smoker, respectively. To create a factor the function
factor() is used.
Create a vector with 6 elements:
s <- c(0, 1, 2, 1, 0, 0) s 0, 1, 2, 1, 0
To make these factors, use the function
sf <- factor(s) sf 0 1 2 1 0 0 Levels: 0 1 2
When you conduct your analysis make sure that you have coded factors accurately.
Lists are vectors, but not like vectors links can combine different types of objects. For example, let suppose that we want to create a list of medical records. Medical records contain diagnosis, age, and treatment of patients.
The function to create lists is
x <- list(diagnosis="Gastritis", age=79, medication=TRUE) x $diagnosis "Gastritis" $age 79 $medication TRUE
Now that you created a list let see how we can work with it. You may want to access a individuals element of the list.
x$age x$medication 79 TRUE
Sometime you may want to know what is the size of the list, and for this use the function
Matrices are vectors with more then one dimension, therefore, matrices has rows and columns. To defined number of columns and rows you use the functions
ncol, respectively. Similarly to vectors, matrices can have numbers, characters and logical values.
# create matrix with 6 elements y <- matrix(1:6, nrow=3, ncol=2) y [,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6
Or simply you can create a matrix like this.
# create matrix with 10 elements y <- matrix(1:10, nrow=2) # number of row is 2, than the columns will be 5 y [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10
Another way of creating matrices is by using functions column-binding
cbind() or row-binding
# create vectors x <- 2:5 y <- 9:12 # sort by rows rbind(x,y) # sort by columns cbind(x,y) [,1] [,2] [,3] [,4] x 2 3 4 5 y 9 10 11 12 x y [1,] 2 9 [2,] 3 10 [3,] 4 11 [4,] 5 12
You can create matrix in an other way, by defining the vector and the names of columns and rows.
# create matrix with 4 elements cells <- c(2,5,12,30) colname <- c("Jan", "Feb") rowname <- c("Apple", "Orange") y <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rowname, colname)) y Jan Feb Apple 2 5 Orange 12 30
As you see above, the function
byrow=TRUE set the order of cells by row, you can change in
FALSE as well.
Datasets are similar to the matrix, but in comparison with the matrix, data frame contains numeric and character elements. Therefore, a data frame can have one column with numbers and another column with a character. The function used to create data frames is
Let’s create a simple dataset.
hospital <- c("New York", "California") patients <- c(150, 350) df <- data.frame(hospital, patients) df hospital patients New York 150 California 350
Frequently we are intrested to look the structure of dataset we use, and for this we use the function
str(df) 'data.frame': 2 obs. of 2 variables: $ hospital: Factor w/ 2 levels "California","New York": 2 1 $ patients: num 150 350
Here we end this post. Post a comment if you have any question.