DataScience+ We publish R tutorials from scientists at academic and scientific institutions with a goal to give everyone in the world access to a free knowledge. Our tutorials cover different topics including statistics, data manipulation and visualization!
Introduction

How to create Vectors, Factors, Lists, Matrices and Datasets with R Programming

In this post we will show how to create vectors, factors, lists, matrices and datasets in R

Vectors

The vector is a very important tool in R programing. Through vectors we create matrix and dataframes.
Vectors can have numeric, character and logical values. The function c() is used to create vectors in R programming.

For example, lets create a numeric vector:

# numeric
x <- c(1, 3, 2, 5.2, -4, 5, 12)
x
1  3  2  5.2 -4  5 12

Also, we can have a character vector

# character
y <- c("red", "blue", "green", "no color")
y
"red" "blue" "green" "no color"

Finally, we can create and logical vectors

# logical
z <- c(TRUE, TRUE, FALSE)
z
TRUE TRUE FALSE

Additionally, you can create a vector which combine a numeric and a character values. Also we can check if the vector is numeric or character.

# numeric and character
x <- c(1, 2.2, "blue")
x
"1" "2.2" "blue"
# check if it is numeric
is.numeric(x)
FALSE
# check if it is character
is.character(x)
TRUE

Sometimes we might be interested to know the number of elements that a vector has, or in other words the length of vector.

x <- c(1,2,6,4,7)
length(x)
5

A simple way to generate vectors is to use seq() function in arithmetic progression.

x <- seq(from=2, to=10, by=2)
x
2 4 6 8 10

Factors

Factors are similar to vectors in R but they have another meaning. Factors have levels. In medical research, levels are widely used and they have an important meaning. For example, the smoking could be in 3 levels: never smoker, former smoker and current smoker. When we code smoking we can write 0, 1, 2 for never, former and current smoker, respectively. To create a factor the function factor() is used.

Create a vector with 6 elements:

s <- c(0, 1, 2, 1, 0, 0)
s

0, 1, 2, 1, 0

To make these factors, use the function factor:

sf <- factor(s)
sf

0 1 2 1 0 0
Levels: 0 1 2

When you conduct your analysis make sure that you have coded factors accurately.

Lists

Lists are vectors, but not like vectors links can combine different types of objects. For example let suppose that we want to create a list with medical records. A medical records contain diagnosis, age and treatment of patients.

The function to create lists is list().

x <- list(diagnosis="Gastritis", age=79, medication=TRUE)
x

$diagnosis
"Gastritis"
$age
79
$medication
TRUE

Now that you created a list let see how we can work with it. You may want to access a individuals element of the list.

x$age
79

x$medication
TRUE

Sometime you may want to know what is the size of the list, and for this use the function length.

length(x)
3

Matrices

Matrices are vectors with more then one dimension, therefore, matrices has rows and columns. To defined number of columns and rows you use the functions nrow and ncol, respectively. Similarly to vectors, matrices can have numbers, characters and logical values.

# create matrix with 6 elements
y <- matrix(1:6, nrow=3, ncol=2)
y
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

Or simply you can create a matrix like this.

# create matrix with 10 elements
y <- matrix(1:10, nrow=2)
# number of row is 2, than the columns will be 5
y
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

Another way of creating matrices is by using functions column-binding cbind() or row-binding rbind().

# create vectors
x <- 2:5
y <- 9:12
# sort by rows
rbind(x,y)
  [,1] [,2] [,3] [,4]
x    2    3    4    5
y    9   10   11   12
# sort by columns
cbind(x,y)
     x  y
[1,] 2  9
[2,] 3 10
[3,] 4 11
[4,] 5 12

You can create matrix in an other way, by defining the vector and the names of columns and rows.

# create matrix with 4 elements
cells <- c(2,5,12,30)
colname <- c("Jan", "Feb")
rowname <- c("Apple", "Orange")
y <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rowname, colname))
y
       Jan Feb
Apple    2   5
Orange  12  30

As you see above, the function byrow=TRUE set the order of cells by row, you can change in FALSE as well.

Datasets

Datasets are similar to matrix, but in comparison with matrix, data frame contain numeric and character elements. Therefore, a data frame can have one column with numbers and other column with a characters. The function used to create data frames is dataframe()

Let’s create a simple dataset.

hospital <- c("New York", "California")
patients <- c(150, 350)
df <- data.frame(hospital, patients)
df

hospital   patients
New York        150
California      350

Frequently we are intrested to look the structure of dataset we use, and for this we use the function str():

str(df)

'data.frame':	2 obs. of  2 variables:
 $ hospital: Factor w/ 2 levels "California","New York": 2 1
 $ patients: num  150 350

Here we end this post. Post a comment if you have any question.