In this post, we will show how to create vectors, factors, lists, matrices and datasets in R

Vectors

The vector is a very important tool in R programming. Through vectors, we create matrix and data frames.
Vectors can have numeric, character and logical values. The function c() is used to create vectors in R programming.

For example, lets create a numeric vector:

# numeric
x <- c(1, 3, 2, 5.2, -4, 5, 12)
x
1  3  2  5.2 -4  5 12

Also, we can have a character vector

# character
y <- c("red", "blue", "green", "no color")
y
"red" "blue" "green" "no color"

Finally, we can create and logical vectors

# logical
z <- c(TRUE, TRUE, FALSE)
z
TRUE TRUE FALSE

Additionally, you can create a vector which combine a numeric and a character values. Also we can check if the vector is numeric or character.

# numeric and character
x <- c(1, 2.2, "blue")
x
# check if it is numeric
is.numeric(x)
# check if it is character
is.character(x)
"1" "2.2" "blue"
FALSE
TRUE

Sometimes we might be interested to know the number of elements that a vector has, or in other words the length of vector.

x <- c(1,2,6,4,7)
length(x)
5

A simple way to generate vectors is to use seq() function in arithmetic progression.

x <- seq(from=2, to=10, by=2)
x
2 4 6 8 10

Factors

Factors are similar to vectors in R but they have another meaning. Factors have levels. In medical research, levels are widely used and they have an important meaning. For example, the smoking could be on 3 levels: never smoker, a former smoker, and current smoker. When we code smoking we can write 0, 1, 2 for never, former and current smoker, respectively. To create a factor the function factor() is used.

Create a vector with 6 elements:

s <- c(0, 1, 2, 1, 0, 0)
s
0, 1, 2, 1, 0

To make these factors, use the function factor:

sf <- factor(s)
sf
0 1 2 1 0 0
Levels: 0 1 2

When you conduct your analysis make sure that you have coded factors accurately.

Lists

Lists are vectors, but not like vectors links can combine different types of objects. For example, let suppose that we want to create a list of medical records. Medical records contain diagnosis, age, and treatment of patients.

The function to create lists is list().

x <- list(diagnosis="Gastritis", age=79, medication=TRUE)
x
$diagnosis
"Gastritis"
$age
79
$medication
TRUE

Now that you created a list let see how we can work with it. You may want to access a individuals element of the list.

x$age
x$medication
79
TRUE

Sometime you may want to know what is the size of the list, and for this use the function length.

length(x)
3

Matrices

Matrices are vectors with more then one dimension, therefore, matrices has rows and columns. To defined number of columns and rows you use the functions nrow and ncol, respectively. Similarly to vectors, matrices can have numbers, characters and logical values.

# create matrix with 6 elements
y <- matrix(1:6, nrow=3, ncol=2)
y
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

Or simply you can create a matrix like this.

# create matrix with 10 elements
y <- matrix(1:10, nrow=2)
# number of row is 2, than the columns will be 5
y
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

Another way of creating matrices is by using functions column-binding cbind() or row-binding rbind().

# create vectors
x <- 2:5
y <- 9:12
# sort by rows
rbind(x,y)
# sort by columns
cbind(x,y)
[,1] [,2] [,3] [,4]
x    2    3    4    5
y    9   10   11   12

 x  y
[1,] 2  9
[2,] 3 10
[3,] 4 11
[4,] 5 12

You can create matrix in an other way, by defining the vector and the names of columns and rows.

# create matrix with 4 elements
cells <- c(2,5,12,30)
colname <- c("Jan", "Feb")
rowname <- c("Apple", "Orange")
y <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rowname, colname))
y
       Jan Feb
Apple    2   5
Orange  12  30

As you see above, the function byrow=TRUE set the order of cells by row, you can change in FALSE as well.

Datasets

Datasets are similar to the matrix, but in comparison with the matrix, data frame contains numeric and character elements. Therefore, a data frame can have one column with numbers and another column with a character. The function used to create data frames is dataframe()

Let’s create a simple dataset.

hospital <- c("New York", "California")
patients <- c(150, 350)
df <- data.frame(hospital, patients)
df
hospital   patients
New York        150
California      350

Frequently we are intrested to look the structure of dataset we use, and for this we use the function str():

str(df)
'data.frame':	2 obs. of  2 variables:
 $ hospital: Factor w/ 2 levels "California","New York": 2 1
 $ patients: num  150 350

Here we end this post. Post a comment if you have any question.