An online community for showcasing R & Python tutorials. It operates as a networking platform for data scientists to promote their talent and get hired. Our mission is to empower data scientists by bridging the gap between talent and opportunity.

Introduction

In this post we will show how to create vectors, factors, lists, matrices and datasets in R

The vector is a very important tool in R programing. Through vectors we create matrix and dataframes.

Vectors can have numeric, character and logical values. The function `c()`

is used to create vectors in R programming.

For example, lets create a numeric vector:

# numeric x <- c(1, 3, 2, 5.2, -4, 5, 12) x1 3 2 5.2 -4 5 12

Also, we can have a character vector

# character y <- c("red", "blue", "green", "no color") y"red" "blue" "green" "no color"

Finally, we can create and logical vectors

# logical z <- c(TRUE, TRUE, FALSE) zTRUE TRUE FALSE

Additionally, you can create a vector which combine a numeric and a character values. Also we can check if the vector is numeric or character.

# numeric and character x <- c(1, 2.2, "blue") x # check if it is numeric is.numeric(x) # check if it is character is.character(x)"1" "2.2" "blue" FALSE TRUE

Sometimes we might be interested to know the number of elements that a vector has, or in other words the length of vector.

x <- c(1,2,6,4,7) length(x)5

A simple way to generate vectors is to use `seq()`

function in arithmetic progression.

x <- seq(from=2, to=10, by=2) x2 4 6 8 10

Factors are similar to vectors in R but they have another meaning. Factors have levels. In medical research, levels are widely used and they have an important meaning. For example, the smoking could be in 3 levels: never smoker, former smoker and current smoker. When we code smoking we can write 0, 1, 2 for never, former and current smoker, respectively. To create a factor the function `factor()`

is used.

Create a vector with 6 elements:

s <- c(0, 1, 2, 1, 0, 0) s0, 1, 2, 1, 0

To make these factors, use the function `factor`

:

sf <- factor(s) sf0 1 2 1 0 0 Levels: 0 1 2

When you conduct your analysis make sure that you have coded factors accurately.

Lists are vectors, but not like vectors links can combine different types of objects. For example let suppose that we want to create a list with medical records. A medical records contain diagnosis, age and treatment of patients.

The function to create lists is `list()`

.

x <- list(diagnosis="Gastritis", age=79, medication=TRUE) x$diagnosis "Gastritis" $age 79 $medication TRUE

Now that you created a list let see how we can work with it. You may want to access a individuals element of the list.

x$age x$medication79 TRUE

Sometime you may want to know what is the size of the list, and for this use the function `length`

.

length(x)3

Matrices are vectors with more then one dimension, therefore, matrices has rows and columns. To defined number of columns and rows you use the functions `nrow`

and `ncol`

, respectively. Similarly to vectors, matrices can have numbers, characters and logical values.

# create matrix with 6 elements y <- matrix(1:6, nrow=3, ncol=2) y[,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6

Or simply you can create a matrix like this.

# create matrix with 10 elements y <- matrix(1:10, nrow=2) # number of row is 2, than the columns will be 5 y[,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10

Another way of creating matrices is by using functions column-binding `cbind()`

or row-binding `rbind()`

.

# create vectors x <- 2:5 y <- 9:12 # sort by rows rbind(x,y) # sort by columns cbind(x,y)[,1] [,2] [,3] [,4] x 2 3 4 5 y 9 10 11 12 x y [1,] 2 9 [2,] 3 10 [3,] 4 11 [4,] 5 12

You can create matrix in an other way, by defining the vector and the names of columns and rows.

# create matrix with 4 elements cells <- c(2,5,12,30) colname <- c("Jan", "Feb") rowname <- c("Apple", "Orange") y <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rowname, colname)) yJan Feb Apple 2 5 Orange 12 30

As you see above, the function `byrow=TRUE`

set the order of cells by row, you can change in `FALSE`

as well.

Datasets are similar to matrix, but in comparison with matrix, data frame contain numeric and character elements. Therefore, a data frame can have one column with numbers and other column with a characters. The function used to create data frames is `dataframe()`

Let’s create a simple dataset.

hospital <- c("New York", "California") patients <- c(150, 350) df <- data.frame(hospital, patients) dfhospital patients New York 150 California 350

Frequently we are intrested to look the structure of dataset we use, and for this we use the function `str()`

:

str(df)'data.frame': 2 obs. of 2 variables: $ hospital: Factor w/ 2 levels "California","New York": 2 1 $ patients: num 150 350

Here we end this post. Post a comment if you have any question.