# Getting Started with R - Part 5: Matrices - Creating, Filling and Subsetting

Often we have the need to arrange data in a structure with rows and columns, or we need to represent a collection of vectors. R has a built-in matrix type that allows us to create a basic structure of rows and columns

I am posting this tutorial as I learn R. I will respond to feedback for errata in the comments.

## Constructing a matrix with the `matrix` function

Before we start, lets look at the basics in the online help by executing `?matrix`. Doing that reveals the signature of the function

``````matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,
dimnames = NULL)
``````

Reading further in the help you’ll see we can supply a vector for the data, or use a type that can be coerced to a vector. We can specify the number of rows and columns and how the vector should be used to fill the matrix, and we can supply dimension names (which I’ll cover a bit later). For now let us create a matrix

If we simply call `matrix()` we will get all the default parameters as specified with the `=` operator in definition of the help file. The `=` is very similar to the `<-` operator in that it is an assignment operator, but it can only be used in specific scope. So our simple `matrix()` call will create a 1 by 1 matrix with an `NA` value in it.

``````matrix()
``````

yields

``````     [,1]
[1,]   NA
``````

What on earth are those commas? Let us create a matrix with 1 row and 3 columns. You can see I can assign select variable like this `variable_name=value`, without the need to list them all in order.

Note: Here we use the `=` assignment operator to assign to a parameter, its scope in this case is limited to the call. We cannot get access to the value at the top level, by calling `ncol` after the the matrix function.

``````vectorlike_matrix <- matrix(ncol=3)
vectorlike_matrix
``````

yields

``````     [,1] [,2] [,3]
[1,]   NA   NA   NA
``````

Now we have some more clarity. Matrix values are indexed by rows then columns in this fashion `[row,column]`. To make this more clear let us add another row. To help us I need to introduce some new functions

## `rbind()` and `cbind()`

Simply put these add rows or columns to a matrix, or can combine matrices. lets add combine the `vectorlike_matrix` above with itself to create a two by three matrix.

``````a_proper_matrix <- rbind(vectorlike_matrix, vectorlike_matrix)
a_proper_matrix
``````

yields what we expect:

``````     [,1] [,2] [,3]
[1,]   NA   NA   NA
[2,]   NA   NA   NA
``````

`rbind` and `cbind` are so similar that the online help shares a single entry for them. I’ll let you explore `cbind` by yourself.

## Filling our matrix

When we created our matrix we didn’t specify data at create time. We could have done so using a vector that would be unpacked row-wise or column-wise depending on the `byrow` parameter. The `byrow` parameter is FALSE if not specified so let us see what that looks like. Also, if we provide data we only need to specify either the column or row count. No need to do extra work

``````two_by_three <- matrix(1:6, nrow=2)
two_by_three
``````

We can see that the data was filled by column as expected

``````     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
``````

## Basic indexing

The `[]` operator on a matrix looks suspiciously like the one we had for vectors so lets try it out. Try indexing from 1 to 6 on our `two_by_three` and you will see that we get our values by row, without specifying two coordinates. For instance `two_by_three[4]` yields 4. In case you are wondering, this has nothing to do with the `byrow` option during contruction. Single indexing is by row, regardless of how the data was unpacked. This is great news, there are many cases where the two coordinate indexing of matrices just get in the way. But, most of the time its still easier to index by row and column.

`````` two_by_three[2,3]
``````

returns the value 6 as expected. But what if we want to work with rows or columns and treat them as vectors? Our output above describing the matrix already revealed how this is done : simply ommit the dimension you want to include completely, for instance `[,2]` gives me all rows, and only column two. This is similar to what we saw in our previous lesson when we tried to set the entire vector’s values. We wanted to set the entire vector’s values without changing the vector, so we used empty brackets `[]` to select all values to be updated. We can do the same here

Say we want to fill our matrix, we can use the `rep()` function to repeat a value…

`````` zero_matrix <- matrix(rep(0.0, 12), nrow=3)
``````

…or, we could have set all the values later with like this

`````` fill_me_up <- matrix(nrow=3, ncol=4)
fill_me_up[] <- 0
fill_me_up
``````

It would have been perfectly legal to use `fill_me_up[,] <- 0`. We can extend this idea to setting entire rows or columns as vectors:

`````` fill_me_up[,1] <- 1
fill_me_up
``````

will set the entire first column as 1.

We can also set a specific cell with basic indexing as you’d expect

`````` fill_me_up[2,1] <- 2
fill_me_up
``````

So far I’ve mainly shown the replace side of the extract \ replace `[]` operator. You can extract columns, rows, and cells similar to our replace operation. The only difference is that instead of setting values we yield them or assign them to other variables.

`````` row_one <- fill_me_up[1,]
column_two <- fill_me_up[,2]
cell_3_2 <- fill_me_up[3, 2]
fill_me_up[2, 2]
``````

## Using the extract \ replace operator beyond just indexing

In Part 4: Vector Extracting, Replacing and Excluding, we saw that we can use combined integer indexes, for example `myvector[c(1,3)]` would select the first and third element. We can do the same here. To prevent repeated typing of the same value We can use the `rep()` function.

``````three_by_three_identity_matrix <- matrix(rep(0.0, 9), nrow=3)
three_by_three_identity_matrix[c(1,5,9)] <-  1.0
three_by_three_identity_matrix
``````

Here you can see an excelent example of how the single indexing of a matrix helps us… lets generalize that matrix a bit. I will use the `seq()` function to generate a sequence of numbers with a step.

``````matrix_dim <- 10
num_cells <- matrix_dim ^ 2
identity_matrix <- matrix(rep(0.0, num_cells), nrow=matrix_dim)
identity_matrix[seq(1, num_cells, matrix_dim + 1)] <-  1.0
identity_matrix
``````

We are not restricted to using only single dimension syntax. We can apply index vector to the row or column side of the `,` inside the matrix `[]` operator. For instance

``````matrix_dim <- 10
x_matrix <- matrix(ncol=matrix_dim, nrow=matrix_dim)
x_matrix[] <- " "
x_matrix[c(3:5, 9), c(3, 7:9)] <- "X"
x_matrix[, c(1, 10)] <- "|"
x_matrix
``````

Gives us this output

``````     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "|"  " "  " "  " "  " "  " "  " "  " "  " "  "|"
[2,] "|"  " "  " "  " "  " "  " "  " "  " "  " "  "|"
[3,] "|"  " "  "X"  " "  " "  " "  "X"  "X"  "X"  "|"
[4,] "|"  " "  "X"  " "  " "  " "  "X"  "X"  "X"  "|"
[5,] "|"  " "  "X"  " "  " "  " "  "X"  "X"  "X"  "|"
[6,] "|"  " "  " "  " "  " "  " "  " "  " "  " "  "|"
[7,] "|"  " "  " "  " "  " "  " "  " "  " "  " "  "|"
[8,] "|"  " "  " "  " "  " "  " "  " "  " "  " "  "|"
[9,] "|"  " "  "X"  " "  " "  " "  "X"  "X"  "X"  "|"
[10,] "|"  " "  " "  " "  " "  " "  " "  " "  " "  "|"
``````

Extracting a subset from a matrix is again similar to replacing values. We either yield the value or assign it. as we saw above.

`````` double_x_pipe <- x_matrix[3:4, 9:10]
double_x_pipe
``````

shows that `double_x_pipe` is a nice sub-matrix

``````     [,1] [,2]
[1,] "X"  "|"
[2,] "X"  "|"
``````

But, look at the rows and columns. The indices are based on their ordinal positions and not on the original source. We will address this next.

Tags:

Categories:

Updated: