Getting Started with R - Part 2: Vector Basics

3 minute read

Expanding on the basics of the simple data classes in Part 1: The Console and Variables, we now start with the first higher dimension construct: vectors

I am posting this tutorial as I learn R. I will respond to feedback for errata in the comments.

Combining values

Vectors are created by combining values under a single structure using the c() function. Think c for combine. We can create vectors of our simple types: numeric, logical, characters and integer

combined_numeric <- c(12.3, 34.5, 67.8)
combined_logical <- c(TRUE, FALSE, TRUE, TRUE)
combined_integer <- c(1, 2, 4)

combined_numeric
combined_logical
combined_integer

Try a variable of combined characters type on your own. Also try to find out the class of the vectors you created. Was it what you expected? As you can see it returned the class of the first element.

Coercing

What happens if we mix types? Go ahead and try.

combined_mixed <- c(12.3, FALSE, 4)
combined_mixed

FALSE was converted to 0.0 and 4 to 4.0. This is called coercing and it can get alot worse than this sample. What happens if we try to combine those with a characters element?

combined_basic_classes <- c(12.3, FALSE, 4, "Arggh")
combined_basic_classes

Call class against combined_mixed and combined_basic_classes. Do you understand the result that is returned? Again it returns the class of the first element, in this case the coerced type.

Careful! When combining different classes into a vector. Values are coerced to a type that can represent them all, often the numeric or characters type

Basic Indexing

We have already touched on the fact that R uses 1-based indexing. This is different from most other programming languages that use 0-based indexing

combined_numeric <- c(12.3, 34.5, 67.8)
combined_logical <- c(TRUE, FALSE, TRUE, TRUE)
combined_integer <- c(1, 2, 4)

combined_numeric[1]
combined_logical[2]
combined_integer[3]

The [] syntax is used to denote the extract\replace operator. For now think of it as a way to index values, we’ll dig into its advanced properties later.

As you can see the first, second and third values of each of the vectors were returned. What if we pass an index outside the range such as combined_integer[4]. Try it.

combined_integer <- c(1, 2, 4)
combined_integer[4]

Did you get what you expected? Most programming languages would throw an “Index out of bounds” exception. R returns a polite NA value. Simply it means “Not Available” (the value is missing). NA is somewhat analogous to a null value, in that it shows the absence of a value. In the next post it will hopefully become more clear why NA is returned. Also, as you will soon see, NULL is also present in the R language

Try the following

combined_integer <- c(1, 2, 4)
combined_integer[0]

Bet you were expecting an NA value. This is an odd case. The online help states “Rows with an index 0 are ignored” and “An index of NULL is treated as integer(0)”. So it was essentially the same as calling combined_integer[NULL].

Help

Before I continue with more advanced vector value extracting, a quick note on the help system. To get help on anything simply prefix ? to the function For instance, to get help on the combine function we can use:

?c

or in some cases we need to quote the item for which we want help, like this:

?"["

Naming and combining vectors

Sometimes working with vectors is easier if we can name the indices. To do that we can use the names function

probability_of_rain <- c(0.8, 0.2, 0.05, 0.4, 0.65)
names(probability_of_rain) = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
probability_of_rain

Now we see a nice output showing the labels over each indexed value

   Monday   Tuesday Wednesday  Thursday    Friday 
     0.80      0.20      0.05      0.40      0.65 

Cool, say we have another two vectors like this

probability_of_rain_work <- c(0.8, 0.2, 0.05, 0.4, 0.65)
names(probability_of_rain_work) = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
probability_of_rain_play <- c(0.1, 0.0)
names(probability_of_rain_play) = c("Saturday", "Sunday")

How can I combine the two vectors? Combine? Combine? yes, combine as in c()

c(probability_of_rain_play, probability_of_rain_work)

yields this nice result

 Saturday    Sunday    Monday   Tuesday Wednesday  Thursday    Friday 
     0.10      0.00      0.80      0.20      0.05      0.40      0.65 

Of course, as before, we can assign the result to a new variable as well. Use ?c, notice that the signature can accept a parameter use.names that has a default value TRUE. This is why our index names were preserved

Leave a Comment