4 ✨ Creating New Data
Whew! Now that we’ve learned how to filter properly, and select columns, we can talk about non-destructive operations.
4.1 Creating New Columns using a Vector
Let’s return to our original friends
dataframe. Rather than replacing original data, we might prefer to add columns to reach the same purpose. Let’s return to the example where I want to include my friend’s last names. A smarter thing to do would be to add a column.
last_names<- c("A", "B", "C", "D", "E", "J")
#Method 1: Use $
friends$last_names<- last_names
#Method 2: Use []. Notice that the column name is in quotes, because this is a name. If you don't have the quotes, R will literally use the vector to create a new column for each element in the vector last_names.
friends[ ,'last_names']<- last_names
friends
## names ages DC_Resident fav_number last_names
## 1 Abram 34 TRUE 1.00 A
## 2 Bryant 35 FALSE 2.17 B
## 3 Colleen 32 FALSE 26.00 C
## 4 David 29 TRUE 7.00 D
## 5 Esther 30 FALSE 10.00 E
## 6 Jeremiah 30 TRUE 9.00 J
4.2 Merging two columns together into a new column
The last names are great, but we can improve even on this by paste()
ing the names
column and the last_names
column.
friends$full_names<- paste(friends$names, friends$last_names)
friends
## names ages DC_Resident fav_number last_names
## 1 Abram 34 TRUE 1.00 A
## 2 Bryant 35 FALSE 2.17 B
## 3 Colleen 32 FALSE 26.00 C
## 4 David 29 TRUE 7.00 D
## 5 Esther 30 FALSE 10.00 E
## 6 Jeremiah 30 TRUE 9.00 J
## full_names
## 1 Abram A
## 2 Bryant B
## 3 Colleen C
## 4 David D
## 5 Esther E
## 6 Jeremiah J
The paste function takes separate elements and combines them into one. Notice that this is different from the c()
function, which keeps elements separate into a single list.
There are many other ways to also combine columns together, like the stringr
package or the tidyr
package, which you can learn here, but are outside the scope of the book.
4.3 Merging two numerical columns into a new column
Combining two columns doesn’t make sense for numerical vectors– after all, if we want to combine 4 and 1, we don’t want 4, 1
, we want 5 as our answer. Let’s say I want to add my friend’s ages to their favorite number (no reason why). Since we can add, subtract, and any other mathematical operation to two vectors as long as they have the same number of elements, this is easy to do.
friends$age_and_fav_number<- friends$ages + friends$fav_number
friends
## names ages DC_Resident fav_number last_names
## 1 Abram 34 TRUE 1.00 A
## 2 Bryant 35 FALSE 2.17 B
## 3 Colleen 32 FALSE 26.00 C
## 4 David 29 TRUE 7.00 D
## 5 Esther 30 FALSE 10.00 E
## 6 Jeremiah 30 TRUE 9.00 J
## full_names age_and_fav_number
## 1 Abram A 35.00
## 2 Bryant B 37.17
## 3 Colleen C 58.00
## 4 David D 36.00
## 5 Esther E 40.00
## 6 Jeremiah J 39.00