if(!require(readr)) install.packages("readr")
if(!require(dplyr)) install.packages("dplyr")
if(!require(tidyr)) install.packages("tidyr")
library (readr)
library(dplyr)
library(tidyr)
github= paste0("https://raw.githubusercontent.com/",
"agronomy4future/raw_data_practice/",
"main/yield_per_location.csv")
df=data.frame(read_csv(url(github),show_col_types= FALSE))
df.transpose= data.frame(df %>%
group_by(Genotype, Nitrogen, Block) %>%
pivot_longer(
cols= c(Location1, Location2, Location3, Location4, Location5,
Location6, Location7, Location8, Location9, Location10,
Location11, Location12),
names_to= "Location",
values_to= "Yield"))
print(head(df.transpose, 12))
Genotype Nitrogen Block Location Yield
1 CV1 N0 I Location1 98.0
2 CV1 N0 I Location2 96.5
3 CV1 N0 I Location3 115.8
4 CV1 N0 I Location4 94.1
5 CV1 N0 I Location5 82.8
6 CV1 N0 I Location6 115.8
7 CV1 N0 I Location7 110.0
8 CV1 N0 I Location8 97.9
9 CV1 N0 I Location9 107.6
10 CV1 N0 I Location10 128.6
11 CV1 N0 I Location11 74.3
12 CV1 N0 I Location12 121.3
.
.
.
Here is a dataset. I want to replace specific text values. First, I want to change “CV1” to “Genotype1”.
df.transpose= df.transpose %>%
mutate(Genotype = gsub("CV1", "Genotype1", Genotype))
print(head(df.transpose, 5))
Genotype Nitrogen Block Location Yield
1 Genotype1 N0 I Location1 98.0
2 Genotype1 N0 I Location2 96.5
3 Genotype1 N0 I Location3 115.8
4 Genotype1 N0 I Location4 94.1
5 Genotype1 N0 I Location5 82.8
In the same way, let’s change “Location1” to “Site1”.
df.transpose= df.transpose %>%
mutate(Location= gsub("Location1", "Site1", Location))
print(head(df.transpose, 5))
Genotype Nitrogen Block Location Yield
1 Genotype1 N0 I Site1 98.0
2 Genotype1 N0 I Location2 96.5
3 Genotype1 N0 I Location3 115.8
4 Genotype1 N0 I Location4 94.1
5 Genotype1 N0 I Location5 82.8
However, there is a catch. If the sequence goes up to CV10 or Location10, we would have to write this code 10 times. To avoid this kind of tedious, repetitive work, I suggest using the following code.
dataA= df.transpose %>%
mutate(
Site = as.numeric(gsub("Location", "", Location)),
SiteInfo = gsub("Location", "Site", Location)
)
print(head(dataA, 5))
Genotype Nitrogen Block Location Yield Site SiteInfo
1 CV1 N0 I Location1 98.0 1 Site1
2 CV1 N0 I Location2 96.5 2 Site2
3 CV1 N0 I Location3 115.8 3 Site3
4 CV1 N0 I Location4 94.1 4 Site4
5 CV1 N0 I Location5 82.8 5 Site5
This code creates a new dataframe called dataA by taking df.transpose and applying mutate() to add two new columns. The first column, Site, is created by removing the word "Location" from each entry in the Location column using gsub() and then converting the remaining numeric string to an actual number using as.numeric(), so for example "Location3" becomes simply 3. The second column, SiteInfo, is created by replacing the word "Location" with "Site" in each entry of the Location column, so for example "Location3" becomes "Site3". The original Location column is kept intact, and the two new columns are added alongside it in the resulting dataframe dataA.
In the same way, let’s change “CV” to “Genotype”.
dataA= df.transpose %>%
mutate(
CV = as.numeric(gsub("CV", "", Genotype)),
CVInfo = gsub("CV", "Genotype", Genotype)
)
print(tail(dataA, 5))
Genotype Nitrogen Block Location Yield CV CVInfo
320 CV3 N2 III Location8 117.9 3 Genotype3
321 CV3 N2 III Location9 129.6 3 Genotype3
322 CV3 N2 III Location10 154.8 3 Genotype3
323 CV3 N2 III Location11 89.5 3 Genotype3
324 CV3 N2 III Location12 146.0 3 Genotype3

We aim to develop open-source code for agronomy ([email protected])
© 2022 – 2025 https://agronomy4future.com – All Rights Reserved.
Last Updated: 05/24/2026