[Data article] Visualizing Responsiveness: Integrating Raw Data for a Holistic Dataset View

■ [R package] Embedding Key Descriptive Statistics within Original Data (Feat. descriptivestat)
■ [R package] Calculate the responsiveness of each treatment relative to a control (Feat. deltactrl)
In my previous posts, I introduced two R packages. The first package, descriptivestat()
, displays raw data along with mean values and additional descriptive statistics. The second package, deltactrl()
, calculates the responsiveness of dependent variables in response to the control. Today, in this article, I will demonstrate how combining these two R packages allows us to easily create different types of figures to better understand the dataset.
First, I will upload a dataset.”
if (!require("rio")) install.packages("rio")
library(rio)
url= "https://github.com/agronomy4future/raw_data_practice/raw/main/wheat_grain_size_big_data.RData"
df= import(url)
df1=subset(df, fungicide!="N/A" & Genotype=="Peele")
df1= subset(df1, select = c(-fertilizer, -Shoot, -Length.mm., -Width.mm.))
print(tail(df1,5))
Field Genotype Block fungicide planting_date Area.mm2.
.
.
.
96315 South Peele III No early 13.687
96316 South Peele III No early 11.058
96317 South Peele III No early 9.154
96318 South Peele II No late 18.092
96319 South Peele II No late 18.092
This dataset contains 96,319 rows, making it difficult to perform certain calculations in Excel. In this case, R programming is an efficient tool for conducting data analysis. I will create a bar graph based on this dataset to analyze how the grain area differs at different planting dates, depending on whether fungicide is applied.
First, I will rename the variables and reorder them accordingly.
if (!require("dplyr")) install.packages("dplyr")
library(dplyr)
dataA= df1 %>%
mutate (planting_date= case_when(
planting_date== "early" ~ "Early",
planting_date== "normal" ~ "Normal",
planting_date== "late" ~ "Late",
TRUE ~ as.character(planting_date)
))
dataA$planting_date=factor(dataA$planting_date, levels=c("Early","Normal","Late"))
print(head(dataA,5))
Field Genotype Block fungicide planting_date Area.mm2.
.
.
.
96315 South Peele III No Early 13.687
96316 South Peele III No Early 11.058
96317 South Peele III No Early 9.154
96318 South Peele II No Late 18.092
96319 South Peele II No Late 18.092
and I will summarize the dataset to create a bar graph.
if (!require("dplyr")) install.packages("dplyr")
library(dplyr)
dataB= data.frame(dataA %>%
group_by(fungicide, planting_date) %>%
dplyr::summarize(across(c(Area.mm2.),
.fns= list(Mean=~mean(., na.rm= TRUE),
SD= ~sd(., na.rm= TRUE),
n=~length(.),
se=~sd(.,na.rm= TRUE) / sqrt(length(.))))))
print(dataB)
fungicide planting_date Area.mm2._Mean Area.mm2._SD Area.mm2._n Area.mm2._se
No Early 13.97312 2.479379 2054 0.05470698
No Normal 15.78524 3.087877 2217 0.06558086
No Late 17.65699 2.787904 1628 0.06909563
Yes Early 14.23392 2.569342 2243 0.05425093
Yes Normal 16.87129 2.610428 2434 0.05291166
Yes Late 17.59145 2.746128 1636 0.06789364
Okay! Let’s create a bar graph.
if(!require("ggplot2")) install.packages("ggplot2")
library(ggplot2)
Fig1=ggplot(data=dataB, aes(x=planting_date, y=Area.mm2._Mean, fill=fungicide)) +
geom_bar(stat="identity", position="dodge", width=0.91, size=1) +
geom_errorbar(aes(ymin=Area.mm2._Mean-Area.mm2._SD, ymax=Area.mm2._Mean+Area.mm2._SD),
position=position_dodge(0.9), width=0.5) +
scale_fill_manual(name="Fungicide", values= c("grey35", "grey75")) +
geom_text(aes(family="serif", x=0.8, y=18, label="e"), size=5, color="black") +
geom_text(aes(family="serif", x=1.2, y=19, label="d"), size=5, color="black") +
geom_text(aes(family="serif", x=1.8, y=20, label="c"), size=5, color="black") +
geom_text(aes(family="serif", x=2.2, y=21, label="b"), size=5, color="black") +
geom_text(aes(family="serif", x=2.8, y=23, label="a"), size=5, color="black") +
geom_text(aes(family="serif", x=3.2, y=23, label="a"), size=5, color="black") +
geom_text(aes(family="serif", x=2, y=25, label="***"), size=6, color="red") +
scale_y_continuous(breaks = seq(0, 30, 5), limits = c(0, 30)) +
labs(x="", y="Yield") +
theme_classic(base_size= 15, base_family = "serif") +
theme(legend.position=c(0.15,0.87),
legend.title= element_text(family="serif", size=14, color="black"),
legend.key=element_rect(color="white", fill=alpha(0.5)),
legend.text=element_text(family="serif", face="plain",
size=13, color="black"),
legend.background= element_rect(fill=alpha(0.5)),
panel.border= element_rect(color="black", fill=NA, linewidth=0.5),
axis.line= element_line(linewidth= 0.5, colour= "black"),
strip.background=element_rect(color="white",
linewidth=0.5, linetype="solid"))
Fig1+windows(width=5.5, height= 5)
ggsave("C:/Users/agron/Fig1.jpg",
Fig1, width=9*2.54, height=5*2.54, units="cm", dpi=1000)

This figure shows how grain area differs at various planting dates, with or without fungicide application. At later planting dates, grain area tends to be greater (the error bar represents the standard deviation).
While this figure is useful, if our primary interest is not the grain area differences but rather the effect of fungicide, it would be better to focus on the responsiveness of grain area to fungicide, rather than displaying the grain area itself.
Additionally, to provide a more comprehensive view of the dataset, using raw data points instead of mean values would better capture the variation in the data.
The two R packages, descriptivestat()
and deltactrl()
, make it easy to conduct such data analysis.

First, let’s load the necessary libraries.
if(!require(remotes)) install.packages("remotes")
library(remotes)
# deltactrl
if (!requireNamespace("deltactrl", quietly = TRUE)) {
remotes::install_github("agronomy4future/deltactrl", force= TRUE)
}
library(deltactrl)
# descriptivestat
if (!requireNamespace("descriptivestat", quietly = TRUE)) {
remotes::install_github("agronomy4future/descriptivestat", force= TRUE)
}
library(descriptivestat)
I will calculate the responsiveness of grain area in response to fungicide application. The ‘fungicide’ column contains two levels: ‘Yes’ (fungicide application) and ‘No’ (no fungicide). I will use ‘No’ as the control (baseline), and the responsiveness will be calculated as (Treatment - Control) / Control
, which translates to (Yes - No) / No
.
dataC= deltactrl(
data= dataA,
group_vars= c("planting_date"),
treatment_var= fungicide,
control_label= No,
response_vars= c("Area.mm2.")
)
print(tail(dataC,5))
Field Genotype Block fungicide planting_date Area.mm2. responsive_Area.mm2.
.
.
.
South Peele III No Early 13.7 NA
South Peele III No Early 11.1 NA
South Peele III No Early 9.15 NA
South Peele II No Late 18.1 NA
South Peele II No Late 18.1 NA
In this code, you can designate the control by setting control_label= No
. You can also calculate responsiveness by grouping the data with group_vars = c('planting_date'
). With these conditions, I’ll calculate the responsiveness of the grain area (Area.mm2
).
Responsiveness will be calculated by adding a new column: responsive_Area.mm2
. This column indicates how the grain area changes when fungicide is applied. Next, I’ll add descriptive statistics to dataC
.
dataD= descriptivestat(data= dataC, group_vars= c("planting_date","fungicide"),
value_vars= c("responsive_Area.mm2."),
output_stats= c("sd"))
I added the standard deviation; output_stats = c('sd')
, calculated by grouping the data by planting date and fungicide application. Since ‘no fungicide’ is regarded as the baseline, it is not necessary to include it in the final output. Therefore, I will delete it.
dataE= subset(dataD, fungicide!="No")
Let’s create the figure.
if(!require(ggplot2)) install.packages("ggplot2")
library(ggplot2)
Fig2= ggplot() +
geom_jitter(data= subset(dataE, category=="observed"),
aes(x= planting_date, y= responsive_Area.mm2., fill= planting_date,
shape=planting_date),
width=0.2, alpha=0.5,
size=2, color="grey75") +
geom_errorbar(data= subset(dataE, category=="mean"),
aes(x= planting_date, ymin= responsive_Area.mm2.-sd.responsive_Area.mm2.,
ymax=responsive_Area.mm2.+sd.responsive_Area.mm2.),
width= 0.1, color= "black") +
geom_point(data= subset(dataE, category=="mean"),
aes(x= planting_date, y= responsive_Area.mm2., fill= planting_date,
shape=planting_date),
size=4, color="black", stroke=1.5) +
geom_text(aes(family="serif", x=1, y=0.65, label="b"), size=5, color="black") +
geom_text(aes(family="serif", x=2, y=0.7, label="a"), size=5, color="black") +
geom_text(aes(family="serif", x=3, y=0.55, label="c"), size=5, color="black") +
geom_text(aes(family="serif", x=2, y=0.85, label="***"), size=6, color="red") +
scale_fill_manual(values= c("darkred", "grey35", "darkblue")) +
scale_shape_manual(values = c(21,22,23)) +
geom_hline(yintercept=0, linetype="dashed", color="black", size=0.5) +
scale_y_continuous(breaks=seq(-1,1,0.5), limits = c(-1,1)) +
#facet_wrap(~ genotype, scales = "free") +
labs(x= NULL, y="Responsiveness to
fungicide application") +
theme_classic(base_size=18, base_family="serif") +
theme(
legend.position="none",
legend.key=element_rect(color="white", fill="white"),
legend.text=element_text(family="serif", face="plain",
size=15, color= "black"),
legend.background=element_rect(fill=alpha("white", 0.5)),
strip.background= element_rect(color="white", linewidth=0.5, linetype="solid"),
panel.border= element_rect(color="black", fill=NA, linewidth=0.5),
panel.grid.major= element_line(color="grey90", linetype="dashed"),
axis.line= element_blank()
)
Fig2+windows(width=5.5, height= 5)
ggsave("C:/Users/agron/Fig2.jpg",
Fig2, width=9*2.54, height=5*2.54, units="cm", dpi=1000)

This figure shows how grain area responds to fungicide application at different planting dates. The dotted line represents no fungicide application, meaning that if the responsiveness is 0, there is no difference in grain area between fungicide and non-fungicide treatments. The figure indicates that at the normal planting date, grain area is most responsive to fungicide application.
If you copy and paste this entire code into your R console, you will generate the same figure shown above.
if (!require("rio")) install.packages("rio")
if(!require(remotes)) install.packages("remotes")
if (!requireNamespace("deltactrl", quietly = TRUE)) {
remotes::install_github("agronomy4future/deltactrl", force= TRUE)
}
if (!requireNamespace("descriptivestat", quietly = TRUE)) {
remotes::install_github("agronomy4future/descriptivestat", force= TRUE)
}
if (!require("dplyr")) install.packages("dplyr")
if(!require("ggplot2")) install.packages("ggplot2")
library(rio)
library(remotes)
library(deltactrl)
library(descriptivestat)
library(dplyr)
library(ggplot2)
url= "https://github.com/agronomy4future/raw_data_practice/raw/main/wheat_grain_size_big_data.RData"
df= import(url)
df1=subset(df, fungicide!="N/A" & Genotype=="Peele")
df1= subset(df1, select = c(-fertilizer, -Shoot, -Length.mm., -Width.mm.))
dataA= df1 %>%
mutate (planting_date= case_when(
planting_date== "early" ~ "Early",
planting_date== "normal" ~ "Normal",
planting_date== "late" ~ "Late",
TRUE ~ as.character(planting_date)
))
dataA$planting_date=factor(dataA$planting_date, levels=c("Early","Normal","Late"))
dataC= deltactrl(
data= dataA,
group_vars= c("planting_date"),
treatment_var= fungicide,
control_label= No,
response_vars= c("Area.mm2.")
)
dataD= descriptivestat(data= dataC, group_vars= c("planting_date","fungicide"),
value_vars= c("responsive_Area.mm2."),
output_stats= c("sd"))
dataE= subset(dataD, fungicide!="No")
Fig2= ggplot() +
geom_jitter(data= subset(dataE, category=="observed"),
aes(x= planting_date, y= responsive_Area.mm2., fill= planting_date,
shape=planting_date),
width=0.2, alpha=0.5,
size=2, color="grey75") +
geom_errorbar(data= subset(dataE, category=="mean"),
aes(x= planting_date, ymin= responsive_Area.mm2.-sd.responsive_Area.mm2.,
ymax=responsive_Area.mm2.+sd.responsive_Area.mm2.),
width= 0.1, color= "black") +
geom_point(data= subset(dataE, category=="mean"),
aes(x= planting_date, y= responsive_Area.mm2., fill= planting_date,
shape=planting_date),
size=4, color="black", stroke=1.5) +
geom_text(aes(family="serif", x=1, y=0.65, label="b"), size=5, color="black") +
geom_text(aes(family="serif", x=2, y=0.7, label="a"), size=5, color="black") +
geom_text(aes(family="serif", x=3, y=0.55, label="c"), size=5, color="black") +
geom_text(aes(family="serif", x=2, y=0.85, label="***"), size=6, color="red") +
scale_fill_manual(values= c("darkred", "grey35", "darkblue")) +
scale_shape_manual(values = c(21,22,23)) +
geom_hline(yintercept=0, linetype="dashed", color="black", size=0.5) +
scale_y_continuous(breaks=seq(-1,1,0.5), limits = c(-1,1)) +
#facet_wrap(~ genotype, scales = "free") +
labs(x= NULL, y="Responsiveness to
fungicide application") +
theme_classic(base_size=18, base_family="serif") +
theme(
legend.position="none",
legend.key=element_rect(color="white", fill="white"),
legend.text=element_text(family="serif", face="plain",
size=15, color= "black"),
legend.background=element_rect(fill=alpha("white", 0.5)),
strip.background= element_rect(color="white", linewidth=0.5, linetype="solid"),
panel.border= element_rect(color="black", fill=NA, linewidth=0.5),
panel.grid.major= element_line(color="grey90", linetype="dashed"),
axis.line= element_blank()
)
Fig2+windows(width=5.5, height= 5)


I created two different formats of the figure using the same dataset. The bar graph (left figure) is the typical format we use, while the raw data with mean graph (right figure) provides a different insight for analyzing the data.
What is more, calculating responsiveness can reduce the number of variables (in this figure, ‘No fungicide’ has been omitted). This approach is also useful when there are many variables that take up too much space in the figure.
■ code summary: https://github.com/agronomy4future/r_code/blob/main/%5BData_article%5D_Visualizing_Responsiveness_Integrating_Raw_Data_for_a_Holistic_Dataset_View.ipynb

We aim to develop open-source code for agronomy ([email protected])
© 2022 – 2025 https://agronomy4future.com – All Rights Reserved.
Last Updated: 09/06/2025