How to calculate responsiveness in response to control using R?

In my previous post, I explained how to quantify phenotypic plasticity and introduced the concept of ‘responsiveness.’

□ Quantifying Phenotypic Plasticity of Crops

I introduced a formula to calculate responsiveness as (Treatment - Control) / Control.

Genotype	Control	Treatment	Responsiveness
A	100	90	-10.0%
B	120	70	-41.7%
C	115	90	-21.7%
D	95	85	-10.5%
E	110	105	-4.5%

However, when analyzing data, the format may not always be the same as above. Mostly, treatments (independent variable) are arranged in one column, and the dependent variable is arranged in another column. In this case, how to calculate responsiveness calculated as (Treatment - Control) / Control?

Let’s upload one data.

if(!require(readr)) install.packages("readr")
library(readr)

github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/fertilizer_treatment.csv"
dataA= data.frame(read_csv(url(github), show_col_types = FALSE))

print(head(dataA,5))
    Genotype Block variable value
1 Genotype_A     I  Control  42.9
2 Genotype_A    II  Control  41.6
3 Genotype_A   III  Control  28.9
4 Genotype_A    IV  Control  30.8
5 Genotype_B     I  Control  53.3
.
.
.

This data has two treatments (independent variables) with 4 replicates.

xtabs(~variable + Genotype, data=dataA)

            Genotype
variable      Genotype_A Genotype_B Genotype_C Genotype_D
  Control              4          4          4          4
  Fertilizer1          4          4          4          4
  Fertilizer2          4          4          4          4
  Fertilizer3          4          4          4          4

Now, I’ll calculate the ‘responsiveness’ of yield for each fertilizer in response to control.

if(!require(dplyr)) install.packages("dplyr")
library(dplyr)

dataA=data.frame(dataA %>%
      group_by(Genotype) %>%
      mutate(responsiveness=ifelse(variable!="Control", 
      (value-value[variable=="Control"])/value[variable=="Control"], NA)))

print(head(dataA,5))
    Genotype Block variable value responsiveness
1 Genotype_A     I  Control  42.9             NA
2 Genotype_A    II  Control  41.6             NA
3 Genotype_A   III  Control  28.9             NA
4 Genotype_A    IV  Control  30.8           NA
5 Genotype_B     I  Control  53.3             NA

I’ll check if this calculation is correct. Let’s download this data as an Excel file.

if(!require(writexl)) install.packages("writexl")
library(writexl)

write_xlsx (dataA,"C:/Users/Desktop/dataA.xlsx")

Then, I compared what I calculated manually with what R calculated, and they are the same.

How to calculate when several groups exist?

The previous data only includes replicates from a single group (genotype). My next question is how to calculate when multiple groups are present?

This dataset involves different planting rows, including outside, east, west, and middle divisions. It analyzes biomass yield per wheat head and stem in response to three conditions: control, tr1, and tr2. How should I calculate responsiveness for each group combination in this scenario? I already calculated responsiveness using excel.

library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/responsiveness.csv"
dataA= data.frame(read_csv(url(github),show_col_types = FALSE))

print(head(dataA,5))
  tissue     row plot treatment  yield responsiveness
1   Head outside    5   Control 159.30           <NA>
2   Head outside    5       Tr1  59.70         -62.5%
3   Head outside    5       Tr2  97.50         -38.8%
4   Stem outside    5   Control  74.65           <NA>
5   Stem outside    5       Tr1  34.05         -54.4%

Now I’ll calculate the responsiveness using R by groups.

if(!require(dplyr)) install.packages("dplyr")
library(dplyr)

dataB=data.frame(dataA %>%
      group_by(tissue, row, plot) %>%
      mutate(Calculation=ifelse(treatment!="Control", 
      (yield-yield[treatment=="Control"])/yield[treatment=="Control"], NA)))

print(head(dataB,5))
  tissue     row plot treatment  yield responsiveness Calculation
1   Head outside    5   Control 159.30           <NA>          NA
2   Head outside    5       Tr1  59.70         -62.5%  -0.6252354
3   Head outside    5       Tr2  97.50         -38.8%  -0.3879473
4   Stem outside    5   Control  74.65           <NA>          NA
5   Stem outside    5       Tr1  34.05         -54.4%  -0.5438714

However, as you can see, the calculated value differs from what I calculated using excel This issue can occur when the ‘plyr’ package is installed. To resolve it, you can either uninstall the ‘plyr’ package or use the dplyr::mutate function.

if(!require(dplyr)) install.packages("dplyr")
library(dplyr)

dataB=data.frame(dataA %>%
      group_by(tissue, row, plot) %>%
      dplyr::mutate(Calculation=ifelse(treatment!="Control", 
      (yield-yield[treatment=="Control"])/yield[treatment=="Control"], NA)))

print(head(dataB,5))
  tissue     row plot treatment  yield responsiveness Calculation
1   Head outside    5   Control 159.30           <NA>          NA
2   Head outside    5       Tr1  59.70         -62.5%  -0.6252354
3   Head outside    5       Tr2  97.50         -38.8%  -0.3879473
4   Stem outside    5   Control  74.65           <NA>          NA
5   Stem outside    5       Tr1  34.05         -54.4%  -0.5438714

Now, the value will be the same.

How about calculating responsiveness using the mean?

Next, I want to determine responsiveness by averaging the yield. Therefore, the first step is to calculate the average yield.

library (plyr)

dataC=ddply(dataA, c("tissue","row","treatment"), summarise, mean=mean(yield))
colnames(dataC)[4]=c("yield")

dataC
   tissue     row treatment    yield
1    Head    east   Control  71.8000
2    Head    east       Tr1  52.7375
3    Head    east       Tr2  45.1750
4    Head  middle   Control  62.9000
5    Head  middle       Tr1  57.8250
6    Head  middle       Tr2  37.2250
7    Head outside   Control 118.0500
8    Head outside       Tr1 137.6500
9    Head outside       Tr2  96.8250
10   Head    west   Control  73.8750
11   Head    west       Tr1  60.9125
12   Head    west       Tr2  47.6000
13   Stem    east   Control  48.5500
14   Stem    east       Tr1  35.1875
15   Stem    east       Tr2  53.9375
16   Stem  middle   Control  41.3750
17   Stem  middle       Tr1  36.3750
18   Stem  middle       Tr2  63.6250
19   Stem outside   Control  66.9500
20   Stem outside       Tr1  71.7750
21   Stem outside       Tr2  79.1000
22   Stem    west   Control  50.6250
23   Stem    west       Tr1  38.1500
24   Stem    west       Tr2  55.2250

Then, I’ll calculate responsiveness again.

library(dplyr)

dataD=data.frame(dataC %>%
      group_by(tissue, row) %>%
      dplyr::mutate(responsiveness=ifelse(treatment!="Control", 
      (yield-yield[treatment=="Control"])/yield[treatment=="Control"], NA)))

dataD
   tissue     row treatment    yield responsiveness
1    Head    east   Control  71.8000             NA
2    Head    east       Tr1  52.7375    -0.26549443
3    Head    east       Tr2  45.1750    -0.37082173
4    Head  middle   Control  62.9000             NA
5    Head  middle       Tr1  57.8250    -0.08068362
6    Head  middle       Tr2  37.2250    -0.40818760
7    Head outside   Control 118.0500             NA
8    Head outside       Tr1 137.6500     0.16603134
9    Head outside       Tr2  96.8250    -0.17979670
10   Head    west   Control  73.8750             NA
11   Head    west       Tr1  60.9125    -0.17546531
12   Head    west       Tr2  47.6000    -0.35566836
13   Stem    east   Control  48.5500             NA
14   Stem    east       Tr1  35.1875    -0.27523172
15   Stem    east       Tr2  53.9375     0.11096807
16   Stem  middle   Control  41.3750             NA
17   Stem  middle       Tr1  36.3750    -0.12084592
18   Stem  middle       Tr2  63.6250     0.53776435
19   Stem outside   Control  66.9500             NA
20   Stem outside       Tr1  71.7750     0.07206871
21   Stem outside       Tr2  79.1000     0.18147872
22   Stem    west   Control  50.6250             NA
23   Stem    west       Tr1  38.1500    -0.24641975
24   Stem    west       Tr2  55.2250     0.09086420

Tip!! How about we consider reshaping the data format??

Let’s go back to dataA. After going through the above process to calculate responsiveness due to variables being in the same column, how about we reshape the data format to facilitate the calculation of responsiveness more easily?

dataA
     Genotype Block    variable value
1  Genotype_A     I     Control  42.9
2  Genotype_A    II     Control  41.6
3  Genotype_A   III     Control  28.9
4  Genotype_A    IV     Control  30.8
5  Genotype_B     I     Control  53.3
6  Genotype_B    II     Control  69.6
7  Genotype_B   III     Control  45.4
8  Genotype_B    IV     Control  35.1
9  Genotype_C     I     Control  62.3
10 Genotype_C    II     Control  58.5
.
.
.

Let’s use the below code to reshape data format.

library(dplyr)
library(tidyr)
dataB= data.frame(dataA %>%
                    group_by(Genotype, Block, variable) %>%
                    spread(key=variable, value=value))

dataB
     Genotype Block Control Fertilizer1 Fertilizer2 Fertilizer3
1  Genotype_A     I    42.9        53.8        49.5        44.4
2  Genotype_A    II    41.6        58.5        53.8        41.8
3  Genotype_A   III    28.9        43.9        40.7        28.3
4  Genotype_A    IV    30.8        46.3        39.4        34.7
5  Genotype_B     I    53.3        57.6        59.8        64.1
6  Genotype_B    II    69.6        69.6        65.8        57.4
7  Genotype_B   III    45.4        42.4        41.4        44.1
8  Genotype_B    IV    35.1        51.9        45.4        51.6
9  Genotype_C     I    62.3        63.4        64.5        63.6
10 Genotype_C    II    58.5        50.4        46.1        56.1
11 Genotype_C   III    44.6        45.0        62.6        52.7
12 Genotype_C    IV    50.3        46.7        50.3        51.8
13 Genotype_D     I    75.4        70.3        68.8        71.6
14 Genotype_D    II    65.6        67.3        65.3        69.4
15 Genotype_D   III    54.0        57.6        45.6        56.6
16 Genotype_D    IV    52.7        58.5        51.0        47.4

Then we can easily calculate the responsiveness.

dataB$R_Fertilizer1=(dataB$Fertilizer1-dataB$Control)/dataB$Control
dataB$R_Fertilizer2=(dataB$Fertilizer2-dataB$Control)/dataB$Control
dataB$R_Fertilizer3=(dataB$Fertilizer2-dataB$Control)/dataB$Control

dataB
     Genotype Block Control Fertilizer1 Fertilizer2 Fertilizer3 R_Fertilizer1 R_Fertilizer2 R_Fertilizer3
1  Genotype_A     I    42.9        53.8        49.5        44.4    0.25407925   0.153846154   0.153846154
2  Genotype_A    II    41.6        58.5        53.8        41.8    0.40625000   0.293269231   0.293269231
3  Genotype_A   III    28.9        43.9        40.7        28.3    0.51903114   0.408304498   0.408304498
4  Genotype_A    IV    30.8        46.3        39.4        34.7    0.50324675   0.279220779   0.279220779
5  Genotype_B     I    53.3        57.6        59.8        64.1    0.08067542   0.121951220   0.121951220
6  Genotype_B    II    69.6        69.6        65.8        57.4    0.00000000  -0.054597701  -0.054597701
7  Genotype_B   III    45.4        42.4        41.4        44.1   -0.06607930  -0.088105727  -0.088105727
8  Genotype_B    IV    35.1        51.9        45.4        51.6    0.47863248   0.293447293   0.293447293
9  Genotype_C     I    62.3        63.4        64.5        63.6    0.01765650   0.035313002   0.035313002
10 Genotype_C    II    58.5        50.4        46.1        56.1   -0.13846154  -0.211965812  -0.211965812
11 Genotype_C   III    44.6        45.0        62.6        52.7    0.00896861   0.403587444   0.403587444
12 Genotype_C    IV    50.3        46.7        50.3        51.8   -0.07157058   0.000000000   0.000000000
13 Genotype_D     I    75.4        70.3        68.8        71.6   -0.06763926  -0.087533156  -0.087533156
14 Genotype_D    II    65.6        67.3        65.3        69.4    0.02591463  -0.004573171  -0.004573171
15 Genotype_D   III    54.0        57.6        45.6        56.6    0.06666667  -0.155555556  -0.155555556
16 Genotype_D    IV    52.7        58.5        51.0        47.4    0.11005693  -0.032258065  -0.032258065

Then, let’s reshape data again.

dataC= data.frame(pivot_longer(dataB, 
                   cols= starts_with("R_Fertilizer"), 
                   names_to= "Responsiveness", 
                   values_to= "Value"))
dataC
     Genotype Block Control Fertilizer1 Fertilizer2 Fertilizer3 Responsiveness        Value
1  Genotype_A     I    42.9        53.8        49.5        44.4  R_Fertilizer1  0.254079254
2  Genotype_A     I    42.9        53.8        49.5        44.4  R_Fertilizer2  0.153846154
3  Genotype_A     I    42.9        53.8        49.5        44.4  R_Fertilizer3  0.153846154
4  Genotype_A    II    41.6        58.5        53.8        41.8  R_Fertilizer1  0.406250000
5  Genotype_A    II    41.6        58.5        53.8        41.8  R_Fertilizer2  0.293269231
6  Genotype_A    II    41.6        58.5        53.8        41.8  R_Fertilizer3  0.293269231
7  Genotype_A   III    28.9        43.9        40.7        28.3  R_Fertilizer1  0.519031142
8  Genotype_A   III    28.9        43.9        40.7        28.3  R_Fertilizer2  0.408304498
9  Genotype_A   III    28.9        43.9        40.7        28.3  R_Fertilizer3  0.408304498
10 Genotype_A    IV    30.8        46.3        39.4        34.7  R_Fertilizer1  0.503246753

We aim to develop open-source code for agronomy ([email protected])

Last Updated: 05/11/2023