Converting Rows to Columns in R: A Guide to Transposing Data (feat. pivot_wider and pivot_longer)

Converting Rows to Columns in R: A Guide to Transposing Data (feat. pivot_wider and pivot_longer)

When data is arranged, it can be structured either vertically (row-based) or horizontally (column-based). The choice depends on your preference for organizing data. However, when running statistics, data should be arranged row-based, as variables need to be in the same column. On the other hand, when calculating per variable, it is much easier to organize data column-based, allowing for simpler calculations. Regardless of the approach, well-organized data is essential, and the ability to restructure data is a valuable skill. Today, I will introduce how to automatically reshape data both vertically and horizontally using R.

Let’s upload a dataset in R.

if(!require(readr)) install.packages("readr")
library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/wheat_grain_Fe_uptake.csv"
dataA= data.frame(read_csv(url(github), show_col_types= FALSE))

head(dataA, 10)
   Location Season Genotype Reps Iron_ton_ha      Stage   Fe
1      East   2021      CV9    1      7.5833 Vegetative 0.36
2      East   2021      CV9    2      4.1246 Vegetative 0.32
3      East   2021      CV9    3      3.0873 Vegetative 0.39
4      East   2021      CV9    4      7.4287 Vegetative 0.37
5      East   2021      CV9    5     11.4825 Vegetative 0.36
6      East   2021      CV9    6      6.2228 Vegetative 0.36
7      East   2021      CV9    7      6.0621 Vegetative 0.32
8      East   2021      CV9    8      3.5002 Vegetative 0.33
9      East   2021      CV9    9      2.7580 Vegetative 0.37
10     East   2021      CV9   10      7.6637 Vegetative 0.33
.
.
.

Now, I’d like to move the variables in the ‘Stage‘ column to individual columns. I’ll use pivot_wider()

if(!require(dplyr)) install.packages("dplyr")
if(!require(tidyr)) install.packages("tidyr")
library(dplyr)
library(tidyr)

dataB= data.frame(dataA %>%
                  group_by(Location, Season, Genotype, Iron_ton_ha) %>%
                  pivot_wider(names_from= Stage, values_from=Fe))

head(dataB, 10)
   Location Season Genotype Reps Iron_ton_ha Vegetative Reproductive Maturity
1      East   2021      CV9    1      7.5833       0.36         0.26     0.07
2      East   2021      CV9    2      4.1246       0.32         0.22     0.07
3      East   2021      CV9    3      3.0873       0.39         0.22     0.06
4      East   2021      CV9    4      7.4287       0.37         0.24     0.06
5      East   2021      CV9    5     11.4825       0.36         0.28     0.16
6      East   2021      CV9    6      6.2228       0.36         0.30     0.07
7      East   2021      CV9    7      6.0621       0.32         0.29     0.16
8      East   2021      CV9    8      3.5002       0.33         0.30     0.10
9      East   2021      CV9    9      2.7580       0.37         0.27     0.09
10     East   2021      CV9   10      7.6637       0.33         0.30     0.15
.
.
.

This is perfectly transposed from a vertical to a horizontal structure, but it only works correctly when all values are fully grouped. When values are repeated without grouping, pivot_wider() does not work as expected.

if(!require(readr)) install.packages("readr")
library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/refs/heads/main/treatment_without_grouping.csv"
dataC= data.frame(read_csv(url(github), show_col_types= FALSE))
dataC=subset(dataC, select=c(-Biomass))

head(dataC, 10)
   Treatment Block Yield
1    Control     I  75.3
2    Control     I  45.4
3    Control     I  92.2
4    Control     I  32.5
5    Control     I  85.1
6    Control     I  69.6
7    Control     I  84.1
8    Control     I  41.5
9    Control     I  38.4
10   Control    II  76.9
.
.
.

This dataset contains Treatment and Block for Yield; however, as shown, there are 9 replicates per block. When using pivot_wider(), the structure breaks due to the lack of grouping for Block.

dataD= data.frame(dataC %>%
                    group_by(Treatment) %>%
                    pivot_wider(names_from= Block, values_from=Yield))

head(dataD, 10)
   Treatment
1    Control
2 Fertilizer                                                                                                 I
1 75.3, 45.4, 92.2, 32.5, 85.1, 69.6, 84.1, 41.5, 38.4, 56.8, 44.8, 29.2, 66.1, 65.2, 48.2, 38.4, 61.8, 49.5
2 47.9, 27.7, 32.6, 29.9, 43.1, 45.2, 29.3, 33.6, 58.7, 49.7, 37.3, 53.3, 19.8, 31.9, 27.9, 40.6, 41.8, 18.1                                                                                                II

In this case, we can modify the code as follows. I added a grouping variable named rep. When grouping by Treatment and rep, the data structure works correctly.

dataE= dataC %>%
             group_by(Treatment, Block) %>%
             mutate(rep= row_number()) %>%
             ungroup()
                  
head(dataE, 3)
  Treatment Block Yield   rep
1 Control   I      75.3     1
2 Control   I      45.4     2
3 Control   I      92.2     3
.
.
.

dataF= data.frame(dataE %>%
                    group_by(Treatment, rep) %>%
                    pivot_wider(names_from= Block, values_from=Yield))

head(dataF, 15)
    Treatment rep    I    II   III    IV
1     Control   1 75.3  76.9 101.5  14.3
2     Control   2 45.4  70.4  63.7  58.1
3     Control   3 92.2  83.8  80.7  69.8
4     Control   4 32.5  36.3  92.6  56.7
5     Control   5 85.1  62.1  78.4  99.7
6     Control   6 69.6  92.0  69.5  39.5
7     Control   7 84.1  71.0  93.2  72.0
8     Control   8 41.5  67.6  40.4  62.3
9     Control   9 38.4 131.4  73.3 142.6
10 Fertilizer   1 47.9  44.2  34.1  30.8
11 Fertilizer   2 27.7  26.0  35.6  25.7
12 Fertilizer   3 32.6  23.4  28.7  28.3
13 Fertilizer   4 29.9  33.7  37.6  32.5
14 Fertilizer   5 43.1  44.2  36.8  31.4
15 Fertilizer   6 45.2  49.6  46.6  19.5
.
.
.

Let’s change it as vertically again.

dataG= data.frame(dataF %>%
                    pivot_longer(
                      cols= c(I,II,III,IV),
                      names_to= "Block",
                      values_to= "Yield"))

head(dataG, 10)
   Treatment rep Block Yield
1    Control   1     I  75.3
2    Control   1    II  76.9
3    Control   1   III 101.5
4    Control   1    IV  14.3
5    Control   2     I  45.4
6    Control   2    II  70.4
7    Control   2   III  63.7
8    Control   2    IV  58.1
9    Control   3     I  92.2
10   Control   3    II  83.8
.
.
.

We aim to develop open-source code for agronomy ([email protected])

© 2022 – 2025 https://agronomy4future.com – All Rights Reserved.

Last Updated: 12/21/2025

Comments are closed.