Browsed by
Category: R programming

How to create separate linear and quadratic regression graphs for each group in the same panel using R?

How to create separate linear and quadratic regression graphs for each group in the same panel using R?

When we draw regression lines for a group, they are usually of the same type, such as simple linear regression. Here is an example using yield data for different nitrogen rates per genotype. Then, the regression graph for each group would be shown below. I think it would be better to show the quadratic regression line for genotype A. In this case, how can we create separate linear and quadratic regression graphs for each group in the same panel? Data…

Read More Read More

[STAT Article] What is the statistical method for comparing whether the slopes and y-intercepts in a regression model are the same or not (Feat. ANCOVA using R and SAS)?

[STAT Article] What is the statistical method for comparing whether the slopes and y-intercepts in a regression model are the same or not (Feat. ANCOVA using R and SAS)?

To gain a basic understanding of the topic, I recommend reading the following posts. Analysis of Covariance (ANCOVA) I have a dataset as shown below, and I would like to analyze crop yield, and height based on different fertilizer types (Control, Slow-release, and Fast-release). The experimental design is a Completely Randomized Design (CRD) with 10 replicates. Rep Fertilizer Yield Height Fertilizer Yield Height Fertilizer Yield Height 1 Control 12.2 45.0 Slow 16.6 63.0 Fast 9.5 52.0 2 Control 12.4 52.0…

Read More Read More

What is the F-ratio in statistics?

What is the F-ratio in statistics?

Today, I will explain the meaning of the F-value in testing for significance through statistical processing. Let me give you an example. Suppose we want to determine whether there are differences in the yield according to the varieties (A, B, C). The total experimental unit is 12 (3 varieties x 4 replicates). What would happen if there is a significant difference in yield among varieties A and C? If there is a large difference in yield between these varieties, the…

Read More Read More

Advanced Text Formatting in R STUDIO Graphs: Superscripts and Subscripts

Advanced Text Formatting in R STUDIO Graphs: Superscripts and Subscripts

Sometimes, when creating graphs using R, there may be a need to include superscripts or subscripts in axis text or titles. In this post, I will introduce about how to enter text with superscripts or subscripts. I will generate one simple data and draw a graph to demonstrate. Here, I want to add superscripts or subscripts to the axis titles of the graph. For example, for the x-axis, I want to name it as “GenotypeTM” and for the y-axis, I…

Read More Read More

What is logistic regression (feat. odds, odds ratio and model equation)?

What is logistic regression (feat. odds, odds ratio and model equation)?

Logistic regression is a type of statistical analysis used to model the relationship between a binary (yes/no) dependent variable and independent variables. The goal of logistic regression is to find a relationship between the independent variables (x) and the probability of a particular outcome for the dependent variable (y). The logistic regression model calculates the probability of a certain outcome by applying a logistic function to the linear combination of the independent variables. Here is one example. Sulphur improves plant…

Read More Read More

In R, how to adjust the unit of axis in graph?

In R, how to adjust the unit of axis in graph?

When we make graphs, the unit is great and the number would be overlapped. Here is an example. Now, I’d like to change the unit of number. For example, I want to divide each value by 1000, so that to show 5 to 30 in x-axis. We can add below codes.

What is split-split-plot design in agronomy research (feat. using R and SAS)?

What is split-split-plot design in agronomy research (feat. using R and SAS)?

In my previous post, I explained what split-plot design and the statistical model is, and also how it is different RCBD. What is split-plot design in agronomy research? I explained the main difference between split-plot design and RCBD is that in split-plot design, error is divided into two (error a and b), increasing the significance of interaction between the main plot and sub-plot. Now our interest lies in cases where we have three factors. In a split-plot design, we typically…

Read More Read More

An Introduction to Residual Analysis in Simple Linear Regression Models

An Introduction to Residual Analysis in Simple Linear Regression Models

Sample No. x y 1 10 30 2 20 40 3 30 50 4 40 80 5 50 90 6 60 100 7 70 120 Here is a dataset that allows us to analyze the relationship between x and y and obtain the model equation, y= β0 + β1x. Although statistical programs can provide us with results in just 10 seconds, it is more important to understand the principles behind the calculations than to simply know how to run the…

Read More Read More

Data filtering using R Studio

Data filtering using R Studio

When you conduct statistical analysis, you might want to include/exclude some variables. For example, here is one data. This is data about how yield, grain number (GN) and average grain weight (AGW) are different according to two different fertilizers (N0, N1) in five genotypes (CV1 – CV5). That is, there will be 10 treatments [Genotype (5) x Nitrogen (2) =10]. Replicates are 10 as blocks, and therefore experimental unit will be 30 [10 treatments x 3 blocks = 30]. What…

Read More Read More

How to analyze linear plateau model in R Studio?

How to analyze linear plateau model in R Studio?

When we talk about regression, it’s usually about simple linear regression model. This is about the relationship between two variables. FYI□ Simple linear regression (1/5)- correlation and covariance□ Simple linear regression (2/5)- slope and intercept of linear regression model Linear plateau model is similar with simple linear model, but linear plateau model is a segmented model, and this statistical model is interested in the critical value (the x-value above which there is no further increase in y), indicating the plateau value (the statistically highest value…

Read More Read More

In R, how to substrtact the mean from each value?

In R, how to substrtact the mean from each value?

In my previous post, I explained how to add extra column and row to calculate mean respectively. In R, how to add extra column and row to calculate mean respectively? Now, I’d like to substrtact the mean from each value in each column. This will be genotypic effect.

In R, how to add extra column and row to calculate mean respectively?

In R, how to add extra column and row to calculate mean respectively?

Let’s generate one data table. Now, I’d like to calculate mean of each column and row. For example, I want to calculate mean of ENV1 to ENV5, and also CV1 to CV5. First, I’ll calculate mean of each row (ENV1 to ENV 5). I discarded Environment row (dataA %>% select(-Environment)) because it’s not a numeric. Now, I’ll calculate mean of each column (CV1 to CV5). Now, mean of each column and row was calculated.

What is Probability Density Function (PDF) and Cumulative Distribution Function (CDF): How to calculate using Excel and R ?

What is Probability Density Function (PDF) and Cumulative Distribution Function (CDF): How to calculate using Excel and R ?

When we analyze data, we may need to show graphs depicting normal distributions. These graphs differ from density graphs as they convey various concepts that simple bar graphs cannot. While it is easy to draw these graphs in Excel, understanding the underlying concepts is crucial. In this article, I will explain what the Probability Density Function (PDF) is, and I will show how we can calculate it in both Excel and R. Here is a dataset of 1,000 individual wheat…

Read More Read More

How to change the name of columns in R?

How to change the name of columns in R?

Let’s upload one data to R. Now, I’d like to change the name of column as field → locationgenotype → varietyblock → repstreatment → experimentshoot → branchgrain_number → GNgrain_weight → GW I introduce two ways to change column names. 1) using colnames() 2) using rename() in dplyr package In this time, I’ll use dplyr package.

How to upload a file from GitHub to R?

How to upload a file from GitHub to R?

I uploaded one .csv file to GitHub. Now I want to analyze this data in R. Simply I can download this file and upload to R. But let’s directly upload this file from GitHub to R. First, we need to know the URL address of this file. If you click your file name in GitHub, you can find the “Raw” button. So, let’s click this button. Then, in the address bar of your web browser, you can obtain the URL…

Read More Read More

How to easily change legend name inside a graph in R?

How to easily change legend name inside a graph in R?

I’ll generate one data. Then, I’ll make a bar graph about this data. To make a bar graph, data should be summarized. Now, I want to change legend name from N0 to 0kg N/ha, and N1 to 200kg N/ha. Simply we can add more code like this; scale_fill_manual(label=c(“0kg N/ha”,”200kg N/ha”), values=c(“grey75″,”grey25″)) What if we want to change the title of legend from Fertilizer to Treatment? Just add one code like this; labs(fill=”Treatment”,x=”Genotype”, y=”Yield”)

What is Finlay-Wilkinson Regression Model?

What is Finlay-Wilkinson Regression Model?

The genotype is dependent on environmental changes. One genotype may strongly respond to certain environmental conditions, while another genotype may weakly respond to the same conditions. If some genotypes strongly respond under better conditions, they would be adaptable to the environment. Adaptability refers to the flexibility of a genotype in its response to improved environments. If a certain genotype exhibits high performance across a wide range of environmental conditions, it would be considered to have broad adaptation. To achieve this…

Read More Read More

What is a nested model in statistics?

What is a nested model in statistics?

One tomato farmer is growing tomato seedlings, and all of sudden he wants to investigate the amount of calcium in leaves. So, he selected four tomato seedlings, and he randomly chose three leaves in each seedling and investigated the amount of calcium. He measured twice in each leaf. This experimental design would be explained by below table. y111 means the amount of calcium in the 1st seedling – 1st leaf – 1st replicate. Then, y432 will mean the outcome of…

Read More Read More

How to calculate the optimum sample size for 2-Sample t test (using R and G*Power program)?

How to calculate the optimum sample size for 2-Sample t test (using R and G*Power program)?

When we set up our experimental design, it is not easy to decide the sample size because we don’t know exactly how many samples are required for our experiments. Of course the more, the better. However, eventually, we need to decide appropriate sample size according to our time and resources. For example, if we want to know the average height of students in University of Guelph, the best way is to measure all students’ height. According to Wikipedia, total number…

Read More Read More

What is ANCOVA (1/3)? The basic concept

What is ANCOVA (1/3)? The basic concept

Today, I will explain Analysis of Covariance (ANCOVA). ANCOVA is a statistical technique that involves including covariates, which are additional variables that may impact the dependent variable (y) in addition to the independent variable (x). I have a dataset as shown below, and I would like to analyze crop yield based on different fertilizer types (Control, Slow-release, and Fast-release). The experimental design is a Completely Randomized Design (CRD) with 10 replicates. Rep Fertilizer Yield Fertilizer Yield Fertilizer Yield 1 Control…

Read More Read More

In R STUDIO, how to reverse the order of x-axis (numeric), and also change the direction of graph?

In R STUDIO, how to reverse the order of x-axis (numeric), and also change the direction of graph?

Here is one data Then I make a regression graph. Now, I’d like to reverse the order of x-axis by descending (60 to 0). So I delete the code, scale_x_continuous(breaks = seq(-0, 60, 10), limits = c(-0, 60)) and add a new code, scale_x_reverse(limits = c(60,0)) So the whole code is below. Now, you can see the order of x-axis is changed and also the direction of the graph is changed. How to adjust unit of the axis? To adjust…

Read More Read More

Displaying Axis Values as Percentages in R Studio with Simple Code

Displaying Axis Values as Percentages in R Studio with Simple Code

Let’s create a simple dataset and draw a bar graph with this data The values on the y-axis are in decimal points. I would like to display them as percentages. So, I will insert the code labels=scales::percent inside the scale_y_continuous() function. The complete code is as follows: The values on the y-axis have changed to percentages.

Uploading Excel Data in R and Converting it to Code for Improved Management

Uploading Excel Data in R and Converting it to Code for Improved Management

Recently, I have uploaded a large excel dataset into R. This data consists of 7,849 columns. Upon checking the excel file size, it’s approximately 1MB in size. Now, I’d like to share this data with someone else. However, instead of attaching it as an excel file, I want to send it as R code, allowing them to work with the data directly in R. Therefore, all I need to do is convert this data into code. Below is how you…

Read More Read More

Exporting Individual Graph Images with R Studio and ggsave()

Exporting Individual Graph Images with R Studio and ggsave()

After creating a graph using R, repeatedly copying and pasting it to move it becomes a cumbersome task. Today, I’ll demonstrate how to easily relocate the graph. Let’s generate some data and draw a graph to demonstrate. Running the code like this will display the graph in the Plot window. Then, each time, you’ll need to click Export, save it with a different name, or copy the image to place the graph where you want. In reality, this task is…

Read More Read More

Exploring Axis Title and Text Spacing Adjustment in R Studio for Graphs

Exploring Axis Title and Text Spacing Adjustment in R Studio for Graphs

If you visit FAOSTAT (https://www.fao.org/faostat/en/), you can download high-quality data related to agriculture. Recently, I conducted an analysis of the trends in global and European wheat harvest quantities. As a result, I performed data analysis similar to the following. The complete code for the above graph is as follows: In the above graph, it seems that the axis title labels are too close to the axis text. I’d like to increase the spacing a bit. From now on, I’ll be…

Read More Read More

Creating a Data Frame in R Studio

Creating a Data Frame in R Studio

Today, I will show you how to create a data frame using R Studio. We have several variables that we will combine into a data frame. The ‘nation’ variable consists of five countries: “USA”, “GERMANY”, “NETHERLANDS”, “DENMARK”, and “KOREA”. We also have some survey data on the happiness and economic power of each country. To create a data frame in R, we can use the data.frame() function to combine all variables. In this example, I have written the code as…

Read More Read More

Creating Visual Emphasis: Adding Dotted Boxes to Graphs in R Studio

Creating Visual Emphasis: Adding Dotted Boxes to Graphs in R Studio

I’ll explain how to insert a box in a graph to highlight it. I’ll generate some data. “This data pertains to the yield and standard error for five different genotypes. I’ll create a bar chart to visualize it. In this graph, genotypes D and E exhibit greater yields compared to the other genotypes. My current objective is to emphasize genotypes D and E by adding a dotted box. To achieve this, we can utilize the geom_rect(). For geom_rect(), I set…

Read More Read More

In R STUDIO, how to apply the same font type and size in ggplot?

In R STUDIO, how to apply the same font type and size in ggplot?

First, let’s generate a simple data. Then I’ll make a bar graph using ggplot2. Now, I made a bar graph like above, but in the code to make this bar graph, I repeated font type and size over and over to set up the same font type and size in both graph title and text (also in x and y axis). I want to reduce this repeated codes, and the solution is using theme_grey(). axis.title.x= element_text (family=”serif”, size=15, color=”black”), axis.title.y=…

Read More Read More

Creating Stacked Bar Graphs in R Studio: A Step-by-Step Guide

Creating Stacked Bar Graphs in R Studio: A Step-by-Step Guide

Today, I’ll be introducing how to create stacked bar graphs using R Studio. To start, I will generate a data table as shown below. I’ll make stacked bar graphs using this data table. First of all, it’s necessary to summarize the data. I’ll use ddply() function. If I use this code, the error message pops up This is because when generating data, I used double quotation marks such as yield = c(rep(“15”, 5), rep(“18”, 5), rep(“20”, 8), rep(“14”, 7), rep(“21”,…

Read More Read More