What is logistic regression (feat. odds, odds ratio and model equation)?
Logistic regression is a type of statistical analysis used to model the relationship between a binary (yes/no) dependent variable and independent variables. The goal of logistic regression is to find a relationship between the independent variables (x) and the probability of a particular outcome for the dependent variable (y). The logistic regression model calculates the probability of a certain outcome by applying a logistic function to the linear combination of the independent variables. Here is one example. Sulphur improves plant…
[데이터 칼럼] 선형 보간법 (Linear Interpolation) 을 사용하여 중간 데이터를 예측해 보자
오늘은 데이터 사이에 있는 값을 예측하기 위한 선형 보간법 (Linear Interpolation) 에 대해 설명하겠습니다. 예를 들어, 현장에서 데이터를 수집할 때 매일 데이터를 수집할 수는 없을 것입니다. 그래서 우리는 일정한 간격 (매주, 격주, etc.,) 으로 데이터를 수집합니다. 그러나 데이터를 제시할 때는 일별로 표시해야 할 경우가 발생 합니다. 예를 들어, 질소 비료 시비량이 0kg/ha, 30kg/ha, 60kg/ha, 120kg/ha 일 때 반응하는 작물의 수확량 차이를 조사한다고 가정해 보겠습니다. 0부터 120까지의 각 질소 비료량에서 수확량 차이를 나타내야 한다면 어떻게 데이터를 추정할 수 있을까요? 이런 상황에서…
In R, how to adjust the unit of axis in graph?
When we make graphs, the unit is great and the number would be overlapped. Here is an example. Now, I’d like to change the unit of number. For example, I want to divide each value by 1000, so that to show 5 to 30 in x-axis. We can add below codes.
What is split-split-plot design in agronomy research (feat. using R and SAS)?
In my previous post, I explained what split-plot design and the statistical model is, and also how it is different RCBD. What is split-plot design in agronomy research? I explained the main difference between split-plot design and RCBD is that in split-plot design, error is divided into two (error a and b), increasing the significance of interaction between the main plot and sub-plot. Now our interest lies in cases where we have three factors. In a split-plot design, we typically…
How to use Google Colab for Python (power tool to analyze data)?
Google Colaboratory (aka. Colab) is a cloud-based platform that provides a Jupyter notebook environment and therefore users can write and run Python code. You don’t have to install Anaconda to use Jupyter notebook. If you have a google account, simply you can analyze data. Google Colab is a powerful tool for collaborative coding and data analysis, providing users with an easy-to-use platform with a wide range of features and resource. I introduce how to set up Google Colab. Step 1)…
An Introduction to Residual Analysis in Simple Linear Regression Models
Sample No. x y 1 10 30 2 20 40 3 30 50 4 40 80 5 50 90 6 60 100 7 70 120 Here is a dataset that allows us to analyze the relationship between x and y and obtain the model equation, y= β0 + β1x. Although statistical programs can provide us with results in just 10 seconds, it is more important to understand the principles behind the calculations than to simply know how to run the…
Data filtering using R Studio
When you conduct statistical analysis, you might want to include/exclude some variables. For example, here is one data. This is data about how yield, grain number (GN) and average grain weight (AGW) are different according to two different fertilizers (N0, N1) in five genotypes (CV1 – CV5). That is, there will be 10 treatments [Genotype (5) x Nitrogen (2) =10]. Replicates are 10 as blocks, and therefore experimental unit will be 30 [10 treatments x 3 blocks = 30]. What…
What is odds, log odds and logit (feat. Slam Dunk story)?
Odds and logit is the basic concept to understand logistic regression. Today I’ll explain what it is as much as easily. Do you know a comic book, ‘Slam Dunk’? I’ll explain odds with this story. 1) Odds Now, Shohoku high school is playing games with other high schools in the tournament. In the first round, Shohoku high school won 4 games and lost 6 games out of 10 games. Now the winning odds of Shohoku high school is 4/6 ≈…
How to delete all data at a time using code in R?
I wrote many codes in R and now I want to delete all of them. You can click the broom icon to delete all. But, you can use this code to delete all.
How to analyze quadratic plateau model in R Studio?
Previous post□ How to analyze linear plateau model in R Studio? In my previous post, I explained how to analyze linear plateau model. I simulated yield data for five different crop varieties with different sulphur applications, and suggsted the optimum sulphur application would be 23.3 kg/ha based on the linear plateau model. In this time, I’ll explain how to analyze quadratic plateau model with the same data using R studio 1) Data upload If you run the below code, the…
How to analyze linear plateau model in R Studio?
When we talk about regression, it’s usually about simple linear regression model. This is about the relationship between two variables. FYI□ Simple linear regression (1/5)- correlation and covariance□ Simple linear regression (2/5)- slope and intercept of linear regression model Linear plateau model is similar with simple linear model, but linear plateau model is a segmented model, and this statistical model is interested in the critical value (the x-value above which there is no further increase in y), indicating the plateau value (the statistically highest value…
[캐나다 농업 일기] 궬프대학교 – 프랭크 스코필드
사무실 동료가 미국 대학 교수 면접 인터뷰가 있다고 해서 잠깐 사무실을 비워 줬습니다. 그래서 사무실 맞은편에 있는 University Center 건물에 가 봤습니다. University Center 는 궬프대학교에서 여러가지 편의시설이 있는 곳입니다. 학교 식당과 안경점, 치과, 약국, 프린트 샵 등 학생들을 위한 여러가지 편의 시절이 이곳 1층에 위치하고 있습니다. 30분 정도 시간을 보낼 곳이 필요해서 University Center 를 층별로 한번 둘러보기로 했습니다. 4층을 가 봤더니 궬프 대학교의 역사에 대해 여러가지 정보를 볼수 있었습니다. 게시판을 둘러보던 중 Korea 단어가 보여서 뭔가 해서 자세히…
In Excel, how to use If function with 3 conditions?
Here is one data. P-values are summarized for genotypes at difference fields. Now I’d like to add symbols; *,**,*** and n.s. If p-value is less than 0.05, it will be *, and if p-value is less than 0.01, it will be **, and if p-value is less than 0.001, it will be ***, and if all conditions are not met, it will be n.s. So, I’ll add this code in D column in the excel, and drag this code to…
Simple linear regression (4/5)- t value on the slope and intercept
Simple Linear Regression Series 1) Simple linear regression (1/5)- correlation and covariance 2) Simple linear regression (2/5)- slope and intercept of linear regression model 3) Simple linear regression (3/5)- standard error of slope and intercept 4) Simple linear regression (4/5)- t value on the slope and intercept 5) Simple linear regression (5/5)- Coefficient of determination In my previous post, I explained how to calculate standard error of slope and intercept in simple linear regression model. Now, I’ll explain how to calculate t…
[캐나다 농업 일기] 눈 내리는 궬프 대학교 캠퍼스 풍경
1월말에 온타리오 지역에 폭설이 내렸습니다. 사무실에서 일을 하다가 잠깐 리프레쉬 할겸 오랜만에 캠퍼스를 한번 둘러 봤습니다. 사무실에서 보이는 캠퍼스의 설경이 무척이나 예뻐 보입니다. 빌딩 밖을 나와서 도서관 쪽으로 이동해 보기로 했습니다. 양 옆에 가로수가 눈에 쌓여 있는 모습이 무척이나 이쁜 설경 모습입니다. 학교 메인 건물인 Johnston Hall 입니다. 유럽 건물 느낌이 나는 이곳은 학생 기숙사 입니다. 그리고 캠퍼스 이곳 저곳을 둘러 보았습니다. 4월말까지는 이런 겨울 풍경이 계속 된다고 합니다. 처음 캐나다 오기전 겨울밀을 10월에 파종하고 stem elongation (GS30) 이 4월…
In R, how to substrtact the mean from each value?
In my previous post, I explained how to add extra column and row to calculate mean respectively. In R, how to add extra column and row to calculate mean respectively? Now, I’d like to substrtact the mean from each value in each column. This will be genotypic effect.
In R, how to add extra column and row to calculate mean respectively?
Let’s generate one data table. Now, I’d like to calculate mean of each column and row. For example, I want to calculate mean of ENV1 to ENV5, and also CV1 to CV5. First, I’ll calculate mean of each row (ENV1 to ENV 5). I discarded Environment row (dataA %>% select(-Environment)) because it’s not a numeric. Now, I’ll calculate mean of each column (CV1 to CV5). Now, mean of each column and row was calculated.
[Maize Article] GxE interaction in terms of stability
GxE interaction is when the phenotypic difference between a pair of genotypes is larger or smaller in one environment than in another environment. It is important to understand that what genotypes have different phenotypic values in two environments is not the same as GxE interaction. Please look at above graph. Both genotypes had a different phenotype in the two different environments (Env1 and Env2), but there was no interaction in this case because the difference between the phenotypic values was…
MandalArt chart
MandalArt chart became famous due to Shohei Ohtani who is a baseball player in MLB. This method is to pinpoint eight specific ways about one major goal. News articleHow Shohei Ohtani Visualized His Baseball Success As a crop physiologist, I set up my own MandalArt chart to visually organize and explore the interconnected aspects of crop physiology. © 2022 – 2023 https://agronomy4future.com
Simple linear regression (3/5)- standard error of slope and intercept
Previous post!!□ Simple linear regression (1/5)- correlation and covariance□ Simple linear regression (2/5)- slope and intercept of linear regression model In my previous post, I explained how to calculate slope (β1) and intercept (β0) of linear regression model. If you well followed my previous posts, you will get the above result, y= 89.0 + 1.5x Now our interest is how to calculate standard error in the intercept and slope (Red box). Here is the equation to obtain standard error of…
Simple linear regression (2/5)- slope and intercept of linear regression model
□ Simple linear regression (1/5)- correlation and covariance In my previous post, I explained about correlation and covariance. Now, I’ll explain about slope (β1) and intercept (β0) of linear regression model. In the whole picture to explain a linear regression model, β1 is calculated as β1 = r * Sy / Sx We already know how to calculate correlation (r), and only we need to calculate the ratio between standard deviation of x and y. Let’s go back to the…
Simple linear regression (1/5)- correlation and covariance
Since today, I’ll explain simple linear regression model. There are lots of information about linear regression on websites, but I believe I’ll tell you about what most people don’t mention. My philosophy on data analysis and statistics is to fully understand the concept, not simply follow what software programs say. Therefore I usually calculate statistical concepts by hand, and only my hand calculation is exactly same as the software programs provide, I say I understand the concept. In this context,…
geom_hline(data=data.frame(variety=c(“A”, “B”)), aes(yintercept=c(195.4028, 206.0819)), linetype=”dashed”, color=”Dark blue”) +
What is Probability Density Function (PDF) and Cumulative Distribution Function (CDF): How to calculate using Excel and R ?
When we analyze data, we may need to show graphs depicting normal distributions. These graphs differ from density graphs as they convey various concepts that simple bar graphs cannot. While it is easy to draw these graphs in Excel, understanding the underlying concepts is crucial. In this article, I will explain what the Probability Density Function (PDF) is, and I will show how we can calculate it in both Excel and R. Here is a dataset of 1,000 individual wheat…
R-Squared in ANOVA: A Practical Approach to Calculation and Interpretation
Every time we discuss R2, we typically associate it with regression models. However, R2 also has a significant role in ANOVA. There seems to be less information available on how to calculate and interpret R2 in ANOVA, so today’s topic will focus on how to interpret this measure in the context of ANOVA. Let’s consider an example dataset. Suppose we measured the final yield at varying nitrogen levels. We established three replicates as a block. Consequently, this model will be…
In Excel, how to adjust x-y axis of graph at a time using VBA?
All VBA codes I suggested are summarized in my github. https://github.com/agronomy4future/VBA/blob/main/adjusting_axis Here is one data, and I made three bar graphs per location. You can download above data in my github. https://github.com/agronomy4future/raw_data_practice/blob/main/VBA_practice.csv Now, I’d like to add a title in x and y axis, and adjust the range and unit in y-axis. Of course, we can change it in each graph, but if there are 100 graphs, will you still do it one by one? We don’t have time to…
[캐나다 농업 일기] 2022년 궬프 지역 옥수수 수확
2022년 옥수수 필드 연구를 마무리 짓고 수확을 마무리 하였습니다. 파종일: 2022년 5월 9일개화일: 2022년 7월 27일등숙일: 2022년 10월 28일수확일: 2022년 11월 14일
Attendance of the 2022 ASA, CSSA and SSSA International Annual Meeting at Baltimore, MD
I attended the 2022 ASA, CSSA and SSSA International Annual Meeting at Baltimore, MD, and presented my Ph.D research topic.
code
geom_text(aes(family=”serif”,fontface=6), x=65, y=0.02, label=paste(“Pistolo\n”,pistolo_CT_MEAN, “±”, round(pistolo_CT_SE,digits=2)),size=7, col=”Black”)