How to upload data from Google Drive to Google Colab in an R environment?

How to upload data from Google Drive to Google Colab in an R environment?

How to use Google Colab for Python (power tool to analyze data)? In my previous post, I introduced how to upload data from Google Drive to Google Colab in a Python environment. Google Colab is primarily Python-based, but now we can change the runtime to R and use R code in Google Colab. Today, I will introduce how to upload data from Google Drive to Google Colab in an R environment. First, let’s upload a data file to Google Drive…

Read More Read More

[R package] R-Squared Calculation in Simple Linear Regression with Zero Intercept (Feat. Intercept0)

[R package] R-Squared Calculation in Simple Linear Regression with Zero Intercept (Feat. Intercept0)

In my previous article, I suggested when forcing the intercept to zero in simple linear regression model, the existing calculation R2 = SSR / SST is incorrect. Instead, when forcing the intercept to zero, R2 should be calculated as shown below. 1 – SSE (when intercept is 0) / SST (when intercept exists) ■ R-Squared Calculation in Linear Regression with Zero Intercept This is because that only the SSE (Sum of Squared Error) is calculated as Σ(yi – ŷi)2, regardless of the presence of an…

Read More Read More

A Step-by-Step Guide to Creating an R Package and Uploading to GitHub

A Step-by-Step Guide to Creating an R Package and Uploading to GitHub

1) Create a folder and R package I’ll create an R package named kimindex. This package will contain a simple code to predict grain weight based on the area of a wheat grain. I have already developed a model equation: y=x^1.32, where y is the wheat grain weight and x is the grain area. Install available package if not installed if (!requireNamespace(“available”, quietly = TRUE)) {install.packages(“available”)} Check package name availability available::available(“interptools”) The basic R package files for kimindex have been…

Read More Read More

How to Upload Data from GitHub Using R and Python?

How to Upload Data from GitHub Using R and Python?

I have soybean yield data that I want to upload to Github and access from R. First, let’s upload the data to Github. The data should be in .csv format. Click Add file, choose Upload files, and, after uploading, select the Raw button to view the data in .csv format as text. and you can find the address for this data, starting with https://raw.githubusercontent.com/… Let’s copy this address. Next, I’ll bring this data into R from Github. Before that, let’s…

Read More Read More

[Data article] Simulating Crop Growth Over Time Using a Sigmoid Growth Model

[Data article] Simulating Crop Growth Over Time Using a Sigmoid Growth Model

I’m planning to frequently collect biomass samples to observe how biomass accumulation differs among treatments or varieties over time. I assume that the growth will follow a curve pattern, characterized by slow accumulation during the early growing stage, followed by rapid growth, and eventually reaching a plateau. I want to visualize this curve through simulation, and here is the Python code to demonstrate it. First, let’s import the required packages. and I’ll also set up a seed for reproducibility. Next,…

Read More Read More

[STAT Article] RMSE Calculation with Excel and R: A Comprehensive Guide

[STAT Article] RMSE Calculation with Excel and R: A Comprehensive Guide

When running statistical programs, you might encounter RMSE (Root Mean Square Error). For example, the table below shows RMSE values obtained from SAS, indicating that it is ca. 2.72. I’m curious about how RMSE is calculated. Below is the equation for RMSE. First, calculate the difference between the estimated and observed values: (ŷi – yi), and then square the difference: (ŷi – yi)². Second, calculate the sum of squares: Σ(ŷi – yi)². Third, divide the sum of squares by the…

Read More Read More

What is split-plot design in agronomy research?

What is split-plot design in agronomy research?

Split-plot design has been widely used particularly in the agronomy research. In split-plot design, the experimental units are divided into smaller units. Split-plot designs are useful when some factors are difficult or expensive to change or when the levels of the factors cannot be randomized (I’ll explain in detail later). Split-plot design consists of one whole plot and one subplot. The whole plot factor is randomly assigned to the experimental units, while the subplot factor is applied to a smaller…

Read More Read More

[데이터 칼럼] 선형 보간법 (Linear Interpolation) 을 사용하여 중간 데이터를 예측해 보자

[데이터 칼럼] 선형 보간법 (Linear Interpolation) 을 사용하여 중간 데이터를 예측해 보자

오늘은 데이터 사이에 있는 값을 예측하기 위한 선형 보간법 (Linear Interpolation) 에 대해 설명하겠습니다. 예를 들어, 현장에서 데이터를 수집할 때 매일 데이터를 수집할 수는 없을 것입니다. 그래서 우리는 일정한 간격 (매주, 격주, etc.,) 으로 데이터를 수집합니다. 그러나 데이터를 제시할 때는 일별로 표시해야 할 경우가 발생 합니다. 예를 들어, 질소 비료 시비량이 0kg/ha, 30kg/ha, 60kg/ha, 120kg/ha 일 때 반응하는 작물의 수확량 차이를 조사한다고 가정해 보겠습니다. 0부터 120까지의 각 질소 비료량에서 수확량 차이를 나타내야 한다면 어떻게 데이터를 추정할 수 있을까요? 이런 상황에서…

Read More Read More

[R Package] Convert Data into Code Instantly – Save as a Script with One Line

[R Package] Convert Data into Code Instantly – Save as a Script with One Line

When uploading data to R, we sometimes worry about losing track of the data over time. This is because we save data in different folders according to various projects, and we might forget where we stored it. Additionally, if the file path changes, it can be difficult to upload the data directly and locate its current location. Therefore, a better approach is to save the data as code, allowing us to access it directly when opening the R file where…

Read More Read More

[R package] An easy way to use interpolation code to predict in-between data points

[R package] An easy way to use interpolation code to predict in-between data points

In my previous post, I explained how to calculate interpolation to predict in-between data points. ■ [Data article] Predicting Intermediate Data Points with Linear Interpolation in Excel and R To make interpolation calculations easier, particularly for groups, I recently developed a new R package, interpolate(). First, let’s upload a dataset. This dataset contains chlorophyll content measurements for sorghum and soybean. I measured chlorophyll content every 10 days between 65 and 125 days after sowing, with four replicates at each time…

Read More Read More

[Data article] Predicting Intermediate Data Points with Linear Interpolation in Excel and R

[Data article] Predicting Intermediate Data Points with Linear Interpolation in Excel and R

Today, I’ll explain the interpolation technique used to predict in-between data points. For example, when collecting field data, we might not be able to gather information every day, so we establish our own interval (e.g., weekly or bi-weekly). However, when presenting the data, it might be necessary to show it on a daily basis. As another example, consider investigating yield differences in response to varying continuous variables, such as nitrogen at levels of 0, 30, 60, 120. What if we…

Read More Read More

How to Combine Files and Create a New Data Table in MySQL

How to Combine Files and Create a New Data Table in MySQL

In my previous post, I introduced how to combine multiple files into one using Access, and now I’ll explain how to do the same using MySQL. The SQL code is similar in both programs, so the code will be the same. First, I uploaded three different datasets to MySQL, and I want to combine them into one. I’ll use union code to combine all data. Now I want to create this data table. So, I’ll use this code. Now, new…

Read More Read More

How to Rename Variables within Columns in R (feat. case_when() code)?

How to Rename Variables within Columns in R (feat. case_when() code)?

In my previous post, I introduced how to change variable names within columns. In the post, I provided a simple code to rename variables and also used the stringr package for renaming variables. ■ How to Rename Variables within Columns in R? Today I’ll introduce another code to simply rename variables using dplyr() package. In my previous post, using the simple data above, I introduced how to rename variables. For example, we can rename variables using the below code. Or,…

Read More Read More

How to convert to a .json file using Python?

How to convert to a .json file using Python?

Sometimes we need to convert our data to .json format, and I will introduce an easy way to do it using Python. I will use Google Colab. First, let’s mount Google Drive to Google Colab. Second, let’s upload a dataset from GitHub. I’ll convert this data to a .json file and download it to my PC. or I can directly download it to Google Colab. Now .json file is created. Let’s upload this .json file to Google Colab. When uploading…

Read More Read More

How to Use Temporary Tables for Quick Calculations in MySQL?

How to Use Temporary Tables for Quick Calculations in MySQL?

In my SQL, sometimes we need to calculate average or something else for filtered data. It woud be much easiler if we create temporary tables when calculating filtered data. here is an example. First, let’s create a database Second, I’ll create a data table. Let’s see the data table was well created. Now, I want to calculate average for root and total biomass per treatment. Next, I want to calculate average again, but excluding treatment, N3. So, I’ll run this…

Read More Read More

Visualizing Geospatial Data with Folium in Python

Visualizing Geospatial Data with Folium in Python

Recently, I saw the QS World University Rankings; QS World University Rankings by Subject 2024: Agriculture & Forestry. It shows the global university rankings for Agriculture and Forestry Science. Suddenly, I became interested in marking the U.S. agriculture universities on a map to see where these colleges are located in the U.S. I found that the Folium package in Python provides an excellent GIS map with an easy process, and I am sharing the code here. First, using Python, I’ll…

Read More Read More

How to automatically insert linear regression equation in graph in RSTUDIO?

How to automatically insert linear regression equation in graph in RSTUDIO?

Sometimes, we need to insert a linear regression equation inside a graph, but it’s an annoying to type an equation every time when generating a linear regression graph. Using stat_poly_eq(), we can automatically insert a linear regression equation. Let’s generate one data frame. Then, I’ll generate a regression graph. Now let’s analyze a linear regression. The linear model equation is y= 9.1429 + 1.5357x and R2 is 0.9245. Now I’ll insert this equation model automatically using stat_poly_eq(). I’ll add the…

Read More Read More

[Article] Tiny Plants Reveal Big Potential for Boosting Crop Efficiency – Boyce Thompson Institute – Boyce Thompson Institute

[Article] Tiny Plants Reveal Big Potential for Boosting Crop Efficiency – Boyce Thompson Institute – Boyce Thompson Institute

Scientists have long sought ways to help plants turn more carbon dioxide (CO₂) into biomass, which could boost crop yields and even combat climate change. Recent research suggests that a group of unique, often overlooked plants called hornworts may hold the key. “Hornworts possess a remarkable ability that is unique among land plants: they have a natural turbocharger for photosynthesis,” said Tanner Robison, a graduate student at the Boyce Thompson Institute (BTI) and first author of the paper recently published in Nature Plants….

Read More Read More

[슬기로운 코넬 생활 101] 한국에서 이타카 (Ithaca) 가는 방법

[슬기로운 코넬 생활 101] 한국에서 이타카 (Ithaca) 가는 방법

“슬기로운 코넬 생활 101” 은 제가 초기 정착 때 경험한 것들을 시간별로 정리해서 새롭게 오시는 분들에게 필요한 정보를 공유하는 것을 목적으로 하는 프로젝트 입니다. 참고로 “슬기로운 어바나-샴페인 생활 101” 프로젝트는 성공적이었으며 어바나-샴페인으로 새롭게 오시는 분들에게 다양한 로컬 정보를 제공했었습니다 (e.g., 한국에서 어바나-샴페인 (Urbana-Champaign) 가는 방법) 목차1. [슬기로운 코넬 생활 101] 한국에서 이타카 (Ithaca) 가는 방법 일리노이 대학 (University of Illinois at Urbana-Champaign) 연구그룹에서 마무리를 잘 하고 이타카 (Ithaca) 로 잘 이동했습니다. 필드 작물 연구가 활발한 Midwest 지역에서 약간은 필드 작물…

Read More Read More

[STAT Article] Steps to Calculate Log-Likelihood Prior to AIC and BIC: [Part 2] ANOVA model

[STAT Article] Steps to Calculate Log-Likelihood Prior to AIC and BIC: [Part 2] ANOVA model

In my previous post, I explained how to calculate the Log-Likelihood, AIC, and BIC in a regression model. In this post, I will demonstrate the same concepts, but in the context of an ANOVA model. Here I have one dataset. Let’s say this data represents yield in response to different fertilizer types (Control, Slow, and Fast), and I want to determine the effect of fertilizer type on yield. Therefore, I will perform a one-way ANOVA. Now, I observe that the…

Read More Read More

2024 ASA, CSSA, SSSA International Annual Meeting in San Antonio, TX

2024 ASA, CSSA, SSSA International Annual Meeting in San Antonio, TX

I went to San Antonio to attend the ASA, CSSA, SSSA International Annual Meeting. It was my first time to visit Texas, and the weather was great!! I had an oral presentation about my agrivoltaics study I’d conducted for the last two seasons. The title was ‘Shading Impacts on Sorghum and Soybean Grain Yields Under Agrivoltaics Systems: Source-Sink Strength in Response to Shading.’ Agrivoltaic (AV) systems induce shading throughout the entire crop growth period, and understanding how these shading patterns…

Read More Read More

[STAT Article] Steps to Calculate Log-Likelihood Prior to AIC and BIC: [Part 1] regression model

[STAT Article] Steps to Calculate Log-Likelihood Prior to AIC and BIC: [Part 1] regression model

Here I have one dataset. I want to predict grain weight using grain dimension data such as length, width, and area, and identify the best prediction model for estimating grain weight. As a result, I developed the following models. and I’ll calculate Log-likelihood for each model. To do that, I need to know each model equation. Now, I obtained each model equation, and I’ll calculate Log-likelihood For a linear regression model, the Log-Likelihood (LL) is defined as: where:n is the…

Read More Read More

[STAT Article] Step-by-Step Guide to Calculating and Analyzing Principal Component Analysis (PCA) by Hand

[STAT Article] Step-by-Step Guide to Calculating and Analyzing Principal Component Analysis (PCA) by Hand

Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction while preserving as much variability in the data as possible. It transforms the original variables in a dataset into a new set of uncorrelated variables called principal components, ordered by the amount of variance they capture from the original dataset. Here’s the step of Principal Component Analysis (PCA). 1. Standardize the Data: Since PCA is affected by the scale of the variables, it often begins with standardizing the…

Read More Read More

Understanding Mean Absolute Error (MAE) in ANOVA: A Step-by-Step Guide to Calculation in Excel

Understanding Mean Absolute Error (MAE) in ANOVA: A Step-by-Step Guide to Calculation in Excel

Mean Absolute Error (MAE) is a metric used to measure the accuracy of a model’s predictions. It calculates the average magnitude of the errors in a set of predictions, without considering their direction. In other words, MAE measures the average absolute difference between the actual values and the predicted values. MAE is typically used in the context of regression analysis and prediction error evaluation, rather than in ANOVA (Analysis of Variance), which focuses on comparing the means of different groups….

Read More Read More

Practices in Data Normalization using normtools() in R

Practices in Data Normalization using normtools() in R

■ [R package] Normalization Methods for Data Scaling (Feat. normtools) In my previous post, I introduced the R package normtools(), which I developed to normalize data using various methods. This time, I’ll demonstrate how to use the R package normtools() for data normalization. 1. Data upload This data includes kernel number (KN), average kernel weight (AGW), and grain yield (GY) for different corn varieties across various years, populations, and locations. 2. Data normalization This is the normtools() package. First, I’ll…

Read More Read More

Sorghum panicle damage

Sorghum panicle damage

The damage to sorghum grain can result from a variety of causes, including environmental, biological, and mechanical factors. Here are some common causes: 1. Excessive Rainfall and Humidity 2. Pest Infestation 3. Temperature Stress 4. Mechanical Damage During Harvest 5. Soil Conditions 6. Delayed Harvest 7. Post-Harvest Factors To minimize sorghum grain damage, it is crucial to manage environmental conditions, ensure proper timing of harvest, and implement effective pest control and storage techniques. ■ References □ Rain devastates Downs sorghum…

Read More Read More

How to install Llama 3 in your PC?

How to install Llama 3 in your PC?

Llama 3, or Large Language Model Meta AI 3, is an advanced iteration of Meta’s language models, designed to facilitate a wide array of natural language processing tasks with enhanced capabilities. This model leverages state-of-the-art techniques in deep learning and transformer architectures, providing improved performance in text generation, comprehension, and contextual awareness. We can install Llama 3 in your PC. 1. Visit ollama.com and click the Download button. Select your OS and download. https://ollama.com After downloading, run the OllamaSetup file….

Read More Read More

[R package] Normalization Methods for Data Scaling (Feat. normtools)

[R package] Normalization Methods for Data Scaling (Feat. normtools)

■ [Data article] Data Normalization Techniques: Excel and R as the Initial Steps in Machine Learning In my previous post, I explained how to normalize data using various methods and demonstrated how to perform the calculations for each method. To simplify these calculations, I recently developed an R package that easily generates normalized data. 1. Install the normtools() package 2. Basic code format 3. Practice with actual dataset (data upload) 4. Normalize data 4.1. Z-test normalization 4.2. Robust Scaling 4.3….

Read More Read More