The Best Linear Unbiased Estimator (BLUE): Step-by-Step Guide

In this session, I will introduce the method of calculating the Best Linear Unbiased Estimator (BLUE). Instead of simply listing formulas as many websites do to explain BLUE, this post aims to help readers understand the process of calculating BLUE with an actual dataset using R. I have the following data. The dataset comprises three … Read more

The Geometric Completion of Spacetime

Time is not a flowing river, but a massive, frozen block of ice where the past, present, and future coexist simultaneously within a four-dimensional spacetime continuum. As Albert Einstein famously noted: “The distinction between past, present, and future is only a stubbornly persistent illusion.” 1. The Geometric Completion of Spacetime As three-dimensional beings, we only … Read more

My own Twitter-style news feed

Today, I developed my own Twitter-style news feed, which you can see now. I added a DNS record to my website, agronomy4future.com, and the current address is today.agronomy4future.com. Do I know how to code? No. Did I major in programming? No. But I developed my own platform by collaborating with Claude. I simply shared my … Read more

facetext() R Package: Easy Text Annotation for ggplot2 Faceted Plots

In my previous post, I demonstrated how to add distinct text to panels created with facet_wrap() or facet_grid(), and I proposed the following simple method. I first created a data frame containing the labels, which I then used to map the text to the corresponding panels. Let’s practice using this method This data is from … Read more

Database Flow Map

Recently, I set up my database workflow using a Virtual Private Server (VPS) and developed R code to easily access and analyze this database. I’ll introduce them one by one. 1) Data in local PC to MySQL Server I want to upload my data in my PC to a MySQL server hosted on a Virtual … Read more

Firth’s Logistic Regression: Solving the Problem of Separation

In my previous post, I explained What logistic regression is and also explained odds, odds ratio and model equation. Today, I’ll introduce Firth’s Logistic Regression. Common logistic regression is mostly used when the sample size is sufficiently large, and the outcome variable (0 and 1) is well-balanced. Also, it is used when there is no … Read more

Build a Cloud-Synced CLI Worklog: The Ultimate Minimalist Setup for Linux

In a modern workflow, we constantly bounce between various web-based note-taking tools. For a developer or researcher, the act of opening a browser, logging in, and navigating to a page is a significant context switch. When you are deep in a coding session or analyzing data in the terminal, grabbing the mouse to record a … Read more

How to Sync OneDrive (or Google Drive or Box) on Linux using Rclone?

On Linux, OneDrive cannot be downloaded and synchronized as seamlessly as it is in a Windows environment. Today, I will provide a step-by-step guide on how to install and sync OneDrive on your Linux PC. 1) Set up a new remote connection. First, let’s open the terminal using the shortcut Ctrl + Alt + T … Read more

phenokio() R Package: Grain Size Analysis – Length, Width, and Area Metrics

When analyzing grain size, we’ve used high-throughput image scanning machines. However, if we can detect grains using R code, estimating grain size becomes possible. In my previous R function, colorcapture(), we were able to detect fruit size and estimate its surface area in 2D. Recently, I developed a new R function called phenokio() specifically designed … Read more

[Agronomy article] Nitrogen Cycle in Soil

1) Nitrogen Immobilization Nitrogen immobilization is the process by which soil microorganisms take up inorganic nitrogen (mainly NH₄⁺ and NO₃⁻) from the soil solution and convert it into organic forms inside their own cells. In other words, the N is temporarily “locked up” in microbial cells and is not available for plants. Microbes need nitrogen … Read more

Literature Mining for Meta-Analysis Using the scopusmining Package

Meta-analysis is a quantitative method that synthesizes results from multiple independent studies to identify overall patterns, effect sizes, and sources of variability for specific treatments or research questions. It is particularly powerful for summarizing existing evidence and placing new findings within the context of current scientific trends through transparent and reproducible statistical integration of prior … Read more

Converting Rows to Columns in R: A Guide to Transposing Data (feat. pivot_wider and pivot_longer)

When data is arranged, it can be structured either vertically (row-based) or horizontally (column-based). The choice depends on your preference for organizing data. However, when running statistics, data should be arranged row-based, as variables need to be in the same column. On the other hand, when calculating per variable, it is much easier to organize … Read more

[R package] Log Response Ratio and Effect Size Calculation (feat. lrr)

When conducting a meta-analysis, results from multiple studies must be synthesized despite differences in measurement scales, experimental designs, and reported units. To enable meaningful comparisons across studies, effect sizes are commonly used. One of the most widely applied effect size metrics in ecological and agricultural research is the Log Response Ratio (LRR). The Log Response … Read more

Confidence interval (CI) formula for a two-sample t-test

When performing a t-test, a confidence interval can be obtained. Below, I describe how to calculate the confidence interval manually, step by step. 1) Difference in means 2) Pooled variance 3) Standard error 4) Degrees of freedom 5) Confidence interval and let’s calculate the confidence interval We aim to develop open-source code for agronomy ([email protected]) … Read more

Understanding Bayes’ Theorem Step by Step

Recently, I’ve been focusing on Bayesian statistics. To organize the concepts for myself as well, I’m going to explain Bayes’ theorem as simply as possible. Let’s import a dataset from Kaggle. https://www.kaggle.com/datasets/cameronseamons/electronic-sales-sep2023-sep2024 This dataset contains information used to analyze customer purchasing behavior at an electronics store. You can download it from Kaggle after creating an … Read more

[R package] Cook’s Distance Diagnostics and Outlier Detection (Feat. datacooks)

In my previous post, I explained how to calculate Cook’s Distance step by step, and noted that in R you can simply use the function cooks.distance(). However, this simple function only provides the Cook’s Distance values. In my previous post, I explained the formula for Cook’s Distance step by step, including how to compute residuals, … Read more

How to analyze quadratic plateau model in R Studio?

Previous post□ How to analyze linear plateau model in R Studio? In my previous post, I explained how to analyze the linear plateau model. I simulated yield data for five different crop varieties with varying sulphur applications and suggested that the optimum sulphur application would be 23.3 kg/ha based on the linear plateau model. In … Read more

What is the Gamma Distribution? Shape and Scale Parameters, and the Probability Density Function (PDF)

The Gamma distribution is a flexible family of continuous probability distributions defined only for non-negative values (x>0). It’s commonly used to model quantities that represent time, size, or waiting periods—anything that can’t go below zero and often shows right-skewed behavior (a long tail toward larger values). In essence, the Gamma distribution describes how likely different … Read more

How to Set Up RStudio Server on a Linux-Based Virtual Private Server (VPS)

A Virtual Private Server (VPS) is a virtualized computer within a larger physical server. It acts like an independent server, offering dedicated resources and control at a lower cost than a full physical machine. VPS hosting uses this setup to give users private, customizable environments for web hosting or applications. Some reputable VPS providers include … Read more

[Data article] How to Import Data from MySQL Server to R?

In my previous post, I introduced how to import data into a Cloud MySQL database using Python from the Command Prompt. By typing the code below in your Command Prompt, you can automatically import data into your MySQL server. This is the next step. After importing data into the MySQL server, what if I want … Read more

[R package] Segment and Measure Colored Objects in Images (Feat. colorcapture)

This colorcapture() R function provides easy image analysis to estimate fruit surface area. 1. Install the pacakge Before installing colorcapture(), please download Rtools (https://cran.r-project.org/bin/windows/Rtools), and install the following package. □ colorcapture() 2. Basic code If you want to change other criteria, type ?colorcapture to see detailed information about the colorcapture() function. The default code is set to capture the … Read more

[R package] Segment and Measure Green Objects in Images (Feat. greencapture)

When measuring leaf area or fruit surface area, these have usually been measured manually, which is time-consuming and often inaccurate, especially when the shape is not perpendicular. For more reliable measurement, image analysis provides a good alternative. To facilitate this process, I developed an R function, greencapture() that captures the green area of leaves or … Read more

[R package] Compute Cumulative Summaries of Grouped Data (Feat. datacume)

When analyzing data, we sometimes need to analyze cumulative data. When calculating cumulative data, grouping is important, and it takes time to perform the grouping and calculations. To simplify this process, I developed an R package called datacume(). Let’s upload the dataset. This dataset contains biomass measurements across different treatments over time, recorded for various … Read more