[Data article] Data Normalization Techniques: Excel and R as the Initial Steps in Machine Learning

[Data article] Data Normalization Techniques: Excel and R as the Initial Steps in Machine Learning

In my previous post, I introduced the necessity of data normalization in visualizing data. By following that post, you may gain an understanding of how we can organize data according to our preferences. □ Why is data normalization necessary when visualizing data? Today, I’ll introduce various methods for data normalization, utilizing the biomass with N and P uptake data available on my GitHub. I also aim to create regression graphs illustrating the relationship between biomass and either nitrogen or phosphorus….

Read More Read More

[Data article] Why is data normalization necessary when visualizing data?

[Data article] Why is data normalization necessary when visualizing data?

Data normalization is necessary when visualizing data for several key reasons, and I believe the most important reason is for scale uniformity. Different data variables can have vastly different scales and units. For example, grain yield might be in Mg/ha, while nutrient contents might typically range from %. Normalizing these data to a common scale (like 0 to 1) allows them to be compared and visualized on the same axis without one overshadowing the other due to its scale. Additionally,…

Read More Read More

How to draw a y-axis border when using facet_wrap() in R? (feat. scales=”free”)

How to draw a y-axis border when using facet_wrap() in R? (feat. scales=”free”)

Here is one dataset, and I’ll use facet_wrap() to create bar graphs. First, let’s summarize the data. Then, I’ll create a bar graph using facet_wrap() to divide panels by irrigation. Now, I want to draw a y-axis border for the ‘Irrigation_Yes’ panel. We can achieve this simply by adding scales=”free”. © 2022 – 2023 https://agronomy4future.com

How to randomize treatments using R?

How to randomize treatments using R?

Setting up experimental design according to your experiment goal is the first step to achieve your experiment’s success. In Agronomy studies, experimental design involves the combination of treatments deployed in the field, and these treatments should be randomized. Randomization is important in experimental design as it helps our experiments avoid biases due to physical or biological factors. Of course, there are no specific, unconditional rules for randomization. In a very old-fashioned way, you can write treatment numbers on paper, and…

Read More Read More

Achieving Smooth Curve Graphs with R

Achieving Smooth Curve Graphs with R

□ How to convert character to POSIXct format in R? In my previous post, I created a curve graph like the one shown below. The curve on the graph appears to be not very smooth, and I want to make it smoother. Therefore, I will add geom_smooth(), but the method will be method=”gam” code summary: https://github.com/agronomy4future/r_code/blob/main/Achieving_Smooth_Curve_Graphs_with_R.ipynb © 2022 – 2023 https://agronomy4future.com

How to convert character to POSIXct format in R?

How to convert character to POSIXct format in R?

Here is one dataset Let’s check the data type of each variable. The time column is in character format. When opening the data in Excel, it is considered text. I wish to create a time series graph, but this cannot be accomplished when the variables are in text format. Therefore, we need to convert the text to a time format. Now we can adjust time using scale_x_datetime() full summary: https://github.com/agronomy4future/r_code/blob/main/How_to_convert_character_to_POSIXct_format_in_R.ipynb © 2022 – 2023 https://agronomy4future.com

How to Convert Time to Numeric for Line Graphs in R?

How to Convert Time to Numeric for Line Graphs in R?

Here is one dataset. With this data, I’ll create a line graph to show the change in day length over time. First, let’s transpose the columns to rows using pivot_longer(). I’ll sort the data by Day and Month, but since the month column is in text format, sorting it from January to December directly isn’t feasible. Therefore, I’ll add a number corresponding to each month for sorting purposes. Now, I can sort by ‘month1’ and ‘Day’ from January 1 to…

Read More Read More

Summarizing Data by Group: Mean and Standard Error with MS Azure

Summarizing Data by Group: Mean and Standard Error with MS Azure

□ Creating an Azure SQL Database: A step-by-step guide In my previous post, I introduced how to set up Azure SQL Database. Today, let’s practice some SQL coding! 1) to create data table I just created two data tables YieldData, and BiomassData. 2) to summarize data I will summarize the data tables by calculating the mean and standard error for each. How to merge two datasets? Here is one more tip. I want to merge two datasets. Here is the…

Read More Read More

Converting Character Values to Numeric in R: A How-To Guide

Converting Character Values to Numeric in R: A How-To Guide

First, let’s create a dataset. and observe the different data formats of each value. I have two sets of yield data: one in character format (yield column) and the other in numeric format (yield1 column). How to convert missing value to 0 when data is numeric? When data is numeric (yield1 column), and if there are missing values, how can we replace it to 0? or you can also use the following code. How to convert missing values to 0…

Read More Read More

How to add separate text to panels divided by facet_wrap() in R?

How to add separate text to panels divided by facet_wrap() in R?

□ Graph Partitioning Using facet_wrap() in R Studio□ How to customize the title format in facet_wrap()? In my previous posts, I introduced how to divide panels in one figure using facet_wrap(). Today, I’ll introduce how to add separate text to panels. First, let’s make sure we have the required packages installed. I’ll create a dataset as shown below: Next, I’ll reshape the dataset into columns to facilitate data analysis. And then, I’ll summarize this data using descriptive statistics. Finally, I’ll…

Read More Read More

The Agrivoltaics Image created from DALL∙E3

The Agrivoltaics Image created from DALL∙E3

DALL·E3, developed by OpenAI, is an advanced AI model capable of generating images from textual descriptions. It can create images based on a wide variety of prompts, ranging from straightforward descriptions to more imaginative or abstract concepts. ChatGPT – DALL·E (openai.com) I requested images from DALL·E depicting Agrivoltaics farming, and these are the results.

Quantifying pre- and post-anthesis heat waves on grain number and grain weight of contrasting wheat cultivars

Quantifying pre- and post-anthesis heat waves on grain number and grain weight of contrasting wheat cultivars

Quantifying pre- and post-anthesis heat waves on grain number and grain weight of contrasting wheat cultivars The study titled “Quantifying pre- and post-anthesis heat waves on grain number and grain weight of contrasting wheat cultivars” investigates the impact of heat stress on wheat productivity. As temperatures rise, wheat faces challenges in maintaining grain yield. Heat stress adversely affects two critical components: grain number per m2 (GN) and average grain weight (AGW). However, it remains unclear whether the sensitivity of these components differs and…

Read More Read More

How to summarize data using Python?

How to summarize data using Python?

In my previous post, I demonstrated how to create a data table using Python. If you’re interested, please refer to the post below. ■ How to create a data table in Python? I’ll summarize this data by mean and standard error. full code: https://github.com/agronomy4future/python_code/blob/main/How_to_summarize_data_using_Python.ipynb

Generating Graphs and Summarizing Data Tables in Data Analyst By ChatGPT (feat. texting to coding)

Generating Graphs and Summarizing Data Tables in Data Analyst By ChatGPT (feat. texting to coding)

If you update to ChatGPT Plus version, we can access Data Analyst, and “you can create graphs by texting instead of coding“. Let’s upload a dataset into Data Analyst. This dataset contains data about Fe uptake on wheat grains. If you run the following R code, you can download the data from my GitHub. After downloading the data, let’s proceed to Data Analyst, click the upload button, and upload the data file. ChatGPT – Data Analyst Starting now, I’ll be…

Read More Read More

Efficient Data Management: Variable Filtering in SAS Studio

Efficient Data Management: Variable Filtering in SAS Studio

Today, I’ll introduce how to filter variables after uploading data to SAS Studio. First, let’s upload data to SAS Studio. I’ll summarize data based on Treatment_modified. I want to summarize for each unique ID. So, I’ll add numbers from 1 up to the end. and I’ll summarize data again. Then, I’ll download this data to my PC. 1) to upload data to SAS First, let’s upload the data to SAS. I’ll assign this data to the Test table that I…

Read More Read More

Exploring Machine Learning Fundamentals: Predicting Survival on the Titanic

Exploring Machine Learning Fundamentals: Predicting Survival on the Titanic

In 2024, one of my goals is to learn machine learning and publish a crop physiology paper in an academic journal using machine learning. While taking online or offline courses of machine learning, I discovered Kaggle, a popular platform for data science and machine learning competitions, datasets, and tutorials. Kaggle provides excellent datasets for practicing basic machine learning and data analysis. If you visit the Kaggle website: Titanic – Machine Learning from Disaster, you can access and download various datasets….

Read More Read More

How to import Kaggle datasets directly into Google Colab?

How to import Kaggle datasets directly into Google Colab?

Kaggle is a popular online platform for data science and machine learning competitions, datasets, and tutorials. You can find high-quality data on Kaggle to practice data analysis. I have uploaded some of my data on Kaggle to share it with others. Recently, I’ve begun learning machine learning, and one of the most fundamental datasets for this purpose is the Titanic dataset. By visiting the website below, you can download the Titanic survivor data and practice machine learning with this foundational…

Read More Read More

In R, Drawing Lines with Different X-axis Starting Positions

In R, Drawing Lines with Different X-axis Starting Positions

In R, I want to draw a line in a graph, and first, I’ll create the data. Next, I’ll create a bar graph. In this graph, I want to draw a horizontal line. The code to draw lines is introduced in the post below. □ Drawing Lines in ggplot() I added a horizontal line to represent the mean yield of all cultivars. Next, I would like to draw a horizontal line starting from Cultivar B. How can this be achieved?…

Read More Read More

How to run R codes in Visual Studio Code? A Step-by-Step Guide

How to run R codes in Visual Studio Code? A Step-by-Step Guide

Visual Studio Code is a free and open-source code editor developed by Microsoft. It is a versatile editor that supports a wide range of programming languages, including, but not limited to, R, Python, SQL, JavaScript, TypeScript, Java, C#, and many others. The software provides a unified and user-friendly interface for developers working with different languages, making it a popular choice across various programming communities. Now, I’m working on my SQL code in Visual Studio Code, and recently, I’ve also been…

Read More Read More

Matching Datasets in R: An Approach Comparable to Excel’s VLOOKUP Function

Matching Datasets in R: An Approach Comparable to Excel’s VLOOKUP Function

I have two datasets. Now, I want to combine these two datasets, but the row numbers differ between the two datasets. In dataB, the 3rd replicate for Tr1 and the 2nd replicate for Tr3 were deleted due to environmental errors. In this case, simply combining the two datasets is not feasible. One solution is to merge them row-wise using the rbind() function. This way, the two datasets will be combined by row. However, my goal is to combine the two…

Read More Read More

How to run R codes in Google Colab?

How to run R codes in Google Colab?

Google Colab is essentially a Jupyter notebook environment, which means that typically only Python code works. However, it is also possible to use R code in Google Colab. If you’re unfamiliar with Google Colab, please read the post below to grasp its general concept. □ How to use Google Colab for Python (power tool to analyze data)? When opening a new Google Colab window, navigate to Runtime in the menu, choose Change runtime type, and a new window will appear,…

Read More Read More