R Programming Journal-shanzay

Posts

Assignment 8

- October 19, 2025

I learned how to load data into R, summarize it, filter it, and save the results in different file types for this project.First, I set up my working location so that R could find files quickly and save them in a neat place. After that, I used read.table() to add the information to a data frame. This helped me learn how heads and divisions work when reading real data. Next, I used the ddply() method and the plyr package to find the average grade and age for each gender. This taught me how to do grouped reports in R. Lastly, I used the subset() and grepl() methods to practice filtering data. I got all the student names that started with "i" and saved both the names-only and full filtered results. By following these steps, I learned how to use R to automatically clean data, analyze it, and send it to a file. These are important skills for working with datasets quickly. # R Code setwd("C:/Users/shanz/OneDrive/Documents/Assigment 6") x <- read.table("Assignme...

Module 7. Assignment

- October 11, 2025

This week, I studied Object-Oriented Programming (OOP) in R and discovered that R perceives all entities as objects, including integers, vectors, data frames, and functions. The talk addressed two main systems used in R: S3 and S4. S3 is the preliminary, more accessible way that allows for the rapid addition of a class to a list and the creation of custom print methods. S4 is the more recent and systematically structured framework that has inherent validation and explicit class definitions with slots. We further examined the use of common methods such as summary(), print(), and plot() to see their distinct functionalities across different object kinds. Ultimately, I was able to create my own S3 and S4 objects, implement fundamental methods, and understand how R determines which function version to use, a process known as method dispatch. # Download Data for Mtcar data("mtcars") # Show the first few rows head(mtcars) # Describe its structure str(mtcars) # Test Generic Function...

Module 6 – Linear Algebra in R (Part 2)

- October 05, 2025

# 1. Matrix Addition & Subtraction A <- matrix(c(2, 0, 1, 3), ncol = 2) B <- matrix(c(5, 2, 4, -1), ncol = 2) # Addition A_plus_B <- A + B A_plus_B # Subtraction A_minus_B <- A - B A_minus_B Explanation : Matrix addition and subtraction work element-by-element on matrices of the same size, combining or contrasting values at the same positions. This is useful for quickly aggregating or comparing structured numeric data # 2.Create a Diagonal Matrix D <- diag(c(4, 1, 2, 3)) D Explanation : diag() places the supplied numbers along the main diagonal and fills all other entries with zeros. Diagonal matrices are commonly used for identity/scaling operations and as building blocks in linear algebra. # 3.Construct a Custom 5 × 5 Matrix M <- diag(3, 5, 5) M[1, 2:5] <- 1 M[2:5, 1] <- 2 M Explanation : started with a diagonal of 3’s and then...

Assignment #5

- September 28, 2025

Why solve(A) and det(A) work A is a square matrix (10×10), hence det(A) is defined and equals 0, which means that A is unique (not invertible). det(A) = 0, solve(A) properly gives a solitary system error (there is no inverse). Why operations on B fail (non‑square matrix). B is not a square (10×100). Inverse and determinants are only defined for square matrices, therefore both calls are wrong by definition. A determinant close to 0 means (almost) singularity and computations that aren't stable. It's better to use solution(A, b) (or qr.solve/SVD) to solve systems than to make an explicit inverse; it's more reliable and quicker. https://github.com/shanzay28/r-programming-assignments/edit/main/Doing%20Math%20in%20R%20-%20Part%201%20-README.md

Assignment 4 -Programming Sructure in R

- September 21, 2025

The boxplots next to each other reveal that patients who had a Bad rating on the first assessment and a High rating on the second assessment usually had greater blood pressure. The High final decision category also goes along with high blood pressure. In other words, it seems like the doctors' decisions in this small sample are in line with BP: patients who are labeled as more worrying tend to have higher BP numbers. The histograms show two things: Visit Frequency is grouped together between 0.2 and 0.6, while Blood Pressure is considerably more spread out and has distinct outliers, such a very low number at 30 and a very high value over 200. In a genuine clinical dataset, numbers so severe would cause tests for data quality (measurement mistake, unit mix-ups, or true but uncommon occurrences) before looking for patterns. The patterns in this dataset are simply examples, not generalizable, since it is little and made up (only 10 rows, decreased to 9 after cleaning). I used n...

Module 3 -Introduction to Data Frame

- September 13, 2025

From this chapter and exercise, I have taken away a number of operations I can perform to explore and understand a data frame in R. By utilizing functions such as str(), head() and summary(), it allowed me to visually see the structure and column names of my dataset, as well summary statistics for each variable. I also drilled how to calculate statistics such as the mean, median and range, which helped me further understand what was going on with the poll data. The results indicate that Donald is leading with the most support in both polls, but his strength varies between ABC and CBS. Ted does well. Carly and Hillary still suck. The biggest difference was caused by Donald, who received a lot more points on CBS than ABC. Since the dataset is entirely manufactured, none of these results can be taken at face value. There is no sample size or margin of error behind the numbers and no demographics to break it down. The data is only useful if you want to practice R skills that involve creati...

Search This Blog