Advance Statistics Blog

Posts

Showing posts from September, 2025

Analyzing Data: Hypothesis Testing, Confidence Intervals, and Correlations

September 25, 2025

mu <- 70 > xbar <- 69.1 > sigma <- 3.5 > n <- 49 > > z <- (xbar - mu) / (sigma/sqrt(n)) > p <- pnorm(z) # one-tailed > z [1] -1.8 > p [1] 0.03593032 > > > > xbar <- 85 > sigma <- 8 > n <- 64 > z_alpha <- 1.96 > > se <- sigma/sqrt(n) > lower <- xbar - z_alpha*se > upper <- xbar + z_alpha*se > c(lower, upper) [1] 83.04 86.96 > > > > > # Girls data > girls_goals <- c(4, 5, 6) > girls_grades <- c(49, 50, 69) > girls_popular <- c(24, 36, 38) > girls_time <- c(19, 22, 28) > > # Boys data > boys_goals <- c(4, 5, 6) > boys_grades <- c(46.1, 54.2, 67.7) > boys_popular <- c(26.9, 31.6, 39.5) > boys_time <- c(18.9, 22.2, 27.8) > > # Create dataframe > df <- data.frame(girls_goals, girls_grades, girls_popular, girls_time, + boys_goals, boys_grades, boys_popular, boys_time) > > # Corre...

From Forecasts to Healthcare: Understanding Probability Step by Step

September 17, 2025

So, the result ≈ 0.107 \approx 0.107 ≈ 0.107 basically means there’s about a 10.7% chance that, using the traditional method, a surgeon would operate on 10 patients in a row without any complications. That’s a pretty low probability, which makes the new procedure look promising since it already pulled off 10 straight successes.

Descriptive Statistics: Comparing Two Data Sets in R

September 13, 2025

I worked with two small data sets, each with 7 numbers, and looked at things like mean, median, mode, range, and standard deviation. Both sets had the same spread, same range (8), IQR (3), variance (8.33), and standard deviation (2.89), which means they’re equally spread out. But their averages were really different. Set #1 had a lower mean (4) and median (3), and most of its values were on the low end, with one big outlier (10) that pushed the average up. Set #2 had a higher mean (14) and median (13), and its numbers were more balanced around the middle. This activity helped me see how two data sets can have the same variability but totally different patterns based on where the values are center

The Art of Programming Assignment

September 07, 2025

For this assignment, I created a vector in R called assignment2 using the numbers 6, 18, 14, 22, 27, 17, 22, 20, and 22. A vector in R is simply a collection of values stored together under one name, making it easy to perform calculations on the entire set. Next, I used the custom function myMean , which works by adding all the numbers in the vector with sum(assignment) and then dividing that total by the number of elements using length(assignment). Running the function gave me a result of 18.67 , which represents the average of the dataset. This shows how we can create our own function in R to calculate something as common as the mean, even though R already has a built-in mean ( ) function.