Analyzing Data: Hypothesis Testing, Confidence Intervals, and Correlations
mu <- 70
> xbar <- 69.1 > sigma <- 3.5 > n <- 49 > > z <- (xbar - mu) / (sigma/sqrt(n)) > p <- pnorm(z) # one-tailed > z [1] -1.8 > p [1] 0.03593032 > > > > xbar <- 85 > sigma <- 8 > n <- 64 > z_alpha <- 1.96 > > se <- sigma/sqrt(n) > lower <- xbar - z_alpha*se > upper <- xbar + z_alpha*se > c(lower, upper) [1] 83.04 86.96 > > > > > # Girls data > girls_goals <- c(4, 5, 6) > girls_grades <- c(49, 50, 69) > girls_popular <- c(24, 36, 38) > girls_time <- c(19, 22, 28) > > # Boys data > boys_goals <- c(4, 5, 6) > boys_grades <- c(46.1, 54.2, 67.7) > boys_popular <- c(26.9, 31.6, 39.5) > boys_time <- c(18.9, 22.2, 27.8) > > # Create dataframe > df <- data.frame(girls_goals, girls_grades, girls_popular, girls_time, + boys_goals, boys_grades, boys_popular, boys_time) > > # Correlations > cor(df) girls_goals girls_grades girls_popular girls_time boys_goals boys_grades boys_popular boys_time girls_goals 1.0000000 0.8873565 0.9244735 0.9819805 1.0000000 0.9897433 0.9894203 0.9890517 girls_grades 0.8873565 1.0000000 0.6445509 0.9585035 0.8873565 0.9441243 0.9448614 0.9456833 girls_popular 0.9244735 0.6445509 1.0000000 0.8357661 0.9244735 0.8605276 0.8593826 0.8580918 girls_time 0.9819805 0.9585035 0.8357661 1.0000000 0.9819805 0.9989061 0.9990085 0.9991175 boys_goals 1.0000000 0.8873565 0.9244735 0.9819805 1.0000000 0.9897433 0.9894203 0.9890517 boys_grades 0.9897433 0.9441243 0.8605276 0.9989061 0.9897433 1.0000000 0.9999975 0.9999887 boys_popular 0.9894203 0.9448614 0.8593826 0.9990085 0.9894203 0.9999975 1.0000000 0.9999968 boys_time 0.9890517 0.9456833 0.8580918 0.9991175 0.9890517 0.9999887 0.9999968 1.0000000 > cor(df, method="pearson") girls_goals girls_grades girls_popular girls_time boys_goals boys_grades boys_popular boys_time girls_goals 1.0000000 0.8873565 0.9244735 0.9819805 1.0000000 0.9897433 0.9894203 0.9890517 girls_grades 0.8873565 1.0000000 0.6445509 0.9585035 0.8873565 0.9441243 0.9448614 0.9456833 girls_popular 0.9244735 0.6445509 1.0000000 0.8357661 0.9244735 0.8605276 0.8593826 0.8580918 girls_time 0.9819805 0.9585035 0.8357661 1.0000000 0.9819805 0.9989061 0.9990085 0.9991175 boys_goals 1.0000000 0.8873565 0.9244735 0.9819805 1.0000000 0.9897433 0.9894203 0.9890517 boys_grades 0.9897433 0.9441243 0.8605276 0.9989061 0.9897433 1.0000000 0.9999975 0.9999887 boys_popular 0.9894203 0.9448614 0.8593826 0.9990085 0.9894203 0.9999975 1.0000000 0.9999968 boys_time 0.9890517 0.9456833 0.8580918 0.9991175 0.9890517 0.9999887 0.9999968 1.0000000 > cor(df, method="spearman") girls_goals girls_grades girls_popular girls_time boys_goals boys_grades boys_popular boys_time girls_goals 1 1 1 1 1 1 1 1 girls_grades 1 1 1 1 1 1 1 1 girls_popular 1 1 1 1 1 1 1 1 girls_time 1 1 1 1 1 1 1 1 boys_goals 1 1 1 1 1 1 1 1 boys_grades 1 1 1 1 1 1 1 1 boys_popular 1 1 1 1 1 1 1 1 boys_time 1 1 1 1 1 1 1 1 > > # Scatterplots > pairs(df, main="Scatterplot Matrix") > > # Correlogram > install.packages("corrgram")
This assignment investigates whether a new cookie machine meets the manufacturer’s specifications using hypothesis testing and p-values, showing that in some scenarios the machine may fall short. I also calculated a 95% confidence interval to estimate the population mean for another dataset, demonstrating how sample data can predict population parameters. Finally, correlation analysis compared girls and boys goals, grades, popularity, and time spent on assignments, revealing very strong positive relationships across all variables, which I illustrated with scatterplots and a correlogram.



Comments
Post a Comment