top of page
CRAN
Heart Dataset

We did the in depth analysis of the cran heart dataset by using methods comparison of means ANOVA, plotting box-and-whisker plot and linear regression
ANOVA
#Hypothesis
#Ho : oldpeak mean value across all three Thal(thalassemia) are equal
#H1 : atleast one oldpeak mean value across three Thal(thalassemia) is different


#Result
#As calculated p-value is very very small and less than 0.05
#We will reject the Null Hypothesis and will accept alternative hypothesis i.e. atleast one oldpeak mean value across three Thal(thalassemia) is different
LINEAR REGRESSION
#we are working at 95% siginifact level
#response variable is Age
#explanatory variables are MaxHR, RestBP, oldpeak, Chol
#We chose age as response variable because you can estimate your maximum heart rate, rest bp... etc based on your age
#We created and run our first model(unrestricted) to find out which are the explanatory variables significant
model1 <- lm(Age ~ MaxHR + Oldpeak + Chol + RestBP)

# After seeing the summary of model1 we find out that two explanatory variables does are not significant ie Chol and oldpeak


#after doing regression line plot with age and both selected explanatory variable differently we find out that
#There is negative correlation between Age & MaxHR
#There is positive correlation between Age & RestBP
# H0: B1(beta 1) = B2(beta2) =0
# H1: one of the coefficient is not equal to zero
#We created and run our second model (restricted) with explanatory variables which were significant in model 1 ie MaxHR and RestBP
model2 <- lm(Age ~ MaxHR + RestBP)

#from the summary table of model2 we observe that B1 not equal to B2 and are significant at 95% confidence interval
#therefore we reject the null hypothesis and accept the alternate hypothesis
R SQUARED
#H0: R sqaure is equal to zero
#H1: R square is not equal to zero

#since R square is greater the zero thus we reject null hypothesis and accept alternate hypothesis
#as value of r is greater than zero and also r value is low with a number of independent variables significant
#we can say that there is some kind of relation exists among response and explanatory variable
F VALUE
# H0: B1(beta 1) = B2(beta2) =0
# H1: one of the coefficient is not equal to zero
#finding F value using CAR library(Companion to Applied Regression)

#5% Critical F-Statistic with (2,298) degree of freedom is 3 (from the F table at 95% Confidence Interval)
#As calculated F-value (6.1733) is greater than F-Value in table we will reject the Null Hypothesis and accept the alternate hypothesis
PLOTS
#Here we see that linearity seems to hold in some parts on the curve
#It is not the right fit for a linear regression model, a violation of the linear relationship between the response and explanatory variable




bottom of page