What is the purpose of the coefficient of determination (R-squared) in linear regression?
To measure the proportion of variation in the dependent variable explained by the independent variable(s).
To identify the presence of outliers in the data.
To determine the statistical significance of the model.
To assess the linearity assumption of the model.
Why is normality of errors an important assumption in linear regression?
It is necessary for the calculation of the regression coefficients
It guarantees the homoscedasticity of the errors
It ensures the linearity of the relationship between variables
It validates the use of hypothesis testing for the model's coefficients
If the coefficient of determination (R-squared) for a linear regression model is 0.8, what does this indicate?
20% of the variation in the dependent variable is explained by the independent variable.
There is a weak relationship between the independent and dependent variables.
The model is a poor fit for the data.
80% of the variation in the dependent variable is explained by the independent variable.
Which of the following is the general equation for a simple linear regression model?
y = b0 * x^b1
y = b0 + b1*x + e
y = b0 + b1x1 + b2x2 + ... + bn*xn
y = e^(b0 + b1*x)
Which of these methods can be used to address heteroscedasticity?
Removing outliers
All of the above
Adding more independent variables
Transforming the dependent variable
Why is a residual plot useful in evaluating a linear regression model?
To determine the slope of the regression line.
To check for non-linearity and other violations of the linear regression assumptions.
To calculate the R-squared value.
To predict future values of the dependent variable.
Who is credited with developing the foundational principles of linear regression?
Sir Francis Galton
Marie Curie
Albert Einstein
Isaac Newton
What distinguishes simple linear regression from multiple linear regression?
Simple linear regression has one independent variable, while multiple linear regression has two or more.
Simple linear regression analyzes categorical data, while multiple linear regression analyzes numerical data.
Simple linear regression uses a curved line, while multiple linear regression uses a straight line.
There is no difference; the terms are interchangeable.
Which assumption of linear regression ensures that the relationship between the independent and dependent variables is linear?
Normality of errors
Linearity
Independence
Homoscedasticity
What does a residual represent in linear regression?
The predicted value of the dependent variable.
The intercept of the regression line.
The difference between the actual and predicted values of the dependent variable.
The slope of the regression line.