Which of these is a common visual tool for diagnosing heteroscedasticity?
Histogram
Normal probability plot
Scatter plot of residuals vs. predicted values
Box plot
Feature selection in linear regression primarily aims to:
Ensure that all features have a statistically significant p-value
Improve model performance and generalization by focusing on the most relevant predictors
Make the model more complex and harder to interpret
Increase the number of features used for prediction
What is the purpose of splitting the dataset into training and testing sets in Linear Regression?
To evaluate the model's performance on unseen data.
To reduce the dimensionality of the data.
To visualize the relationship between variables.
To handle missing values in the dataset.
What does a correlation coefficient of 0 indicate?
A perfect linear relationship
A strong positive linear relationship
No linear relationship
A strong negative linear relationship
Can the R-squared value be negative?
Yes, if there is a perfect negative correlation between the variables.
No, it is always positive.
No, it always ranges between 0 and 1.
Yes, if the model fits the data worse than a horizontal line.
In the context of linear regression, what is an error term?
The difference between the slope and the intercept of the regression line.
A mistake made in collecting or entering data.
The variation in the independent variable.
The difference between the observed value of the dependent variable and the predicted value.
Which of the following is the general equation for a simple linear regression model?
y = e^(b0 + b1*x)
y = b0 + b1*x + e
y = b0 * x^b1
y = b0 + b1x1 + b2x2 + ... + bn*xn
What does a pattern in the residual plot suggest?
The linear model is not a good fit for the data, and a non-linear model may be more appropriate.
The residuals are normally distributed.
The linear model is a good fit for the data.
There is no relationship between the independent and dependent variables.
Why is a residual plot useful in evaluating a linear regression model?
To calculate the R-squared value.
To check for non-linearity and other violations of the linear regression assumptions.
To determine the slope of the regression line.
To predict future values of the dependent variable.
What does the assumption of independence in linear regression refer to?
Independence between the errors and the dependent variable
Independence between the observations
Independence between the independent and dependent variables
Independence between the coefficients of the regression model