What is the ideal shape of a residual plot for a well-fitted linear regression model?
A straight line.
An inverted U-shape.
Random scatter with no discernible pattern.
A U-shape.
What does the linearity assumption in linear regression imply?
The data points are evenly distributed around the regression line.
The dependent variable must have a normal distribution.
The independent variables are unrelated to each other.
The relationship between the dependent and independent variables can be best represented by a straight line.
How does the Mean Squared Error (MSE) penalize larger errors compared to smaller errors?
It uses a logarithmic scale to compress larger errors.
It doesn't; all errors are penalized equally.
It squares the errors, giving more weight to larger deviations.
It takes the absolute value of the errors, ignoring the sign.
Backward elimination in linear regression involves removing features based on what criterion?
The feature with the lowest p-value
The feature that results in the smallest decrease in model performance
The feature that contributes the least to multicollinearity
The feature with the highest correlation with the target variable
Which of the following is NOT an assumption of linear regression?
Linearity
Normality of residuals
Multicollinearity
Homoscedasticity
Which of these is a common visual tool for diagnosing heteroscedasticity?
Scatter plot of residuals vs. predicted values
Histogram
Normal probability plot
Box plot
Which of the following situations might make feature selection particularly important?
Having a very large dataset with only a few features
When computational resources are unlimited
Having a small dataset with a very large number of features
When all features are highly correlated with the target variable
What distinguishes simple linear regression from multiple linear regression?
Simple linear regression uses a curved line, while multiple linear regression uses a straight line.
Simple linear regression has one independent variable, while multiple linear regression has two or more.
There is no difference; the terms are interchangeable.
Simple linear regression analyzes categorical data, while multiple linear regression analyzes numerical data.
Which of the following is the general equation for a simple linear regression model?
y = b0 + b1*x + e
y = b0 + b1x1 + b2x2 + ... + bn*xn
y = e^(b0 + b1*x)
y = b0 * x^b1
If a Durbin-Watson test statistic is close to 2, what does it suggest about the residuals?
They are homoscedastic
They are normally distributed
They exhibit a linear pattern
They are independent