In forward selection, what criteria is typically used to decide which feature to add at each step?
The feature with the highest p-value
The feature that is least correlated with the other features
The feature that results in the smallest increase in R-squared
The feature that results in the largest improvement in model performance
How does the Mean Squared Error (MSE) penalize larger errors compared to smaller errors?
It uses a logarithmic scale to compress larger errors.
It doesn't; all errors are penalized equally.
It takes the absolute value of the errors, ignoring the sign.
It squares the errors, giving more weight to larger deviations.
What is the primary goal of feature selection in linear regression?
Increase the complexity of the model
Introduce bias into the model
Maximize the number of features used in the model
Improve the model's interpretability and reduce overfitting
Which of the following is NOT an assumption of linear regression?
Normality of residuals
Multicollinearity
Homoscedasticity
Linearity
Which of these is a common visual tool for diagnosing heteroscedasticity?
Normal probability plot
Box plot
Scatter plot of residuals vs. predicted values
Histogram
What type of visualization tool is commonly used to initially assess the relationship between two continuous variables in linear regression?
Pie chart
Bar chart
Scatter plot
What is the main difference between forward selection and backward elimination in linear regression?
Forward selection starts with no features and adds one by one, while backward elimination starts with all features and removes one by one.
Forward selection starts with all features and removes one by one, while backward elimination starts with no features and adds one by one.
Forward selection is used for classification, while backward elimination is used for regression.
There is no difference; both techniques achieve the same outcome.
What does the assumption of independence in linear regression refer to?
Independence between the coefficients of the regression model
Independence between the independent and dependent variables
Independence between the errors and the dependent variable
Independence between the observations
Who is credited as a pioneer in developing the method of least squares, a foundational element of linear regression?
Carl Friedrich Gauss
Blaise Pascal
Alan Turing
Ada Lovelace
Why is normality of errors an important assumption in linear regression?
It guarantees the homoscedasticity of the errors
It ensures the linearity of the relationship between variables
It validates the use of hypothesis testing for the model's coefficients
It is necessary for the calculation of the regression coefficients