What happens to the bias and variance of a linear regression model as the regularization parameter (lambda) increases?
Bias increases, Variance increases
Bias decreases, Variance increases
Bias decreases, Variance decreases
Bias increases, Variance decreases
You are comparing two linear regression models for predicting house prices. Model A has a lower RMSE than Model B. What does this imply about their predictive performance?
Model A has a higher R-squared value than Model B.
Model A is guaranteed to make better predictions on all new data points.
Model B is definitely overfitting the data.
Model A, on average, has smaller prediction errors than Model B.
In which scenario might you prefer Huber regression over RANSAC for robust regression?
When the proportion of outliers is relatively small
When the outliers are expected to be clustered together
When it's important to completely discard the outliers from the analysis
When dealing with high-dimensional data with a large number of features
The performance of the Theil-Sen estimator can be sensitive to which characteristic of the data?
The presence of heteroscedasticity (unequal variances of errors)
The non-normality of the residuals
The presence of categorical variables
The presence of multicollinearity (high correlation between independent variables)
How does Lasso Regression differ from Ridge Regression in terms of feature selection?
Ridge Regression tends to shrink all coefficients towards zero but rarely sets them exactly to zero.
Neither Lasso nor Ridge Regression performs feature selection; they only shrink coefficients.
Both Lasso and Ridge Regression can shrink coefficients to zero, but Lasso does it more aggressively.
Lasso Regression can shrink coefficients to exactly zero, effectively performing feature selection.
What does a high Cook's distance value indicate?
The observation is not an outlier
The observation has both high leverage and high influence
The observation has high leverage but low influence
The observation has low leverage but high influence
What is a key limitation of relying solely on Adjusted R-squared for model evaluation in linear regression?
It is highly sensitive to outliers.
It doesn't provide information about the magnitude of prediction errors.
It is difficult to interpret.
It can be misleading when comparing models with different numbers of predictors.
Why might centering or scaling independent variables be insufficient to completely resolve multicollinearity?
It can make the model more complex and harder to interpret.
It only works for linear relationships between variables.
It requires a large sample size to be effective.
It doesn't address the fundamental issue of high correlations between the variables.
Which of the following is a common method for addressing multicollinearity in multiple linear regression?
Ignoring the issue, as it has no impact on the model.
Transforming the outcome variable.
Removing one or more of the correlated predictor variables.
Increasing the sample size.
What does multicollinearity refer to in the context of multiple linear regression?
A high correlation between the outcome variable and a predictor variable.
Non-linearity in the relationship between predictors and outcome.
The presence of outliers in the data.
A high correlation between two or more predictor variables.