What does the adjusted R-squared value tell you in multiple linear regression?
The accuracy of the model's predictions.
The statistical significance of the overall model.
The proportion of variance in the outcome explained by the predictors, adjusted for the number of predictors in the model.
The presence of outliers in the data.
Huber regression modifies the loss function used in OLS regression. How does this modification help in handling outliers?
It assigns lower weights to data points that deviate significantly from the predicted values
It completely ignores data points identified as outliers
It increases the learning rate of the regression model for outlier data points
It transforms all data points to follow a normal distribution
Which robust regression technique is particularly well-suited for handling datasets with a high proportion of outliers?
Huber regression
Theil-Sen estimator
RANSAC (Random Sample Consensus)
Ordinary Least Squares (OLS) regression
Why is evaluating the model on a separate test set crucial in Polynomial Regression?
To calculate the model's complexity and determine the optimal degree of the polynomial.
To fine-tune the model's hyperparameters and improve its fit on the training data.
To visualize the residuals and check for any non-linear patterns.
To estimate the model's performance on unseen data and assess its generalization ability.
What does a high Cook's distance value indicate?
The observation has both high leverage and high influence
The observation is not an outlier
The observation has low leverage but high influence
The observation has high leverage but low influence
What distinguishes a random slope model from a random intercept model in HLM?
Random slope models are used for smaller datasets, while random intercept models are used for larger datasets.
Random slope models allow slopes to vary, while random intercept models don't.
Random slope models handle categorical variables, while random intercept models handle continuous variables.
Random slope models allow intercepts to vary, while random intercept models don't.
How do Generalized Linear Models (GLMs) extend the capabilities of linear regression?
By allowing only categorical predictor variables.
By assuming a strictly linear relationship between the response and predictor variables.
By limiting the analysis to datasets with a small number of observations.
By enabling the response variable to follow different distributions beyond just normal distribution.
Which evaluation metric is particularly sensitive to outliers in the dependent variable?
MAE
Adjusted R-squared
R-squared
RMSE
How does Lasso Regression differ from Ridge Regression in terms of feature selection?
Ridge Regression tends to shrink all coefficients towards zero but rarely sets them exactly to zero.
Both Lasso and Ridge Regression can shrink coefficients to zero, but Lasso does it more aggressively.
Lasso Regression can shrink coefficients to exactly zero, effectively performing feature selection.
Neither Lasso nor Ridge Regression performs feature selection; they only shrink coefficients.
Which metric is in the same units as the dependent variable, making it easier to interpret directly?