What does a Variance Inflation Factor (VIF) value greater than 10 generally suggest?
No multicollinearity
Heteroscedasticity
Perfect multicollinearity
Severe multicollinearity
Which of the following scenarios would benefit from using a hierarchical linear model?
Forecasting stock prices based on historical data
Classifying emails as spam or not spam
Predicting the price of a house based on its size and location
Analyzing the effect of a new drug on patients in different hospitals
What is the primary motivation for using robust regression over ordinary least squares (OLS) regression?
To mitigate the impact of outliers on the fitted regression line
To improve the interpretability of the regression coefficients
To reduce the computational complexity of the regression analysis
To handle datasets with non-linear relationships between variables more effectively
Why is evaluating the model on a separate test set crucial in Polynomial Regression?
To calculate the model's complexity and determine the optimal degree of the polynomial.
To estimate the model's performance on unseen data and assess its generalization ability.
To fine-tune the model's hyperparameters and improve its fit on the training data.
To visualize the residuals and check for any non-linear patterns.
Poisson regression, another type of GLM, is particularly well-suited for analyzing which kind of data?
Proportions or percentages
Ordinal data with a specific order
Continuous measurements
Count data of rare events
How do hierarchical linear models help avoid misleading conclusions in nested data analysis?
By assuming all groups have the same effect on the outcome
By ignoring individual-level variations
By treating all observations as independent
By accounting for the correlation between observations within groups
What is the primary difference between L1 and L2 regularization in the context of feature selection?
L1 regularization can shrink some feature coefficients to exactly zero, performing feature selection, while L2 regularization generally shrinks coefficients towards zero without making them exactly zero.
L2 regularization is more computationally expensive than L1 regularization.
L2 regularization forces the model to use all available features, while L1 regularization selects a subset of features.
L1 regularization is less effective when dealing with highly correlated features compared to L2 regularization.
Which of the following is a method for detecting outliers in linear regression?
Cook's distance
Residual plots
Leverage values
All of the above
When using Principal Component Analysis (PCA) as a remedy for multicollinearity, what is the primary aim?
To remove all independent variables from the model
To introduce non-linearity into the model
To create new, uncorrelated variables from the original correlated ones
To increase the sample size of the dataset
A model has a high R-squared but a low Adjusted R-squared. What is a likely explanation?
The model has high bias.
The model is overfitting.
The model is too simple.
The model is a perfect fit.