What is a primary risk of using high-degree polynomials in Polynomial Regression?
It always improves the model's performance on unseen data.
It makes the model too simple and reduces its ability to capture complex patterns.
It can lead to overfitting, where the model learns the training data too well but fails to generalize to new data.
It reduces the computational cost of training the model.
How do polynomial features help in capturing non-linear relationships in data?
They convert categorical variables into numerical variables.
They reduce the impact of outliers on the regression line.
They introduce non-linear terms, allowing the model to fit curved relationships.
They make the model less complex and easier to interpret.
If a predictor has a p-value of 0.02 in a multiple linear regression model, what can you conclude?
The predictor is statistically significant at the 0.05 level.
The predictor explains 2% of the variance in the outcome.
The predictor is not statistically significant.
The predictor has a practically significant effect on the outcome.
How does stepwise selection work in feature selection?
It transforms the original features into a lower-dimensional space while preserving important information.
It uses L1 or L2 regularization to shrink irrelevant feature coefficients to zero.
It ranks features based on their correlation with the target variable and selects the top-k features.
It iteratively adds or removes features based on a statistical criterion, aiming to find the best subset.
What is the primary purpose of using hierarchical linear models (HLMs)?
To analyze data with a single level of variability.
To improve the accuracy of predictions in linear regression.
To analyze data with nested or grouped structures.
To handle missing data in a linear regression model.
Which evaluation metric is particularly sensitive to outliers in the dependent variable?
R-squared
Adjusted R-squared
MAE
RMSE
What happens to the bias and variance of a linear regression model as the regularization parameter (lambda) increases?
Bias decreases, Variance decreases
Bias decreases, Variance increases
Bias increases, Variance increases
Bias increases, Variance decreases
Which of the following scenarios would benefit from using a hierarchical linear model?
Classifying emails as spam or not spam
Forecasting stock prices based on historical data
Analyzing the effect of a new drug on patients in different hospitals
Predicting the price of a house based on its size and location
Which of the following is a potential drawback of using robust regression methods?
They always require data normalization before model fitting
They can be computationally more expensive than OLS regression
They are not applicable to datasets with categorical variables
They always result in models with lower predictive accuracy than OLS regression
How do GLMs handle heteroscedasticity, a situation where the variance of residuals is not constant across the range of predictor values?
They require data transformations to stabilize variance before analysis.
They ignore heteroscedasticity as it doesn't impact GLM estimations.
They use non-parametric techniques to adjust for heteroscedasticity.
They implicitly account for it by allowing the variance to be a function of the mean.