Logistic regression, a specific type of GLM, is best suited for modeling which type of response variable?
Binary (two categories)
Time-to-event data
Count data
Continuous
Which of the following scenarios would benefit from using a hierarchical linear model?
Forecasting stock prices based on historical data
Analyzing the effect of a new drug on patients in different hospitals
Classifying emails as spam or not spam
Predicting the price of a house based on its size and location
Which of the following robust regression methods iteratively identifies and removes outliers before fitting a standard linear regression model?
None of the above
Theil-Sen estimator
Huber regression
RANSAC (Random Sample Consensus)
How do GLMs handle heteroscedasticity, a situation where the variance of residuals is not constant across the range of predictor values?
They implicitly account for it by allowing the variance to be a function of the mean.
They use non-parametric techniques to adjust for heteroscedasticity.
They require data transformations to stabilize variance before analysis.
They ignore heteroscedasticity as it doesn't impact GLM estimations.
The Theil-Sen estimator is known for its robustness and non-parametric nature. What does 'non-parametric' imply in this context?
It does not have any parameters that need to be estimated from the data
It does not require assumptions about the distribution of the data
It does not require a dependent variable for model fitting
It does not require a linear relationship between the variables
Which metric is in the same units as the dependent variable, making it easier to interpret directly?
Adjusted R-squared
R-squared
RMSE
MAE
What is the primary difference between L1 and L2 regularization in the context of feature selection?
L1 regularization can shrink some feature coefficients to exactly zero, performing feature selection, while L2 regularization generally shrinks coefficients towards zero without making them exactly zero.
L2 regularization forces the model to use all available features, while L1 regularization selects a subset of features.
L1 regularization is less effective when dealing with highly correlated features compared to L2 regularization.
L2 regularization is more computationally expensive than L1 regularization.
You are comparing two linear regression models for predicting house prices. Model A has a lower RMSE than Model B. What does this imply about their predictive performance?
Model A, on average, has smaller prediction errors than Model B.
Model A has a higher R-squared value than Model B.
Model A is guaranteed to make better predictions on all new data points.
Model B is definitely overfitting the data.
Which of the following is a common method for addressing multicollinearity in multiple linear regression?
Transforming the outcome variable.
Ignoring the issue, as it has no impact on the model.
Increasing the sample size.
Removing one or more of the correlated predictor variables.
How do polynomial features help in capturing non-linear relationships in data?
They introduce non-linear terms, allowing the model to fit curved relationships.
They reduce the impact of outliers on the regression line.
They make the model less complex and easier to interpret.
They convert categorical variables into numerical variables.