Which of the following is a potential drawback of using robust regression methods?
They are not applicable to datasets with categorical variables
They always require data normalization before model fitting
They always result in models with lower predictive accuracy than OLS regression
They can be computationally more expensive than OLS regression
Which metric is more interpretable in terms of the original units of the dependent variable?
Adjusted R-squared
None of the above
Root Mean Squared Error (RMSE)
Both RMSE and MAE are equally interpretable.
Why is evaluating the model on a separate test set crucial in Polynomial Regression?
To visualize the residuals and check for any non-linear patterns.
To estimate the model's performance on unseen data and assess its generalization ability.
To calculate the model's complexity and determine the optimal degree of the polynomial.
To fine-tune the model's hyperparameters and improve its fit on the training data.
What distinguishes a random slope model from a random intercept model in HLM?
Random slope models handle categorical variables, while random intercept models handle continuous variables.
Random slope models allow intercepts to vary, while random intercept models don't.
Random slope models are used for smaller datasets, while random intercept models are used for larger datasets.
Random slope models allow slopes to vary, while random intercept models don't.
What type of data is particularly well-suited for analysis using hierarchical linear models?
Experimental data
Cross-sectional data
Nested data
Time series data
What is the primary motivation for using robust regression over ordinary least squares (OLS) regression?
To reduce the computational complexity of the regression analysis
To improve the interpretability of the regression coefficients
To mitigate the impact of outliers on the fitted regression line
To handle datasets with non-linear relationships between variables more effectively
How do polynomial features help in capturing non-linear relationships in data?
They introduce non-linear terms, allowing the model to fit curved relationships.
They convert categorical variables into numerical variables.
They reduce the impact of outliers on the regression line.
They make the model less complex and easier to interpret.
What does the adjusted R-squared value tell you in multiple linear regression?
The accuracy of the model's predictions.
The proportion of variance in the outcome explained by the predictors, adjusted for the number of predictors in the model.
The statistical significance of the overall model.
The presence of outliers in the data.
What is the primary role of a link function in a Generalized Linear Model?
It establishes a connection between the linear predictor and the mean of the response variable.
It transforms the predictor variables to follow a normal distribution.
It calculates the residuals between the observed and predicted values.
It determines the optimal number of predictor variables to include in the model.
What does heteroscedasticity in a residual plot typically look like?
A straight line with non-zero slope
A funnel shape, widening or narrowing along the x-axis
A random scattering of points
A U-shape or inverted U-shape