In which scenario might you prefer Huber regression over RANSAC for robust regression?
When it's important to completely discard the outliers from the analysis
When the proportion of outliers is relatively small
When the outliers are expected to be clustered together
When dealing with high-dimensional data with a large number of features
What does Adjusted R-squared penalize that R-squared does not?
Non-linearity in the relationship
Presence of outliers
Number of data points
Inclusion of irrelevant predictor variables
What happens to the bias and variance of a linear regression model as the regularization parameter (lambda) increases?
Bias increases, Variance decreases
Bias increases, Variance increases
Bias decreases, Variance increases
Bias decreases, Variance decreases
What does a Variance Inflation Factor (VIF) value greater than 10 generally suggest?
Severe multicollinearity
No multicollinearity
Perfect multicollinearity
Heteroscedasticity
The performance of the Theil-Sen estimator can be sensitive to which characteristic of the data?
The presence of categorical variables
The non-normality of the residuals
The presence of multicollinearity (high correlation between independent variables)
The presence of heteroscedasticity (unequal variances of errors)
What is the role of feature selection in Polynomial Regression?
To reduce the model complexity by identifying and selecting the most relevant features.
To increase the number of features used in the model to improve accuracy.
To visualize the relationship between the target variable and independent variables.
To convert categorical variables into numerical variables.
Poisson regression, another type of GLM, is particularly well-suited for analyzing which kind of data?
Continuous measurements
Ordinal data with a specific order
Proportions or percentages
Count data of rare events
How do hierarchical linear models help avoid misleading conclusions in nested data analysis?
By accounting for the correlation between observations within groups
By assuming all groups have the same effect on the outcome
By treating all observations as independent
By ignoring individual-level variations
Elastic Net Regression combines the penalties of which two regularization techniques?
Lasso Regression and Polynomial Regression
Ridge Regression and Polynomial Regression
Linear Regression and Ridge Regression
Lasso Regression and Ridge Regression
What is the primary goal of regularization techniques in linear regression?
To speed up the training process of the linear regression model.
To handle missing data points in the dataset more effectively.
To prevent overfitting by adding a penalty to the complexity of the model.
To improve model interpretability by selecting only the most relevant features.