Who is credited with developing the foundational principles of linear regression?
Isaac Newton
Sir Francis Galton
Marie Curie
Albert Einstein
Why is normality of errors an important assumption in linear regression?
It is necessary for the calculation of the regression coefficients
It guarantees the homoscedasticity of the errors
It validates the use of hypothesis testing for the model's coefficients
It ensures the linearity of the relationship between variables
Backward elimination in linear regression involves removing features based on what criterion?
The feature with the highest correlation with the target variable
The feature with the lowest p-value
The feature that contributes the least to multicollinearity
The feature that results in the smallest decrease in model performance
What is the purpose of the coefficient of determination (R-squared) in linear regression?
To assess the linearity assumption of the model.
To identify the presence of outliers in the data.
To determine the statistical significance of the model.
To measure the proportion of variation in the dependent variable explained by the independent variable(s).
What does a correlation coefficient of 0 indicate?
A strong positive linear relationship
A strong negative linear relationship
A perfect linear relationship
No linear relationship
What is the primary goal of feature selection in linear regression?
Introduce bias into the model
Maximize the number of features used in the model
Improve the model's interpretability and reduce overfitting
Increase the complexity of the model
Can the R-squared value be negative?
No, it is always positive.
Yes, if there is a perfect negative correlation between the variables.
No, it always ranges between 0 and 1.
Yes, if the model fits the data worse than a horizontal line.
What does a high R-squared value indicate?
The model is a perfect fit for the data.
The independent variables are not correlated with the dependent variable.
A large proportion of the variance in the dependent variable is explained by the independent variables.
The model is not a good fit for the data.
How does the Mean Squared Error (MSE) penalize larger errors compared to smaller errors?
It squares the errors, giving more weight to larger deviations.
It doesn't; all errors are penalized equally.
It uses a logarithmic scale to compress larger errors.
It takes the absolute value of the errors, ignoring the sign.
What type of visualization tool is commonly used to initially assess the relationship between two continuous variables in linear regression?
Bar chart
Scatter plot
Pie chart
Histogram