Which assumption of linear regression ensures that the relationship between the independent and dependent variables is linear?
Independence
Homoscedasticity
Linearity
Normality of errors
Feature selection in linear regression primarily aims to:
Improve model performance and generalization by focusing on the most relevant predictors
Increase the number of features used for prediction
Make the model more complex and harder to interpret
Ensure that all features have a statistically significant p-value
Who is credited with developing the foundational principles of linear regression?
Isaac Newton
Marie Curie
Sir Francis Galton
Albert Einstein
What is the purpose of the coefficient of determination (R-squared) in linear regression?
To determine the statistical significance of the model.
To assess the linearity assumption of the model.
To identify the presence of outliers in the data.
To measure the proportion of variation in the dependent variable explained by the independent variable(s).
What type of visualization tool is commonly used to initially assess the relationship between two continuous variables in linear regression?
Pie chart
Bar chart
Histogram
Scatter plot
In forward selection, what criteria is typically used to decide which feature to add at each step?
The feature that results in the largest improvement in model performance
The feature with the highest p-value
The feature that is least correlated with the other features
The feature that results in the smallest increase in R-squared
If a Durbin-Watson test statistic is close to 2, what does it suggest about the residuals?
They are normally distributed
They exhibit a linear pattern
They are homoscedastic
They are independent
What is the main difference between forward selection and backward elimination in linear regression?
Forward selection starts with no features and adds one by one, while backward elimination starts with all features and removes one by one.
There is no difference; both techniques achieve the same outcome.
Forward selection is used for classification, while backward elimination is used for regression.
Forward selection starts with all features and removes one by one, while backward elimination starts with no features and adds one by one.
Why is normality of errors an important assumption in linear regression?
It is necessary for the calculation of the regression coefficients
It ensures the linearity of the relationship between variables
It validates the use of hypothesis testing for the model's coefficients
It guarantees the homoscedasticity of the errors
Which method in pandas is used to read a CSV file containing the dataset for Linear Regression?
load()
loadtxt()
read_csv()
from_csv()