Which matplotlib function is commonly used to plot the regression line along with the scatter plot of the data?
hist()
show()
scatter()
plot()
What does the 'fit_intercept' parameter in 'LinearRegression()' control?
Whether to calculate the intercept (bias) of the line.
Whether to normalize the data before fitting.
Whether to use gradient descent for optimization.
Whether to calculate the slope of the line.
Backward elimination in linear regression involves removing features based on what criterion?
The feature with the highest correlation with the target variable
The feature that contributes the least to multicollinearity
The feature with the lowest p-value
The feature that results in the smallest decrease in model performance
Which of the following is NOT a benefit of feature selection in linear regression?
Improved model interpretability
Reduced computational cost
Increased risk of overfitting
Potential for better generalization to new data
If the coefficient of determination (R-squared) for a linear regression model is 0.8, what does this indicate?
80% of the variation in the dependent variable is explained by the independent variable.
The model is a poor fit for the data.
There is a weak relationship between the independent and dependent variables.
20% of the variation in the dependent variable is explained by the independent variable.
What does the linearity assumption in linear regression imply?
The data points are evenly distributed around the regression line.
The independent variables are unrelated to each other.
The relationship between the dependent and independent variables can be best represented by a straight line.
The dependent variable must have a normal distribution.
Which of the following is the general equation for a simple linear regression model?
y = e^(b0 + b1*x)
y = b0 + b1x1 + b2x2 + ... + bn*xn
y = b0 + b1*x + e
y = b0 * x^b1
What is a potential drawback of using a purely automated feature selection technique (like forward selection or backward elimination) without careful consideration?
It can lead to models that are less accurate than using all available features.
It guarantees the most interpretable model.
It can sometimes overlook features that might be important in combination with others.
It completely eliminates the need for domain expertise in model building.
What does the assumption of independence in linear regression refer to?
Independence between the independent and dependent variables
Independence between the errors and the dependent variable
Independence between the observations
Independence between the coefficients of the regression model
Which method in pandas is used to read a CSV file containing the dataset for Linear Regression?
loadtxt()
read_csv()
load()
from_csv()