R10 练习: 简单线性回归

考纲范围

  • Describe a simple linear regression model, how the least squares criterion is used to estimate regression coefficients, and the interpretation of these coefficients.
  • Explain the assumptions underlying the simple linear regression model, and describe how residuals and residual plots indicate if these assumptions may have been violated.
  • Calculate and interpret measures of fit and formulate and evaluate tests of fit and of regression coefficients in a simple linear regression.
  • Describe the use of analysis of variance (ANOVA) in regression analysis, interpret ANOVA results, and calculate and interpret the standard error of estimate in a simple linear regression.
  • Calculate and interpret the predicted value for the dependent variable, and a prediction interval for it, given an estimated linear regression model and a value for the independent variable.
  • Describe different functional forms of simple linear regressions.

Q1.

Ordinary least squares (OLS) is an algorithm to estimate the intercept coefficient and the slope coefficient of the regression model to minimize the sum of squares errors (SSE). SSE refers to:

A. the sum of squared vertical distances between the observations of the dependent variable and the regression line.

B. the sum of squared differences between the predicted value of the dependent variable by the regression and the mean value of the dependent variable.

C. total sum of squares.


Q2.

An analyst is investigating the correlation between a company’s net profit margin and its Research & Development (R&D) expenditures. The analyst gathers data from various companies within an industry and calculates two key ratios: the ratio of R&D expenditures to revenues (RDR) and the net profit margin (NPM) for 11 companies. The objective is to provide an explanation for the changes in NPM by examining the changes in RDR among the companies. If the covariance between NPM and RDR is -0.00048, the standard deviations of NPM and RDR are 0.00069 and 0.01613, respectively. What is the slope coefficient for this linear regression model?

A. -1.84.

B. -0.03.

C. -0.7.


Q3.

Gary, a quantitative analyst, is using the following simple linear regression to find the relationship between a firm’s return on equity (ROE) and debt-to-equity (D/E) ratio:

where ROE is stated in percentages and D/E is in decimals.

If D/E is 2, the predicted value of the firm’s ROE is ___; If D/E changes from 1.5 to 2, the change of ROE is ___; If D/E is 4 and the actual ROE is 8.2%, the residual is ___.

A. 2.3%, 4.25%, 0.2%.

B. 5%, 0.75%, -0.2%.

C. 5%, 0.75%, 0.2%.


Q4.

Erik and Eric are discussing some topics about simple linear regression. Erik states that there are four key assumptions and presents these assumptions as follows:

Assumption I: the relationship between the dependent variable and the independent variable is linear. Assumption II: the variance of the error term is the same for all observations. Assumption III: the residuals are correlated across observations. Assumption IV: both the error term and the explaining variable are normally distributed.

Which assumptions are correct?

A. I and III.

B. I and II.

C. II and IV.


Q5.

Which of the following statements is not an assumption of a simple linear regression model?

A. Homoskedasticity: The variance of the regression residuals is the same across the observations.

B. Dependence: The variables, Y and X, are dependent on one another.

C. Normality: The regression residuals are normally distributed.


Q6.

The analysis of variance (ANOVA) of a simple linear regression is presented in the following table:

SourcedfSum of SquaresMean Square
Regression10.0680.068
Residual250.0250.001
Total260.093

The coefficient of determination is closest to:

A. 68.

B. 0.27.

C. 0.73.


Q7.

Which of the following statements is least correct about the coefficient of determination?

A. Coefficient of determination measures the percentage change of the dependent variable attributed to the independent variable.

B. The minimum and maximum values of the coefficient of determination are 0 and 1, respectively.

C. The lower coefficient of determination, the better fitness.


Q8.

An analyst runs a simple linear regression to explain the price movements of a manufacturing company by using monthly outputs over the past 60 months. The regression result shows that the regression sum of squares is 0.0562, and the sum of squared errors is 0.0229. The F-statistic is closest to:

A. 142.3406.

B. 144.7948.

C. 147.2489.


Q9.

An economist is analyzing the relationship between household incomes and the expenditure of households. The results of this estimation are based on 400 observations provided below.

CoefficientStandard Errort-Statisticp-value
Intercept380.5269212.36301.7918700.1109
Household incomes0.4845320.03238214.962980.0000

Which of the following should the economist conclude?

A. The average household income is 380.5269.

B. The estimated slope coefficient is different from the one at the 0.05 level of significance.

C. The household incomes explain 48.45% of the variation in household expenditures.


Q10.

An analyst collects 200 observations and runs a simple linear regression to forecast the return of a beverage company, Shining Star, by using monthly CPI. The regression parameters are b₀ = 2.33%, b₁ = 0.45, where b₀ and b₁ indicate the estimated intercept and slope respectively. The R-square of the regression is 49%. If he wants to test whether the correlation between return and CPI is equal to zero, the test statistic should be:

A. -13.79.

B. 13.79.

C. 13.83.


Q11.

For a 0.05 level of significance, the critical F-value for the test of whether the simple linear regression model is a good fit is 7.71. Based on the Exhibit below, is the F-test significant at the 0.05 significance level?

SourcedfSum of SquaresMean Square
Regression1123.9123.9
Residual426.26.55
Total5150.1

A. Yes. With a calculated F-statistic of 18.92, we conclude that the slope of the model is different from zero.

B. No. With a calculated F-statistic of 18.92, we cannot conclude that the slope of the model is different from zero.

C. No. With a calculated F-statistic of 5.34, we cannot conclude that the slope of the model is different from zero.


Q12.

Teddy, a stock researcher, notices that the return on the stock of Ping An Bank is correlated with the bank’s EPS. He gets the following analysis of variance (ANOVA) table through linear regression analysis.

SourcedfSum of Squares (SS)Mean Square (MS)
Regression10.049120.0492
Error101.05280.1053
Total111.1020

Based on the table, the standard error of estimate is closest to:

A. 0.3245.

B. 0.4216.

C. 1.1378.


Q13.

Which of the following statements is the least correct about the standard error of the estimate (SEE)?

A. The standard error of the estimate is the square root of the sum of squares error (SSE).

B. The smaller the better fit of the model.

C. SEE is an absolute measure of fitness.


Q14.

An analyst analyzed the impact of China’s household income on consumer spending. He collected the data of per capita disposable income (PCDI) and per capita consumption expenditure (PCCE) of 100 Chinese cities in 2020 to run a linear regression with PCCE as the dependent variable and PCDI as the independent variable, and obtained the following results:

CoefficientsStandard Errort-Statisticp-Value
Intercept1,353.2316680.34231.98900.0001
PCDI (CNY)0.78560.026429.75760.0000

Based on the regression results, if the per capita disposable income for a Chinese city is 43,834 RMB, the predicted per capita consumption expenditure is closest to:

A. 35,789.2220.

B. 32,234.9873.

C. 48187.2316.


Q15.

Gloria, a quantitative analyst, notices that the gross profit margin of a real estate developer is correlated with the GDP growth rate. Based on 20 observations, she conducts a simple linear regression using the gross profit margin as the dependent variable, and the GDP growth rate as the independent variable. The regression results are presented in the table below:

CoefficientStandard Error
Intercept0.02120.556
GDP Growth Rate0.2530.108

Notes:

  1. The absolute value of the critical value for the t-statistic with 18 degrees of freedom is 2.101 at the 5% level of significance.
  2. The standard error of the forecast () is 0.07324.

If the forecasted value of the GDP growth rate is 2%, the 95% prediction interval for the actual gross profit margin is closest to:

A. -0.1341 to 0.1490

B. -0.1276 to 0.1801

C. -0.0922 to 0.1573


Q16.

When illustrating the relative change in the dependent variable for a relative change in the independent variable, which of the following functional forms is the most appropriate?

A. The log-lin model

B. The lin-log model

C. The log-log model


Q17.

Rui Wen is studying the relationship between the earnings per share (EPS) of companies and their capital expenditure (CAPEX). He collects a sample of 100 listed companies and runs a model as follows:

It is known that the slope coefficient is significantly different from zero at a 0.05 significance level. Which of the following statements is most likely correct?

A. The variation of the natural log of capital expenditure explains the variation of EPS.

B. The variation of EPS explains the variation of the natural log of capital expenditure.

C. The variation of capital expenditure explains the variation of EPS.


Q18.

Xin, an equity analyst, wants to figure out the relationship between the annual sales of ABC company and the annual GDP. He runs two models as follows:

Model 1Model 2

Which of the following least likely provides evidence to support the conclusion that Model 2 fits the data better than Model 1?

A. The coefficient of determination () of model 2 is higher than that of model 1.

B. The slope coefficient of model 2 is higher than that of model 1.

C. The F-statistic (for testing the overall model significance) of model 2 is higher than that of model 1.