关于后端:STAT-361-开发总结

STAT 361 (Fall 2021)
Assignment 3
The assignment is due on Nov. 04 (Thursday) at 23:00 (time of Kingston Ontario). Please submit to
Crowd Mark.
Guidelines for Preparing Solutions
For questions that needs R coding, please only include the important R output and the necessary results in
the main text of your solutions. Present them in a clear and concise fashion (for example, tabulate models
and output).
Give descriptions and discussions for your important exploration and findings.
Put long code and output in an Appendix, at the end of EACH problem.
These Appendix sections will NOT be marked, but will be checked as evidence of your independent work.
Prepare your assignment solutions so that it is easy for the readers (in this case, TAs) to follow, without
having to search everywhere for your answers from lengthy code and output.

Consider the multiple regression model Y = Xβ +, where ∼ MVNn(0, σ2
I). See descriptions of model
forms (1) and (2) in Chapter 4.
(a) Show that the residual vector r = (I − P)Y, where P = X(XT X)−1XT, and show that I − P is also a
projection matrix.
(b) Let U = (βˆ, r)T. Find the joint distribution of the random vector U. It may be helpful to notice that.
(c) Show that βˆ and r are independent.
Hint: For (b) and (c), properties of multivariate normal distribution may be useful.
Consider the “Savings.txt” data posted. It is an economic dataset collected in 48 different countries. The
variable “sr” is ratio of savings (aggregate personal saving divided by disposable income). The variables
“pop15” and “pop75” are percentages of population under 15 and over 75 respectively. The variable “dpi”
is disposable income (per-capita, in dollars) while the variable “ddpi” is the rate (percent) of change in
disposable income (per capita).
(a) Draw scatter plot matrix for all the variables involved. Comment on the possible relationships between
variables, focus on those appear interesting to you.
(b) Fit a simple linear regression model with disposable income (“dpi”) as response and percentage of population
under 15 as the only covariate. Describe the model clearly. Report and interpret the fitted model:
is there a significant association between the variables, is this what you expect?
(c) Fit a regression model with ratio of savings (Y , “sr”) as the response, and all other variables as the
covariates. Describe the model clearly, report and discuss the fit of the model. Interpret the estimated
coefficient for the rate of change in disposable income.
(d) Is it reasonable to drop the covariate disposable income (“dpi”) from the model in (c)? Support your
answer with a test, describe the test procedure and results clearly; also calculate a confidence interval for
the regression coefficient for this covariate.
Added Note: Test at level 0.05, and construct a 95% confidence interval.
(e) Based on the model for (c), obtain a 95% prediction interval for the ratio of savings of a country with
x = (20, 3.2, 2200, 2.1)T
for “pop15”, “pop75”, “dpi”, “ddpi” respectively.
Four objects are weighed 2 at a time on a spring balance. Denote the 4 unknown weights by β1, . . . , β4.
Six observations are made and are expressed in these forms:
Y1 = β1 + β2 + 1,
Y2 = β1 + β3 + 2,
Y3 = β1 + β4 + 3,
Y4 = β2 + β3 + 4,
1
Y5 = β2 + β4 + 5,
Y6 = β3 + β4 + 6.
Assume that i
iid∼ N (0, σ2
), i = 1, . . . , 6.
(a) Find expressions for the least squares estimators β1, . . . , β4 (specify the expressions in terms of Y1, . . . , Y6).
(b) Find an expression for Cov(βˆ) (specify matrix entries, may involve σ2).
(c) Find expressions for the residuals (specify the expressions in terms of Y1, . . . , Y6).
(d) Create a small data set for this study, for (Y1, . . . , Y6) = (5, 8, 6, 7, 10, 9). Use lm() function in R to fit
the data. Check the results for (a), (b) and (c). Does the output from lm() fit agree with the corresponding
calculation results for the data set based on the expressions you derived above?
(e) Explain how you will construct a 95% confidence interval for β1 + β2. We can still use the tn−k distribution.
Find the confidence interval for the given data.

关于后端:STAT-361-开发总结

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

关于后端:STAT-361-开发总结

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

发表回复取消回复