全文链接:https://tecdat.cn/?p=33178
- The density of a finite mixture distribution has the form
p(x) = KXi =1 πifi(x; θi)
where fi(:) are the K component densities, and πj are mixing proportions. For fixed K, the EM algorithm (see lecture slides) can be
used to estimate the parameters, θi, πi, for i = 1; : : : K, from an iid
sample. In this question we will restrict to all component densities
being p-dimensional normal, with density
f(x) = 1
(2π)p2 jΣj1 2 exp-1 2(x – µ)tΣ-1(x – µ)
(a) Write an R function that uses the EM algorithm to find parameters which maximise the likelihood (or minimise the negative
log-likelihood) for a sample of size n from p(x), for a given choice
of K. The function prototype should be
em.norm(x,means,covariances,mix.prop)
where x is an n × p matrix of data, means, covariances, and
mix.prop are the initial values for the K mean vectors, covariance matrices and mixing proportions. Consider including arguments, with sensible defaults, for the convergence criterion and
the maximum number of iterations.
(b) This question will use the first two columns of the object synth.te
in the MASS library:
x <- synth.te[,-3]
For K = 2; 3; 4; 5; 6, use your function to compute the maximum
likelihood estimates for the finite mixture of normal distributions,
for these data. Select initial parameters either randomly, or by
selecting from a plot of the data.
i. Construct a table that reports, for each choice of K, the
maximised likelihood, and the AIC.
ii. On the basis of this table, which choice of K provides the
best density estimate? For this choice, construct a contour
plot of the estimated density, along with the data.
iii. Briefly discuss any problems you anticipate using the EM
algorithm for computing a mixture model with more components, or in higher dimensions
#b.r
install.packages("MASS")
library(MASS)
install.packages("EMCluster")
library(EMCluster)
Y=synth.te[,c(1:2)]
plot(Y[,1],Y[,2])# 绘制 Y 的变量相干图
#K= 2 时 依据上图,当将样本点分成两个簇的时候,能够预估均值迭代初始值为 c(-0.5,0.3),c(0.4,0.5)
# Create starting values
mustart = rbind(c(-0.5,0.3),c(0.4,0.5)) # must be at least slightly different
covstart = list(cov(Y), cov(Y))
probs = c(.01, .99)
qplot(x=xs, y=ys, data=Y)
ggplot(aes(x=xs, y=ys), data=Y) +
geom_point(aes(color=factor(test$cluster)))
em.aic(x=Y,emobj=list(pi = ret$pi, Mu = ret$Mu, LTSigma = ret$LTSigma))# 计算结果的 AIC
#K= 3 时
probs = c(.1, .2, .7)
mustart = rbind(c(-0.7,0.3),c(-0.3,0.8),c(0.4,0.5)) # must be at least slightly different
- Consider a two-class bivariate classification problem, with equal prior
probabilities and class conditional densities given by
f(x; yjCi) = 4θi2xy exp -θi(x2 + y2) x; y > 0
and θi > 0 for i = 1; 2. Note that this joint density is the product of
Rayleigh distributions.
(a) Write an R function that generates a random sample of size n
from class C1 and a random sample of size n for class C2. The
function should return both the feature vectors and the class
indicator. A function for generating Rayleigh distributed random
variables is available1.
(b) Obtain an expression for the decision boundary for minimum error. Suppose we are interested in the situation where the decision
boundary for minimum error intersects with the midpoint of the
line connecting the sample mean vectors. Derive an expression
for θ1 and θ2 to satisfy this situation.
(c) Derive an expression in terms of θ1 and θ2 for the Bayes error rate.
Now, suppose θ2 = 1 and θ1 > θ2. Use the golden ratio search
algorithm developed in question 4 of project 1, to determine the
value of θ1 that gives a Bayes error rate of 15%. The solution
occurs in the interval [3; 10]. (Hint: The target function does
not have to be differentiable at the minimum for the golden ratio
search to work.)
(d) Write down a discriminant function for each class, treating th
parameter θi as unknown.
(e) Let θ1 = 4 and θ2 = 2. Construct a plot of the unconditional density, f(x; y) = p(C1)f(x; yjC1)+p(C2)f(x; yjC2), for the specified
parameter values. Obtain a sample of 50 observations from each
class. Add these data and the Bayes optimal decision boundary
to the plot.
(f) Derive the maximum likelihood estimators for the parameters of
each class, given a sample of size n from each class.
(g) Write two R functions, the first for computing the maximum
likelihood estimates in (f) from a set of data generated by the
function in (a), and the second for evaluating the discriminant
function for each class, using the maximum likelihood estimates
(the estimative discriminant function). Compute the discriminant scores for the data generated in (e) and estimate the error
rate of this classifier on this training data.
(h) Obtain a training sample of size n = 200 and a test sample of
size n = 10000, using the parameter values in part (e). Retain
these training and test samples for use in Questions 3 and 4.
Using these data sets, compute the training and test set error
rates for
i. the estimative version of the true model, using the functions
in part (g),
ii. Linear discriminant analysis,
iii. Quadratic discriminant analysis.
Provide a table of these error rates for the different models. Comment on the results.
a)
#rrayleigh.r
rrayleigh=function(n,theta){u=runif(n,0,1)
f=sqrt(-2*log(u))/sqrt(2*theta)
return(f)
c)
#theta.r
theta= function(x){y=(x[1]^2+x[2]^2)/(2-x[1]^2-x[2]^2)
d=c+(c-a)*rat;
}
}
1/2*(a+b)
}
golden(theta)#Use the golden ratio search algorithm to determine the value of theta 1
d)
install.packages(VGAM)
library(VGAM)
#f.r
f = function(x,theta1,theta2){1/2*drayleigh(x[1],theta1)*drayleigh(x[2],theta1)-1/2*drayleigh(x[1],theta2)*drayleigh(x[2],theta
e)
#e.r
f = function(x,theta1=4,theta2=2){1/2*drayleigh(x[1],theta1)*drayleigh(x[2],theta1)-1/2*drayleigh(x[1],theta2)*drayleigh(x[2],theta2) }#discriminant function
p = function(x,theta1=4,theta2=2){1/2*drayleigh(x[1],theta1)*drayleigh(x[2],theta1)+1/2*drayleigh(x[1],theta2)*drayleigh(x[2],theta