Homework 2
EE425X – Machine Learning: A signal processing persepective
Logistic Regression and Gaussian Discriminant Analysis
In this homework we are going to apply Logistic Regression (LR) and Gaussian Discriminant Analysis
(GDA) for solving a two-class classification problem. The goal will be to implement both correctly and
figure out which one is better.
To do this, you will first“learn”the parameters for each case using the training data (as discussed in
class and available in the handouts). Then, you will apply it to test data and evaluate the performance as
explained below. The only change from the handout is that, for GDA, you need to assume that the
covariance matrix Σ is diagonal.
1 Synthetic Data Generation
Generate your own training data first. To do this, we use the GDA model because that is the only one which
provides a generative model.
Generating Training data: Since we want to implement a two-class classification problem, let the class
labels, y
(i)
take two possible values 0 or 1 (for i = 1, · · · , m, i.e., we have m training samples). These
are generated independently according to a Bernoulli model with probability φ. Next, conditioned on
y
(i)
, the features x
(i) ∈ R
n×1 are generated independently from a Gaussian distribution with mean
µy
(i) and covariance matrix Σ. In other words, while generating x
(i)
, use the same covariance matrix
Σ for both classes, but pick two different µ’s: µ0 as the n-dimensional mean vector for data from class
0 and µ1 as the n-dimensional mean vector for data from class 1. Do this for all i = 1, 2, · · · , m.
Generating Test data: Do the same as above, but now instead generate mtest = m/5 samples.
2 Learning parameters using training data; and then testing the method
on test data
❼ Write code to estimate the parameters for Logistic Regression and for GDA. For how to do it, please
refer to the class handouts. GDA was covered recently in the Generative Learning Algorithms handout.
LR is covered in the first handout (Supervised Learning).
For LR, you need to write Gradient Descent code to estimate θ.
For GDA, proceed as follows. The ONLY CHANGE from the handout is that we assume that Σ is
1
DIAGONAL and thus use the following formulas:
while setting all non-diagonal entries of Σ to be zero. Here, 1(w = c) is the indicator function that
evaluates to 1 when w = c and 0 otherwise.
Write a code that uses the estimated parameters for each method, and then classifies the test data as
explained in the handout and in class. For GDA, we use Bayes rule for classification. For each input
query x, compute the output ˆy(x) as
Evaluate accuracy: let us denote the test data as Dtest. Report accuracy of each method as
where ˆy(x) is the output of the classifier for input x. Also, |Dtest| = mtest is number of testing samples.
Use n = 100 and m = 20. This means that for estimating each entry of µ or Σ you have 20 samples.
Generally speaking, we need to have order of n
2
samples to estimate all entries of Σ. However, since
in this homework we assume that Σ is a diagonal matrix, order n samples suffices.
3 Real Data
Next use the MNIST dataset to evaluate both approaches on real data. MNIST is a good database for people
who want to try learning techniques and pattern recognition methods on real-world data while spending
minimal efforts on preprocessing and formatting. The MNIST database of handwritten digits has a training
set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from
NIST. The digits have been size-normalized and centered in a fixed-size image. The entire dataset can be
downloaded from here but in this problem we only use samples corresponding to two digits 0 and 9.
Use the code written in the previous part to classify two digits 0 and 9 in MNIST by using Logistic
Regression and Gaussian Discriminant methods. You should have written code for part 2 so you need not
have to rewrite anything, except change what you provide as training and test data. This is what we want
to learn in this course: use simulated (synthetic) data to write and test code; make sure everything works
as expected, then use the same code on real data.
Please report the final classification accuracy and discuss how the obtained accuracy for the real data
differences from the synthetic data.
4 What to turn in?
Submit a short report that discusses all of the above questions. Also submit your codes with clear documentation.
Grading will be based on the quality of report and accuracy of implemented codes.