School of Computer Science
The University of Adelaide
Artificial Intelligence
Assignment 3
- Introduction
This assignment has two parts: Probability Graph Model (Section 2) and Part-of-Speech Tagging (Section 3). Both
UG and PG students need to finish both parts. In part one, you are expected to perform approximate inference on
a given PGM using C++/Java/Python and submit the codes to the Web Submission system. In part two, you are
expected to provide your work for the required computation in a pdf and submit the pdf in MyUni, Assignment 3. - Probability Graph Model (10 marks)
Your task is to perform approximate inference on a probabilistic graphical model (PGM) of Boolean random
variables using likelihood weighted sampling, as described in lecture. You will work on two networks (Burglar and
Fire). The networks and example queries are provided in the textual files attached on MyUni and will be described
in the later section.
2.1 The task
You need to write a program (C++/Java/Python) to parse the input file of network and to create and populate an
internal data structure of the PGM. The program also need to parse the input query file correctly and evaluate the
conditional probability distribution of the query variable, given the evidence. The result must be written as two
decimal values corresponding to the values of P(QueryVar=true|…) and P(QueryVar=false|…) onto the standard
output stream, separated by white space. For example:
0.872 0.128
Note that if you write anything else before these two numbers, automark will interpret that output as the answer,
almost certainly resulting in a test failure.
2.2 File Format
The format of the network file is:
N
rv0 rv1 … rvN-1 - 0 1 … 0
- 0 0 … 1
… - 1 1 … 0
mat0
mat1
…
matN-1
N is the number of random variables in the network;
rv* are the random variable names (arbitrary alphanumeric strings);
The matrix of zeros and ones specifies the directed arcs in the graph; a‘1’in the (i, j) entry indicates there
is an edge from i to j, so the i-th variable is a parent of the j-th variable.
mat are two dimensional arrays of real numbers (in ASCII) that specify the conditional probability table of
each random variable conditioned on its parents; If a node has m parents, then the matrix needs to specify
the probability of each outcome (true, false) conditioned on 2m different combinations of parent values, so
the matrix will be 2
m x 2 (rows x columns). Treating true as 1, and false as 0, concatenate the values of the
parents in their numerical order from most significant bit to least significant bit (left to right) to create a
row index r. The entry in the first column, r-th row is then the probability that the variable is true given the
values of the other variables (the entry in the corresponding 2nd column is the probability that the variable
is false). Thus, the first row of the matrix corresponds to all conditioning variables taken the value false (r
= 000… 0), and the last row has all conditioning variables true (r = 111 …1).
For example if the variable A has parents C and F where C is the 3rd variable specified and F is the 6th, then
C,F = 00, 01, 10, 11 correspnd to row r = 0, 1, 2, 3 of the table. The CPT entries for P(A|C,F) entries will be:
The format of the query file is:
P(rvQ | rvE1=val, rvE2=val, …)
where rvQ is the name of the query variable, and rvEx are the names of the evidence variables with their respective
true/false values specified.
2.3 Deliverables
Implement random forest for wine quality prediction in either C/C++, Java or Python.
C++. In the case of C/C++, you must supply a makefile (Makefile) with a rule called inference to compile your
program into a Linux executable named inference.bin. Your program must be able to be compiled and run as follows:
$ make inference
$ ./inference.bin [graphfile] [queryfile]
JAVA. In the case of JAVA, you must write your program in the file inference.java. Your program must be able to be
compiled and run as follows:
$ javac inference.java
$ java inference [graphfile] [queryfile]
Python. In the case of Python, write you program in the file inference.py. Your program must be able to be run as
follows:
$ python inference.py [graphfile] [queryfile]
At the moment, only an older version of Python (version 2.7.5) is supported on the school servers. If you are not
familiar enough with this version of Python, PLEASE DO NOT USE PYTHON for the assignment.
1.3 Expected run time
Your program must be able to terminate within 1 minutes on the server.
1.4 Web Submission Instructions
You must submit your program on the Computer Science Web Submission System. This means you must create the
assignment under your own SVN repository to store the submission files.
The SVN key for this submission is
2021/s1/ai/assignment3
The link to the Web Submission System used for this assignment is:
https://cs.adelaide.edu.au/se…
After submit your codes through SVN, navigate to 2021, Semester 1, Artificial Intelligence, Assignment 3 (for COMP
SCI 3007 7059 students). Then, click Tab“Make Submission” for this assignment and indicate that you agree to the
declaration. The automark script will then evaluate your codes.
1.5 Assessment
Your code will be compiled and run, testing its outputs for the two networks with several queries each. Two example
queries have been provided to you. Since the results are stochastic, the query answers will not be exact and will
vary from run to run. The answers will therefore be tested against a tolerance of ±1 (i.e. your answers must be
within 1% of the true values), so you must ensure convergence to this level of precision (by giving more samples).
If it passes all tests you will get 10% of the overall course mark. The objective of the tests is to check for the correct
operation of your implementation. Hence, the basis of the assessment is to compare your results against the
expected results.
Although it is an auto-assessment, I will randomly pick up some codes to check manually. - Part-of-Speech Tagging – Viterbi (5 marks)
In this part, you will do POS tagging via Viterbi algorithm for the sentence Janet will back the bill. This is the example
in our lecture and also in the textbook here (https://web.stanford.edu/~jur…
The transition probabilities computed from the WSJ corpus without smoothing are in the table below. Rows are
labelled with the conditioning event; e.g., P(VB |MD) is 0.7968.
The Observation likelihoods computed from the WSJ corpus without smoothing are in the following table:
The computation for the three columns are shown in the following figure:
3.1 The task
In this task, you need to calculate v3(3). Note that the computation of v1(1) to v1(7) and v2(2) are provided in the
lecture slides and the textbook. To compute v3(3) you first need to compute v2(1), v2(3), v2(4) ,v2(5), v2(6), v2(7).
For PG students, you need to further calculate v4(7). Note that to compute v4(7), you need to know v3(1) to v3(7).
3.2 Deliverable
You will provide your exact working for the computation of v2(1), v2(3), v2(4) ,v2(5), v2(6), v2(7) and v3(3) into a
report in pdf format. You need to write the detailed process and the results. Using equations provided by your text
editor. Hand-written equations (e.g., in image format) will not be marked.
Please submit the pdf to MyUni, Assignment 3
3.3 Assessment
Use correct equation, correct value to compute, and get correct results.
UG: 5 marks for v3(3), include intermediate results
PG: 2 marks for v3(3), 3 marks for v4(7), include intermediate results.
WX:codehelp