共计 15094 个字符,预计需要花费 38 分钟才能阅读完成。
APS106 Final Assessment
Due: April 28, 2021 at 23:59 Toronto time
As engineers around the world, including at U of T, are developing treatments, designing ventilators,
redesigning supply chains, and repurposing production lines, you, the students of APS106, will be
provided a simple dataset of information that might be collected by a COVID-19 contact tracer, and
will develop data structures that allow quick access to and manipulation of that data. We hope that
this task will give you a sense of the power of coding to gather and display data, and help in
understanding important worldwide events.
You have ten days to complete this project, but we do NOT expect this to take ten days, because we
know you’re taking other courses, and have other final assessments to complete.
Good luck!
Academic Integrity
The project is“open book”meaning that students are allowed to use a Python IDE (e.g. Wing 101), all
course material (lecture notes and videos, labs, textbook), and other offline and online resources.
A student may NOT:
• collaborate with any other student;
• consult any person, except the course instructors via the APS106 piazza site;
• submit work not wholly created by the student.
Doing any of the above is an academic offense. We will use tools to detect such offenses.
Submission of your project deliverables constitutes agreement with the following statement.
In submitting this assessment, I confirm that my conduct adheres to the Code of Behaviour on
Academic Matters. I confirm that I did NOT act in such a way that would constitute cheating,
misrepresentation, or unfairness, including but not limited to, using unauthorized aids and
assistance, impersonating another person, and committing plagiarism. I pledge upon my
honour that I have not violated the Faculty of Applied Science & Engineering’s Honour Code
during this assessment.
Any academic offence will be pursued to the full extent of University regulations.
page 2 of 11
Instructions
In addition to the instructions you’re reading now, please download the datafile tracking.csv.
There are 5 parts to this project; each is described later in this document. Some are easier than
others, but each is worth the same number of marks.
For each of the five project parts, you will submit two files:
• a Python (.py) file that is only the code for that part;
• and a .pdf file that describes how you developed and would test the code for that part.
The 5 Python files
You will likely implement this project as a single Python file. But to help us autograde your code via
MarkUS, you’ll submit it as five separate Python files, where each file“part#.py”will contain ONLY the
code you wrote for part #, where # is a number from 1 to 5.
Part 1 asks you to write a class definition, including its data and method attributes; only submit that in
part1.py. Each of parts 2 to 5 asks you to write one or two functions; for each part only submit the
requested function(s) in the corresponding part#.py file.
The only module you may import is csv. You should only import it into the .py files where it is used. If
you don’t use it, don’t import it.
Do NOT include any input() or print() statements in any of the part#.py files.
As in the labs, you may submit your part#.py files to MarkUS multiple times and we will mark the last
one submitted before the deadline.
The MarkUS tests that we are providing do not test your code’s functionality. We are providing
some very basic MarkUS tests that are not even close to sufficient for assessing if your code works.
Our tests provide a“sanity check”to help you identify simple syntax problems and typos.
See the “Description of Provided MarkUS tests” on page 11 of this document for details.
page 3 of 11
The 5 .pdf files
Each .pdf file will contain:
• page 1 – an algorithm plan
• page 2 – a programming plan
• page 3 (and more, if you like) – a testing plan
The Algorithm and Programming Plans
Write your algorithm and programming plans before you start coding, based on examples you’ve
seen presented throughout the course, and in particular during the Friday Design Problem
sessions. The algorithm plan must fit on one page; the programming plan on another page. Write
as little or much as you think is necessary to explain how you planned to write the code for that
part.
The Testing Plan
This part of each .pdf file will document tests of the code for that part. NOTE that the testing plan
may extend to multiple pages if necessary.
For each test you propose, show the Python code used for testing, the input and expected output,
and include a description of what the test does and why you proposed the test (i.e., what is it
testing?).
We will not run these tests via MarkUS, and please do NOT include these tests in your part#.py
files. Rather, your testing plan is meant to allow another programmer (like someone grading your
project) to appreciate that you’ve thought of all possible inputs and outputs to your various
functions.
page 4 of 11
How to submit your 5 Python files and your 5 .pdf files
All files (part#.py and part#.pdf) should be submitted to MarkUS. As with the labs, MarkUS will only
accept files that are named correctly. You do not need to submit all the files at the same time. You
can submit each individual file as you complete it and run the“sanity check”MarkUS tests on each
part individually. You will be able to run these“sanity check”tests 5 times per hour.
Note that if you are including images (especially pictures of sketches taken with your phone), your
document may be larger than 5 Mb. If you wish to include pictures of sketches, you may need to
compress the image before inserting it in your document. For information on how to compress an
image inserted into a Word document, go to https://support.microsoft.com…
When you’re ready to submit, you will upload your python and pdf files to MarkUS. The submission
process is the same as the labs. Go to https://markus.engineering.ut…, sign-in with
your UtorID, and navigate to the‘final project’assignment from the MarkUS home page. From there,
you can navigate to the‘Submissions’tab where you will be able to upload your files. Remember,
MarkUS will only accept files with the correct naming conventions. Once your files are successfully
uploaded, you should be able to see them listed in the submitted files table.
Each part#.pdf file must be less than 5 Mb. MarkUS will not accept
larger documents. APS106 is not responsible for documents
exceeding the 5 Mb limit.
page 5 of 11
Evaluation
Each part of this project is worth 20 marks, distributed as follows:
8 MarkUS autograding of your part#.py file
2 Algorithm plan
2 Programming plan
3 Testing Plan
5 Code quality
NOTES:
• MarkUS will evaluate your code with tests that are unknown to you, and different than the
basic tests available to you to test your part#.py files.
• The other components will be evaluated by a human being:
o When grading your algorithm and programming plans, we will assess the extent to
which you demonstrate thought about the requirements of that part of the project,
and articulate a clear approach to attacking it – even if that approach is different than
what you actually did/coded.
o Your testing plan will be evaluated by assessing the comprehensiveness of your tests,
and the logic of your explanations.
o The quality of your code will be assessed by considering clarity, simplicity, meaningful
variable names, comments, documentation, etc.
page 6 of 11
Part 1 – Represent the Contact Tracing Data
In this part, you will design and implement a data structure that will be used to represent information
about an infected person, and others infected by contact with that person.
TO DO:
Create an Infected_person class that contains the following data and method attributes:
Data Attribute Name Type Format Description
id str
unique identifier of an infected
person
infected_on str YYYY-MM-DD date that person was infected
infected list list of Infected_person
objects others infected by this person
Method
Name
Input Parameters Output Description
init self
– type:Infected_person
pid
– type: str
date
– type: str
– default value:“unknown”
Constructor of class Infected_person:
- initializes id to pid
- initializes infected_on to date
- initializes infected to an empty list
str self
– type: Infected_person
str Returns a string representation of the object
in the form “(A,B,C)” where: - A is the object’s id
- B is the infected_on date
- C is the size of the infected list
infects self
– type:Infected_person
person
– type: Infected_person
Adds a person to the infected list. The
order in which the objects are stored in the
infected list does not matter. The list
should not contain duplicates.
page 7 of 11
Part 2 – Parse the Contract Tracing Data
In this part, you will design and implement a function to process the content of a file containing data
on persons and infections. The data file is in csv format and has the following structure:
id,infected-by,date
a3959d249c3444dbb36310bb16e2a689,d03ba6ffa3b5475989ce477973d1617d,2021-02-09
ae66eefa796a4a59aa375d0514620fcf,209864c2f0f543089f8466e7bcff3acd,2021-01-27
435e781627db4635b966c3d4fd493728,d161fa8d17f244d186ef9a47b40e931c,2021-03-21
0583968bf83442858853ef7c2d0f75cb,c921ae9021ab440aa440601e20fdb45f,2021-01-16
Notes about the data file:
• There is a header row as shown above.
• Each row of the file after the header contains the id of someone infected by COVID, the id of
the person who infected them, and the date of the infection. Someone can only be infected
once, by one other person. However, one person may infect many others on the same or
different dates.
• There may be persons who appear in the infected-by column who do not appear in the first
(id) column. For these persons, the date of infection, and person that infected them, are
unknown.
• The date is a string of format “YYYY-MM-DD”.
• Make no assumptions about the order of the data in the file (i.e., it is not necessarily in
alphabetical or chronological order).
TO DO:
Write a function parse_contacts whose only argument is a string filename, which is the name
of the data file the function will process. parse_contacts then returns a dictionary of the
following format:
{id : Infected_person-object, …}
where:
• each key, id, is a string representing a person’s id;
• each value is an object of type Infected_person
Notes:
• all persons mentioned in the data file (in either the id or infected-by columns) should
have an entry in the dictionary
• each row in the data file lists an infected person and who they were infected by. An
Infected_person object contains information about an infected person and who they
infected. Be mindful of these details in order to successfully translate the data file into the
required dictionary representation.
• if the data file is empty (i.e., has no rows), parse_contacts should return an empty
dictionary.
page 8 of 11
• You may assume that all test data files will have the format described above. You may also
assume that any non-empty test data file will contain a header row (and only one header row)
as its first line and at least one data row.
• DO NOT hardcode the name of the file. We will test your code with different data files.
Part 3 – Identify the Top Direct Superspreaders
A“direct superspreader”is a person who directly infects many others. In this part, you will design and
implement a function to identify the top direct superspreader(s): the person(s) that directly infected
the most other persons.
Notes:
• there may be one top direct superspreader, or more than one in case of a tie
• the data file lists direct infections. We will look at indirect infections in Part 5.
TO DO:
Write a function find_top_direct_superspreader:
• its only argument is a dictionary with format {id : Infected_person-object,
…}
o this is the output of function parse_contacts in Part 2
• it returns a list of person id’s corresponding to top direct superspreader(s), [id1, id2,
…]. Again, remember that there may only be one, or more than one in the case of a tie. The
order in which the id’s are stored in the list is not important. The list should not contain
duplicates.
o If the input is an empty dictionary, find_top_direct_superspreader should return
an empty list.
o You can assume that find_top_direct_superspreader will be called with a
dictionary as its only argument. You can also assume that the dictionary, if not empty , will
have the format {id : Infected_person-object, …}.
page 9 of 11
Part 4 – Find Superspreading Dates
We define a superspreading date as the date in which the most persons were infected. Ties are
possible and so there may be multiple direct superspreading dates.
TO DO:
Write a function find_superspreading_date:
• its only argument is a dictionary with format {id : Infected_person-object,
…}
o this is the output of function parse_contacts in Part 2
• it returns a tuple of the form ([date1,…], max-infected) where:
o [date1,…] is a list of strings representing the dates that had the maximum
number of infections. The order in which the dates are stored in the list is not
important. The list should not contain duplicates.
o max_infected is the maximum number of direct infections on any of those dates.
For example, if max_infected is 10, each of the dates in the list [date1,…]
had 10 direct infections.
o If the input dictionary is empty, find_superspreading_date should return the
tuple ([], 0)
o You can assume that find_superspreading_date will be called with a
dictionary as its only argument. You can also assume that the dictionary, if not empty,
will have the format {id : Infected_person-object, …}.
Part 5 – Identify the Top (Direct + Indirect) Superspreaders
In Part 3 we looked at direct infections: when one person infects one or more others. Indirect
infections are the ones that follow from a direct infection. For example, if person A infects person B
and person B infects persons C and D, A is said to have indirectly infected C and D. If C and D do not
infect anyone else, A is said to have (directly + indirectly) infected three persons: B, C, and D.
In this part, you will design and implement two functions to identify all people directly or indirectly
infected by a person, and to identify the top (direct + indirect) superspreader(s).
page 10 of 11
TO DO:
5A – First write a function get_all_infected, with two arguments:
• a dictionary with format {id : Infected_person-object, …}
o this is the output of function parse_contacts in Part 2
• a string that represents the id of a person of interest
The function returns a list of the ids of all persons (directly + indirectly) infected by the person of
interest.
5B – Then write a function find_top_direct_plus_indirect_superspreader:
• its only argument is a dictionary with format {id : Infected_person-object,
…}
o this is the output of function parse_contacts in Part 2
• it returns a tuple of the form ([id1, id2, …], max_infection_count) where:
o ([id1, id2, …] is a list of person id’s corresponding to the top (direct +
indirect) superspreader(s). Again, remember that there may only be one, or more than
one in the case of a tie. The order in which the id’s are stored in the list is not
important. The list should not contain duplicates.
o max_infection_count represents the maximum number of (direct + indirect)
infections.
o If find_top_direct_plus_indirect_superspreader is called with an
empty dictionary as argument, it should return the tuple ([], 0)
o You can assume that find_top_direct_plus_indirect_superspreader
will be called with a dictionary as its only argument. You can also assume that the
dictionary, if not empty, will have the format {id:Infected_person-object,
…}.
Notes:
• there may be one top (direct + indirect) superspreader, or more than one in case of a tie
• you may use the first function, get_all_infected, to implement the second function,
find_top_direct_plus_indirect_superspreader
page 11 of 11
Description of the Provided MarkUS tests
Unlike in your labs, we are only providing MarkUS tests that will tell you if there are basic problems
(e.g. syntax errors, typos in function names, etc.) with your submission that prevent MarkUS from
evaluating your code. These tests will not check any outputs from your functions, they will only check
that the functions and classes are correctly defined and can be run by MarkUS when grading starts.
You will be able to run these“sanity check”tests 5 times per hour.
It is your responsibility to create sufficient tests to be confident in the correctness of your code.
Below is a summary of what our simple tests will evaluate for each part of the assignment.
Part 1:
• the file part1.py contains a class named Infected_person; AND
• the Infected_person class includes the following methods: str and infects;
AND
• the file contains no syntax errors that prevent MarkUS from running the code in the file
part1.py
Part 2:
• the file part2.py contains a function named parse_contacts; AND
• the file contains no syntax errors that prevent MarkUS from running the code in the file
part2.py
Part 3:
• the file part3.py contains a function named find_top_direct_superspreader; AND
• the file contains no syntax errors that prevent MarkUS from running the code in the file
part3.py
Part 4:
• the file part4.py contains a function named find_superspreading_date; AND
• the file contains no syntax errors that prevent MarkUS from running the code in the file
part4.py
Part 5:
• the file part5.py contains a function named get_all_infected; AND
• the file part5.py contains a function named
find_top_direct_plus_indirect_superspreader; AND
• the file contains no syntax errors that prevent MarkUS from running the code in the file
part5.py
Our tests provide a“sanity check”to help give you confidence that you will not lose MarkUS grades
for simple typos or syntax errors. You will be able to run these tests in the same way you ran MarkUS
tests for labs 2-9. That is, submit your code to MarkUS and then click on“Run Tests”under the
automated testing tab. After a few minutes, refresh the page and the results of the tests will be
displayed.