关于后端:EECS-281问题求解

65次阅读

共计 24654 个字符,预计需要花费 62 分钟才能阅读完成。

EECS 281 – Winter 2021
Programming Project 4
Drone Delivery
Due Monday, April 19 before midnight
Overview
You’ve decided to start a company to handle on-campus deliveries of items via drone. Not just at UM, you’re
developing software to sell so that people at other campuses can set up their own drone delivery service!
There are two types of drones under development, used for three different purposes:
Drone Type I is a drone that moves ten times as fast as other drones. However, these drones require drone
director beacon installations along each path so that their rapid flight can be properly controlled. The cost of
installing these beacons are directly proportional to the distance of the desired path, and clients will want to
minimize the cost of beacons.
Drone Type II is a regular drone similar to competing products. The drone’s energy usage is directly
proportional to the distance it travels.
There are currently three types of clients (A, B, and C) that are interested in your drones, and your task will
vary based on the client type. This is described further in detail below.
To be clear: these scenarios are separate; the program will create a plan for one or the other, but not
both in the same run (though you may find that algorithms from one mode help with another mode).
In this project, your output does not need to be exactly the same as ours to be correct. As long as it
provides a valid answer, and follows the rules for formatting output, it can be in a different order or use
different edges, and still be correct.
Project Goals
● Understand and implement MST algorithms. Be able to determine whether Prim’s or Kruskal’s is more
efficient for a particular scenario.
● Understand and implement a Branch and Bound algorithm. Develop a fast and effective bounding
algorithm.
● Explore various heuristic approaches to achieving a nearly-optimal solution as fast as possible.
○ Do some web searching for“TSP heuristics”, don’t choose one worse than O(n
2
).
● Use a visualization tool to help with debugging.
Version 03-28-21
Written by many 281 staff
© 2021 Regents of the University of Michigan
2
Map Input
On startup, your program, drone, reads input from standard input (cin) describing the locations where pickups
and/or deliveries might occur (there is no actual difference between pickup and delivery locations, these are
just part of the description). The campus is mapped on a grid with a restricted area (medical campus) in the
bottom left quadrant (see‘Path Rules’). You will be given a list of M locations on campus with associated
integer coordinates (x, y). Locations are identified by integer indices which correspond to the order in which
they are read in (the first location coordinate you read corresponds to location 0, the second coordinate to
location 1, etc.). For parts B and C you always start at location 0.
Formally, the input will be formatted in this manner: The first line will contain a single number denoting the
number of locations on campus. That will be followed by a list of integer x/y, coordinates in the form: x y. You
may assume that the input will always be well-formed (it will always conform to this format and you do not need
to error check). There may be additional blank lines at the end of the file (but nowhere else).
Sample:
5
6 1
2 3
-5 -4
-1 6
0 -1
The above sample can be visualized as in the figure below, where the numbers shown are the location indices
that starts at the 0-th location, and the blue line indicates the border (for more details, see the ‘Path Rules’
section) which separates the medical campus from everything else: location 2 is in the medical campus, 4 is on
the border, the rest are the“normal”part of campus.
There is more than one way to represent this configuration internally in your program, and this will
affect your runtime. Choose your data structures wisely!
3
For Part A, there will always be at least two points. For Parts B and C, there will always be at least 3.
Path Rules
We represent the paths you take as a set of pairs of locations or as a sequence of locations. Your calculations
1
should use Euclidean distance, and you should represent your distances as doubles.
In Part A, the faster drones must respect air-traffic safety (due to medical helicopters), and can only enter or
leave the medical campus at certain locations. In order to reach locations inside (or outside, if leaving) the
medical campus, you must first reach a location on the border.
In Parts B and C, the standard-speed drones can fly directly from location to location. You move along the line
segments that connect locations. Please be very careful with rounding errors in your distance calculations; the
autograder will allow you a 0.01 margin of error to account for rounding, but you should avoid rounding at
intermediate steps.
Medical Campus
There is a medical campus that is located in the bottom left quadrant of the map. When first embarking on
your delivery (in Part A), the medical campus is bounded by its border, which consists of the negative portions
of the x and y axes, including the origin (0, 0). In Part A, you may only enter and exit the medical campus by
passing through a location on the border. For example, you are not allowed to travel directly from (-5, -4) to (6,
1). You must first travel from (-5, -4) to (0, -1), and then from (0, -1) to (6, 1).
For the sake of simplicity, assume two locations that “cross” the medical campus can be connected by a direct
path. The example below shows two validly connected“normal”campus locations, at A and B.
In this example, there is a direct path from A to B.
1
“Path,”in this project, can be understood as a route between two locations.
4
When you are using standard speed drones (in Parts B and C), you can fly directly from“normal”campus
locations to medical campus locations, and vice versa (the special rules only exist in Part A).
Other
Important: For parts B and C, you always start at the 0-th campus location.
You are also not allowed to‘wrap around the edges of the world’(you cannot go above the top of the map to
arrive at the bottom).
Command Line Input
Your program, drone, should take the following case-sensitive command line options:
● -m, –mode <MODE>
This command line option must be specified, if it is not, print a useful error message to standard error
(cerr) and exit(1). MODE is a required argument. Set the program mode to MODE. MODE must be one of
MST, FASTTSP, or OPTTSP. The MODE corresponds to the algorithm drone runs (and therefore what it
outputs).
● -h, –help
Print a short description of this program and its arguments and exit(0).
Valid examples of how to execute the program:
./drone –mode MST (OK, but you must type the input by hand)
./drone -h < inputFile.txt (OK, -h happens before we realize there’s no -m)
./drone -m OPTTSP < inputFile.txt (OK, reads from a file on disk)
./drone -m BLAH (BAD mode following -m)
Remember that when we redirect input, it does not affect the command line. Redirecting input just sends the
file contents to cin. You should not have to modify your code to allow this to work; the operating system will
handle it.
We will not be specifically error-checking your command-line handling, however we expect that your
program conforms with the default behavior of getopt_long(). Incorrect command-line handling may
lead to a variety of difficult-to-diagnose problems.
5
Algorithms
Your program should run one and only one of the following modes at runtime depending on the mode option
for that particular program call. We divide it into‘parts’for your convenience, though you may find that
elements and algorithms of some parts can help with others.
Part A – MST mode
Description
This mode will devise a path that determines where to place high-speed drone beacons, while minimizing the
total distance of that path (and thus the total cost of the beacons).
When the program is run in the MST mode, it should calculate and print out an MST connecting all of the
campus locations. You may use any MST algorithm to connect all the locations. Hint: Unless you want to
implement both and compare, think about the nature of the graph (how many vertices and edges does it
have?). You are free to adapt code from the lecture slides to fit this project, but you will want to carefully think
about the data structures necessary to do each part (storing unnecessary data can go over memory limits).
Your program must always generate one valid MST for each input.
Remember that in this part, you must respect the boundary between the medical campus and other locations.
Your MST cannot connect a medical campus location directly to one in the rest of campus, without passing
through a location on the border.
Output Format (for Part A only)
For the MST mode, you should print the total weight of the MST you generate by itself on a line; this weight is
the sum of the weights of all edges in your MST (in terms of Euclidean distance). You should then print all
edges in the MST. All output should be printed to standard output (cout).
The output should be of the format:
weight
node node
node node
……
The nodes are the campus location numbers corresponding to the vertices of the MST and a pair of nodes on
a given line of the output describes an edge in the MST from the first node to the second. To be clear, the
weight should be formatted as a double (2 decimal point precision is enough – see Appendix A), and the node
numbers should all be integer values when printed. For example, given the example input file above, your MST
mode output might be:
19.02
0 1
2 4
1 3
1 4
6
You should also always print the pairs of vertices that describe an edge such that the index on the left has a
smaller integer value than the index on the right. In other words:
1 2
is a possible valid edge of output, but
2 1
is not.
If it is not possible to construct an MST for all locations (i.e. if there are locations on both the normal and
medical campuses, with none on the border), your program should print the message “Cannot construct
MST” to cerr and exit(1).
Parts B & C (FASTTSP & OPTTSP mode)
Description (for both Parts B and C)
In this mode, you will figure out how to travel to every campus location and then return to your starting location.
The route will always start at location index 0, visit every other location exactly once, and return to the starting
locations. Your job will thus be to solve the TSP (Traveling Salesperson Problem) and choose paths to
locations so as to minimize the total distance travelled. Euclidean (straight-line) distance is used here again to
compute distances between locations. Because you are now using normal-speed drones, there is no longer
any restriction on medical campus interaction; you can fly from any location to any other.
For FASTTSP mode, you do not need to produce an optimal TSP tour, but your solution should be close to
optimal. Because your FASTTSP algorithm does not need to be perfectly optimal, we expect it to run much
faster than finding a perfect optimal solution. Do some online searching for“TSP heuristics”. There are
several types, some easier to implement, some with better path lengths, some both.
For OPTTSP mode, you must find an optimal solution to the TSP (the actual minimum Euclidean distance
necessary). More on the differences between OPTTSP and FASTTSP modes will be discussed later.
For both methods:
● You must start the tour from the 0-th campus location.
● You must visit every location exactly once, and at the end of the tour, you must return back to the 0-th
location (you do not print that you’ve returned, but must count the length of that edge as part of
the total path length).
● The length of a tour is defined as the sum of all pairwise distances travelled – that is, the sum of the
lengths of all of the paths taken (using Euclidean distance).
● Your program must print the indices of the locations in an order such that the length of this tour is as
small as possible. More details about how to accomplish this are listed below.
Output Format (for both Parts B and C)
You should begin your output by printing the overall length of your tour on a line. On the next line, output the
nodes in the order in which you visit them. The initial node should be the starting location index and the last
7
node should be the location number directly before returning back to the 0-th location. The nodes in your tour
should be printed such that they are separated by a single space. There can be a space after the last location
listed. All output should be printed to standard output (cout).
For example, if given the input file above, your program could produce:
31.64
0 4 2 3 1
or
31.64
0 1 3 2 4
Part B Only Description
In the case that we have a large number of possible locations to visit, finding an optimal method for visiting all
of them may be too slow, and you may grow old and die before completing your task. You can use heuristics
instead to find nearly-optimal tours. A heuristic is a problem-solving method (an algorithm) that can produce a
good answer that is not necessarily the best answer. For example, you can skip a branch speculatively rather
than waiting to know for a fact that it can be skipped. There are many other simple heuristic techniques, such
as starting with a random tour and trying to improve it by small changes.
You should be able to produce a solution to the Traveling Salesperson Problem (TSP) that is as close as
possible to the optimal tour length (though it does not need to be optimal). In the best case, your produced
tour length will equal the optimal tour length. Sometimes, especially with a low number of locations, your
heuristic might get lucky and produce the optimal tour.
You are allowed to use any combination of algorithms for this section that we have covered in class, including
the MST algorithm you wrote for Part A. You should however not exceed O(n
2
) complexity.
You will need to be creative when designing your algorithms for this section. You are free to implement any
other algorithm you choose, so long as it meets the time and memory constraints. However, you should not
use any advanced algorithms or formulas (such as Simulated Annealing, Genetic Algorithms and Tabu search –
they are too slow) that are significantly different from what has been covered in class. Instead, creatively
combine the algorithms that you already know and come up with concise optimizations. Your heuristic will very
likely be greedy in some way, but there are different ways to be greedy!
Part C Only Description
To find an optimal tour, you could start with the brute force method of exhaustive enumeration that evaluates
every tour and picks a smallest tour. By structuring this enumeration in a clever way, you could determine that
some branches of the search cannot lead to optimal solutions. For example, you could compute lower bounds
on the length of any full tour that can be found in a given branch. If such a lower bound exceeds the cost of a
full solution you have found previously, you can skip this branch as hopeless. If implemented correctly, such a
branch-and-bound method should always produce an optimal solution. It will not scale as well as sorting or
searching algorithms do for other problems, but it should be usable with a reasonable number of locations.
Clever optimizations (identifying hopeless branches of search early) can make your algorithm a hundred times
faster. Drawing TSP tours on paper and solving small location configurations to optimality by hand should be
8
very useful. Remember that there is a tradeoff between the time it takes to run your bounding function
and the number of branches that bound lets you prune.
MAKE SURE that you use the version of genPerms() presented in either the“Project 4 Tutorial”or the
“Backtracking BB TSP”lecture slides. The one presented earlier in the semester, in the“Stacks and
Queues”lecture slides is much slower.
Given an input set of N locations defined by integer coordinates, your job is to produce an optimal tour using
branch-and-bound algorithms. Your program should always produce the shortest possible tour as a solution,
even if computing that solution is time-consuming. You will be given a 35-second cpu time limit to generate
your solution. If your program does not produce a valid solution, it will fail the test case. Your solution will also
be judged by time and space budgets as per previous projects.
Libraries and Restrictions
We highly encourage the use of the STL for this project, with the exception of several prohibited features. Do
not use:
● The C++11 regular expressions library
● The STL thread/atomics libraries (which spoil runtime measurements)
● Shared or unique pointers.
● Other libraries (e.g., boost, pthreads, etc).
Testing and Debugging
Part of this project is to prepare several test files that will expose defects in buggy solutions – your own or
someone else’s. As this should be useful for testing and debugging your programs, we recommend that you
first try to catch a few of our intentionally-buggy solutions with your test files, before completing your solution.
The autograder will also tell you if one of your own test files exposes bugs in your solution.
Each test that you submit should consist of an input file. When we run your test files on one of our
intentionally-buggy project solutions, we compare the output to that of a correct project solution. If the outputs
differ, the test file is said to expose that bug.
Test files should be named test-n-MODE.txt where 1 <= n <= 10. The autograder’s buggy solutions will run
your test files in the specified MODE. The mode must be MST, FASTTSP, or OPTTSP.
Your test files may have no more than 10 locations in any one file. You may submit up to 10 test files (though it
is possible to get full credit with fewer test files). The tests the autograder runs on your solution are NOT limited
to 10 coordinates in a file; your solution should not impose any size limits (as long as sufficient system memory
is available). Test files for part A must have at least two locations; parts B and C must have at least three.
9
Submitting to the Autograder
Do all of your work (with all needed source code files, as well as test files) in some directory other than your
home directory. This will be your “submit directory”. Before you turn in your code, be sure that:
● Every source code and header file contains the following project identifier in a comment at the top of
the file:
● // Project Identifier: 1761414855B69983BD8035097EFBD312EB0527F0
● The Makefile must also have this identifier (in the first TODO block)
● You have deleted all .o files and your executable(s). Typing‘make clean’shall accomplish this.
● Your makefile is called Makefile. Typing‘make -R -r’builds your code without errors and generates
an executable file called drone. The command-line options -R and -r disable automatic build rules,
which will not work on the autograder.
● Your Makefile specifies that you are compiling with the gcc optimization option -O3. This is extremely
important for getting all of the performance points, as -O3 can speed up code by an order of
magnitude.
● Your test files are named test-n-MODE.txt and no other project file names begin with test. Up to 10
test files may be submitted. The“mode”portion of the filename must be MST, OPTTSP or FASTTSP.
● The total size of your program and test files does not exceed 2MB.
● You don’t have any unnecessary files (including temporary files created by your text editor and
compiler, etc) or subdirectories in your submit directory (i.e. the .git folder used by git source code
management).
● Your code compiles and runs correctly using the g++ compiler. This is available on the CAEN Linux
systems (that you can access via login.engin.umich.edu). Even if everything seems to work on another
operating system or with different versions of GCC, the course staff will not support anything other than
GCC running on CAEN Linux. At the moment, the default version installed on CAEN is 4.8.5, however
we want you to use version 6.2.0 (available on CAEN with a command and/or Makefile); this version is
also installed on the autograder machines.
Turn in all of the following files:
● All your .h and/or .cpp files for the project
● Your Makefile
● Your test files
You must prepare a compressed tar archive (.tar.gz file) of all of your files to submit to the autograder. One
way to do this is to have all of your files for submission (and nothing else) in one directory. Our Makefile
provides the command make fullsubmit. Alternately you can go into this directory and run this command:
dos2unix ; tar czf ./submit.tar.gz .cpp .h .hpp Makefile test*.txt
This will prepare a suitable file in your working directory.
Submit your project files directly to either of the two autograders at: https://g281-1.eecs.umich.edu or
https://g281-2.eecs.umich.edu. Note that when the autograders are turned on and accepting
submissions, there will be an announcement on Piazza. The auto graders are identical and your daily
submission limit will be shared (and kept track of) between them. You may submit up to three times per
calendar day with autograder feedback (double that during Spring semester). For this purpose, days begin
and end at midnight (Ann Arbor local time). We will count only your best submission for your grade. If
10
you would instead like us to use your LAST submission, see the autograder FAQ page, or use this form. We
strongly recommend that you use some form of revision control (ie: SVN, GIT, etc) and that you ‘commit’ your
files every time you upload to the autograder so that you can always retrieve an older version of the code as
needed. If you use an online revision control system, make sure that your projects and files are
PRIVATE; many sites make them public by default! If someone searches and finds your code and uses
it, this could trigger Honor Code proceedings for you..
Please make sure that you read all messages shown at the top section of your autograder results!
These messages often help explain some of the issues you are having (such as losing points for
having a bad Makefile or why you are segfaulting). Also be sure to note if the autograder shows that
one of your own test cases exposes a bug in your solution (at the bottom). Search for the words “hint”
or “exceeded” (without quotes).
Grading
Point Scale
80 points — Your grade will be derived from correctness and performance (runtime). This will be determined by
the autograder. On this project we expect a much broader spread of runtimes than on previous projects. As
with all projects, the test cases used for the final grading are likely to be different.
10 points — Your program does not leak memory. Make sure to run your code under valgrind before each
submit. This is also a good idea because it will let you know if you have undefined behavior (such as reading
an uninitialized variable), which may cause your code to crash on the autograder.
10 points — Student test file coverage (effectiveness at exposing buggy solutions).
In your autograder output, the section named “Scoring student test files” will tell you how many bugs exist, how
many are needed to start earning points, earn full points, and earn an extra submit per day.
Runtime Quality Tradeoffs
In this project there is no single correct answer (unlike previous projects). Accordingly, the grading of your
problem will not be as simple as a‘diff’, but will instead be a result of evaluating your output. For example, if
we gave you a square for Part A, you might choose any 3 of the 4 edges, and print them in any order, meaning
there are 24 possible correct output files!
Particularly for Part B, we expect to see greater variation in student output. Part B asks you to solve a hard
problem, and with the given time constraints, we don’t actually expect your output to be optimal for all cases.
The quality of your solutions may even vary from case to case. We want you to quickly produce solutions that
are close to optimal. This inevitably creates tradeoffs between solution optimality and runtime.
You may find it useful to implement more than one algorithm or heuristic that you use in Part B, and do some
testing plus use the autograder to determine which works the best..
Your grade for Part B will be determined based on how close you are to the best solution, computed as
a percentage. If your tour is over length, you can lose points (similar to time and memory). If this
11
happens, the autograder will not color the test case “blue”; that only works for time or memory; search your
feedback for the words “Your tour cost is too large” (without quotes).
Hints and Advice
It will be difficult to get this project correct without visualizing your MSTs and TSP tours. We have provided a
visualization tool on the Autograder that you can use; follow this link: https://g281-2.eecs.umich.edu…
Running your code (on CAEN or locally) using valgrind can help you find and remove undefined (buggy)
behavior and memory leaks from your code. This can save you from losing points in the final run when you
mistakenly believe your code to be correct.
It is extremely helpful to compile your code with the following gcc options: -Wall -Wextra -Wvla
-Wconversion -pedantic. This way the compiler can warn you about poor style and parts of your code
that may result in unintended/undefined behavior.
Make sure that you are using getopt_long() for handling command-line options.
There’s a video for the project; be sure to watch it. Read the description, and one of the student comments
that provides the times for some interesting places in the video.
For Part C, think about when computing the estimate would save time, and when it would be a waste of time.
If 𝑘 is the number of unvisited vertices, then Θ(𝑘!) is the time to complete the TSP, while Θ(𝑘 is the time
2

  • 2𝑘)
    to compute the estimate:
    𝑘 𝑘! 𝑘
    2
  • 2𝑘
  • 1 3
  • 2 8
  • 6 15
  • 24 24
  • 120 35
  • 720 48
  • ~3.6M 120
    Think about what this means about what promising() should return when 𝑘 is “small enough”.
    12
    Appendix A – Formatting Output
    In order to ensure that your output is within the tolerable margins of error for the autograder to correctly grade
    your program you must run the following lines of code before you output anything to cout. We highly
    recommend that you simply put them at the top of main() so that you don’t forget about them.
    cout << std::setprecision(2); //Always show 2 decimal places
    cout << std::fixed; //Disable scientific notation for large numbers
    You will need to #include <iomanip> to use this code.
    Appendix B – Example TSP Heuristic
    One possible heuristic (NOT the only one) is to construct an initial TSP tour by following MST edges and
    skipping previously visited vertices. By using this technique correctly, you can find a tour that is guaranteed to
    be no more than twice the optimal solution’s length (and use this“2x”check for debugging). You can then
    make adjustments to refine and improve upon the initial tour. This is not necessarily the best approach! It
    is one approach used to illustrate how a heuristic can be used, but there are better ones out there. This
    example method is VERY hard to code. Do some research on other heuristics!
    “Corner Cutting”algorithm illustrated
    Suppose locations are distributed in an input map as shown below:
    Below is an MST that would be formed from the above locations.
    13
    Below is a path taken by strictly following the edges of the MST.
    However, by“cutting corners,”as described in the picture below, an effective TSP solution can be generated.
    This is possible because once a vertex is visited, it will not be visited again. In the above path that strictly
    follows the edges of the MST, the middle vertex is visited four times (visited after an outer vertex is visited). If
    the middle vertex is skipped after the first visit, a TSP tour shown below is achieved.
    The reason we bring up this‘twice-around-the-tree’heuristic is to state the theorem that the resulting tours are
    no worse than 2x the MST cost.
    The following is an explanation/proof sketch as to why the 2x bound is valid:
    If you perform a DFS traversal of a tree and return to the initial node, you have visited every edge exactly twice
    (“going in” and “returning”), so this gives you exactly double the cost/length of the MST (the fact that the tree is
    minimal is not important for the logic of the proof – this works for any tree). Since the tree is spanning, you have
    visited all locations. The only problem with this twice-around-the-tree self-intersecting path is that it’s not a tour.
    It can be turned into a tour by taking shortcuts (i.e., taking shortcuts is important not only to reduce the length).
    When considering other heuristics:
  • The theorem allows you to upper-bound the cost of optimal TSP tours based on MST length.
  • The theorem has a constructive proof – a heuristic that always produces TSP tours that satisfy this
    upper bound .
    2
  • Such heuristics are also called approximation algorithms.
    14
    As a consequence, your heuristic should also satisfy the 2x upper bound. This heuristic is very hard to code;
    there are others that are much easier and produce better results.
    Appendix C – Test Case Legend
    Autograder test names are broken down by the mode used with each test:
    ● Test file names starting with“A”are used with the MST mode.
    ● Test file names starting with“B”are used with the FASTTSP mode.
    ● Test file names starting with“C”are used with the OPTTSP mode.
    Test files that start with“C”are much smaller in size than those that start with“A”and“B”. The maximum
    number of locations for such files is smaller than 40 as compared to tens of thousands of locations for the
    others. The reason is that TSP is NP-complete and finding an optimal solution in a reasonable amount of time
    is only possible for small-sized problems.
正文完
 0