共计 5873 个字符,预计需要花费 15 分钟才能阅读完成。
Lab 7: Cache Design in PyRTL
Assigned: Wednesday, February 21st, 2024
Due: Wednesday, February 28th, 2024
Points: 100
• MAY ONLY BE TURNED IN ON GRADESCOPE as PYTHON files (see below for details).
• There is NO MAKEUP for missed assignments.
• We strictly enforce the LATE POLICY for all assignments (see syllabus).
Goals for This Lab
By completing this lab, you should be able to utilize PyRTL to design and simulate a 4-way set-associative cache that follows a round-robin replacement policy.
Figure 1: Textbook figure 5.18 of a set-associative cache. Note that this has different parameters than your cache for this lab.
Task
Your PyRTL code for this lab will implement a 4-way, set-associative cache that supports 32-bit addresses and is word addressable. Your cache should be 16 rows. Because this is a 4-way cache, there are 4 cache blocks in each row. Each data block has 4 words (16 bytes total). Provided with several MemBlocks for these components, your task is to implement the rest of the logic for a 4-way cache. Your cache should follow a round-robin replacement policy when inserting entries into the cache (explained later).
Provided Files
We have provided the skeleton file ucsbcs154lab7_4waycache.py, where you will need to implement your cache.
Your cache will have 4 inputs and 2 outputs. The names and bit widths must match exactly.
● Input req_new (1 bit)
● Input req_type (1 bit)
● Input req_addr (32 bits)
● Input req_data (32 bits)
● Output resp_hit (1 bit)
● Output resp_data (32 bits)
We have also provided several MemBlocks in the skeleton file to help guide you in your implementation. You must use all of the provided MemBlocks in your cache implementation.
Instructions
The following figure shows a high-level design of what your 4-way cache should look like.
Figure 2: High-level Cache Design
Each Way entry consists of a Valid Field, Tag field, and Data Block, as shown in Figure 4. As previously mentioned, each Data Block is 4 words in length. The Valid field should be 1 bit. The size of the Tag field is determined by how many bits a referenced memory address requires for its Offset Field and Index Field (see Figure 3). Because our cache has 24 rows, the Index Field should be 4 bits. Because there are 24 bytes in each Data Block, the Ofset Field should be 4 bits. Therefore, the Tag Field should be 24 bits (32 – 4 – 4 = 24 bits).
Figure 3: Composition of a requested memory address
Figure 4: Composition of cache rows within some way n
Expected Cache Behaviour
When req_new = 1, that means your cache is receiving a new request for the memory address provided by the req_addr Input. When req_new = 0, your cache should output 0 for both resp_hit and resp_data and remain unchanged (i.e., no writes should occur).
Handling Read Hits
When req_new = 1 and req_type = 0, that means that this is a read request. If req_addr hits in your cache, you should return the requested data as your resp_data. Output“1”for
resp_hit.
Handling Write Hits
When req_new = 1 and req_type = 1, that means that this is a write request. If req_addr hits in your cache, replace the appropriate word with req_data and return“1”as your resp_hit. Output“0”as your resp_data.
Handling Read Misses
On a cache miss, our cache will not access a larger memory as would occur in a regular memory hierarchy. You may instead assume the new block’s contents are “0”. If you miss on req_addr, you should return“0”as your resp_data and output“0”as your resp_hit. The replaced data block should be set to“0”and valid for future accesses.
Handling Write Misses
On a write miss, resp_data and resp_hit should output“0”as with a read miss. A new entry should be added to the cache for req_addr, whose data block should consist of req_data at the appropriate word location and the rest of the words in the block set to“0”As with a read miss, the new block should be made valid.
Round-robin Cache Replacement Policy
For this assignment, we want you to implement your cache such that it inserts new entries into the cache following a round-robin policy per row. When you miss on req_addr, you have to replace an existing entry in the cache row to make room for the data at req_addr. req_addr maps to some Row i, and there are 4 slots you can place in that row. You will need to use the repl_way MemBlock structure to help you keep track of which slot you should replace next for each row. For example, you had to replace entries in some Row i repeatedly. First, you would replace the slot in Way 0,then Way 1,then Way 2,then Way 3,and then back to Way 0.
Design Note
Note about byte order in Data Block: For the purposes of this lab, Byte 0 in the Data Block is the least significant byte, meaning that Word 0 is also the least significant word (as shown in Figure 5). You would otherwise know this as little-endian order.
Figure 5: Byte Ordering Within the Data Block
You may assume for the purposes of this lab that your access addresses (req_addr) always have the two LSBs as 0. All accesses would correspond with load word or store word instructions.
Test your Design!
We have provided you with an environment in which you may test your cache. We have provided you with some barebones test cases. Within the main function of the skeleton files, you will see code that tests that your cache correctly misses, hits, performs reads/writes, and maps the address to the correct index (i.e., row) in the cache. Note: these tests are NOT comprehensive, and you should write additional tests independently to check for edge cases.
Additional test cases to consider:
● Reading/writing to a word that is NOT Word 0
● Miss on a write request (the existing WriteTest only tests when you write to an existing entry in your cache)
● Filling the entire cache
● Replacement
Files to Submit
For this lab, you simply need to submit ucsbcs154lab7_4waycache.py. Please double-check that you are not including Python imports at the top of your file unrelated to this lab (e.g., tkinter). These imports may prevent the autograder from running your submission.
Please keep your eyes open for any Piazza announcements in case we make updates to the submission requirements.
Autograder
The autograder will reveal the name of the test, but we will NOT be releasing the actual test cases themselves. Part of this assignment is to learn how to test your code! For this lab, you may not share the test code with other students in the course. You may discuss testing strategy in terms of behaviors but not the specifics of the trace of memory access addresses. We want you to think through and develop an understanding of cache indexing.
TLDR (but please read the instructions)
● What are you implementing?
○ A 4-way, set-associative cache
● What are the cache specs?
○ 32 bit addresses
○ 4 ways
○ 16 rows
○ 16 bytes (4 words) of data per block
● What replacement policy should you follow?
○ Round-Robin
● When do you replace an entry in the cache?
○ In the same cycle you had a miss
● How should you split up“req_addr”?
○ tag (24 bits)[offset (4bits)]
● What should your cache output?
○ Refer to the table below. Note that“x”values are“don’t care”