Monday, December 16, 2013

Address Subdivision

Address Subdivision



 










                                                                                                                             
·         32-bit byte address
·         A direct-mapped cache
·         The cache size is 2^n blocks, so n bits are used for the index
·         The block size is 2^m words (2^m+2 bytes), so m bits are used for the word within the block, and two bits are used for the byte part of the address

Accessing Cache

▪ Total number of bits needed for a cache
1)    size of tag field is

o   32-(n+m+2)
                 2) The total number of bits in a direct-mapped cache
o   2^n x (block size + tag size +valid field size)
             ▪ Total number of bits in cache is
-       2^n x (2^m x 32 +(32-n-m-2)+1) = 2^n x (2^m x 32+31-n-m)


             Example: Large Block Size
             ▪ 64 blocks ,16 bytes/block
o   To what block number does address 1200 map?
             ▪ Block Address= 1200/16 = 75
             ▪ Block number= 75 modulo 64=11

 

            Block Size Considerations
            ▪ Larger blocks should reduce miss rate
o   Due to spatial locality
▪ But in a fixed-sized cache
o   Larger blocks------->fewer of them

§  More competition-------->increased miss rate
o   Larger blocks------->pollution
 ▪ Larger miss penalty
o   Can override benefit of reduced miss rate
o   Early restart and critical-word-first can help

Cache Misses
▪ On cache hit, CPU proceed normally
▪ On cache miss
o   Stall the CPU pipeline
o   Fetch block from next level of hierachy
o   Instruction cache miss
-       Restart Instruction fetch
o   Data cache miss
-       Complete data access

             Write-Through
             ▪ On data-write hit, could just the block in cache
-       But then cache and memory would be inconsistent
             ▪ Write through: also update memory
             ▪ But makes writes take longer
-       e.g., if base CPI=1, 10% of instruction are stores, writes to memory takes 100 cycles
§   Effective CPI=1+0.1x100=11
 ▪ Solution: write buffer
-       Holds data waiting to be written to memory
-       CPU continues immediately
§  Only stalls on write if write buffer is already full

Write-Back
▪ Alternative: On data-write hit, just update the block in cache
-       keep track of whether each block is dirty
▪When a dirty block is replaced
-       Write it back to memory
-       Can use a write buffer to allow replacing block to read first

Write Allocation
▪ What should happen on a write miss?
▪ Alternatives for write-through
-       Allocate on miss: fetch the block
-       Write around: don’t fetch the block
o   Since programs often write a whole block before reading it(e.g., Initialization)
▪ For write-back
-       Usually fetch the block













No comments:

Post a Comment