You are on page 1of 16

Lecture 14

Lecture14
Directmappedcache

AdaptedfromComputerOrganizationandDesign,4th edition,PattersonandHennessy

ATypicalMemoryHierarchy

Takeadvantageoftheprincipleoflocalitytopresenttheuser
Take
advantage of the principle of locality to present the user
withasmuchmemoryasisavailableinthecheapest
technologyatthespeedofferedbythefastest technology

OnChipComponents
On
Chip Components
Control

DTLB

Data
In
nstr
Cache Cache

ITTLB
RegFile

Datapath

Second
Level
Cache
(SRAM)

Speed(%cycles):s1s10s100s
( y )
100s
Size(bytes):

Main
Memory
(DRAM)

10,000s

10KsMsGsTs

Cost:highestlowest

Secondary
Memory
(Disk)

CharacteristicsoftheMemoryHierarchy
Processor
48bytes(word)
y (
)

Increasing
distance
fromthe
processorin
accesstime

L1$
832bytes(block)
y (
)

L2$
1to4blocks

Main Memory
MainMemory

Inclusive
what is in L1$
whatisinL1$
isasubsetof
whatisinL2$
is a subset of
isasubsetof
whatisinMM
thatisasubset
ofisinSM

1,024+bytes(disksector=page)

Secondary Memory
SecondaryMemory

(Relative)sizeofthememoryateachlevel

CacheBasics
Twoquestionstoanswer(inhardware):
( h d
)
Q1:Howdoweknowifadataitemisinthecache?
Q2:Ifitis,howdowefindit?
Q
,

Directmapped
EEachmemoryblockismappedtoexactlyoneblockin
h
bl k i
dt
tl
bl k i
thecache
lotsoflowerlevelblocksmustshare blocksinthecache

Addressmapping(toanswerQ2):
(blockaddress)modulo(#ofblocksinthecache)
Haveatag associatedwitheachcacheblockthat
containstheaddressinformation(theupperportionof
th dd
theaddress)requiredtoidentifytheblock(toanswer
)
i d t id tif th bl k (t
Q1)

Caching:ASimpleFirstExample
Cache
Index Valid Tag

Data

00
01
10
11

Q1:Isitthere?
Comparethecachetag
Compare
the cache tag
tothehighorder2
memoryaddressbits to
y
tellifthememoryblock
isinthecache

MainMemory
0000
0000xx
Onewordblocks
0001xx
Twoloworderbits
0010xx
definethebyteinthe
0011xx
word(32bwords)
0100xx
0101xx
0110xx
0111xx
Q2:Howdowefindit?
1000xx
1001xx
Use next 2 low order
1010xx Usenext2loworder
1011xx memoryaddressbits
1100xx theindex todetermine
which cache block (i e
whichcacheblock(i.e.,
1101
1101xx
1110xx modulothenumberof
1111xx blocksinthecache)

(blockaddress)modulo(#ofblocksinthecache)

Caching:ASimpleFirstExample
MainMemory
Cache
Index Valid Tag

Data

00
01
10
11

Q1:Isitthere?
Comparethecachetag
Compare
the cache tag
tothehighorder2
memoryaddressbits to
y
tellifthememoryblock
isinthecache

0000xx
0000
0001xx
0010xx
0011xx
0100xx
0101xx
0110xx
0111xx
1000xx
1001xx
1010xx
1011xx
1100xx
1101
1101xx
1110xx
1111xx

Onewordblocks
Twoloworderbits
definethebyteinthe
word(32bwords)

Q2:Howdowefindit?
Usenext2loworder
Use
next 2 low order
memoryaddressbits
theindex todetermine
which cache block (i e
whichcacheblock(i.e.,
modulothenumberof
blocksinthecache)

(blockaddress)modulo(#ofblocksinthecache)

MIPSDirectMappedCacheExample
Onewordblocks,cachesize
One word blocks, cache size =1Kwords(or4KB)
1K words (or 4KB)
Byte
offset

3130...131211...210

Hit

Tag
Index Valid

20
Index
Tag

Data

10
Data

0
1
2
.
.
.
1021
1022
1023

20

32

Whatkindoflocalityarewetakingadvantageof?

DirectMappedCache
Considerthemainmemorywordreferencestring
Consider the main memory word reference string
Startwithanemptycache allblocks
initiallymarkedasnotvalid

012343415
2

15

DirectMappedCache
Considerthemainmemorywordreferencestring
Consider the main memory word reference string
Startwithanemptycache allblocks
initiallymarkedasnotvalid

0 miss
i

1 miss
i

00Mem(0)

00Mem(0)

012343415

00Mem(1)

01

4 miss

4
00Mem(0)
00Mem(1)
00 Mem(2)
00Mem(2)
00Mem(3)

3 hit
01Mem(4)
00Mem(1)
00 Mem(2)
00Mem(2)
00Mem(3)

8requests,6misses

2 miss
i
00Mem(0)
00Mem(1)
00Mem(2)

hit

01Mem(4)
00Mem(1)
00 Mem(2)
00Mem(2)
00Mem(3)

3 miss
i
00Mem(0)
00Mem(1)
00Mem(2)
00Mem(3)
15 miss
01Mem(4)
00Mem(1)
00 Mem(2)
00Mem(2)
11 00Mem(3) 15

MultiwordBlockDirectMappedCache
Fourwords/block,cachesize=1Kwords
/
,
3130...131211...43210

Hit

20
Index

Tag
Index Valid

Byte
offset

Data
Bl k ff t
Blockoffset

8
Data

Tag

0
1
2
.
.
.
253
254
255

20

32

Whatkindoflocalityarewetakingadvantageof?

TakingAdvantageofSpatialLocality
Letcacheblockholdmorethanoneword
Let cache block hold more than one word
Startwithanemptycache allblocks
initiallymarkedasnotvalid

012343415
1

15

TakingAdvantageofSpatialLocality
Letcacheblockholdmorethanoneword
Let cache block hold more than one word
Startwithanemptycache allblocks
initiallymarkedasnotvalid

0 miss
i

1 hit
00Mem(1)Mem(0)

00Mem(1)Mem(0)

3 hit
00Mem(1)Mem(0)
00Mem(3)Mem(2)

4 miss
01
5
4
00Mem(1)Mem(0)
00Mem(3)Mem(2)
4 hit

01Mem(5)Mem(4)
00Mem(3)Mem(2)

012343415

8requests,4misses

i
2 miss
00Mem(1)Mem(0)
00Mem(3)Mem(2)
3 hit
01Mem(5)Mem(4)
00Mem(3)Mem(2)
15 miss

1101Mem(5)Mem(4)
15
14
00Mem(3)Mem(2)

MissRatevsBlockSizevsCacheSize
Mis
ss rate (%
%)

10
8 KB
16 KB
64 KB

256 KB

0
16

32

64

128

256

Block size (bytes)

Missrategoesupiftheblocksizebecomesasignificantfraction
ofthecachesizebecausethenumberofblocksthatcanbeheld
in the same size cache is smaller (increasing capacity misses)
inthesamesizecacheissmaller(increasingcapacitymisses)

HandlingCacheMisses(SingleWordBlocks)

Read misses (I$ and D$)


Readmisses(I$andD$)
stall thepipeline,fetchtheblockfromthenextlevelinthe
memoryhierarchy,installitinthecacheandsendtherequested
wordtotheprocessor,thenletthepipelineresume
p
,
pp

Writemisses(D$only)
1. stall thepipeline,fetchtheblockfromnextlevelinthememory
hierarchy,installitinthecache(whichmayinvolvehavingto
y,
(
y
g
evictadirtyblockifusingawritebackcache),writetheword
fromtheprocessortothecache,thenletthepipelineresume
or
2. Writeallocate justwritethewordintothecacheupdatingboth
thetaganddata,noneedtocheckforcachehit,noneedtostall
or
3. Nowriteallocate skipthecachewrite(butmustinvalidatethat
cacheblocksinceitwillnowholdstaledata)andjustwritethe
wordtothewritebuffer(andeventuallytothenextmemory
level) no need to stall if the write buffer isnttfull
level),noneedtostallifthewritebufferisn
full

MultiwordBlockConsiderations
Readmisses(I$andD$)
d
( $ d $)
Processedthesameasforsinglewordblocks amiss
returnstheentireblockfrommemory
Misspenaltygrowsasblocksizegrows
Earlyrestart processorresumesexecutionassoonasthe
requestedwordoftheblockisreturned
Requestedwordfirst requestedwordistransferredfromthe
memorytothecache(andprocessor)first

Nonblocking cache allowstheprocessortocontinueto


access the cache while the cache is handling an earlier
accessthecachewhilethecacheishandlinganearlier
miss

Writemisses(D$)
Ifusingwriteallocatemustfirst
f
ll
f
f h h bl k f
fetchtheblockfrom
memoryandthenwritethewordtotheblock(orcould
endupwithagarbledblockinthecache(e.g.,for4
word blocks a new tag one word of data from the new
wordblocks,anewtag,onewordofdatafromthenew
block,andthreewordsofdatafromtheoldblock)

You might also like