Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
525 views
in Technique[技术] by (71.8m points)

cpu architecture - How does Load Store Queue work in the presence of MSHR?

I understand the basic working of load-store queue, which is

  1. when loads compute their address, they check the store queue for any prior stores to the same address and if there is one then they gets the data from the most recent store else from write buffer or data cache.
  2. When stores compute their address, they check load queue for any load violations

My doubt is what happens when

  1. In the first case when the load access data cache due to some unresolved store addresses in the store queue and the access is miss in L1 data cache and before the data can be retrieved from the cache, the store address resolves. Now, the store does load queue checking for any violations. The dependent load has already accessed the data cache prior but didn't receive the value from cache yet due to long latency miss. Does the store post load violation or does it do store-to-load forwarding and cancel the data from cache?

  2. When load access miss in the l1 data cache, then the loads are placed in MSHR so as to not block the execute stage. When the miss resolves, the MSHR entry for that load has information regarding destination register and physical address. So the value can be updated in the physical register but how does the MSHR communicate with load queue that the value is available? when does this happen in the pipeline stage? Because I have read somewhere that MSHR store physical addresses and Load-store queue store virtual addresses. So how does MSHR communicate with LSQ?

I haven't found any resources regarding these doubts.

question from:https://stackoverflow.com/questions/65851556/how-does-load-store-queue-work-in-the-presence-of-mshr

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. This is speculative execution where loads bypass older stores. When the older store is resolved, we can throw a load violation. If the probability of address aliasing is low then speculative execution is profitable (more throughput) - typically should be true for programs. On detecting a load violation, we can take appropriate step - (a) store-to-load forward, or (b) rollback pipeline to the resolved store.

  2. Same as when loads are served via cache hits (that can take 1-3 cycles for a L1 hit). For example in a reservation station with a CDB (common data bus), the result will be shared with all HW structures that need it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...