Core 架构
Fetch
在Core中,fetch模块主要与I-Cache(指令缓存)、Decode(解码)模块、Scheduler(调度)模块相连。
fetch的主要功能是取址,即从指令缓存中取出一条指令,并将指令输出至下一级Decode模块中。
Decode
在Core中,decode模块主要与fetch、scheduler模块、scoreboard模块相连。 decode的主要功能是解码指令,将fetch所得指令二进制解码后放入ibuffer对应位置,以供scheduler和oc模块使用
Scheduler
在Core中,Scheduler模块主要与Decode模块、Operand Collector模块以及Simt Stack模块相连。其主要功能是从Decode模块的Ibuffer中取出一条有效指令发射到Operand Collector模块, 以及处理控制流相关指令。
Operand Collector
OC模块作为操作数收集模块,主要与Scheduler模块、Execution模块相连。其主要功能是从Scheduler模块的不同类型的fifo中取出待处理的指令,通过读取register file模块和static memory中的数据,为其分配好所需要的相关操作数,待到该指令涉及到的操作数全部准备完毕,将其发送给Execution模块。
Load Store Unit
As a memory access module, the Load Store unit is a subset of the execution unit, which is connected with the OC module and WriteBack module. The main function is to calculate the actual memory access address and to read or write the memory according to the address. At the same time, according to the principle of GPGPU throughput priority, the Load Store unit is responsible for merging memory access requests initiated by threads to reduce global memory access traffic, reduce memory access bandwidth pressure, and improve overall memory access throughput.