Simultaneous Multithreading (SMT) is a technique to execute multiple threads in parallel in a single processor pipeline. An SMT processor has shared instruction queues and functional units and these resources are utilized efficiently without being wasted. Because the instruction queues and functional units are shared by multiple threads, it is very important to decide which threads to fetch instructions from every cycle.
This paper investigates 2-level fetch policies and other techniques with a view to improve both throughput and fairness. To measure the potential of the 2-level fetch policies, simulations are conducted on 4 different benchmark combinations with two SMT configurations, and simulation results are compared with those of ICOUNT and LC-BPCOUNT, two existing fetch policies. Our detailed experimental evaluation confirms that the 2-level fetch policies outperform both ICOUNT and LC-BPCOUNT in terms of throughput, as well as fairness.
As a way to improve fairness, we also investigate the idea of partially partitioning the instruction queues among the threads. In particular, we vary the partition size to see how both throughput and fairness are impacted. From this experiment, we found that more fairness can be obtained at the cost of throughput. We expect the techniques presented in this paper to play a major role in future SMT designs.
Source: University of Maryland
Author: Lim, Chungsoo