The P3RMA (Programmable, Parallel, and Predictable Random Memory Access) processor, currently being developed at Linköping University Sweden, is an attempt to solve the problems of parallel computing by utilizing a parallel memory subsystem and splitting the complexity of address computations with the complexity of data computations.
It is targeted at embedded low power low cost computing for mobile phones, handsets and base stations among many others. By studying the radix-2 FFT using the P3RMA concept we have shown that even algorithms with a complex addressing pattern can be adapted to fully utilize a parallel datapath while only requiring additional simple addressing hardware. By supporting this algorithm with a SIMT instruction almost 100% utilization of the datapath can be achieved.
A simulator framework for this processor has been proposed and implemented. This simulator has a very flexible structure featuring modular addition of new instructions and configurable hardware parameters. The simulator might be used by hardware developers and firmware developers in the future.
Author: Kraigher, Olof | Olsson, Johan
Source: Linköping University