With the importance of data security at its peak today, many reconfigurable systems are used to provide security. This protection is often provided by FPGA-based encrypt/decrypt cores secured with secret keys. Physical unclonable functions (PUFs) use random manufacturing variations to generate outputs that can be used in keys. These outputs are specific to a chip and can be used to create device-tied secret keys. Due to reliability issues with PUFs, key generation with PUFs typically requires error correction techniques. This can result in substantial hardware costs. Thus, the total cost of an -bit key far exceeds just the cost of producing n bits of PUF output.
To tackle this problem, we propose the use of variation aware intra-FPGA PUF placement to reduce the area cost of PUF-based keys on FPGAs. We show that placing PUF instances according to the random variations of each chip instance reduces he bit error rate of the PUFs and the overall resources required to generate the key. Our approach has been demonstrated on a Xilinx Zynq-7000 programmable SoC using FPGA specific PUFs with code-offset error correction based on BCH codes. The approach is applicable to any PUF-based system implemented in reconfigurable logic.
To evaluate our approach, we first analyze the key metrics of a PUF – reliability and uniqueness. Reliability is related to bit error rate, an important parameter with respect to error correction. In order to generate reliable results from the PUFs, a total of four ZedBoards containing FPGAs are used in our approach. We quantify the effectiveness of our approach by implementing the same key generation scheme using variation-aware and default placement, and show the resources saved by our approach.
A PHYSICAL UNCLONABLE FUNCTION NATIVE TO THE XILINX ARCHITECTURE
In this chapter, a detailed operational description of the Anderson PUF used in this work is provided. This PUF has been shown to work effectively in Xilinx FPGAs. The PUF overcomes the drawbacks of the delay-based PUFs described in Chapter 2 by avoiding the need for careful symmetric routing.
The PUF uses internal component connections with fixed delays inside logic clusters. The primary advantage of using Anderson PUFs for our work is the lack of design dependence on delay variations due to programmable interconnect.
Xilinx Virtex 7 Architecture:
A Configuration Logic Block (CLB) in the Xilinx Virtex 7 architecture is comprised of two logic slices and each slice consists of a combination of lookup tables (LUTs) and registers (flip flops). CLBs are arranged in a two-dimensional array on the FPGA chip and are connected to each other through a programmable interconnection matrix.
Anderson PUF Design:
As mentioned in Chapter 2, an Arbiter PUF or a Butterfly PUF can be difficult to implement in FPGAs. The PUF design proposed by Anderson produces an output which is more clearly based on random interconnect delay variations and does not suffer from the problems faced by the other PUFs which require the use of programmable interconnect in FPGAs.
Anderson PUF Operation:
As mentioned in the previous subsection, two LUTs, D and C, are used in a 16-bit shift register mode. The shift registers need to be pre-initialized as follows:
LUT D: 0101010101010101
LUT C: 1010101010101010
We need to make sure that the initialization bitstrings of the two shift registers are complementary to each other. The shift register output drives the select line of the multiplexers in the carry chain. The ”0” data inputs of all design multiplexers are tied to logic 0 while the bottom carry chain multiplexer has its ”1” input tied to logic 1.
A research goal is to determine if there are locations on the FPGA chip which are more favorable to PUF performance than others. PUF performance is defined by two primary factors, uniqueness and reliability.
For our analysis, we instantiated our PUF design at all possible locations in a target FPGA. We evaluated the design using four ZedBoards which include a Xil-inx XC7Z020-1CLG484C Zynq-7000 AP SoC. Each board has approximately 4,200 SLICEM’s and each PUF circuit uses two slices to generate a bit.
In this research, we propose the idea of per-device placement of PUFs. In this chapter, we provide evidence to support our approach by calculating the reliability of the PUFs on all the locations of a chip and observing the spatial correlation of the unreliable PUFs with respect to other chips.
This chapter gives a detailed description about the error correcting codes used in our work along with a full system design to generate a key by utilizing encryption cores along with process variation dependent bits generated by the PUF corrected by error correcting codes.
Device Specific Location of Unreliable PUF Instances:
The work in Chapter 3 quantified the uniqueness and reliability of our implementation of the Anderson PUF. In this section, we quantify key generation using the bit error rate (BER) metric which signifies the probability that a PUF will produce an incorrect output.
We observe the BER of every PUF instantiated on the chip to select the most reliable PUFs or the PUFs with the lowest BER values. PUF selection based on BER directly relates to the size of the required error correcting code. This metric is related to hardware cost.
In this section, we explain the importance of error correction codes and their use with PUF-based keys. The process of key enrollment and generation with the help of error correction codes is described along with a detailed analysis of system area is affected by unreliable PUFs.
Our complete system implements several encryption schemes using PUF-basedkeys 1. We implemented authenticated encryption using AES in Galois Counter Mode for 256-bit and 128-bit keys and DES in Electronic Code Book mode. The three primary components of the system are the PUF instances, the BCH decoder for error correction and the encryption blocks. Publicly available Verilog code is used to implement the BCH decoder, and the encryption blocks.
TWO PARAMETER MODEL FOR ERROR CORRECTION
The security benefits of PUF usage are accompanied by post processing. In addition to reliability and uniqueness, efficiency also plays are important part in PUF based system. To understand the PUF behavior in terms of reliability, an accurate model that closely fits the statistics of a PUF is required to better predict the number of errors to be corrected. This chapter primarily focuses on predicting an accurate BCH code depending on our PUF behavior using a more robust model.
Fixed Error Rate:
The commonly used PUF reliability model used in previous sections considers a fixed error rate. In this approach, the response bit of each PUF is assumed to be equally prone to errors. Hence the approach suggests computing the average BER of all PUFs used in the application to determine the BCH code size. A comment on such an approach in is: ’Many details are lost by reducing the reliability behavior to a single average-case parameter’. Specifically, there is a possibility that the average PUF BER increases due to a small number of highly unreliable PUFs or decreases due to a small number of highly reliable PUFs.
PUF SELECTION USING MULTIPLEXERS
In Chapter 4, it was shown that variation aware placement reduces area consumption by 21% for AES cores and about 50% for DES cores compared to variation agnostic placement. In the former case, PUFs are constrained to locations which are highly reliable. In the latter case PUF locations are randomly selected without considering their reliability. The reliability difference impacts BCH code hardware size. Although variation aware placement is effective, it requires an FPGA to be recompiled for every chip. In this chapter, we propose a new placement method which strikes a balance between the two schemes discussed so far, while eliminating the need for per-device recompilation.
Our work examines PUF placement in FPGAs based on PUF reliability (variation-aware placement). Our results are compared with random PUF placement (variation agnostic placement). We also examine a PUF selection approach which allows for post-configuration selection of PUFs. Although this latter approach reduces BCH error correcting hardware, the overhead of the PUFs and selection logic is significant.
We have implemented three cryptosystems using different PUF-based key sizes and have demonstrated that our variation-aware approach can save between 21% and 49% of area in these systems compared to a variation-agnostic approach. Future work could consider improving the multiplexer scheme by using bit masking of PUFs. New approaches to PUF implementation in FPGAs could also be considered.
Source: University of Massachusetts Amherst
Author: Shrikant S. Vyas