Constrained Hardware Accelerator Design for Machine Learning

Background

Machine learning (ML) has been widely used in applications today, such as ChatGPT and ICU monitoring. ML requires significant energy to compute on general-purpose processors, such as CPUs and GPUs. In order to meet the need for low energy, hardware accelerators are proposed. ML hardware accelerators have efficient hardware architecture and require significantly low power when computing at the same performance.

However, designing such hardware accelerators requires a significant amount of hardware knowledge, which makes it inaccessible to general users. Recently, there has been interests in developing tools for automatically generating efficient hardware accelerators from high-level abstraction, hiding those hardware details from the user. Examples of these projects are:

FPGAConvNet from Imperial:

fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs

AMD Xilinx FINN:

FINN

The goal of this project is to build the next generation of hardware compilers for generating efficient ML accelerators. We build our work on top of the state-of-the-art compiler infrastructure named MLIR.

MLIR

Project aim

Base elements

The hardware compiler requires a set of constrained hardware blocks as part of the internal library and explores an efficient hardware design by varying the combinations among them with different constraints. The goals of this project include:

Design and implement a set of hardware blocks for linear algorithms:

'linalg' Dialect

Test and verify the functionality of these blocks using Cocotb/Verilator:

cocotb

Veripool

Optimise these hardware blocks for high performance.

Extensions

Extensions of this project include: