Reconfigurable Computers and Racecars: Investigating the use of FPGAs for Autonomous Vehicles

A quick review of my research project into FPGAs, deep learning and autonomous navigation.

Emran Yasser Moustafa Aug 30, 2022

Throughout the last 6 weeks I have been working in the electronics department here in Trinity College. My research was focused on exploring the viability of using FPGAs for object detection in autonomous vehicles. More specifically, in high speed vehicles like racecars or drones. In practice this looked like building a perception module, a cone detector in my case, that could be deployed on an FPGA, a special piece of hardware that can "do AI" but in much more power efficient manner than it's more commonly used counterparts. Prior to this research project, I had little to no experience in deep learning, my computer vision knowledge was limited to half a YouTube tutorial and I couldn't have even told you what the abbreviation FPGA stood for. Yet, my proposal centered around these three constantly evolving areas of engineering. As you can imagine, my proposal had a lot of drafts. However in all seriousness, these last few weeks has definitely had some of the hardest engineering challenges I've had to face in my career to date. I dealt with issues I had never really come across yet and overcoming them was difficult but rewarding.

Like most Laidlaw scholars I've talked to, the beginning of my project can be summarized into "what am I doing and why am I here?" My first 15 minutes in lab consisted on frantically opening and closing tabs on Google chrome or picking things up off my desk and immediately placing them back down. However once the panic of imposter syndrome final cooled down I could start the work I had so looked forward to all summer. I began my project by creating the convolutional neural network that would later be deployed on to the FPGA. Convolutional Neural Networks are models where convolutions are the main operations performed. These operations work to extract something called features in the images that the network is processing. The features that are extracted early in the network may include things like minute colour gradients or edges and corners, whereas later layers might identify things like textures or whole shapes of objects. For my project, I used a CGNet or Context Guided Network architecture. This architecture was chosen due to how lightweight it is ( roughly 500 - 700 thousand parameters ) and it's high accuracy. 500 thousand might seem like a lot, but that's in comparison to other models which may have upwards of 5 to 10 million parameters and only a fractionally better accuracy. My first few days of research was composed of exploring the architecture of this model and experimenting with performance, both in loss and latency of the model.

The next and longest portion of the project consisted of developing a workflow, going from a trained floating point model to a deployable model that can be put on the board. To do this, I used Xilinx's new VitisAI tool. This is a library recently developed by Xilinx to take a neural network, quantise it and then compile it, turning into a set of instructions that the FPGA can read and understand. Prior to the project I estimated this section to take roughly 2 weeks, mixed amongst other task such as testing and optimising this or that. Little did I know that this part of the project would include roughly 100 million error messages, broken components and mistaken assumptions.

To expand on this note a bit more, the innocent sounding "make a workflow" period of the project is what I think I will remember these last few weeks by. Working with such new technology meant there weren't really any resources I could rely on or message boards I could rummage around in. VitisAI has only been around since 2020. Similarly the FPGA I used, Xilinx's Kria SoM, has only been in circulation for about a year. When I ran into an issue or broke something, there was nobody to fix it but myself. There were countless times I had googled some issue only to receive 0 search results or an unrelated article completely written in Chinese. These few weeks really developed both my patience and debugging skills. Looking back on it now, I do feel a sense of pride in having completed it with only a minimal amount of frustrated and/or teary-eyed lunchtime phone calls.

The final stage of the project was the benchmarking and testing period. After the ups and downs of the weeks before, this period of somewhat challenging but ultimately straightforward work was very much welcomed. During this time I tested the performance of a few variations of my model, measuring things like IoU ( intersection over union ) accuracy of the model as well as the latency at different stages and power consumption. These test highlighted a lot to me, mainly the flaws. More than anything it showed me the potential next steps of the project. Every door I closed spontaneously opened two more. Nearer the end of the final week I felt an unexpected sadness that the project was coming to it's close. I was left with so much I wanted to explore before closing the book on this experiment. However all good things must come to end. And so must this blog post.

Thanks for reading!