Claudia Bruhn
A Flexible FPGA-based Accelerator for Double Q-Learning

Abstract
Q-learning is a well-known Reinforcement Learning algorithm. Many publications exist that expand on or modify its basic idea, one of which is the Double Q-learning algorithm proposed by van Hasselt in 2010. This thesis presents an implementation of the Double Q-learning algorithm on a Field-Programmable Gate Array (FPGA). It is based on an already existing FPGA implementation of the standard Q-learning algorithm created by the Computer Engineering Group at Osnabrück University. The resulting architecture is analyzed and evaluated in terms of the results it attains, resource utilization and achieved throughput.