Research Analyzer
← Back ICRA 2023

Deep Learning on Home Drone: Searching for the Optimal Architecture

Alaa Maalouf, Yotam Gurfinkel, Barak Diker, Oren Gal, Daniela Rus, Dan Feldman

PDF

Abstract

We suggest the first system that runs real-time semantic segmentation via deep learning on the weak micro- computer Raspberry Pi Zero v2 (whose price was $15) attached to a toy drone. In particular, since the Raspberry Pi weighs less than 16 grams, and its size is half of a credit card, we could easily attach it to the common commercial DJI Tello toy- drone (<$100, <90 grams, 98 × 92.5× 41 mm). The result is an autonomous drone (no laptop nor human in the loop) that can detect and classify objects in real-time from a video stream of an onboard monocular RGB camera (no GPS or LIDAR sensors). The companion videos demonstrate how this Tello drone scans the lab for people (e.g. for the use of firefighters or security forces) and for an empty parking slot outside the lab. Existing deep learning solutions are either much too slow for real-time computation on such IoT devices, or provide results of impractical quality. Our main challenge was to design a system that takes the best of all worlds among numerous combinations of networks, deep learning platforms/frameworks, compression techniques, and compression ratios. To this end, we provide an efficient searching algorithm that aims to find the optimal combination which results in the best tradeoff between the network running time and its accuracy/performance. I. BACKGROUND Deep learning advancements in the previous decade have sparked a flood of research into the use of deep artificial neural networks in robotic systems. The main drawback is that existing deep learning solutions usually require powerful machines, such as strong servers and graphics (GPU) cards. This is a problem when it comes to small robots due to the following challenges: (i) The weight of the computation machine might be too much, especially for “flying robots" such as nano- drones. (ii) A large amount of required energy to operate powerful computers and graphics translates into large (heavy) batteries and a shorter lifetime between charging, es- pecially for moving robots. (iii) The price in terms of money for these computations machines, is relatively high, especially when it comes to low-cost or Do-It-Yourself (DIY) robots. It seems that most robotics applications that need to run deep learning under such constraints use the powerful Nano- Jetson machine of NVidia [Cass, 2020], [Süzen et al., 2020], which was used in many robotic systems in recent years such as [Jeon et al., 2021], [Süzen et al., 2020], [An et al., 2021], [Rudolph et al., 2022]. However, while much more powerful, NVIDIA’s nano-Jeton (nJeston) is far 1Robotics and Big Data Labs, Department of Computer Science, Univer- sity of Haifa. 2CSAIL, MIT ∗Equal contributions Corresponding author email: alaamalouf12@gmail.com Fig. 1. A DJI’s Tello drone equipped with our Raspberry PI Zero ver 2 micro-computer, connected to an RGB camera. Our deep learning-based systems are executed on this hardware. behind micro-computers such as the recent Raspberry Pi Zero Ver 2 (RPI0), with respect to constraints (i)–(iii) above: (i) The nJetson weighs about 100 grams, and thus cannot be carried by a small low-cost drone as in this paper. Such a drone can easily carry the RPI0 which weighs 16 grams, whereas the whole system weights less than 100 grams (the RPI0 and the drone); see Fig. 1, (ii) The energy consumption of nJetson is ∼5 watts at idle and ∼10 watts under stress, which is about 18× times more than the RPI0 which consumes ∼0.28 watts at idle and ∼0.58 watts under stress, and (iii) Our nJetson costs ∼$50, compared to the RPI0 that we used in this paper which costs ∼$15. For comparison, the Tello drone (as in Fig. 1) could not even carry the nJetson, not to mention its huge battery. A useful feature that we discovered regarding the RPI0 is that, due to its low voltage, it can use the same battery as its carrying drone (Tello in this paper). On the contrary, even a more powerful version of RPI (such as RPI3) needs an additional battery which that drone can barely lift stably and drains its battery more quickly. a) Vision: The motivation for this paper was to turn a remote-controlled commercial toy drone into an autonomous drone that is able to run deep neural networks efficiently, and thus gain all of their benefits for indoor applications. All of this should be done using only a single RGB camera and an onboard micro-computer, without a laptop or human in the loop. Firefighters may use it for searching for people, such as survivors in a building on fire, the police can use it to look for suspicious objects (e.g., in a subway), and the army may 2023 IEEE International Conference on Robotics and Automation (ICRA 2023) May 29 - June 2, 2023. London, UK 979-8-3503-2365-8/23/$31.00 ©2023 IEEE 8208

Index terms

AI-Enabled Robotics AI-Based Methods Semantic Scene Understanding