What Is Tiny Machine Learning (tinyML)?

Matthew Stewart (PhD candidate, Harvard University )

A subcategory of artificial intelligence, machine learning (ML) has applications in a wide range of fields, including atmospheric science and computer vision. As Harvard PhD candidate Matthew Stewart explains, tinyML is an emerging discipline for developing "fast, low-resource, and power-efficient implementations of machine learning algorithms that can be operationalized on resource-constrained microcontrollers."

A subcategory of artificial intelligence, machine learning (ML) has applications in a wide range of fields, including atmospheric science and computer vision. As Harvard PhD candidate Matthew Stewart explains, tiny machine learning (tinyML) is an emerging discipline for developing "fast, low-resource, and power-efficient implementations of machine learning algorithms that can be operationalized on resource-constrained microcontrollers."

Tiny Machine Learning in Science

C. J. Abate: Let’s start with your background. When did you first become interested in machine learning? Did you come to the field due to a background in programming or hardware design?

Matthew Stewart: For my undergraduate degree, I studied mechanical engineering, so this gave me some experience in programming and mechatronics. However, I was not introduced to machine learning until starting at Harvard University. My interest in machine learning was piqued by taking the introductory data science course at Harvard during the first year of my PhD work, which was when I realized the huge potential of machine learning both generally and also specifically for atmospheric research.

Abate: What led you to Harvard University?

Stewart: Well, the obvious response is that Harvard is one of the top research institutions in the world and studying here is the goal of many passionate and hard-working students. I was also drawn by the research interests of my supervisor, who studies the tropical Amazonian rainforest using drones. I became interested in pivoting to environmental science through the course of my mechanical engineering degree as it became more clear to me that most of the defining engineering problems of the modern age will be environmental problems, namely climate change, energy security, and sustainability. This work on drones in the Amazon rainforest seemed ideal based on my interests and engineering background and was the main stimulus for coming to Harvard.

Abate: As an environmental scientist, how do you keep yourself educated about embedded systems and programming? It must be difficult to stay on top of all the new developments in the AI field, as well as innovations in sensor technology, embedded systems, and so on. What’s your approach to staying informed about these various subjects?

Stewart: This is a very real issue for many graduate students and academics due to the relentless and rapid progress in these fields. Personally, there are several resources I use to keep relatively up-to-date. Firstly, Twitter can be a good resource for discovering new research posted by other academics in the field. I am also part of several Slack channels wherein colleagues periodically share news and research articles about related topics. I also periodically review new papers published in relevant journals to look for anything particularly eye-catching and worth reading in more detail. Fortunately, most published work is of little relevance to my own research, and broader trends are often the subject of seminar talks given by various departments and interest groups within the university.

Tiny Machine Learning as a Proto-Engineering Discipline

Abate: Although I touched on tinyML during an interview Daniel Situnayake a few months ago, it remains a new subject for many of the engineers in Elektor’s global community. How do you define tinyML? Is it basically an approach for running machine learning applications on edge microcontrollers?

Stewart: Yes, that is essentially the goal. tinyML is not a specific technology or set of principles, exactly, it is more of a proto-engineering discipline involving synergy of the fields of computer architecture, performance engineering, and machine learning. The overarching goal is to develop fast, low-resource, and power-efficient implementations of machine learning algorithms that can be operationalized on resource-constrained microcontrollers. This can also involve the development of bespoke hardware for specific tasks, the development of new algorithms specifically designed for resource-constrained applications, or new tools to port algorithms or optimize their performance across a wide range of hardware architectures. A useful guideline has been proposed referring to tinyML as the application of machine learning to microcontrollers with less than 1 MB of random access memory and power consumption less than 1 mW, but this is by no means a rigorous or exhaustive definition.

Abate: And just to be clear: we aren’t talking about devices like the NVIDIA Jetson and Raspberry Pi. The focus is on much more resource-constrained devices (i.e., less than 1 mW and kilobytes rather than megabytes), right?

Stewart: Correct. Devices like the Raspberry Pi and NVIDIA Jetson are not the focus of tinyML, nor are technologies related to applications such as self-driving cars, which often have access to considerable computational resources. The key word is “resource-constrained,” which almost suggests we are playing a zero-sum game. In tinyML, we must make informed decisions on how best to optimize the performance of our algorithm in terms of application- and hardware-specific constraints.

For example, in some applications it may be imperative to have both fast inference and high accuracy. To improve inference speed, we could use 8-bit arithmetic instead of floating-point arithmetic, but this will have an impact on the accuracy of our algorithm, and will also influence the memory and compute resources required for the algorithm. This example helps to highlight why I view tinyML as a proto-engineering discipline, since we are starting to think more about functional requirements that must be satisfied but are often in direct competition and must be balanced.

Abate: Can you provide a few examples of practical use cases?

Stewart: Actually, there are already some quite widespread examples of tinyML in smartphones. An important example is keyword spotting, which involves the detection of words such as “Hey Siri” and “Hey Google.” If smartphones used the CPU to continuously monitor the microphone and detect these words, your phone battery would only last a few hours. Instead, a lightweight digital signal processor continuously monitors for these words and, in the event that someone says the keyword, wakes up the CPU, verifies that it was said by a known speaker, and then waits for additional voice input.

Another example exists in smartphones that helps to detect when a user picks up their phone. Data from the onboard inertial measurement unit and gyroscope are continuously monitored, and when a user picks up their phone the set of signals informs the device of this and subsequently wakes up the CPU.

Another useful example is person detection, where a microcontroller connected to a camera can detect the presence of an individual. This can be adapted to, for example, detect whether a user is wearing a mask, which is particularly useful during the current pandemic. Anomaly detection will likely become an important use-case in industry, where signals from heavy machines can be continuously monitored to detect abnormalities for predictive maintenance purposes.

ML in Research

Abate: In 2019 you published a fascinating article, “The Machine Learning Crisis in Scientific Research,” which addressed the issue of whether machine learning is contributing to a “reproducibility crisis” in science. For instance, if a scientist uses a “poorly understood” ML algorithm in an experiment, it might mean that other scientists cannot reproduce the original research results. Even non-scientists can see the problem there. I assume the debate — machine learning vs traditional statistics — has only intensified over the past year. What are your thoughts now?

Stewart: I think this is still an important issue in academia. My article on this subject was in response to the reproducibility crisis which first surfaced over the controversy of some work done on the topic of power poses by Amy Cuddy, a former Harvard Business School professor. Andrew Gelman wrote an influential paper decrying poor research practices in the field of psychology that involved disingenuous data analysis using techniques such as p-hacking, post-hoc rationalization, and cherry picking of data to produce statistically significant results. This led to a series of experiments aiming to reproduce some important results in the psychological literature, many of which were not reproducible. This exposed a flaw in the research process, wherein reproducibility studies were often not funded as they were seen as unnecessary and a waste of resources. Since this time, the reproducibility crisis has also been found to have impacted other fields, including literature and economics.

Naturally, this corruption of the integrity of the research process leads to concerns about the use of large data sets and machine learning. Given a large enough number of variables in a dataset, it is ultimately inevitable that some statistically significant results will be present. This suggests that spurious correlations will be easier to find, but will only be valid if the experiment was designed to specifically test this hypothesis, not a plurality of hypotheses simultaneously. So, big data makes it easier to “cheat” with data, but what about machine learning? The use of machine learning makes it easier to “hide” the cheating. The reduced interpretability, nuanced behavior of many machine learning algorithms, and lack of machine learning education in many research communities will make it more difficult to uncover these issues in published research. Fortunately, the solution to this problem is quite simple — fund reproducibility studies, and educate researchers about the proper design of experiments and use of machine learning for research purposes.

Abate: You made an interesting point in your article: “One of the other problems of machine learning algorithms is that the algorithm must make a prediction. The algorithm cannot say ‘I didn’t find anything’.” It sounds like there are times when machine learning is not the right fit.

Stewart: While I would agree that machine learning is not the right fit for some tasks, I do not think it is for this reason. For example, one of the issues presented by tasks cast as binary classification problems is that they may in fact not be best summarized as such, resulting in a false dichotomy. In some circumstances, it may be more suitable for data lying close to a decision boundary to be assessed in more detail by a human instead of letting the algorithm make a definitive decision. This type of decision-making is sometimes referred to as human-in-the-loop decision-making, and would be most useful in circumstances where the decision being made has important ramifications, such as decisions related to the offer of a loan or whether someone has cancer.

Innovation with tinyML

Abate: In which industries do you see the biggest opportunities for innovation with tinyML?

Stewart: Generally speaking, I think many people working in this area are anticipating the advent of tinyML in one form or another to kickstart a new industrial revolution. For this reason, some have taken to referring to this newly envisioned stage of industry as “Industry 4.0.” Any industry working with large numbers of IoT devices will see large benefits from using tinyML by virtue of the reduced power consumption and network loads associated with tinyML.

More specifically, there are certain industries that are likely to obtain greater benefits from the new capabilities offered by tinyML if leveraged correctly. Agriculture is a good example. The use of tinyML in agriculture may allow for intelligent sensing capabilities without requiring connection to a power grid, that could help to determine when certain crops should be harvested or require additional fertilizer or water.

Another good example is heavy industry, as alluded to previously, whereby performing predictive maintenance using anomaly detection could result in cost savings and an increase in efficiency. Pre-empting issues in large machinery is likely to be less expensive and result in a smaller loss of productivity than the aftermath of a catastrophic failure.

Abate: What about companies interested in developing energy-efficient computing solutions?

Stewart: Apple and ARM are probably the biggest companies focusing on energy-efficient computing at the moment. The development of high-performance and power-efficient architectures has been crucial in the smartphone industry for improving battery life while also providing increased functionality and speed. In recent years, we have seen mobile architectures improve substantially in terms of performance and power efficiency, whereas more traditional architectures from rivals such as Intel have been comparatively stagnant. Consequently, mobile architectures now rival those of more traditional architectures, but have several additional advantages including high power efficiency. This came to a head recently with the announcement by Apple for the new ARM-based M1 chip, boasting that it will provide the “longest battery life ever in a Mac.” This move by Apple is seen by some as a watershed moment in the computing industry that will have ripple effects across the community for years to come.

Looking Ahead

Abate: Tell us about your work with drones and chemical-monitoring systems. What role does tinyML play in your research?

Stewart: Work using tinyML for several microdrone applications has already been published. The focus of this is to create lightweight drones that are able to intelligently navigate an environment using embedded reinforcement learning methods. This could be very useful in the future such as for detecting gas leaks or locating pollutant emission sources, for both indoor and outdoor applications.

For chemical monitoring systems more broadly, tinyML may provide the ability to create remotely located sensor networks that are disconnected from the power grid, as well as more intelligent use of chemical sensor information. For example, instead of continuously transmitting data to a cloud server, the system could be designed to focus only on anomalous data. This would reduce loads on the communication network and also the power consumption associated with performing continuous monitoring. These aspects will become increasingly important in years to come as the number of deployed IoT devices continues to increase exponentially.

Abate: Your articles and research will likely inspire many members of our community to take a closer look at tinyML. Do you recommend any resources — besides a book like Pete Warden and Daniel Situnayake's TinyML — for professional engineers and serious electronics enthusiasts who are interested in learning more about the subject?

Stewart: Unfortunately, one of the downsides of cutting-edge technology is that there are often only a handful of resources available. That being said, we are starting to see a steady release of peer-reviewed literature on the subject of tinyML (although often under a different name). A sizable portion of this literature is published on the preprint server arXiv in the category of hardware architecture, but I suspect we will soon see several journals focused on the topic that will supersede this. Another resource is the TinyML Research Symposium hosted by the TinyML Foundation, scheduled for March 2021, where we will likely see some new and exciting developments for the field.