Insights | Guide

Edge AI: Making smart choices for smarter devices

In the case of AI, big is not always better.

Scientists estimate that training OpenAI’s giant GPT-3 text-generating model used enough energy to drive a car to the Moon and back.

Reducing AI’s carbon footprint is a priority. But the issues with big AI are not just about energy efficiency and sustainability. Other requirements – such as privacy and latency – also create compelling reasons to move AI to the Edge. Indeed, the viability of many use cases and applications depends on using Edge AI.

What’s Driving AI to the Edge?

Privacy and security

Securing personal data is a key priority for service providers – breaches incur customer dissatisfaction and financial penalties. By using voice interactions and video footage, smart homes, consumer devices, retail and smart city systems are continually developing smarter capabilities for facial recognition, intent prediction and emotion detection. As they learn more about people, they create more personal data.

Over the years, poor data security practices have led to many security breaches. This has led to new codes of practice becoming legal requirements across the world. For example, the UK is planning to be the first country to enshrine Secure by Design principles in law. Other countries are expected to follow.

Using proportionate security measures and keeping data local can simplify compliance with regulations and reduce the opportunities for privacy breaches while data is in transit or stored centrally.

Data transfer

Many IoT deployments use a narrow bandwidth technology, such as NB-IoT or LTE-M, for connectivity. The more devices and the more data they send, the greater the potential for bottlenecks. Even with a wider bandwidth technology, sending vast amounts of data from devices is inefficient and expensive – especially if much of the data is irrelevant (for example, security footage when nothing is happening).

Reducing the amount of data that needs to be sent to the cloud can save cost and improve performance.

Latency, reliability and availability

Performance can be critical to success. Lives may depend on a fast response, for example, in autonomous vehicles, robotic surgery or monitoring factory machinery. Even in less critical situations, good user experience relies on responsive devices, such as when speaking the wake word to a voice assistant.

Autonomous applications can perform faster and more reliably than they might if they rely on central cloud systems for instructions.

Power consumption

Many IoT devices can’t use mains power because they’re mobile, in locations without electricity or there are too many devices to connect to mains power. Battery-powered devices must use energy extremely carefully, particularly if the batteries can’t easily be replaced or recharged.

The energy required to transfer data can be an order of magnitude greater than that required for computation so it’s beneficial to process locally where possible – even if it’s only to filter out data that doesn’t need to be sent to the cloud.


The number of machine learning models in the cloud has grown fast, thanks to improvements in processor speeds and the advances of big data. But centralised systems aren’t always best placed to support the growing demands of heavy workload IoT applications, such as those in healthcare, manufacturing and transportation.

Developing intelligence at the Edge can help meet the growing need for more AI in homes, cities, factories, healthcare, retail, transportation, business and more.

How to Get Started with Edge AI

Until recently, cloud was the natural choice for developing ever-smarter AI, thanks to powerful CPU processors complemented with the latest technology (such as GPU, NPU, TPU, DSP), unlimited memory and storage, sophisticated systems, and well-developed frameworks and software tools. By contrast, the type of system that can be embedded in an IoT device is more constrained for reasons such as power usage and cost.

The good news is that it’s not all or nothing. The optimum solutions are often a balance between cloud and Edge. The cloud is best for handling big data, training neural network models, orchestration, and running inferences that are complex, depend on off-device data or can be done offline. Edge AI is best for inference on smaller, self-contained models where autonomy, a fast response or low power consumption is required.

Even so, porting machine learning models developed with almost limitless resources to constrained systems with very few resources is – not surprisingly – a challenge. Some of the key options and considerations are outlined below.

Hardware for the Edge

Manufacturers are responding to the growing demand for Edge AI by developing cost-effective and energy efficient hardware that can support a wide range of Edge AI solutions. These range from discrete semiconductor devices to off-the-shelf modular development platforms. Options include:

  • Processors with architectures that use technologies suitable for machine learning models, such as GPU, NPU and DSP.
  • High-speed neural network accelerators that can be integrated with microprocessors.
  • Complete solutions on a chip that include a processor and integrated accelerator, camera and microphone inputs

Manufacturers are incorporating optimisation techniques, such as Single Instruction, Multiple Data (SIMD) and Very Long Instruction Word (VLIW), that enable the parallel processing that’s needed for machine learning inference.

Frameworks and tools

There are open-source AI development frameworks available for Edge AI with varying capabilities and characteristics. When choosing a framework, factors to consider include performance, coding language, pretrained models, optimisation techniques, commercial support and licensing terms.

Interoperability solutions such as Open Neural Network Exchange (ONNX) are available for application use cases that require more than one AI framework.

Most solutions enable the use of standard software development tools, as long as the required libraries, extensions and solution optimisations are included. Some device manufacturers also provide direct hardware access via proprietary SDK implementations.


Hardware and software technologies used in the cloud are well understood and relatively easy to manage. In simplifying frameworks and tools for the Edge, any functionality that isn’t essential is removed. This can include facilities such as debugging, visualisation and explainability; making development and deployment harder to manage.

Cloud models are developed in high-level programming languages, such as Python, which create large software runtime executables. These need to be greatly reduced in size for Edge devices, using techniques such as model distillation, quantisation, and cross-compiling. It’s important to verify that the optimised model still meets the requirements for accuracy.

Any microprocessor chosen for Edge AI must have enough memory for the operating system and libraries, a neural network interpreter, the model, and the results produced during inference.

Finding Expert Help for your Edge AI Project

The path to developing AI in the cloud is well-understood. By contrast, integrating machine learning models into an embedded system environment is a new and complex field.

Specialist skills are required to create more efficient algorithms, data representations and computation methods. Understanding the available hardware and software options and matching these to Edge AI use cases and applications requires a deep understanding of embedded system technologies.

Our next article describes a number of Edge AI options and shows how you can decide what is right for your requirements.

At Consult Red, we have a long-established and proven track record in delivering innovative embedded solutions for a variety of customers worldwide. Our extensive knowledge of embedded system and AI technologies makes us ideally placed to help you navigate the challenges and help you deliver a successful Edge AI solution.

Read More About Edge AI
Helping you navigate the technology landscape from chip to cloud