Machine Learning 22 November 2024 10 min read

Edge AI and TinyML: Intelligence at the Sensor Level

Running ML inference on microcontrollers with milliwatt power budgets is now practical. We explore how TinyML is enabling real-time analysis at the data source.

Edge AITinyMLIoTEmbedded SystemsSensors
Close-up of electronic circuit board and microprocessor
Alexandre Debieve on Unsplash

The default assumption in ML is that inference happens in the cloud—send data up, get predictions back. But for many geospatial applications, this architecture fails. Remote sensors lack connectivity, latency requirements are sub-second, bandwidth is expensive, or privacy mandates local processing. Edge AI and TinyML flip the model: bring intelligence to the data, not data to the intelligence.

What Is TinyML?

TinyML refers to machine learning inference on microcontrollers with milliwatt power budgets—devices with perhaps 256KB of RAM and 1MB of flash storage. These aren't Raspberry Pis; they're the chips inside sensors, wearables, and IoT devices that run for years on a coin cell battery.

The technical achievement is remarkable. Through aggressive model compression—quantization, pruning, knowledge distillation—researchers have squeezed vision models into 286-536KB deployable footprints. Inference takes 3-15ms per frame with energy consumption measured in microjoules.

TinyML systems do not transfer data to any server for inference, as machine learning functions are executed on the device itself. This makes TinyML very appropriate for real-time applications requiring immediate feedback.
Scientific Reports, 2025

Model Compression Techniques

Key Optimization Approaches

8-bit post-training quantization

3-4× storage reduction with minimal accuracy loss

4-bit and 2-bit k-means quantization

Up to 90% memory reduction

2:4 sparsity patterns

50% weight pruning with hardware acceleration

Knowledge distillation

Train small student models from large teachers

Neural Architecture Search

Automatically find efficient architectures for constraints

The state of the art is impressive: quantized MobileNet variants maintain accuracy ≥0.85 while fitting in embedded flash budgets. For many classification and detection tasks, the compressed models are indistinguishable from full-precision versions.

Geospatial Applications

For sensor networks in remote locations—exactly the kind of infrastructure we work with—TinyML enables capabilities that were previously impossible:

Edge AI for Geospatial

  • Wildlife monitoring — On-device species classification from camera trap images
  • Seismic detection — Real-time earthquake classification at the sensor
  • Agricultural sensing — Crop disease detection from multispectral cameras
  • Water quality — Anomaly detection from turbidity and chemical sensors
  • Infrastructure monitoring — Crack detection on bridge strain sensors

The bandwidth savings are dramatic. Instead of streaming continuous sensor data to the cloud, edge devices transmit only detections or anomalies. A wildlife camera that would generate terabytes of video annually can report only confirmed sightings, reducing data transfer by orders of magnitude.

The Hardware Landscape

Purpose-built silicon is accelerating TinyML adoption:

Edge AI Hardware Options

NVIDIA Jetson Orin

AI-optimized SoC with 100+ TFLOPS for edge servers

Google Edge TPU v3

INT8 inference under 10ms for small model blocks

ARM Ethos-U85 NPU

Designed for Cortex-M microcontrollers

Syntiant NDP

Ultra-low-power audio/sensor inference

ESP32-S3

WiFi microcontroller with vector instructions for ML

For our seismic sensor networks, we've evaluated ESP32-based designs running TensorFlow Lite Micro. The combination provides sufficient compute for waveform classification while maintaining the power efficiency needed for solar-powered remote deployments.

Edge LLMs: The 2025 Frontier

The frontier is moving fast. Small language models (under 9B parameters) with aggressive quantization can now run on edge devices, enabling natural language interaction with sensor data. Imagine querying a remote monitoring station: "Has there been unusual seismic activity in the last 24 hours?"—processed entirely on-device.

Technologies like uTensor with CMSIS-NN kernels are enabling LLM blocks on ARM Cortex-M devices with under 256MB DRAM. This isn't GPT-4, but for constrained inference tasks, edge LLMs are becoming viable.

Frameworks and Tooling

tflite_conversion.py
import tensorflow as tf

# Convert model for TensorFlow Lite Micro
converter = tf.lite.TFLiteConverter.from_saved_model(model_path)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.int8]
converter.representative_dataset = representative_dataset_gen

tflite_model = converter.convert()

# Deploy to microcontroller
open("model.tflite", "wb").write(tflite_model)

TensorFlow Lite Micro remains the dominant framework, but alternatives are emerging. TensorFlores generates platform-agnostic C++ code for embedded systems. Edge Impulse provides end-to-end workflows from data collection to deployment.

Our Perspective

For organizations operating remote sensor networks—and this describes much of what we do at Geoscience Australia and similar agencies—edge AI is transformative. The ability to run inference at the sensor changes the economics of environmental monitoring.

But I'd caution against over-engineering. For many applications, simple threshold-based logic still outperforms ML. The right question isn't "can we run a neural network on this sensor?" but "does ML provide meaningfully better decisions than classical signal processing?" Start simple, add ML complexity only when the value proposition is clear.

Tell us about your project

Our Offices

  • Canberra
    ACT, Australia