Trust Nothing: RTOS Security without Run-Time Software TCB (Extended Version)

This paper presents a novel capability architecture and a corresponding Zephyr-based real-time operating system that achieves comprehensive security for embedded devices by fully disaggregating and isolating all software subsystems and peripherals, thereby eliminating the need for a run-time software Trusted Computing Base (TCB) without requiring hardware modifications.

Eric Ackermann, Sven Bugiel

Published Tue, 10 Ma
📖 5 min read🧠 Deep dive

Imagine your embedded device (like a smart thermostat, a medical device, or a network router) as a high-security bank vault.

Traditionally, this vault has a Guard (the Operating System Kernel) who holds the master keys. The Guard decides who gets to open which door. The problem? If the Guard gets sick, tricked, or hacked, the whole vault is compromised. Furthermore, the vault has Delivery Trucks (peripheral devices like Wi-Fi chips) that drive right up to the vault. If a truck is hijacked by a criminal, it can smash through the walls and steal everything, even if the Guard is doing their job.

Current security systems usually try to protect against either the Guard or the Trucks, but rarely both at the same time.

This paper introduces Skadi (the new OS) and Bredi (the new hardware), a system built on a concept called Northcape. Their philosophy is simple: "Trust Nothing."

Here is how it works, using everyday analogies:

1. The "Token" System (No Master Keys)

Instead of a Guard holding a master key, imagine every single item in the bank (a file, a memory block, a driver) has its own unique, unforgeable token.

  • The Old Way: You ask the Guard, "Can I open the safe?" The Guard checks a list and says, "Yes." If the Guard is lying or hacked, you get access.
  • The Skadi Way: You don't ask permission. You simply hold the token. If you don't have the token for that specific safe, the door physically won't open, no matter how much you scream or beg. Even the "Guard" (the OS kernel) doesn't have a master key; it only has the specific tokens it was given for the task at hand.

2. The "Untrusted Guard" (No Trusted Kernel)

In most computers, the OS kernel is the "Trusted Computing Base" (TCB). It's the one thing you must trust. If it has a bug, you are doomed.

  • The Skadi Solution: They break the OS into tiny, isolated rooms called Subsystems.
    • The Scheduler (who decides who runs next) is just a small room.
    • The Memory Allocator (who gives out space) is another small room.
    • The Network Driver is yet another room.
  • The Magic: These rooms are mutually distrustful. The Scheduler cannot peek into the Memory Allocator's room. If the Scheduler gets hacked, the hacker is trapped in that tiny room and can't touch the rest of the bank.
  • The Result: Once the bank opens in the morning, the "Guard" (the loader) locks itself in a tiny closet and throws away the key. The bank runs entirely on these small, untrusted rooms. There is no running software to trust.

3. The "Hijacked Delivery Truck" (DMA Protection)

Peripherals (like Wi-Fi chips) are like delivery trucks that can drive directly into the vault to drop off or pick up packages (Direct Memory Access). Usually, if a truck is hacked, it can drive anywhere.

  • The Skadi Solution: The system uses a Smart Gatekeeper (the Northcape hardware) sitting at the entrance of the vault.
  • When a truck (device) wants to enter, it must present a Token. The Gatekeeper checks the token.
    • "This token says you can only drop off packages in the Mailroom."
    • The truck tries to drive to the Vault? BAM. The gate slams shut.
  • Even if the truck is a "Trojan Horse" (a malicious device), it can only touch what its token explicitly allows. It cannot read the bank manager's private notes or steal the gold.

4. The "Instant Switch" (Subsystem Calls)

How do these isolated rooms talk to each other without the Guard?

  • They use Subsystem Calls. Think of this as a secure pneumatic tube.
  • If the Network Driver needs to send data to the Scheduler, it puts the data in a tube.
  • The tube automatically checks the ID of the sender and the receiver. It swaps the "ID badge" of the person using the tube.
  • Crucially, the system wipes the slate clean (clears the registers) before and after the transfer. It's like putting on a fresh pair of gloves before handling evidence. This ensures no secret data leaks from one room to another.

5. The "Emergency Brake" (Real-Time Safety)

A common fear is: "If you break the OS into so many tiny pieces, won't it be too slow? What if a hacker freezes one room?"

  • The Solution: The system uses Non-Maskable Interrupts (NMIs).
  • Imagine a fire alarm that cannot be turned off. Even if a hacker tries to disable the alarm (by freezing a room), the hardware forces the system to listen to the alarm.
  • This ensures that critical services (like stopping a heart monitor or blocking a network attack) always happen, even if the rest of the system is under attack.

The Trade-off: Security vs. Speed

The authors admit that this system is a bit "heavier" than a standard one.

  • Analogy: It's like having a bank with 100 tiny, reinforced doors instead of one big main door. Every time you move between rooms, you have to go through a security check.
  • The Result: It is slightly slower (overhead) and takes up more space (chip area).
  • The Verdict: For critical infrastructure (like 5G towers, medical devices, or military systems), speed is less important than not getting hacked. The paper shows that while it's slower, it's still fast enough for real-world use, and the security gain is massive.

Summary

Skadi and Bredi are like building a fortress where:

  1. No one is trusted (not the OS, not the drivers, not the hardware).
  2. Access is granted only by tiny, unforgeable tokens.
  3. If one part is hacked, the damage is contained to that tiny room.
  4. Malicious devices (hacked trucks) are physically blocked from touching anything they aren't explicitly allowed to.

It's a shift from "Trust the Guard" to "Trust the Locks."