Reinforcement learning for closed-loop optimisation of spatiotemporal stimulation in patterned neuronal networks

This paper presents a low-cost, open-source closed-loop reinforcement learning system that enables efficient, goal-directed optimization of spatiotemporal stimulation patterns in topologically constrained in vitro neuronal networks by characterizing their state-dependent responses and demonstrating that learning agents can identify non-trivial stimulation strategies to evoke specific target activity motifs.

Original authors: Maurer, B., Vasiliauskaite, V., Hengsteler, J., Cathomen, G., Ruff, T., Schmid, C., Vörös, J., Ihle, S. J.

Published 2026-04-16
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you have a tiny, living city made of neurons (brain cells) growing on a special chip. This city has a specific layout, like a roundabout with four exits. Your goal is to send electrical "messages" (stimuli) to this city to make the traffic (spikes) flow in a perfect, clockwise circle around the roundabout.

The problem? This city is chaotic. It's not a simple machine where you push a button and get a predictable result. The neurons react differently depending on what happened a split second ago, and there are millions of possible ways to send those messages. Trying to find the perfect message by guessing randomly would take forever—like trying to find a specific grain of sand on a beach by picking one up every second.

Here is how the researchers solved this using a "Smart Coach" (Reinforcement Learning):

1. The Setup: A Living Circuit Board

The scientists grew brain cells on a grid of electrodes (microelectrode arrays). They used tiny, invisible walls (microfluidic channels) to force the cells to grow in a specific shape—a loop. This makes the "city" easier to understand, but it's still a living, breathing system that changes over time.

2. The Challenge: The "Black Box"

If you zap one part of the city, the neurons might fire in a circle. But if you zap the same spot 10 seconds later, they might not. Why? Because the neurons have a "memory" of what just happened.

  • The Analogy: Imagine trying to teach a dog to sit. If you say "Sit" when the dog is already tired, it might ignore you. If you say it when the dog is excited, it might jump. The dog's reaction depends on its current state. The researchers needed a way to figure out not just what to say, but when to say it based on how the dog (the network) was feeling.

3. The Solution: The Reinforcement Learning Agent

Instead of a human guessing, they built a computer program—an AI Agent—to act as a coach.

  • The Loop: The AI sends a signal (a "stimulus") to the chip.
  • The Reaction: The chip sends back a video of the neurons firing (the "response").
  • The Score: The AI checks the video. Did the neurons fire in a perfect clockwise circle? If yes, the AI gets a high score (reward). If not, it gets a low score.
  • The Learning: The AI tries millions of different combinations of signals. Over time, it learns: "Ah, when I zap electrode A and then wait 2 milliseconds before zapping electrode B, the neurons dance in a circle!"

4. The Speed: Milliseconds Matter

Previous experiments were slow, like sending a letter and waiting a week for a reply. This new system is a real-time conversation.

  • The AI sends a signal, waits for the neurons to react (about 20 milliseconds), and decides the next move instantly.
  • It's like a game of ping-pong played at lightning speed, where the AI learns the perfect rhythm to keep the ball (the signal) moving in a circle.

5. The Surprising Discovery: It's Not What You Expect

The researchers thought the AI would eventually learn to zap the electrodes in a perfect clockwise order (A, then B, then C, then D) to match the circle.

  • The Reality: The AI found a much weirder, more complex solution. It discovered that sometimes it needed to zap the electrodes in a chaotic order, or skip some entirely, to get the neurons to cooperate.
  • The Metaphor: Imagine trying to get a group of people to walk in a circle. You might think you need to tell them "Left, Left, Left." But the AI found that sometimes you need to yell "Right, Stop, Jump, Left" to get the group to move in a circle because of how they are reacting to each other. The AI found the "secret handshake" that works, even if it looks random to us.

6. The "State" Factor

The researchers also discovered that the neurons' reaction depends heavily on what happened just before.

  • The Analogy: If you ask a friend a question, their answer depends on whether they were just laughing or just crying. The AI learned to use this "mood" (state) to its advantage. It learned that if the network was in "Mood X," it should try "Signal Y," but if the network was in "Mood Z," it should try "Signal W."
  • While the AI got better at using these moods, the simplest strategy (just finding the one best signal) still worked almost as well as the complex mood-sensing strategy.

Why Does This Matter?

This isn't just about making neurons dance in circles. It's a new tool for understanding the brain.

  • For Science: It gives us a way to systematically map how brain circuits work without needing to know every single connection beforehand.
  • For Medicine: It could help design better brain implants for people with epilepsy or Parkinson's. Instead of a doctor guessing the right electrical pattern to stop a seizure, an AI could learn the perfect pattern for that specific patient's brain in real-time.

In short: The researchers built a fast, smart robot coach that learned to talk to a living brain chip. It figured out the secret code to make the brain cells move in a circle, proving that we can use AI to understand and control the complex, messy world of biology.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →