Infinite Words with very Low Factor Complexity: an introduction to Combinatorics on Words

This paper introduces combinatorics on words through the lens of low factor complexity in infinite words, exploring their minimal complexity, formalizing non-triviality, and presenting new algebraic proofs and consequences for classical theorems by Morse, Hedlund, and Tijdeman.

Mélodie Andrieu

Published Tue, 10 Ma
📖 6 min read🧠 Deep dive

Here is an explanation of the paper "Infinite Words with very Low Factor Complexity" using simple language, creative analogies, and metaphors.

The Big Picture: Counting Patterns in an Infinite Song

Imagine you have a song that never ends. It's made of a sequence of notes (or letters). In mathematics, we call this an infinite word.

The paper asks a very specific question: How "complicated" is this song?

To measure complexity, the author uses a method called Factor Complexity. Imagine you are listening to the song and you stop to look at every possible chunk of 3 notes.

  • If the song is just 1-1-1-1-1..., there is only one type of 3-note chunk (111). It's very simple.
  • If the song is 1-2-1-2-1-2..., there are two chunks: 121 and 212. Still simple.
  • If the song is random noise, you might find every possible combination of notes. That is maximum complexity.

The paper is about finding the "Goldilocks" zone: words that are not boring (not just repeating the same pattern forever) but are also as simple as possible.


Chapter 1: The Binary Case (The Two-Note World)

The Old Rule (Morse & Hedlund, 1938):
Imagine a song made of only two notes, say "A" and "B".

  • If the song eventually settles into a repeating loop (like ABABAB...), it is considered "trivial" or boring.
  • The famous theorem says: If a song is NOT boring, it must have at least n+1n+1 different chunks of length nn.
    • Length 1: At least 2 chunks (A and B).
    • Length 2: At least 3 chunks.
    • Length 3: At least 4 chunks.

The "Sturmian" Stars:
The paper introduces a special class of songs called Sturmian words. These are the "perfect" non-boring songs. They hit the minimum limit exactly: they have exactly n+1n+1 chunks of length nn.

The Magic Connection:
The author explains that these perfect songs are deeply connected to Continued Fractions (a way of writing numbers like a recipe of integers).

  • The Analogy: Think of a Sturmian word as a staircase trying to walk up a hill with a specific slope.
  • If the slope is a simple fraction (like 1/2), the staircase repeats itself (boring).
  • If the slope is an irrational number (like the Golden Ratio, ϕ\phi), the staircase never repeats, but it stays as close to a straight line as possible without ever repeating. This "straightest possible non-repeating path" creates the Sturmian word.

Chapter 2: The Harder Problem (The Multi-Note World)

The Problem:
What happens if our song has 3, 4, or 10 different notes (a "d-ary" alphabet)?

  • In the 2-note world, "non-boring" just meant "doesn't repeat."
  • In the 3-note world, you can make a song that doesn't repeat but is still "fake" or "artificial." For example, you could take a 2-note Sturmian song and just insert a "C" every now and then. It doesn't repeat, but it's not a true 3-note song; it's just a 2-note song in disguise.

The New Rule (Tijdeman's Theorem):
To find the "true" non-boring songs with 3+ notes, we need a stricter rule. The notes must have Rationally Independent Frequencies.

  • The Analogy: Imagine a clock with 3 hands.
    • If the hands move in a ratio like 1:2:3, they will eventually all line up and repeat the pattern.
    • If the hands move at speeds that are "mathematically incompatible" (like 2\sqrt{2}, 3\sqrt{3}, and 5\sqrt{5}), they will never perfectly line up again. They are "rationally independent."
  • The Result: Tijdeman proved that for these "true" multi-note songs, the complexity must be at least (d1)n+1(d-1)n + 1.
    • For 3 notes (d=3d=3), you need at least $2n + 1$ chunks.
    • This is the new "minimum complexity" for a rich, non-repeating world.

Chapter 3: The New Proof (The Algebraic Detective Work)

The Old Way vs. The New Way:
Tijdeman proved his theorem in 1999 using heavy combinatorial logic (counting patterns like a detective).
In 2022, the author and a colleague (J. Cassaigne) found a new, algebraic proof.

The "Flow Matrix" Metaphor:
Imagine the song as a city map (a graph).

  • The Streets are the chunks of letters.
  • The Intersections are where the chunks overlap.
  • The Traffic is the frequency of how often you see a chunk.

The authors built a special tool called a Flow Matrix. Think of this as a Kirchhoff's Law for Traffic.

  • In an electrical circuit, the current flowing into a junction must equal the current flowing out.
  • In a word, the number of times a chunk appears must equal the number of times it is "entered" and "exited" by the next letter.

The Breakthrough:
By treating the word as a system of equations (linear algebra), they showed that if the "traffic" (frequencies) is truly independent (irrational), the "map" (the word) must have a very specific, tree-like structure.

The "Dendric" Discovery:
They discovered that all these minimal-complexity words are Dendric.

  • The Analogy: "Dendric" comes from the Greek word for Tree.
  • If you look at how a chunk of letters can be extended (what letter can go before it? what after it?), the possibilities form a Tree.
  • A tree has no loops. If you have a loop, the word is "too complex" or "trapped" in a cycle.
  • The Conclusion: The simplest possible non-repeating words are those where the connections between letters branch out like a tree, never circling back on themselves.

Summary: Why Does This Matter?

  1. Order in Chaos: It tells us the absolute limit of how simple a non-repeating pattern can be.
  2. Universal Language: It connects three different fields:
    • Numbers: (Irrational numbers and continued fractions).
    • Geometry: (Billiard balls bouncing on a table or walking on a torus).
    • Algebra: (Matrices and flow).
  3. The "Tree" Structure: It reveals that the most efficient, non-repeating patterns in nature and math share a hidden "tree-like" structure, ensuring they never get stuck in a loop.

In short, the paper teaches us that to build the most complex-yet-simple infinite song, you must use a "tree" structure and ensure your ingredients (letters) are mathematically incompatible, so they never fall into a predictable rhythm.