HiFiGaze: Improving Eye Tracking Accuracy Using Screen Content Knowledge

HiFiGaze improves gaze estimation accuracy on consumer devices by leveraging known screen content to segment and analyze the screen's reflection in the user's eyes, achieving an approximately 8% reduction in mean tracking error compared to baseline models.

Taejun Kim, Vimal Mollyn, Riku Arakawa, Chris Harrison

Published 2026-03-23
📖 5 min read🧠 Deep dive

The Big Idea: Turning Your Eyes into a Mirror

Imagine you are looking at your smartphone. Your eyes are like tiny, curved mirrors. Just like a mirror in a hallway reflects the room behind you, your eyes reflect the screen you are staring at.

For a long time, eye-tracking technology on phones has been a bit like trying to guess where someone is looking by just looking at their face. It's okay, but it's often a little fuzzy—like trying to read a sign from 50 feet away. You might be off by a few inches, which is annoying if you want to tap a small button just by looking at it.

HiFiGaze is a new method that says: "Wait a minute! We don't just need to look at the eyes; we need to look at what the eyes are reflecting, and we need to know exactly what is on the screen."

The Problem: The "Glitchy Glint"

In the past, researchers tried to find a "glint" (a tiny spark of light) in the eye to figure out where you are looking.

  • The Analogy: Imagine you are looking at a TV screen. If the screen is just a blank white wall, the reflection in your eye is a simple white dot. Easy to find.
  • The Reality: But screens are full of movies, games, and text. The reflection in your eye is now a tiny, distorted, colorful mess. If you just look for a "dot," you might get confused. Is that reflection from a bright logo? Or a white letter? Or a glare? It's like trying to find a specific needle in a haystack that is constantly changing shape.

The Solution: The "Smart Detective"

HiFiGaze works like a super-smart detective who has two pieces of evidence:

  1. The Crime Scene Photo: A high-quality picture of your eye (taken by your phone's 4K selfie camera).
  2. The Suspect List: The phone knows exactly what is currently displayed on the screen.

How it works:
Instead of guessing, the phone takes the image of what's on the screen (like a tiny, blurry thumbnail) and tries to "match" it against the reflection in your eye.

  • The Analogy: Think of it like a jigsaw puzzle. The phone has the picture of the puzzle piece (the screen content) and the box lid (the reflection in your eye). It slides the piece around until it fits perfectly. Once it finds the perfect fit, it knows exactly where your eye is looking.

Because the phone knows the screen content, it doesn't get confused by the messy colors. It can say, "Ah, that reflection matches the 'Play' button on the video player, so the user is looking at the bottom right."

Why This is a Big Deal

The researchers tested this on an iPhone 14 Pro Max. Here is what they found:

  • Old Way (Just looking at the eye): About 2 cm of error. (That's like missing a coin on a table by the width of two fingers).
  • New Way (HiFiGaze): About 1.6 cm of error.
  • The Result: This is an 18% improvement. In the world of eye tracking, that's huge. It means you can tap small icons just by looking at them without needing to calibrate your eyes first.

The "Eyelash" Problem and the Bottom Camera

There was one snag. When you look down at the bottom of your phone, your eyelashes and upper eyelid sometimes block the reflection, like a curtain closing over a window. This made the system less accurate for the bottom of the screen.

The Fix: The researchers did a funny experiment where they turned the phone upside down so the camera was at the bottom.

  • The Analogy: It's like moving a security camera from the ceiling to the floor. Suddenly, the "curtain" (eyelashes) isn't blocking the view anymore.
  • The Result: Accuracy got even better! This suggests that future phones might put cameras at the bottom of the screen (or under the screen) to make eye tracking perfect.

Why Should You Care?

You don't need special glasses or expensive headsets for this. It uses the cameras and screens you already have in your pocket.

  • Accessibility: It could help people with limited hand movement control their phones just by looking.
  • Gaming: Imagine playing a game where you aim a weapon just by looking at the target.
  • Privacy: The system can work with just a blurry "heat map" of the reflection, meaning it doesn't need to store a high-definition photo of your actual eye, which is safer for your privacy.

In a Nutshell

HiFiGaze is like giving your phone a pair of super-vision glasses. It realizes that your eyes are mirrors reflecting the screen, and by comparing that reflection to what the phone knows is on the screen, it can guess where you are looking with much higher precision than ever before. It turns your phone from a "dumb" screen into a device that truly understands your attention.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →