From Perception to Cognition: How Latency Affects Interaction Fluency and Social Presence in VR Conferencing

This paper investigates how end-to-end latency impacts interaction fluency and social presence in VR conferencing compared to traditional video conferencing through subjective experiments, aiming to clarify the underlying perceptual and cognitive mechanisms to guide system optimization.

Jiarun Song, Ninghao Wan, FuZheng Yang, Weisi Lin

Published Wed, 11 Ma
📖 5 min read🧠 Deep dive

Here is an explanation of the paper, translated into simple language with some creative analogies.

🌐 The Big Picture: The "Glitch" in the Virtual Room

Imagine you are in a video call with a friend. Usually, you talk, they listen, and they reply instantly. It feels natural, like you are in the same room.

Now, imagine there is a delay (latency). You say "Hello," but your friend doesn't hear it for two seconds. They reply, and you don't hear them for another two seconds. Suddenly, the conversation feels awkward, like a bad walkie-talkie connection. You might talk over each other, or you might feel like your friend isn't listening.

This paper is a scientific study about Virtual Reality (VR) meetings. The researchers wanted to know: How much delay can we handle before the meeting feels broken? And, does a VR meeting (where you see digital avatars) feel different from a normal video call (where you see real people on a screen)?

They looked at two main things:

  1. Interaction Fluency (The "Flow"): How smooth does the conversation feel? (Perception)
  2. Social Presence (The "Connection"): Do you feel like you truly understand your friend's feelings and intentions? (Cognition)

🧪 The Experiment: The "Speed Dating" of Delays

The researchers set up a lab where pairs of friends had conversations under different conditions:

  • Scenario A: A normal video call on a laptop (like Zoom).
  • Scenario B: A VR call where they wore headsets and saw each other as 3D digital avatars.

They introduced delays ranging from almost zero (100ms) to a very long, painful delay (3 seconds). They also gave the pairs three types of tasks to do:

  1. Counting: "One, two, three..." (Fast, no thinking required).
  2. Math: "What is 2+2?" (Medium thinking).
  3. Free Chat: "How was your weekend?" (Slow, lots of thinking).

🔍 Key Findings: The Surprising Results

1. The "Flow" Breaks Faster in Simple Tasks

Think of Interaction Fluency like a dance.

  • The Counting Task: This is a fast-paced dance. If the music (the connection) has a delay, the dancers trip immediately. The study found that in simple, fast tasks, even a tiny delay made the conversation feel "clunky" and broken.
  • The Free Chat: This is a slow, relaxed stroll. If there is a delay, you can just pause, think, and keep walking. People were much more forgiving of delays during casual chat.

The Twist: When the delay got bad, VR users were more patient than video call users.

  • Why? The paper suggests that because VR users are looking at a digital avatar (a cartoon-like person), their brains are already working harder to imagine the person is "real." Because they are already doing extra mental work, they don't notice the delay as much as they do when looking at a real human face on a screen. It's like wearing noise-canceling headphones; you are so focused on the virtual world that you ignore the glitches.

2. The "Connection" Crumbles in VR

Think of Social Presence like the emotional bond or the "vibe" between two people.

  • At Low Delays: VR wins! When the connection is fast, VR feels amazing. You feel like you are actually sitting next to your friend. The "vibe" is stronger than a 2D screen.
  • At High Delays: VR loses big time. Once the delay gets too long (over 1 second), the "vibe" in VR crashes harder than in a normal video call.
  • Why? In a video call, if there is a delay, you can still see the real person's face and read their expression, even if it's late. In VR, if the avatar's movement is delayed, it looks like a glitchy robot. You stop believing the person is "there," and the emotional connection breaks.

3. The "Attribution" Game

The study found something interesting about who people blame for the delay.

  • In Fast Tasks (Counting): If the delay happens, people blame the system ("This computer is slow!").
  • In Slow Tasks (Chatting): If the delay happens, people blame their friend ("Oh, they are just thinking hard about their answer").
  • Analogy: If you are playing a fast video game and it lags, you blame the internet. If you are having a deep conversation and there is a pause, you assume your friend is just thoughtful.

💡 The Takeaway: What Does This Mean for the Future?

The researchers concluded that VR conferencing is a double-edged sword:

  1. It feels smoother (more fluent) than video calls when things go wrong, because our brains are distracted by the cool 3D graphics.
  2. But it feels less "real" (less social) when things go wrong, because the digital avatars can't hide the glitches as well as real human faces can.

The Golden Rule:
To make VR meetings feel great, the delay needs to be under 1 second.

  • If the delay is under 1 second, VR is magical and feels better than video calls.
  • If the delay goes over 1 second, the "magic" disappears, and the connection feels worse than a standard video call.

In short: VR is a fantastic tool for the future of work and socializing, but it needs a very fast internet connection to keep the "human" feeling alive. If the internet is slow, a simple video call might actually feel more human than a glitchy VR world.