Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

This paper proposes an integrated framework combining a node transformer architecture with BERT-based sentiment analysis to model stock market graphs and social media sentiment, demonstrating superior forecasting accuracy (0.80% MAPE) and directional precision compared to traditional ARIMA and LSTM models across 20 S&P 500 stocks from 1982 to 2025.

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman

Published Mon, 09 Ma
📖 5 min read🧠 Deep dive

Imagine you are trying to predict the weather. You could look at a thermometer (the numbers), but you'd miss the fact that the sky is turning a weird shade of purple (the mood). Most stock market prediction tools are like that thermometer—they only look at the numbers: price, volume, and past trends.

This paper introduces a new "super-forecaster" that combines the thermometer with a mood ring. It's a smart system that tries to predict stock prices by looking at three things at once: the math, the connections between companies, and what people are saying on social media.

Here is how it works, broken down into simple concepts:

1. The "Social Network" of Stocks (The Graph)

Most models treat every stock like a lonely island. They look at Apple's history and guess where it's going, ignoring that Apple is connected to Microsoft, or that they both rely on the same chip manufacturers.

  • The Analogy: Imagine a high school cafeteria. If you want to know who is going to sit at which table, you don't just look at one kid in isolation. You look at who is friends with whom.
  • The Tech: The authors built a "Node Transformer." Think of this as a map where every stock is a node (a dot) and the relationships between them are lines (edges). If two companies are in the same industry (like Apple and Microsoft) or have similar price movements, the line between them gets thicker. The model learns that if one friend in the group sneezes, the others might catch a cold too. This helps it predict how a shock to one company ripples through the whole market.

2. The "Mood Ring" (Sentiment Analysis)

Stocks aren't just driven by math; they are driven by human fear and greed. If everyone on Twitter is panicking about a company, the stock might drop even if the company's finances are fine.

  • The Analogy: Imagine a sports team. The stats say they are the best team in the league. But if the fans are screaming "We're going to lose!" and the players are arguing in the locker room, the team might lose anyway. You need to read the room.
  • The Tech: The system uses a tool called BERT (a famous AI for reading text). It scans millions of social media posts every day. It reads posts like "$AAPL is crushing it!" or "This stock is a disaster" and turns them into a simple score: Positive, Neutral, or Negative. It's like giving the model a "mood ring" that changes color based on public opinion.

3. The "Smart Conductor" (The Fusion)

Now you have two streams of information: the hard numbers (price charts) and the soft numbers (mood). How do you combine them?

  • The Analogy: Imagine a conductor leading an orchestra. Sometimes the violins (the price data) should be loud, and sometimes the drums (the social media hype) should take over. A bad conductor plays them at the same volume all the time. A good conductor listens to the room.
  • The Tech: The model has a "gating mechanism." It acts like a smart conductor.
    • When the market is calm and boring, it listens mostly to the price history.
    • When the market is crazy (high volatility) or there's big news (like an earnings report), it turns up the volume on the social media mood because that's where the real action is happening.

The Results: Did it work?

The researchers tested this "Super-Forecaster" on 20 big companies (like Apple, Walmart, and Boeing) using data from 1982 all the way to 2025.

  • The Score: It made mistakes only 0.80% of the time (measured by a metric called MAPE).
  • The Competition:
    • Old-school math models (ARIMA) made mistakes 1.20% of the time.
    • Standard AI models (LSTM) made mistakes 1.00% of the time.
  • The "Mood" Bonus: When they turned off the social media part, the model got worse by 10%. When they turned off the "friendship map" (the graph), it got worse by 15%.

Why does this matter?

  • It's more robust: When the market crashes or gets scary (like during a pandemic), old models tend to break. This model stayed accurate because it could "feel" the panic in the social media posts and adjust its predictions.
  • It's smarter about direction: It correctly guessed whether a stock would go up or down 65% of the time. Since a coin flip is 50%, that's a significant edge.

The Catch (Limitations)

The authors are honest about the flaws:

  1. Survivorship Bias: They only tested on companies that survived and are still big today. They didn't test on companies that went bankrupt, so the results might look slightly better than reality.
  2. Data Gaps: Social media (Twitter/X) didn't exist before 2007, so the model had to guess the "mood" for the years before that.
  3. Complexity: It's a heavy computer program. It's not something you can run on a cheap laptop in real-time yet.

The Bottom Line

This paper proposes that to predict the stock market, you can't just look at the numbers. You have to look at the network (who is connected to whom) and the noise (what people are saying). By combining a graph network with a mood-reading AI, they built a system that sees the market more clearly than the tools we've been using for decades.