
In 2025, MEV revenue on the SOL blockchain hit $720 million. That's not a side effect of trading. That's the main event. RPC Fast (sometimes referred to as rpcfast) watches this layer closely because it's where the real money moves—and where most AI agents go to die.
In Solana, an agent with great logic but slow execution is like a mail-playing chess grandmaster; poor decision quality is irrelevant if the move arrives too late.
What’s an AI agent in the Solana ecosystem?
It's an autonomous system that monitors on-chain state, executes decision logic, creates transactions, and submits them—all within a sub-slot window, with no human intervention.
The five-layer architecture
A production Solana HFT agent has five interconnected pieces. Failure in any one breaks the whole system.
| Component | Function | Latency requirement |
|---|---|---|
| Data ingestion | Real-time account and transaction updates | <50ms from state change |
| Signal processing | Classifies data, detects opportunities | <5ms per update |
| Decision logic | Computes trade parameters, validates profit | <2ms for simple routes |
| Transaction construction | Builds signed tx with correct fees and tip | <1ms |
| Submission layer | Broadcasts to validators via multiple paths | <25ms to leader |
Most production setups run data ingestion and execution on separate threads to prevent I/O from blocking the decision engine.
What’s the “AI” part mean?
The "AI" part can mean different things:
- Rule-based ML models that score incoming updates for opportunity probability
- Reinforcement learning agents that optimize bidding and timing behavior
- LLM-based orchestration for higher-level strategy decisions
- Hybrid setups (most common in production)
Different components have varying latency budgets: signal detection and execution should be under 10ms, while LLM reasoning can take 1–5 seconds, provided its outputs feed a fast pipeline.
The execution path within Solana
Understanding where time actually goes reveals where optimization matters.
| Step | Component | What happens | Latency | Key detail |
|---|---|---|---|---|
| 1 | ShredStream | Block data at shred level; raw transaction fragments before block assembly | 50–100ms earlier than gRPC | 12–25% of 400ms slot window |
| 2 | Yellowstone gRPC | Account state updates pushed directly from validator memory | <50ms (dedicated) / 100–300ms (public) | Confirmed state changes only |
| 3 | Opportunity detection | Signal engine recalculates prices across pools; assesses profitability after fees and tip | Microseconds (two-pool) / <1ms (triangular) | Rust optimization required |
| 4 | TX construction | Build swap sequence, set ComputeUnitPrice to 75th–90th percentile of recent fees, add Jito tip | <1ms | Tip determines block engine priority |
| 5 | Bundle submission | Wrap TX in Jito bundle; submit to US East, EU, Tokyo endpoints in parallel | <25ms to leader | Hedges geographic variance |
| 6 | Confirmation | Monitor via getSignatureStatuses at processed commitment; log route, profit, tip, slot delta | <400ms | Use processed, not confirmed/finalized |
Why most Solana HFT agents fail
Here's the thing about Solana arbitrage agents: the playbook is public. You can read the Jito docs. You can clone the ElizaOS repo. You can spin up a node. And yet most agents that go live produce nothing but losses and regret.
- Commitment level is the first trap.
Many developers default to 'confirmed' because it sounds safer, waiting for confirmation. On Solana, "confirmed" means reading data from 400–800 ms ago, about two to three slots. Your agent isn't trading the current market but a recent past ghost market. When your agent detects a price discrepancy, it's already corrected. The only commitment level that matters is 'processed,' offering data from the current slot. All others are just fees to be late.
- Public RPC seems convenient but is a trap.
It works until it doesn't—shared with thousands, and during token launches or liquidations, it gets overwhelmed. Rate limits cause request failures, so your agent loses visibility when market moves happen. You aren't just competing with other agents, but against physics and losing.
- Tip calibration is a set-it-and-forget-it disaster.
You tune your Jito tip during development when competition is light, setting it to 50% of expected profit. It looks good, so you deploy. For a few weeks, bundles land, then slowly, they stop. Other agents enter the market, raising the clearing price. Your tip falls below market, and your bundles sit in the mempool while competitors' land. Your agent fires, pays fees, but captures nothing. You don't notice because there's no error message. The bundle just doesn't confirm. You must actively monitor acceptance rate and adjust, but most people don't.
- Colocation is essential physics.
Your agent runs on a Virginia cloud server, while the nearest Solana validator is in Frankfurt, causing 200ms latency. This geography costs half your slot. No code tweaks or algorithms can fix this; you can't outthink the speed of light. Winners colocate—on bare metal with validators—while losers try to compete from across the Atlantic.
- ShredStream offers a real-time edge over Yellowstone gRPC by showing transactions 50-100 milliseconds earlier, crucial in a 400-millisecond slot, giving users a significant head start. Many miss out due to extra connection needs and different code.
The pattern is always the same: teams build smart agents and route them through dumb infrastructure. Then they wonder why they miss opportunities. The agent isn't the problem. The plumbing is.
Where latency actually hides
It's not evenly distributed. Here's where it accumulates:
| Stage | Public endpoint | Dedicated colocated | Source |
|---|---|---|---|
| Shred arrival | N/A | 0–50ms | ShredStream bypasses gossip |
| Account update | 100–300ms | 10–50ms | Gossip propagation vs direct delivery |
| Opportunity computation | N/A | <1–5ms | Algorithm complexity; Rust vs TypeScript |
| Transaction construction | N/A | <1ms | Signing overhead |
| Send to leader | 100–400ms | 5–25ms | Network hops; distance to leader |
| Bundle confirmation | Variable | Predictable, <400ms | Tip calibration; block engine proximity |
| Failover on node drop | Manual / minutes | <50ms automated | Infrastructure monitoring |
The cumulative difference between public and dedicated colocated is 300–500ms on the full pipeline. In a 400ms slot, that's not a performance gap. It's the difference between competing and not competing.
Comments
Loading comments…