Skip to main content

Command Palette

Search for a command to run...

Someone Asked Me How AI Gets Its GPU Time. I Didn't Have a Good Answer.

So I spent a week finding one.

Updated
16 min read
N
Devoloping in web3 and AI

Here's a question nobody asked me out loud but that I couldn't stop thinking about after the AWS Summit this year:

Where does the compute actually come from?

Like — when you fine-tune a model, or run an inference endpoint, or batch-process a million embeddings — someone's GPU is doing that work. And if it's not your GPU, it's someone else's. And right now, "someone else" is almost always Amazon, Google, or Microsoft.

But it doesn't have to be.

I went down a rabbit hole learning about DePIN — Decentralised Physical Infrastructure Networks — specifically the part about GPU compute. This post is everything I found, written the way I wish someone had explained it to me: like a developer sharing notes, not a whitepaper trying to impress a committee.

Let's go.


First, The Problem. And It's Bigger Than You Think.

Training GPT-3 in 2020 cost roughly $4.6 million in compute.

Training GPT-4 in 2024? Estimated north of $100 million.

Frontier model runs in 2026? Approaching $1 billion.

That's 200× growth in six years. Meanwhile, your AWS bill is also going up. Coincidence? Not really.

The result is something I'd call a compute aristocracy — maybe ten organisations globally that can actually afford to train frontier models. Everyone else rents time from them, at whatever price they decide to charge, under whatever terms they decide to set.

Here's the part that bothered me when I really sat with it:

While AI researchers are desperately hunting for GPU time — millions of GPUs are sitting completely idle right now.

Gaming PCs running at 5% utilisation for 20 hours a day. Old crypto mining rigs gathering dust in garages after Ethereum moved to Proof of Stake. University research clusters running at 30% overnight. Research labs averaging 60% utilisation — meaning 40% of expensive hardware is doing absolutely nothing at any given moment.

DePIN exists to connect that idle supply to the people who desperately need it. Not by building new data centres. By using the ones that already exist — just owned by regular people instead of trillion-dollar corporations.


What DePIN Actually Is — In Plain English

When you spin up a GPU instance on a cloud provider, here's what you're actually paying for:

You (browser / CLI)
        ↓
Cloud Provider Console      ← their marketplace
        ↓
Hypervisor layer            ← virtualises the physical GPU
        ↓
Physical GPU                ← owned by them, in their building,
                               cooled by them, in a region they chose

Three full layers of one company's control sit between you and the silicon. They set the price. They control access. They decide which regions exist.

DePIN asks: what if you removed all three layers?

Instead of one company owning all the hardware, you have thousands of independent people contributing their idle resources to a shared network. A blockchain handles the things the company used to handle — pricing (open auction), payments (automatic smart contracts), and verification (cryptographic proofs).

The mental model I kept coming back to:

DePIN = using blockchain as the trust layer for physical hardware you don't own.

No middleman. The rules are written in code. The code runs the same for everyone.


How It Actually Works — The Full Technical Picture

Let me make this concrete with a character I'll call Klaus. Klaus lives in Berlin. He has a powerful GPU sitting idle most nights. He's registered it on a DePIN compute network. You're in Bengaluru and need to fine-tune a model.

Here's exactly what happens between you and Klaus.

Step 1 — Klaus Broadcasts His Hardware

Klaus runs a lightweight background process that continuously announces his hardware to the network:

{
  "gpu_model": "High-end consumer GPU",
  "vram_gb": 24,
  "cuda_version": "12.2",
  "location": "DE-Berlin",
  "availability": true,
  "price_per_hour": "0.35 network tokens",
  "stake": 1000
}

This heartbeat gets recorded every ~60 seconds, building a live, decentralised inventory of available hardware worldwide. No account approval. No sales call. No contract. Just Klaus and his GPU, announcing themselves to anyone who needs them.

Step 2 — You Submit a Job, The Network Matches You

You submit your training job with requirements — minimum VRAM, CUDA version, maximum price per hour. A smart contract scans the registry and runs a reverse auction: providers bid against each other for your job, and the cheapest qualified bid wins.

The whole matchmaking process: roughly 30–60 seconds. Compare that to minutes for a traditional cloud instance spin-up.

Step 3 — The Hard Part: Proving Klaus Actually Did the Work

This is the most technically interesting problem in all of DePIN. If Klaus claims he ran your training job — how do you know he actually did the compute and didn't just fabricate an output?

The solution most networks use is called challenge-response verification:

Network sends Klaus:
  → Your training job
  → + a hidden "challenge" — a small deterministic computation
      with a known correct answer (generated by the network)

Klaus runs everything on his GPU

Klaus returns:
  → Your training results
  → + the challenge response

Network checks the challenge answer:
  → Correct: Klaus ran the real job. Payment released.
  → Wrong: Klaus is cheating. He loses part of his staked collateral.

The critical insight: the challenge has exactly one correct answer, which can only come from running actual computation on real GPU hardware. The correct output is verifiable in milliseconds but impossible to fake without doing the real work.

Step 4 — Payment Happens Automatically

You lock payment in a smart contract escrow upfront
        ↓
Job completes + verification passes
        ↓
Smart contract automatically releases payment to Klaus
        ↓
If Klaus fails verification → he loses staked collateral proportionally

No invoice. No net-30 payment terms. No customer support ticket. The code runs, the math checks out, Klaus gets paid.


Why Most of These Networks Run on Solana

When I first learned this, it seemed like a random technical choice. It isn't.

DePIN has a specific requirement most blockchain applications don't: microtransactions at extremely high frequency. If hundreds of thousands of GPUs are completing jobs all day, that's potentially millions of small payments per day.

Chain Transaction Fee Block Time
Ethereum $5–50 ~12 seconds
Solana ~$0.00025 ~400ms

Paying someone a small amount for a 20-minute compute job doesn't work if the transaction fee costs more than the job itself. Solana makes the economics work. That's why most DePIN compute networks chose it — not for ideological reasons, but because the numbers made sense.


The Networks That Are Actually Running

There are four major players in production right now. Each made fundamentally different design bets.

The General-Purpose One

The most mature and general-purpose option deploys workloads as standard Kubernetes containers. If you know Docker, you can use it today — no blockchain expertise required. You write something close to a standard YAML deployment file and submit it.

Real cost comparison:

  • A high-end data centre GPU on traditional cloud: ~$3.50/hour

  • Same spec on this network: ~$1.20–1.80/hour

That's roughly 50–60% cheaper. For a training job that would otherwise cost $800, you're paying $350–400. Real money.

Best for: Hosting APIs, running inference endpoints, general cloud workloads you'd normally put on a VM.

The One That Started With 3D Rendering

One network started life as decentralised GPU rendering for 3D artists and film studios — think Blender, Cinema 4D, visual effects pipelines. It built up a massive node count (we're talking hundreds of thousands of registered GPUs) from the rendering market, and has since pivoted aggressively into AI inference.

Their verification approach is clever for deterministic workloads: the same job gets run by multiple nodes, outputs compared by checksum. If one node's output doesn't match, it gets flagged. Simple and effective — though it has limits for non-deterministic training runs where two legitimate executions can produce different outputs.

Best for: AI inference, image generation, workloads where you're running the same job at scale.

The ML Training Cluster One

The most technically ambitious of the group. Specifically designed for distributed GPU clusters — the kind you need for serious model training across multiple cards. Their technically impressive trick: aggregating GPUs from different physical providers scattered around the world into a single virtual cluster that looks unified to your PyTorch or JAX code.

Real cost comparison:

  • 8 high-end data centre GPUs networked together on traditional cloud: ~$32/hour

  • Equivalent configuration here: ~$10–19/hour

That's potentially 40–70% cheaper. For a training run that would otherwise cost $5,000, you might pay $1,500–3,000.

Best for: Distributed ML training, large batch inference, anything that needs multiple GPUs working together.

The Consumer GPU One

Strategically smart positioning: focus entirely on consumer-grade GPUs (gaming cards, prosumer hardware) rather than data centre equipment. There are vastly more of these in the world than enterprise GPUs — more supply means lower prices and better availability. They started with CI/CD pipelines and moved into AI inference.

Best for: AI inference endpoints, lightweight fine-tuning, situations where you need cheap compute and aren't running production-critical workloads.


Tokenomics — The Surprisingly Elegant Part

Here's a problem I hadn't thought about before going deep on this: how do you keep 500,000 strangers honest?

Klaus registered his GPU. Great. But Klaus is a rational human. If he realises he gets paid whether his GPU is fast or slow, reliable or flaky — why bother maintaining uptime? Why not just register, half-heartedly participate, and pocket whatever comes in?

This is called the lazy provider problem. It kills networks.

The solution is mechanism design: structuring rules so that self-interested actors naturally behave the way you want them to. Not because they're virtuous — but because it's in their financial interest.

Every DePIN network uses two levers:

The Carrot — Rewards

Providers earn in two ways: payment for completed jobs (direct, proportional to work done) and staking rewards for uptime (Klaus earns just for being consistently online and available, even during quiet periods). Higher stake often means higher job priority — which means more earnings. It's a compounding advantage for committed providers.

The Stick — Slashing

Klaus locks up collateral when he joins. If he fails a challenge verification, the smart contract automatically burns a portion of that collateral. No human judge. No appeal process. Just math, running automatically, with real financial consequences.

The result is a complete incentive matrix where every row pushes Klaus toward honest behaviour:

What Klaus Does Financial Result
Stays online consistently Earns staking bonus
Completes jobs well Gets paid for work
Fakes computation Loses collateral
Goes offline mid-job Loses potential earnings + mild penalty
Builds reputation over time Gets priority jobs

It's not trust. It's not reputation. It's pure economic alignment. Klaus behaves honestly because dishonesty is expensive.


The Limitations Nobody In The Space Wants To Talk About

This is the section I almost didn't write, because it's uncomfortable. But it's the most important one.

If DePIN were good enough to replace traditional cloud providers, enterprises would have already switched. They haven't. Here's why.

Latency Is A Physics Problem

Your job: Bengaluru
Klaus's GPU: Berlin
Speed of light delay: ~140ms one way

Traditional cloud (nearest region): ~8ms
DePIN random provider: 20ms to 300ms+ and unpredictable

Traditional cloud providers win on latency every time because they built data centres strategically close to population centres. You can tell DePIN networks to only use providers within a certain distance — but that immediately shrinks the available pool, reduces competition, and raises prices. The latency advantage narrows exactly when you try to fix it.

For batch training where latency doesn't matter: DePIN is great. For real-time inference where milliseconds matter: traditional cloud still wins.

The Gamer Problem

Klaus starts your 6-hour training job at 10pm
        ↓
Klaus decides to play games at midnight
        ↓
Job dies 4 hours in
        ↓
Your compute: wasted

Professional cloud hardware has uptime guarantees measured in nines (99.99%). Consumer hardware in someone's bedroom does not. Networks mitigate this with checkpointing and redundancy, but none fully close the gap — and each mitigation adds overhead.

The Verification Gap For Long Jobs

Challenge-response verification works great for short, deterministic jobs. But what about a 72-hour model fine-tuning run?

You send Klaus 100GB of training data. He returns 50GB of model weights after three days. How do you know he trained on your full dataset? Used your exact configuration? Didn't introduce subtle errors — intentional or not?

You can't verify any of this without re-running the entire job yourself. Which defeats the purpose.

The theoretical solution is zero-knowledge proofs — mathematical proofs that a computation happened correctly without revealing the inputs. Theoretically perfect. In practice, generating ZK proofs for GPU workloads is currently far more expensive than just running the computation again. The research is advancing fast, but it's not solved yet.

The Compliance Wall

This one is underrated and kills enterprise adoption faster than any technical limitation.

When you send training data to a traditional cloud provider, your data stays within known, certified infrastructure. Clear jurisdiction. SOC2, HIPAA, GDPR certified. Audit trails.

When you send training data to a DePIN network, your data goes to Klaus's PC in Berlin. Klaus can technically see your raw training data. There's no data processing agreement. Unclear jurisdiction. No compliance certifications.

For healthcare, finance, or legal sectors, this is an immediate disqualifier — and potentially illegal under GDPR for EU citizen data. No compliance officer at a regulated company signs off on "we sent patient data to a stranger's gaming PC."

This is arguably the single biggest barrier to enterprise adoption. Bigger than latency. Bigger than reliability.

The Honest Summary

Traditional cloud providers sell certainty. DePIN sells savings. Most enterprises pay a premium for certainty.

DePIN doesn't need to beat traditional cloud to matter. It needs to serve the massive underserved market of AI builders who can't afford traditional cloud pricing — and that market is growing faster than any single provider can build data centres to serve it.


Why 2026 Specifically — The Timing Is Not An Accident

DePIN concepts have existed since 2017. So why is it suddenly relevant now? Two things happened simultaneously.

On the demand side: The AI compute explosion created a massive, urgent, underserved market for GPU compute below the hyperscaler tier. The numbers are staggering — the global AI compute market is measured in the tens of billions of dollars and growing fast. Most of that demand comes from builders who can't sign enterprise cloud contracts.

On the supply side: In September 2022, Ethereum moved to Proof of Stake. Overnight, hundreds of thousands of GPUs previously used for crypto mining became completely idle. These weren't consumer gaming cards. These were serious professional rigs with industrial cooling and 24/7 uptime infrastructure already built and paid for.

Two months later, ChatGPT launched and AI compute demand exploded.

Sept 2022: GPU mining industry collapses → massive idle supply
Nov 2022:  ChatGPT launches → massive new demand
2023–2026: DePIN networks absorb idle supply, serve exploding demand

That timing wasn't engineered. It just happened. DePIN was in the right place.

And here's the structural tailwind nobody talks about enough: the leading GPU manufacturers have had 6–12 month waiting lists for their most advanced chips. The hyperscalers — the big three cloud providers — are buying them in bulk, which means everyone else either waits, pays massive premiums, or looks for alternatives.

DePIN's value proposition gets stronger every time GPU supply is constrained. Which in 2026 is essentially always.


The Real-World Numbers For An AI Startup

Here's a scenario playing out across hundreds of AI startups right now:

Building a vertical AI product (e.g. legal document analysis)

Traditional cloud approach:
  Fine-tune a large open-source model:     ~$8,000
  Inference endpoint per month:            ~$3,000
  Total Year 1:                            ~$44,000

DePIN approach:
  Fine-tune on a DePIN compute network:   ~$2,800
  Inference endpoint per month:            ~$1,100
  Total Year 1:                            ~$16,000

Saving: ~$28,000 in Year 1

For a bootstrapped AI startup, that $28,000 difference is runway. Potentially 3–6 extra months of operation. At seed stage, that's existential.


Where This Is All Going

DePIN is not going to host the next frontier model training run. It's not going to replace enterprise cloud for production workloads tomorrow. The compliance gap, reliability gap, and verification gap are real and won't close overnight.

But it doesn't need to beat the big cloud providers to be transformative. It needs to do something more specific:

Democratise access to GPU compute for the 99% of AI builders the current system ignores.

The solo researcher fine-tuning an open-source model on a $200 budget. The startup that needs inference at scale but can't sign a $50,000/month contract. The developer in a market where cloud billing approval is difficult but has a powerful GPU sitting right there.

These are DePIN's people. And there are orders of magnitude more of them than there are hyperscaler customers.

The supply is real. The demand is exploding. The timing — a once-in-a-generation collision of idle hardware, exploding AI compute needs, and mature blockchain infrastructure — is something you don't engineer. It just happens.

DePIN happened to be there when it did.


I'm still learning. But I think the question I started with — where does the compute actually come from? — has a more interesting answer than I expected when I started digging.

It can come from anywhere. That's kind of the whole point.


I'm Niranjan — Full Stack & Web3 Developer at Alkimi Exchange, M.Sc. student in Data Science & Generative AI. Writing about what I'm actually learning, not what sounds impressive.

Find me on X / Twitter | Hashnode

1 views