How I Ship Sub-200ms TTFB on Sanity-Powered Pages with PPR and Edge

Apr 27, 2026 · 5 min read

nextjssanityperformanceppredge

Why TTFB still tanks on supposedly cached pages

I see this pattern constantly: team builds a Next.js site with ISR, points it at Sanity, hits 400–800ms TTFB on "cached" pages. The culprit is usually revalidation blocking or route handlers hitting the Sanity CDN synchronously during render. Even with revalidate: 3600, Next.js has to check the cache, possibly re-fetch, and stream. If you're querying Sanity from a serverless function in us-east-1 while your CDN edge is in Sydney, you've lost 150ms to geography alone.

I wanted sub-200ms TTFB globally for a marketing site with 40+ localized pages, each pulling hero images, nav, footer, and a testimonial carousel from Sanity. Here's the pattern that worked.

Partial Prerendering for the shell, defer the dynamic bits

Next.js 15 shipped Partial Prerendering (PPR) as stable. The idea: prerender the static shell at build time, stream dynamic holes on request. For Sanity-backed pages, I prerender the layout, hero, and any above-fold content that doesn't change per user. The testimonial carousel and a "latest posts" sidebar remain dynamic—fetched at request time but outside the initial HTML payload.

In app/[locale]/page.tsx:

// app/[locale]/page.tsx
import { Suspense } from 'react';
import { getHeroData } from '@/sanity/queries/hero';
import { TestimonialCarousel } from '@/components/TestimonialCarousel';
import { LatestPosts } from '@/components/LatestPosts';
 
export const experimental_ppr = true;
 
interface PageProps {
  params: { locale: string };
}
 
export default async function Home({ params }: PageProps) {
  // Static fetch — runs at build, cached indefinitely
  const hero = await getHeroData(params.locale);
 
  return (
    <main>
      <section className="hero">
        <h1>{hero.title}</h1>
        <p>{hero.subtitle}</p>
      </section>
 
      <Suspense fallback={<div className="skeleton h-64" />}>
        <TestimonialCarousel locale={params.locale} />
      </Suspense>
 
      <Suspense fallback={<div className="skeleton h-96" />}>
        <LatestPosts locale={params.locale} />
      </Suspense>
    </main>
  );
}

The hero is prerendered. The two Suspense boundaries stream after the shell. TTFB reflects only the prerendered HTML—typically 80–120ms from Vercel's edge.

Edge runtime for dynamic fragments

For the dynamic Suspense blocks, I use the edge runtime to minimize Sanity query latency. Vercel's edge network colocates with Sanity's CDN (Fastly). A GROQ query from an edge function averages 40–70ms vs. 150–250ms from a Node.js serverless function.

In components/TestimonialCarousel.tsx:

// components/TestimonialCarousel.tsx
import { client } from '@/sanity/client';
 
export const runtime = 'edge';
 
interface Props {
  locale: string;
}
 
export async function TestimonialCarousel({ locale }: Props) {
  const testimonials = await client.fetch(
    `*[_type == "testimonial" && language == $locale][0..2] | order(_createdAt desc) {
      _id,
      quote,
      author,
      "avatar": avatar.asset->url
    }`,
    { locale },
    { next: { revalidate: 1800 } }
  );
 
  return (
    <div className="carousel">
      {testimonials.map((t) => (
        <blockquote key={t._id}>
          <p>{t.quote}</p>
          <cite>{t.author}</cite>
        </blockquote>
      ))}
    </div>
  );
}

Key: export const runtime = 'edge'; at the component level. Next.js compiles this into an edge function. The revalidate: 1800 keeps Sanity queries cached in Vercel's edge cache for 30 minutes. Subsequent requests hit the edge cache—no Sanity roundtrip.

Selective projection in GROQ

Even on the edge, a 200 kB Sanity response tanks your budget. I project only the fields I render. The above query omits _rev, _updatedAt, nested reference expansions I don't need. For portable text fields, I check if the component actually renders blocks; if not, I omit them:

*[_type == "post" && slug.current == $slug][0] {
  _id,
  title,
  "excerpt": pt::text(excerpt),  // Extract plain text instead of full blocks
  publishedAt,
  "image": mainImage.asset->url
}

This cuts response size from 180 kB to 12 kB for a typical post preview. Multiply by 10 posts in a grid, and you've saved 1.6 MB of transfer.

Benchmarks

Before PPR + edge:

  • TTFB (p50): 420ms (Node.js serverless, ISR revalidation every 3600s)
  • TTFB (p95): 780ms
  • LCP: 1.8s

After PPR + edge:

  • TTFB (p50): 140ms (prerendered shell)
  • TTFB (p95): 190ms
  • LCP: 1.1s (hero image preloaded, above-fold content in initial HTML)

The dynamic fragments stream in 60–90ms after the shell. Users see hero and layout instantly. Testimonials and posts pop in without layout shift because I reserve skeleton space.

Trade-offs

PPR requires Next.js 15+. If you're on 14, you can approximate this with loading.tsx and manual streaming, but it's clunkier. Edge runtime doesn't support all Node.js APIs—no fs, no child_process. If you're running image transforms or PDF generation server-side, keep those in Node.js functions and call them from edge via fetch if needed.

The 30-minute revalidate window means content updates lag by up to 30 minutes. For a marketing site, that's fine. For a news site, drop it to 60–300 seconds or use on-demand revalidation via Sanity webhooks.

When to reach for this

If your Sanity-powered pages serve global traffic and you're hitting 300ms+ TTFB, this pattern pays off immediately. If your audience is regional and you're already sub-200ms, the complexity isn't worth it. I use this on client sites with SLA requirements around Core Web Vitals—anything over 200ms TTFB kills your LCP budget before the browser even starts parsing HTML.

Related posts

All posts →