Skip to content

Routing: descriptors and specificity

Routing maps a request path to the single best page descriptor. It replaced the old hardcoded rule list with a small, order-independent matcher driven entirely by the descriptors in the site's settings.

Source: src/engine/router.ts, src/engine/settings.ts.

The two functions

  • matchPattern(pathname, pattern) matches one path against one pattern such as /{category}/{slug}, returning the captured params and a specificity score, or null on no match.
  • matchDescriptor(pathname, pageTypes) runs every descriptor's patterns and returns the best DescriptorMatch, or null when nothing matches.

Matching a single pattern

matchPattern splits both the path and the pattern into segments (leading and trailing slashes ignored) and compares them position by position:

export function matchPattern(
  pathname: string,
  pattern: string,
): { params: Record<string, string>; specificity: number } | null {
  const patternParts = pattern.split("/").filter(Boolean);
  const pathParts = pathname.split("/").filter(Boolean);
  if (patternParts.length !== pathParts.length) return null;

  const params: Record<string, string> = {};
  let specificity = 0;
  for (let i = 0; i < patternParts.length; i++) {
    const seg = patternParts[i]!;
    const val = pathParts[i]!;
    if (seg.startsWith("{") && seg.endsWith("}")) {
      params[seg.slice(1, -1)] = decodeURIComponent(val);
    } else if (seg === val) {
      specificity++;
    } else {
      return null;
    }
  }
  return { params, specificity };
}

Three rules fall out of this:

  • Segment count must match. /a/b never matches /a/b/c.
  • {param} segments capture. {slug} captures the path segment under the name slug (URL-decoded). They do not add to specificity.
  • Literal segments must match exactly and each one adds 1 to the specificity score. A pattern with more fixed segments is more specific.

Choosing the best descriptor

matchDescriptor evaluates every pattern of every descriptor and keeps the winner under a strict ordering:

if (
  best === null ||
  candidate.specificity > best.specificity ||
  (candidate.specificity === best.specificity && candidate.position < best.position)
) {
  best = candidate;
}

The tie-break chain is:

  1. Higher specificity wins -- more literal segments beats more {param} segments.
  2. On equal specificity, lower position wins. A descriptor with no position sorts after any number (1_000_000 + index), then by descriptor order.

Because specificity comes first, the result is robust to how the descriptors happen to be listed in the settings payload.

Catch-all exclusion

Any pattern containing a * segment (for example the 404 descriptor's /*) is skipped during matching:

if (pattern.includes("*")) continue; // catch-all -> engine fallback only

Catch-alls never win a match. Instead, an unmatched path returns null and the engine falls back to the not-found descriptor explicitly (below). This keeps the 404 from accidentally outranking a real route.

The not-found fallback

src/index.ts turns a null match into the explicit not-found descriptor:

const match: DescriptorMatch =
  matchDescriptor(pathname, pageTypes) ??
  { descriptor: notFoundDescriptor(pageTypes), params: {} };

notFoundDescriptor (src/engine/settings.ts) finds the descriptor whose builder is notfound or whose type is 404, falling back to a built-in 404 descriptor if the settings carry none:

export function notFoundDescriptor(pageTypes: PageDescriptor[]): PageDescriptor {
  return (
    pageTypes.find((d) => d.builder === "notfound" || d.type === "404") ?? {
      type: "404",
      routes: ["/*"],
      layout: ["404"],
      ttl: 300,
      builder: "notfound",
    }
  );
}

So the engine always has a descriptor to render, even for a path that matches nothing.

Where descriptors come from

getPageTypes (src/engine/settings.ts) prefers the API-served descriptors and falls back to a built-in set only when the settings carry none:

export function getPageTypes(settings: SiteSettings): PageDescriptor[] {
  const fromApi = settings.site_settings.page_types;
  if (Array.isArray(fromApi) && fromApi.length > 0) return fromApi;
  return DEFAULT_PAGE_TYPES;
}

DEFAULT_PAGE_TYPES mirrors the seed descriptors and is used only when the CMS has not yet served any (an old worker, or a stale KV fallback snapshot). Once the API serves page_types, the API copy wins -- so a site's routing can evolve without a worker redeploy.

Worked example: /category/foo resolves to category, not article

Two DEFAULT_PAGE_TYPES descriptors both have a pattern that matches a two-segment path:

{ type: "article",  routes: ["/{category}/{slug}"], ... position: 40 },
{ type: "category", routes: ["/category/{slug}", "/{slug}"], ... position: 50 },

For the path /category/foo:

Descriptor Pattern Match Specificity
article /{category}/{slug} yes (category=category, slug=foo) 0 literals
category /category/{slug} yes (slug=foo) 1 literal (category)

Both match with two segments, but category's pattern has one literal segment (category) versus the article pattern's zero. Higher specificity wins, so /category/foo resolves to the category descriptor -- even though category has the higher (less preferred) position, because specificity is compared first. The position tie-break only matters when two patterns have the same specificity.

A single-segment path like /foo matches category's second pattern /{slug} (0 literals) and, since no other descriptor matches one segment with higher specificity, resolves to category as the site's catch-all content listing.