TK
Back to all posts

You can't cancel synchronous JavaScript: Web Workers as the browser's only kill switch

AbortController is cooperative. It can't stop a runaway regex or a multi-megabyte JSON.stringify, and a setTimeout on the main thread can't even fire while one is running. The only true preemption primitive in the browser is Worker.terminate(). Two patterns from TaskKit.

Published

Paste this into a regex tester that runs on the main thread:

pattern:  (a+)+$
input:    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!

Thirty-two as and an exclamation mark. The regex engine tries to match, fails, backtracks, tries again, and the number of paths it explores roughly doubles with every character. The tab freezes. Not “spinner” freezes. The whole page stops: the input you’re typing into, the buttons, the close-the-tab animation. You get the browser’s “this page is slow” dialog, and the only button that helps is the one that kills the page.

We had to handle this in TaskKit, because the regex tester takes a pattern and the test string from the same untrusted person, and “don’t paste a pathological pattern” is not a security model. Solving it properly forced a fact into the open that I had been half-believing the wrong version of for years: you cannot cancel synchronous JavaScript. Not with AbortController, not with a timeout, not with a clever promise. The only thing in the browser that can stop a function that has decided not to return is Worker.terminate().

This post is about why that is, and the two different shapes the fix takes depending on whether you’re defending against slow or just trying to cancel stale.

The thing nobody tells you about cancellation

AbortController reads like a cancel button. You make one, you pass signal to the thing you want to be able to stop, you call abort(), and the thing stops. That mental model is correct for exactly one category of work: the work that has agreed, in its own source code, to check the signal.

fetch checks the signal. That’s why aborting a fetch works. It’s polling signal.aborted (and listening for the abort event) at the points where it’s about to do more work, and bailing out when it sees the flag. Cancellation here is cooperative. The caller raises a flag; the callee, of its own free will, looks at the flag and quits.

Now consider the three functions at the center of any “dev tool in the browser” project:

  • JSON.parse / JSON.stringify
  • RegExp.prototype.exec
  • anything you wrote that loops over a big array

None of them take a signal. None of them check one. There is no argument, no global, no option bag that makes JSON.stringify glance up mid-serialization and notice you’d like it to stop. Once you call it, it owns the thread until it returns or throws. AbortController is not “cancellation.” It is a convention, honored by APIs that opt in, and CPU-bound synchronous code does not opt in because it can’t: it isn’t yielding control often enough to check anything.

So the first correction: a cancel button that calls controller.abort() does nothing to a running JSON.stringify. The abort is recorded. The string keeps building. Your handler that’s supposed to react to the abort is sitting in the task queue, waiting for the thread, which it will not get until the work it’s trying to cancel has finished on its own.

Why a timeout on the main thread is a lie

The natural next idea is a deadline. “If the regex hasn’t finished in two seconds, give up.” Something like:

let done = false;
const timer = setTimeout(() => {
  if (!done) abortSomehow();  // <- there is no "abortSomehow"
}, 2000);

const result = bigRegex.exec(hugeString);  // blocks
done = true;
clearTimeout(timer);

This does not work, and the reason it doesn’t work is the most useful thing in this post.

setTimeout does not run code in 2,000 milliseconds. It runs code no sooner than 2,000 milliseconds, and only when the thread is free. The callback is a task. Tasks run on the one thread. While bigRegex.exec(hugeString) is grinding, the thread is not free, so the timer callback can’t run. It waits in the queue. If the regex takes forty seconds, your “two second timeout” fires at forty seconds and one tick, immediately after the thing it was supposed to interrupt has already finished interrupting your users.

You cannot time out main-thread work from the main thread. The watchdog and the thing it’s watching are the same thread, and the thing it’s watching never lets the watchdog speak.

That sentence is the whole design. The watchdog has to live on a thread that is free, while the dangerous work runs on a thread you are willing to lose. In the browser there is exactly one way to get a second thread that can run your code: a Web Worker. And there is exactly one way to stop code on a thread that won’t stop itself: terminate().

The only real primitive: Worker.terminate()

Worker.terminate() is not cooperative. It does not ask. It does not run a finally block, flush a buffer, or reject a pending promise inside the worker. It stops the worker thread wherever it is, mid-instruction, and frees it. From the worker’s perspective there is no “after.” That brutality is exactly the property you need, because the entire problem is that the code in there has stopped responding to polite requests.

So the shape of every solution is the same:

  1. Run the untrusted or unbounded work on a worker.
  2. Keep a reference to that worker on the main thread, where the event loop is still turning.
  3. When you decide to stop (a deadline passes, or a newer request arrives), call terminate() and resolve the waiting promise yourself with a failure or cancellation result.

The main thread stays responsive the whole time, because the only thing it’s doing is waiting for a postMessage that may never come, which is free.

There are two situations where you reach for this, and they look different enough to be worth separating.

Pattern 1: timeout-kill, for work that might be hostile

This is the regex case. The input might be a catastrophic-backtracking pattern, and “catastrophic” means superlinear: you cannot predict from the pattern’s length how long it will run, so you can’t pre-screen it. You have to be willing to start it and then stop it.

Here is the core of TaskKit’s regex executor, trimmed to the moving parts:

export class RegexExecutor {
  private worker: Worker | null = null;
  private pending = new Map<number, { resolve: (v: RegexExecution) => void; timer: ReturnType<typeof setTimeout> }>();
  private currentId = 0;

  constructor(private readonly timeoutMs = 2000) {}

  run(request: RegexRequest): Promise<RegexExecution> {
    const id = ++this.currentId;
    return new Promise((resolve) => {
      const timer = setTimeout(() => {
        this.pending.delete(id);
        this.terminate();   // <- kills the worker mid-backtrack
        resolve({ ok: false, code: "TIMEOUT", error: "Pattern took too long (likely catastrophic backtracking)." });
      }, this.timeoutMs);

      this.pending.set(id, { resolve, timer });
      this.ensureWorker().postMessage({ id, ...request });
    });
  }

  private terminate() {
    this.worker?.terminate();
    this.worker = null;
    for (const [, entry] of this.pending) {
      clearTimeout(entry.timer);
      entry.resolve({ ok: false, code: "TIMEOUT", error: "Pattern took too long (likely catastrophic backtracking)." });
    }
    this.pending.clear();
  }
}

Look at where the setTimeout lives. It’s on the main thread, inside run. The regex runs on the worker thread, after postMessage. That’s the entire trick from the previous section made concrete: because the dangerous exec is off the main thread, the main thread is free, so the 2,000 ms timer actually fires at 2,000 ms. It then calls terminate(), the worker dies mid-backtrack, and the promise resolves with a TIMEOUT result the UI can render as a friendly error instead of a frozen tab.

Two details that aren’t obvious:

The worker is built from a string, not a file. The regex worker has no imports. It’s a RegExp and a loop. So instead of a module worker pointing at a .ts file, it’s inlined:

const blob = new Blob([WORKER_SOURCE], { type: "application/javascript" });
this.workerUrl = URL.createObjectURL(blob);
const worker = new Worker(this.workerUrl);

WORKER_SOURCE is a template string containing the whole handler. This is worth knowing as a technique: when the worker body is small and self-contained, a Blob URL gets you a second thread with zero build-tooling ceremony. (When the worker needs to share real code with the app, you want a proper module worker instead. That’s the next pattern.)

The worker is disposable, and that’s fine. After a timeout we throw the whole worker away and lazily spawn a fresh one on the next run. Spawning a worker isn’t free, but it happens only after a pathological pattern, which is rare, and the alternative (a hung worker we can never trust to be idle again) is worse. There’s no “reset” message you could send, because, as we’re about to see, a busy worker can’t read its mail.

The worker also rejects input over 1 MB before it builds the regex at all, and stops collecting once it hits 10,000 matches, so the timeout is the backstop for the pathological case, not the everyday limiter. Defense in depth: cheap limits first, the expensive kill switch only when those don’t catch it.

Pattern 2: cancel-on-supersede, for work that’s merely stale

The JSON formatter has a different problem. The input isn’t hostile, it’s just big, and the user is typing. Every keystroke produces a new “format this” request, and a 4 MB document takes long enough to format that three or four requests can pile up while the first is still running. You don’t want a timeout here. You want the old work cancelled the instant a newer keystroke makes it irrelevant.

You might think: just send the worker a “never mind” message. You can’t. Here’s the same run-to-completion rule one level down: a worker processes one onmessage handler at a time, to completion, before it reads the next message. While the worker is inside JSON.stringify, its message queue is frozen for exactly the same reason the main thread’s was. Your “cancel” message sits unread behind the work it’s trying to cancel. The only way to make a busy worker stop is, again, to kill it.

So the JSON host terminates and respawns on every supersede:

run(input, action, indent, sortKeys): Promise<JsonOutcome> {
  // Small inputs skip the worker entirely (see threshold below).
  if (input.length < INLINE_THRESHOLD_BYTES) {
    return Promise.resolve(formatJson(input, indent, sortKeys));
  }

  // A newer request supersedes the in-flight one.
  if (this.pending) { this.pending.reject(new Error("CANCELLED")); this.pending = null; }
  if (this.worker) { this.worker.terminate(); this.worker = null; }

  const worker = new Worker(new URL("./json.worker.ts", import.meta.url), { type: "module" });
  this.worker = worker;

  return new Promise((resolve, reject) => {
    this.pending = { resolve, reject };
    worker.onmessage = (event) => {
      if (this.worker !== worker) return;   // a stale worker's late reply
      this.pending = null;
      resolve(event.data);
    };
    worker.postMessage({ input, action, indent, sortKeys });
  });
}

The comment in the original source says it plainly: the V8 main loop can’t be interrupted mid-JSON.stringify, so termination is the only way to free the thread. Same primitive, different trigger. Pattern 1 pulls the trigger on a clock. Pattern 2 pulls it when a newer request arrives.

Note the if (this.worker !== worker) return guard. There’s a race: you terminate worker A and spawn worker B, but a message event from A was already in the main thread’s queue before you killed it. Without the guard, A’s stale result overwrites B’s. The guard says “only accept a reply from the worker I currently consider current.” Small line, real bug if it’s missing.

This one is a module worker (new URL("./json.worker.ts", import.meta.url) with { type: "module" }), not a Blob, because it imports the same formatJson the main thread uses. You do not want two copies of your formatter, one for the fast path and one for the worker, drifting apart. Sharing the module keeps them honest.

The trade-offs

Moving work to a worker is not free, and pretending otherwise leads to putting everything on a worker, which is its own performance bug.

  • Serialization cost. postMessage copies its payload via the structured clone algorithm. For a multi-megabyte string that’s a real, measurable copy on the way in and another on the way out. For small inputs this copy plus the worker spawn costs more than just doing the work inline. That’s why the JSON path has INLINE_THRESHOLD_BYTES = 100_000: below 100 KB, the round trip is the slow part, so we skip the worker and format on the main thread synchronously. The worker earns its keep only when the work is big enough to dwarf the overhead.
  • Workers can’t touch the DOM. This pattern is for pure compute: parse, format, match, transform. Anything that needs the DOM has to come back to the main thread first. If your “slow operation” is actually slow because it’s thrashing layout, a worker won’t help, and the real fix is elsewhere.
  • You lose partial progress. terminate() is total. The half-built string, the matches found so far, all gone. For our cases that’s correct (a superseded format is worthless; a backtracking regex never produced anything useful). If your work is incremental and you’d want to keep partial results, termination is the wrong tool and you need genuinely cooperative chunking instead.
  • Cold start. The first new Worker(...) pays for fetching and initializing the worker module. We spawn lazily, on first real use, so it’s hidden behind the user’s own typing rather than added to page load.

When you don’t need any of this

If the input is bounded and you produced it yourself, do the work inline and move on. The whole apparatus above exists for two specific conditions: the work might not finish in a sane time (untrusted regex, untrusted document size), or a newer request makes an in-flight one pointless (live editing of large input). If neither is true, a worker is just latency and serialization cost with extra files.

The tell that you’ve crossed the line is the one we started with: a setTimeout-based deadline that never fires, because the thing you’re trying to time out is starving the timer. The moment you catch yourself writing a watchdog on the same thread as the thing it watches, stop. The watchdog needs a free thread, and the work needs a thread you can kill, and in the browser that means a worker and terminate().

Why this lives in a browser tool at all

There’s a server-shaped version of all of this. Ship the regex and the test string to a backend, run it there with a real timeout (or a sandboxed process you can SIGKILL), return the result. It works, and it moves the freeze off the user’s tab.

It also means the user’s regex, and more to the point the data they’re testing it against, left their machine. For a developer pasting a production log line to check a pattern against it, that’s the whole question. Doing it in a worker keeps the hostile-input problem and its solution on the user’s device: the tab stays responsive, the pathological pattern gets killed at two seconds, and nothing was uploaded to make that happen. The kill switch is local, which is the only kind of kill switch that doesn’t require trusting us.

The tools this came out of are the regex tester (timeout-kill) and the JSON formatter (cancel-on-supersede). Same primitive, two triggers.