Making impossible states impossible

Beyond a programming language's type system

Early in my software development journey I came across a talk by Richard Feldman’s titled “Make impossible states impossible.” His talk explored how a programming language’s type system could be leveraged to make impossible states impossible to represent.

That idea stuck with me and, over the years, I’ve been applying it way beyond the type system. It has helped me build and maintain resilient systems, preventing bugs that would otherwise be much harder to find.

Impossible states

An impossible state is a state that should have no way of being represented according to your system’s requirement.

Take these two basic requirements, for example:

All users in my system have an email
An order cannot be shipped if no payment has been confirmed

Which results in the following impossible states:

User exists in the system without email
An order has been shipped without payment confirmation

How can we design a system so that these theoretically impossible states are indeed impossible?

Database schema design

When defining a relational database table schema, we need to choose the types of the columns, define primary keys, foreign keys, column nullability, and constraints. These are all tools we can use to make our model conform to the requirements. If all users of our application are expected to have an email but the users table has a nullable email column then the requirements are not properly enforced. A theoretical impossible state is effectively possible.

What if the “Sign Up” flow is validating the email ensuring that no users can be created without an email?

There are a few assumptions in that line of thinking:

There are no bugs in that flow that could allow users to register without an email
There is no bug in dealing with the payload the client sends to the application
No one will delete the email in a different application flow after it is created
No one will access the database and manually delete it

If the email column is not nullable then all the above can still happen, but they will result in an error, we’ll be alerted and fix whatever hole allowed the error to happen. We’ll catch bugs earlier and make it impossible to have users without emails in our system.

The alternative scenario, where we don’t catch these issues immediately, is that we now have an impossible state in our system that we’re not yet aware of. We’ll have all kinds of flows built with the assumption that an email will always be present which will break when they run.

But when will that happen? If we’re lucky, it’ll happen soon. Otherwise, it can happen months down the line, and we’ll be left wondering how that happened. Was there a fault in the sign-up flow at the time the user was created? Is the fault still present or has it been fixed in the meantime? How do we remediate this? Can we delete the user, force them to fill an email, or will the system need to handle users without emails from now on?

Application code

Here’s a very simplified implementation of an attemptDelivery method, that moves a package status from preparing to in_transit.

type PackageStatus = "preparing" | "in_transit" | "delivered";

interface Package {
  status: PackageStatus;
  update(fields: { status: PackageStatus }): void;
}

function attemptDelivery(pkg: Package): void {
  switch (package.status) {
    case "preparing":
      package.update({ status: "in_transit" });
      break;
    case "in_transit":
      // No-op
      break;
    case "delivered":
      throw new Error("Cannot attempt delivery of already delivered package");
  }
}

This implementation does a good job of declaring its assumptions: instead of assuming the package’s status will be preparing it exhausts all the possible status values and deals with them accordingly. It even documents an impossible state: the attemptDelivery action should never be called for packages that already have been delivered. We could argue the same should apply to in_transit but making the operation idempotent seems reasonable. By raising an error, this state is getting flagged as an issue which can be investigated and fixed.

While this approach handles all the possible statuses as of now, what would happen if a waiting_payment status was to be added in the future? An impossible state is now permitted to go through. The operation that attempts a delivery will not fail for a package that was not yet paid, which is problematic. A good way to defend against such a scenario is to always exhaust the switch statement options and use the default branch to raise awareness of impossible states:

function attemptDelivery(package: Package): void {
  switch (package.status) {
    case "preparing":
      package.update({ status: "in_transit" });
      break;
    case "in_transit":
      // No-op
      break;
    case "delivered":
      throw new Error("Cannot attempt delivery of already delivered package");
    default:
      throw new Error(`Impossible state: status ${package.status} not recognized`);
  }
}

The same can be said for if-else scenarios. Say we have a package that has either a tracking_id or an external_tracking_url.

function trackingUrl(): string {
  if (package.trackingId) {
    return getTrackingUrl(package.trackingId);
  } else {
    return externalTrackingUrl();
  }
}

What happens when both tracking_id and external_tracking_url are not present? The assumption is that one will exist, but that might not always be true. Exhausting all the options via else if is the more defensive approach protecting against impossible states.

function trackingUrl(): string {
  if (package.trackingId) {
    return getTrackingUrl(package.trackingId);
  } else if (externalTrackingUrl()) {
    return externalTrackingUrl();
  } else {
    throw new Error("Impossible state: both tracking_id and external_tracking_url are missing");
  }
}

Closing thoughts

These examples are some of the patterns I regularly use to prevent those impossible states as early as possible.
Still, these are as basic as it gets. Complex software systems have lots of intertwined requirements, which makes it much more difficult to avoid impossible states.

I usually ask these questions when designing new flows:

“What invalid states could this flow be allowing?”
“Which assumptions do I currently have that aren’t properly enforced?”

I find them good prompts to help me spot edge cases. Addressing those edge cases makes the software I build far more resilient.

Bugs still happen all the time. I just find them much sooner. And it always makes me smile when I see an alert with a message starting with “Impossible state”, because it means I just prevented a bunch of trouble down the line.