Reason #714 that I’m loving F#: Discriminated Unions

The more experience I gain with F#, the more I like it. So, when contemplating how I might convince someone to give it a try, I briefly contemplated which feature of the language might be most compelling, and quickly decided on Discriminated Unions.

I’ll try to explain the value of Discriminated Unions by walking you through an example, rather than trying to define them in a paragraph format. I will walk through the example by using a C#-centric perspective, because that is where most of my experience and the experience of my peers tends to be.

Exceptions

Consider the following C# code snippet which is designed to throw various exceptions if it is unable to perform its task:

public decimal CalculateSalesTax(Invoice invoice)
{
    if (invoice == null)   throw new ArgumentNullException("invoice");

    if (invoice.Customer.Address.State == null)
    {
        throw new InvalidOperationException("State is required for sales tax calculation!");
    }

    // Pretend sales tax calculation
    return 1.23m;
}

Typically, teams will struggle to consistently design code in a manner which will reliably handle an exception in such a way that both provides a sensible response to the user of the software, but which also does not obscure details about the cause or origin of the exception from future maintainers of the software. Designing code to sensibly handle exceptions is both an art and a science, and few teams manage to come close to getting it right.

Additionally, when using someone else’s code (such as a 3rd party library) it can be difficult or impossible to know or anticipate every type of exception that the code might throw. Therefore it can be difficult or impossible to design your code with certainty that some unknown exception type will not make it crash in the future. Extensive testing (both manual and automated) can help you discover errors that your code might encounter, but you can never be completely certain that you’ve handled all scenarios until you’ve handed your code off to customers and your customers have been using your software forever.

Null References

Null references can become the bane of programming in C#. Until software developers have mastered the right combinations of Null Object Patterns, Invariant Protections, and contextual constraints, any object reference can be a potentially application-crashing null reference.

Many software development teams fail to master these techniques, and instead resort to what has been cleverly termed the Fly Swallower Anti-Pattern. That is to say, nearly every object reference is checked for null, and unfortunately, the context leading to the object’s nullness in the first place is not fixed.

The code sample above can be modified for a nicely typical example:

public decimal CalculateSalesTax(Invoice invoice)
{
    if (invoice != null
        && invoice.Customer != null
        && invoice.Customer.Address != null
        && invoice.Customer.Address.State != null)
    {
        // Pretend sales tax calculation
        return 1.23m;
    }

    return 0m;
}

It might look ugly, you might know better than to ever do anything like this, but man, I see this sort of thing all the time. This tendency can be even more prevalent in void methods because the compiler does not enforce that the method returns any value or even performs any operation at all.

Both of these problems, Exceptions and Null references, and be virtually eliminated, or at least handled far, far more gracefully and at compile time, using F# and Discriminated Unions.

Consider this F# code snippet:

type Option<'a> =       
   | Some of 'a         
   | None   

… and replace the <‘a> with < T > in your head if it helps, to translate the generic type parameter into C#-ese. We can use this discriminated union to write code as follows:

type CustomerAddress = {
    HouseNumber : int
    StreetName : string
    State : string    
}

type Invoice = {
      Address : CustomerAddress Option 
}

let calculateSalesTax invoice =
    match invoice.Address with
    | Some x -> 10
    | None -> failwith "Need a customer address calculate sales tax!"

(Ignore for the moment that failwith is basically throwing an exception. We will first address the issue of the possible nullness of the State property of the invoice record, and then will provide a more elegant solution to the exception throwing in a bit.)

So far, having declared our State as a string Option prevents us (and future developers) from failing to consider that the Address portion of the Invoice record is optional. The compiler prevents us from doing this:

let calculateSalesTax invoice =
    match invoice.Address.State with
    | "WA" -> 10
    | _ -> 0

This code snippet fails at compile time, with the error: Type constraint mismatch. The type ‘Option< CustomerAddress >’ is not compatible with the type ‘CustomerAddress’

So in other words, the compiler makes it absolutely impossible for you, or any other developer, to neglect to consider that the Invoice’s CustomerAddress might or might not exist.

You can still encounter null references just a readily when interacting with any other .NET code but at the very least, you can limit your code’s awareness of null references at the F# boundary by converting every reference you receive into an Option type.

And oh yeah, although creating a discriminated union for optional references is no more difficult than the sample declaration above, this particular type is already built right into the F# language for you.

So now we want to tackle how to improve the failwith above to be something a bit less error prone. For this, I will use a success/failure discriminated union which I have ganked from the site fsharpforfunandprofit.com, which itself is a really fantastic resource:

type Result<'TSuccess,'TFailure> = 
    | Success of 'TSuccess
    | Failure of 'TFailure  

The sales tax calculation can then be modified to use the success/failure discriminated union as follows:

let calculateSalesTax invoice =
    match invoice.Address with
    | Some x -> Success 10
    | None -> Failure "Need a customer address calculate sales tax!"        

Now, it is impossible to neglect to account for the fact that calculateSalesTax can fail in some situations. The following code snippet:

let processInvoice invoice =
    let salesTax = calculateSalesTax invoice
    let productsTotal = 100 // todo: leverage F#'s unit of measure feature
    let invoiceTotal = productsTotal + **salesTax** // problem occurs here
    invoiceTotal

… results in the compilation error The type ‘Result< int,string >’ does not match the type ‘int’

To get the code to compile, both Success and Failure cases must be accounted for, such as follows:

let processInvoice invoice =
    let salesTax = calculateSalesTax invoice
    let productsTotal = 100 // todo: leverage F#'s unit of measure feature
    match salesTax with
    | Success x -> Success (productsTotal + x)
    | Failure x -> Failure x

… which will require the next caller in the call stack to account for the potential failure of the operation, and so on.

There are much more elegant solutions to reap these exact same benefits without having a match … with | Success | Failure explosion in your code, but I’ll save that for a future blog post, or perhaps just refer once again to the excellent article here.

Hopefully I’ve given at least a good enough overview to illustrate to an experienced C# developer how some of these F# techniques can be used to create less error prone code. Huge classes of errors that plague C# code can be eliminated at compile time in F#.