Wrapping My Head Around Rust Enums

Wrapping My Head Around Rust Enums

I've already written some about my first impressions of Rust, but there's one feature of the language I use all the time but still don't fully understand: enums.

Traditionally (maybe better put: in languages I've used in the past), the enum word designated a type with a set of values, and that type often behaves as set of constants. Since I talked about pre-commit last time, I'll keep discussion to git hooks for my examples.

This post is maybe less about how to effectively use Rust enums and more about how I've struggled to untangle my previous ideas of enums.


Java Style Enums

In Java, one might describe the different hooks available as an enum:

public enum Hook {
    COMMIT_MESSAGE,
    POST_CHECKOUT,
    POST_COMMIT,
    POST_MERGE,
    PREPARE_COMMIT_MESSAGE,
    PRE_COMMIT,
    PRE_MERGE_COMMIT,
    PRE_PUSH,
    ;
}

From this collection of constants, I can do predictable things. I can use switch statements or expressions to match the instance being passed into a function without the complexity of managing hardcoded strings across the codebase. But more importantly, it collects all similar, constant behaviors into one namespace and location: the Hook class.

I can use this to leverage the compiler to tell me when I need to implement new functionality in the places I use the enum. For  example, this function:

public static String name(Hook hook) {
    return switch (hook) {
        COMMIT_MESSAGE -> "commit-msg";
        POST_CHECKOUT -> "post-checkout";
        POST_COMMIT -> "post-commit";
        POST_MERGE -> "post-merge";
        PREPARE_COMMIT_MESSAGE -> "prepare-commit-msg";
        PRE_COMMIT -> "pre-commit";
        PRE_MERGE_COMMIT -> "pre-merge-commit";
        PRE_PUSH -> "pre-push";
    }
}

If someone were to remove a git hook from the enum, the compiler would fail because that enum element would no longer exist. If someone added a new hook, the compiler would fail because my switch statement wouldn't have a case for the new hook type.

Now, that's not really how you'd do this in Java. In Java you'd know that each enum element is its own class, but it is guaranteed to be a singleton class. It is a constant that can represent arbitrary functionality. Just for example, I could take that name function above and turn it into a property on the enum. I could then use that name to turn a String into an enum entry:

public enum Hook {
    COMMIT_MESSAGE("commit-msg"),
    POST_CHECKOUT("post-checkout"),
    POST_COMMIT("post-commit"),
    POST_MERGE("post-merge"),
    PREPARE_COMMIT_MESSAGE("prepare-commit-msg"),
    PRE_COMMIT("pre-commit"),
    PRE_MERGE_COMMIT("pre-merge-commit"),
    PRE_PUSH("pre-push"),
    ;

    public final String name;

    Hook(String name) {
        this.name = name;
    }

    public static Optional<Hook> fromName(String name) {
        Stream.of(Hook.values())
            .filter(hook -> hook.name.equalsIgnoreCase(name))
            .findFirst();
    }
}

There are a couple very important things to note here (it will sound very repetitive):

  1. Enums elements are constant, so their fields will always have the same value.
  2. Because enum fields and functions provide static functionality, I can load all the elements of an enum into an array with Hook.values().
  3. Because I can load all elements of an enum into an array, I can iterate over all those elements. And because they all provide static functionality, I don't have to specify the "pre-commit" string in more than one place: it will always be available on the enum element.

There are many other things to explore with Java enums, but that's not what this point is about. This point is about Rust and its enum type, and I think this example is already complex enough for me to show how I struggled with Rust enums.

Rust enums are tagged union types. They are not the traditional enum despite sharing a name. The elements of a Rust enum do not have to adhere to the same interface, they are only brought together into the same namespace.

If you didn't need to attach data to a Rust enum, you could use it in the same way you would a Java enum:

#[derive(Debug, PartialEq)]
pub enum Hook {
    CommitMessage,
    PostCheckout,
    PostCommit,
    PostMerge,
    PreCommit,
    PreMergeCommit,
    PrePush,
    PrepareCommitMessage,
}

impl Hook {
    fn name(&self) -> &'static str {
        match self {
            Hook::CommitMessage => "commit-msg",
            Hook::PostCheckout => "post-checkout",
            Hook::PostCommit => "post-commit",
            Hook::PostMerge => "post-merge",
            Hook::PreCommit => "pre-commit",
            Hook::PreMergeCommit => "pre-merge-commit",
            Hook::PrePush => "pre-push",
            Hook::PrepareCommitMessage => "prepare-commit-msg",
        }
    }
}

Pretty straightforward: none of the functionality is attached to the enum entry itself, but functions can use that to provide consistent behavior if needed.

The problem is if I both need to read and write the same string from an enum. In Java, I could attach a string to the enum entry itself, but in Rust I cannot do the same thing. Even more problematic, because enums in Rust are not constants, I cannot iterate over all the elements of the enum! Some of them might be have different fields, and if those fields don't have defaults, you have no way to construct that type. So I bit the bullet and just wrote a separate function:

impl FromStr for Hook {
    type Err = String;

    fn from_str(input: &str) -> Result<Hook, Self::Err> {
        match input.to_lowercase().as_str() {
            "commit-msg" => Ok(Hook::CommitMessage),
            "post-checkout" => Ok(Hook::PostCheckout),
            "post-commit" => Ok(Hook::PostCommit),
            "post-merge" => Ok(Hook::PostMerge),
            "pre-commit" => Ok(Hook::PreCommit),
            "pre-merge-commit" => Ok(Hook::PreMergeCommit),
            "pre-push" => Ok(Hook::PrePush),
            "prepare-commit-msg" => Ok(Hook::PrepareCommitMessage),
            _ => Err(format!("Unrecognized hook {}", input)),
        }
    }
}

This all worked, but it was fragile. While I did have the compiler to make sure I could read from every element in the enum, it wouldn't help me make sure I had a way to construct every element in the enum. I ended up writing tests to make sure I could do a string round trip:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn pre_commit_string_and_back() {
        let pre_commit = Hook::PreCommit;
        assert_eq!(pre_commit, Hook::from_str(pre_commit.name()).unwrap());
    }
}

Not the best tests, which were also fragile because I didn't have a way to make sure I was testing all my enum variants. What would happen if I wanted to add extra functions to my enum? Without inheritance, it's not possible, at least in the same manner that Java exposes.


Change of Thinking

I spent a disturbing amount of time fighting this concept in my head. At every turn, every solution I came up with still failed one of the most basic concepts of Java enums: the ability to enumerate all the different variants.

And then some time last week it clicked. This wasn't really what Rust enums are for. An enum in Rust is more similar to a sealed class in Java, and yet despite knowing this going in I never could make the leap. I'm still no master of the enum, but I'm starting to appreciate how it allows you to group abstract concepts instead of only constant behaviors.

In the pre-commit world, a hook can be from a plugin or one you wrote yourself. These hooks enumerated in the same rough location of the configuration file and definitely represent the same concept, but their actual configuration differs wildly: a plugin only requires an identifier, while a local hook requires the shell command to execute. The simplest form of this is perfectly modeled by a union type, which is where Rust enums shine (apologies for the name conflict):

pub enum Hook {
    Remote {
        id: String,
    },
    Local {
        id: String,
        name: String
        entry: String,
        args: Vec<String>,
    },
}

This is a very basic example, but it's what started me on the path to using Rust enums in a way that makes more sense. I can think about groups of objects (in the forms of enums) as separate from groups of behaviors (in the forms of traits). The problem is that I've spent most of my programming life forced to think about these as the same thing, and it'll just take time before I bridge the next hurdle and really "grok" the Rust type system.