Learning Rust: Look Ma, No Null!

Introduction

Rust is a systems programming language (think C-like) that makes it easier to perform memory-safe operations than languages like C or C++. It accomplishes this by making it harder to do memory-unsafe operations–and catching these sorts of issues at compile-time instead of runtime.

In order to accomplish this, Rust imposes some constraints on the engineer through the borrow checker and immutable-by-default types. I’m not going to write about those things here as they have been covered in depth by others.

My focus for this post (and other posts in this potential series) is to focus on other language features and idioms that may be unfamiliar to managed-language developers.

No Null

Rust does not have a the concept of null. Consider the following declaration of a struct:

let employee = Employee {
    first_name: "Chris",
    last_name: "McKenzie",
};

In this code sample, an Employee struct is immediately created in memory. There is no option to create a variable of type Employee and leave it unassigned. This might not seem terrible on its face, but what about data we don’t have? What if we add a birth date field to the struct but we don’t know what it is?

struct Employee {
    first_name: &str,
    last_name: &str,
    birth_date: &str,
}

let employee = Employee {
    first_name: "Chris",
    last_name: "McKenzie",
    birth_date: // not sure what to do here?
}

Option

For this scenario, Rust has a built-in enum called Option<T>. An option has two variants: Some and None which are automatically imported for you. To represent the above code, you would write this:

#[derive(Debug)]
struct Employee {
    first_name: String,
    last_name: String,
    birth_date: Option<String>
}

pub fn main() {
    let employee = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None
    };

    println!("{:?}", employee);
}

or

#[derive(Debug)]
struct Employee {
    first_name: String,
    last_name: String,
    birth_date: Option<String>
}

pub fn main() {
    let employee = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::Some(<value>)
    };

    println!("{:?}", employee);
}

Since rust imports std::Option for you automatically, the above can be rewritten without the Option:: prefix:

birth_date: None
...
birth_date: Some(<value>)

Am I Cheating?

Wait! Isn’t “None” just null under the covers?

Well–no. To understand the difference, you first need to understand a few things about enums.

The members of an enum are called variants. Variants can carry data, and variants must be matched.

Let’s examine what this means in practice for a moment.

Building an Enum

Consider the requirement that we represent both salaried and non-salaried employee types. The annual salary must be stored for salaried employees. The hourly rate, overtime rate, and normal hours must be stored for the hourly employee.

In managed languages such as C# you might represent that as follows:

public enum EmployeeType {
    Salaried,
    Hourly,
}

public class Employee {
    public string FirstName {get; set;}
    public string LastName {get; set;}
    public DateTime? BirthDate {get; set;}
    public EmployeeType Type {get; set;}

    // should only be stored for Salaried employees
    public double AnnualSalary {get; set;} 

    // should only be stored for Hourly employees
    public double HourlyRate {get; set;} 
    // should only be stored for Hourly employees
    public double OvertimeRate {get; set;} 
    // should only be stored for Hourly employees
    public int NormalHours {get; set;} 
}

This structure is fine but has the drawback that it’s possible to populate the class with data that’s invalid–e.g., setting the HourlyRate for a Salaried employee is a violation of the business rules–but nothing in the structure of the code prevents it. It’s also possible to fail to provide any of the needed information.

Obviously, you can enforce these rules through additional code, but again–nothing in the structure of the code prevents it–or even communicates the requirements.

I could do exactly the same thing in Rust of course and it doesn’t look much different:

#[derive(Debug)]
pub enum EmployeeType {
    Salaried,
    Hourly,
}

#[derive(Debug)]
pub struct Employee {
    pub first_name: String,
    pub last_name: String,
    pub birth_date: Option<String>,
    pub employee_type: EmployeeType,

    // should only be stored for Salaried employees
    pub annual_salary: Option<f32>,

    // should only be stored for Hourly employees
    pub hourly_rate: Option<f32>, 
    // should only be stored for Hourly employees
    pub overtime_rate: Option<f32>, 
    // should only be stored for Hourly employees
    pub normal_hours: Option<u8>, 
}

pub fn main() {
    let employee = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None,
        employee_type: EmployeeType::Salaried,
        annual_salary: None,
        hourly_rate: None,
        overtime_rate: None,
        normal_hours: None,
    };

    println!("{:?}", employee);
}

Ideally, we’d want to bind the annual salary information to the Salaried employee type, and the hourly information to the Hourly employee type. It turns out–we can!

#[derive(Debug)]
pub enum EmployeeType {
    Salaried { annual_salary: f32 },
    Hourly { rate: f32, overtime_rate: f32, normal_hours: u8 } ,
}

#[derive(Debug)]
pub struct Employee {
    pub first_name: String,
    pub last_name: String,
    pub birth_date: Option<String>,
    pub employee_type: EmployeeType,
}

pub fn main() {
    let salaried = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None,
        employee_type: EmployeeType::Salaried {
            // one, MILLION DOLLARS! Mwa ha ha ha ha ha ha! (I wish!)
            annual_salary: 1_000_000.00
        },
    };

    println!("{:?}", salaried);

    let hourly = Employee {
        first_name: "Fred".to_string(),
        last_name: "Bob".to_string(),
        birth_date: Some("1/1/1970".to_string()),
        employee_type: EmployeeType::Hourly {
            rate: 20.00, 
            overtime_rate: 30.00, 
            normal_hours: 40
        }
    };

    println!("{:?}", hourly);
}

Suddenly things get very interesting because we can express through the structure of the language what fields are required for which enum variant. Since all fields of a struct must be filled out when it is created, the compiler is able to force us to provide the correct data for the correct variant.

Nice!

Reading our enum variant

The match statement in Rust roughly corresponds to switch or if-else in C# or Java. However, it becomes much more powerful in the context of enum variants that carry data.

#[derive(Debug)]
pub enum EmployeeType {
    Salaried { annual_salary: f32 },
    Hourly { rate: f32, overtime_rate: f32, normal_hours: u8 } ,
}

#[derive(Debug)]
pub struct Employee {
    pub first_name: String,
    pub last_name: String,
    pub birth_date: Option<String>,
    pub employee_type: EmployeeType,
}

pub fn main() {
    let salaried = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None,
        employee_type: EmployeeType::Salaried {
            // one, MILLION DOLLARS! Mwa ha ha ha ha ha ha! (I wish!)
            annual_salary: 1_000_000.00
        },
    };


    let hourly = Employee {
        first_name: "Fred".to_string(),
        last_name: "Bob".to_string(),
        birth_date: Some("1/1/1970".to_string()),
        employee_type: EmployeeType::Hourly {
            rate: 20.00, 
            overtime_rate: 30.00, 
            normal_hours: 40
        }
    };

    let employees = vec![salaried, hourly];

    for item in employees {
        // match on employee type
        match item.employee_type {
            EmployeeType::Salaried {annual_salary} => {
                println!("Salaried! {} {}: {}", item.first_name, item.last_name, annual_salary);
            },
            EmployeeType::Hourly {rate, overtime_rate: _, normal_hours: _ } => {
                println!("Salaried! {} {}: {}", item.first_name, item.last_name, rate);
            }
        }

    }

}

I need to point out a couple of things about the match statement.

  1. Each variant must fully describe the data in the variant. I can’t simply match on EmployeeType::Salaried without bringing the rest of the data along.
  2. match requires that every variant of the enum be handled. It is possible to use “_" as a default handling block, but ordinarily you would just handle all of the variants.

Back to Option

The language definition for Option is as follows:

pub enum Option<T> {
    /// No value
    None,
    /// Some value `T`
    Some(T),
}

None is a variant with no data. Some is a variant carrying a tuple with a single value of type T.

Anytime we match on an Option for processing, we are required to handle both cases. None is not the same as null. There will be no “Object reference not set to instance of object.” errors in Rust.

You can play with this code and run it in the playground.

Can This Idea Be Useful Elsewhere?

I’ve you’ve not spent a decent amount of time tracking down null reference exceptions in managed languages, this may not be that interesting to you. On the other hand, the Optional value concept can help drive semantic clarity in your code even if you’re not working in Rust. I’m not arguing that you should stop using null in C# & Java. I am arguing that you should add the concept of Optional values to your toolbox.

The Optional library for C# provides an implementation of this time. It’s worth reviewing the README. Java introduced the Optional type in version 8.

Leave a Reply

%d bloggers like this: