Category Archives: Uncategorized
Learning Rust: Look Ma, No Exceptions!

Introduction

Rust is a systems programming language (think C-like) that makes it easier to perform memory-safe operations than languages like C or C++. It accomplishes this by making it harder to do memory-unsafe operations–and catching these sorts of issues at compile-time instead of runtime.

In order to accomplish this, Rust imposes some constraints on the engineer through the borrow checker and immutable-by-default types. I’m not going to write about those things here as they have been covered in depth by others.

My focus for this post (and other posts in this potential series) is to focus on other language features and idioms that may be unfamiliar to managed-language developers.

In my first post in this series, I talked about the fact that Rust does not have the concept of null.

No Exceptions!

Rust does not have the concept of exceptions or the associated concept of try-catch blocks. This is because once you get code to compile in Rust you can be sure there are no errors anywhere… just kidding. 😃

Instead, in Rust we use an enum type called std::result::Result<T, E> . The T in the generic signature is the return result. The E represents the type of the Error should one occur. The two variants of ResultOk(value) and Err(error)–are always in scope, similarly to the Some(value) and None variants of theOption type.

A Naive Example

Consider the following made-up function:

fn find_data(i: u32) -> Result<u32, String> {
    match i {
        1 => Err("1 is not a valid value".to_string()),
        _ => Ok(i*2)
    }
}

This function accepts an integer and doubles it. For whatever reason, 1 is not considered to be a valid value, so an error message is returned instead. Notice that Ok and Err are used to wrap the return and error values.

Now let’s look at how we would use the Result type in a another function:

    let result = find_data(5);
    match result {
        Ok(value) => {
            println!("The result was {}", value);
        },
        Err(message) => {
            println!("{}", message);
        }
    }

The type of result is std::result::Result<i32, String>. We then treat it like any other enum, matching on the variants and doing the correct processing.

Adding Complexity

Things start to get a little complicated if we have a series of potential errors. Consider retrieving some data from a database. We could fail to connect to the database, construct our query correctly, or map the raw data to our intended representation.

fn get_employee_by_id(id: i32) -> Result<Employee, DataRetrivalError> {
    let connection = Database::create_connection();
    match connection {
        Ok(conn) => {
            
            let raw_data = conn.execute("EmployeeByIdQuery", id);
            
            match raw_data {
                Ok(data) => {
                    Employee = Employee::from_raw_data(data)                    
                }
                Err(error) => {
                    Err(DataRetrievalError::QueryFailed)
                }
            }
            
        },
        Err(error) => {
            Err(DataRetrivalError::ConnectionFailed)
        }
    }
}

Yuck! This is pretty ugly. We could improve readability by removing the nesting:

fn get_employee_by_id(id: i32) -> Result<Employee, DataRetrivalError> {
    let connection_result = Database::create_connection();
    
    if connection_result.is_err() {
        return connection_result;
    }
    
    let connection = connection_result.unwrap();
    
    let raw_data = connection.execute("EmployeeByIdQuery", id);

    if (raw_data.is_err()) {
        return raw_data;
    }
    
    let data = raw_data.unwrap();
    
    Employee::from_raw_data(data)

}

This is better, but still pretty ugly. Fortunately, Rust offers some syntactic sugar to clean this up a lot in the form of the ? operator. The ? early return the result if it’s an error and unwrap it if it’s not. Here is the function rewritten to use the ? operator.

fn get_employee_by_id(id: i32) -> Result<Employee, DataRetrivalError> {
    let connection = Database::create_connection()?;
    let data = connection.execute("EmployeeByIdQuery", id)?;
    Employee::from_raw_data(data)
}

Much nicer!

If the error returned from an inner function does not match the error type expected by the outer function, the compiler will look for a From implementation and do the type-coercion for you.

Comparing to Exception-based Languages

Rust’s error handling strategy does a great job of communicating possible failure modes since the error states of part of the signature of any function you call. This is a clear advantage over exception-based languages in which you (usually) have to read the documentation to know what exceptions can possibly occur.

On the other hand, it’s fairly common in exception-based languages to have some root handler for unhandled exceptions that provides standard processing for most errors.

In Rust, adding error handling can force you to edit much more code than in exception-based languages. Consider the following set of functions:

fn top_levl() -> i32 {
    mid_level1() + mid_level2()
}

fn mid_level1() -> i32 {
    low_level1 + low_level2()
}

fn mid_level2() -> i32 {
    low_level1() * low_level2()
}

fn low_level1() -> i32 {
    5
}

fn low_level2() -> i32 {
    10
}

The top_level function depends on the two mid_level functions which in turn depend on the two low_level functions. Consider what happens to our program if low_level2 is modified to potentially return an error:

fn top_levl() -> Result<i32, String> { // had to change this signature
    mid_level1() + mid_level2()
}

fn mid_level1() -> Result<i32, String> { // had to change this signature
    low_level1 + low_level2()
}

fn mid_level2() -> Result<i32, String> {
    low_level1() * low_level2()
}

fn low_level1() -> i32 {
    5
}

fn low_level2() -> Result<i32, String> {
    Ok(10)
}

This sort of signature change will often bubble through the entire call stack, resulting in a much larger code-change than you would find in exception-based languages. This can be a good thing because it clearly communicates the fact that a low level function now returns an error. On the other hand, if there really is no error handling strategy except returning an InternalServerError at an API endpoint, then requiring that every calling function change its signature to bubble the error is a fairly heavy tax to pay (these signature changes can also have their own similar side-effects in other call-paths).

I’m not making the argument that Rust error handling is therefore bad. I’m just pointing out that this error design has its own challenges.

Error Design Strategies

While mechanism by which errors are generated and handled in Rust is fairly simple to understand, the principles you should use in desigining your errors is not so straightforward.

There are essentially three dominant strategies available for designing your error handling strategy for your library or application:

Strategy Description Pros Cons
Error Per Crate Define one error enum per crate. Contains all variants relevant to all functions in the crate.
  • There is only one error enum to manage.
  • Very little error conversion code (From implementations) will be required.
  • The crate-level enum will have many variants.
  • Individual functions will only potentially return a subset of the crate-level errors but this subset will not be obvious to callers.
Error Per Module Define one error per module. Contains all variants relevant to functions in that module.
  • Much smaller footprint than error per crate.
  • Errors are contextually more relevant than crate-level error variants.
  • This strategy still has the same drawbacks as the Error per crate strategy.
  • Depending on how deep the module structure is, you could end up with a proliferation of error types.
Error Per Function Define one error per function. Only contains variants relevant to that function.
  • Each function defines its own error variants so its obvious what the caller may need to handle.
  • Proliferation of error types throughout the system which makes the crate or module more difficult to understand.

Hybrid Strategy

I don’t think I have the right answer yet, but this hybrid strategy is the one I’ve settled on in my personal development. It basically creates an error hierarchy for the create that gets more specific as you approach a given function.

  1. Define an error enum per function.
  2. Define an error per module, the variants of which “source” the errors per function.
  3. Define an error per crate, the variants of which “source” the errors per module.

pub enum ConfigFileErrors {
    FileNotFound { path: String },
}

fn load_config_file(path: String) -> Result<ConfigFile, ConfigFileErrors> {
    // snipped
}

pub enum ParsingError {
    InvalidFormat
}

fn parse_config(config: ConfigFile) -> Result<ConfigurationItems, ParsingError> {
    // snipped
}

pub enum ValidationError {
    RequiredDataMissing { message: String }
}

fn validate_config(input: ConfigurationItems) -> Result<ConfigurationItems, ValidationError> {
    // snipped
}

pub enum ConfigErrors {
    File { source: ConfigFileErrors },
    Parsing { source: ParsingError },
    Validation { source: ValidationError }
}

fn get_config() -> Result<ConfigurationItems, ConfigErrors> {
    let file = load_config_file("path/to/config".to_string())?;
    let parsed = parse_config(file)?;
    validate_config(parsed)
}

This approach has many of the pros and cons of the other approaches so it’s not a panacea.

Pros:

  • Each function clearly communicates how it can fail and is not polluted by the failure modes of other functions.
  • No information is lost as you bubble up the call-stack as each low-level error is packaged in a containing error.
  • The caller gets to match at the top-level error and decide for themselves if they wish to take finer-grained control of inner errors.

Cons:

  • Proliferation of error types.
  • New failure modes potentially impact the top-level crate design (e.g., adding a failure mode becomes a breaking change requiring a major revision if you are practicing Semantic Versioning.
  • It’s not obvious how to deal with error variants that may be shared across multiple functions (e.g., parsing errors).
Fear is the (Software) Killer

“I must not fear. Fear is the mind-killer. Fear is the little-death that brings total obliteration. I will face my fear. I will permit it to pass over me and through me. And when it has gone past I will turn the inner eye to see its path. Where the fear has gone there will be nothing. Only I will remain.”

― Frank Herbert, Dune

That iconic passage from Frank Herbert is something I think about when I encounter code that engineers are afraid to touch. When an engineering team is afraid to touch a piece of code then the software is no longer “soft.” Instead, current and future choices become constrained by decisions and mistakes of the past.

This fear will often lead teams to try to rewrite and replace the software rather than modify it. The results of this effort is almost always bad. Some of the problems you encounter during the Great Rewrite are:

  1. The legacy software is still running in production.
    • It still gets bug fixes.
    • It still gets new mission-critical features that have to be replicated in the new software.
  2. No one wants to cut over to the new software until it’s feature parable with the legacy software.
    • Feature Parability is hindered by the inability to modify the old software.
    • If the cutover happens before parability is achieved, I’ve seen people express that they’d rather go back to the old and busted system because at least it did what they needed it to.
  3. The planning and engineering techniques used to produce the first set of software–and the corresponding rigidity that led to the Great Rewrite–have not changed.
    • The Great Rewrite will ultimately go through the same lifecycle and have to be rewritten again.

These problems multiply if the authors of The Great Rewrite are a different team than the one that maintains the existing system.

What’s the Alternative?

The alternative is to save your existing software. If you are afraid to change a piece of code, you need to take steps to remove that fear.

  1. Spend time with the code to understand it.
  2. Stand the system up in a test environment so that you can experiment with it to learn it’s edge-cases and wrinkles.
  3. Before making any change, cover the test environment with black-box automated tests that can verify behavior.
    • If the tests cannot be fully automated (sadly the case with some old “Smart UI” style applications), then document the test cases and automate what you can.
  4. Analyze the error logs to make sure you understand the existing failures in the system as well as the rate at which they occur.
  5. Make the desired change to the system.
    • At this point you will have to be careful. You will need to do a post-change analysis of the error logs to look for anomalous errors. The first log analysis is your baseline.
  6. Once you are confident in the change, cover it with as many automated tests as you can.
  7. Once you have great test coverage, aggressively refactor the code for clarity.

This process is time-consuming and expensive. For this teams people try to find shortcuts around it. Unfortunately, there are no shortcuts. The path of the Great Rewrite is even longer and more expensive. It just has better marketing.

But I Really Have to Rewrite!

There are times when a rewrite is unavoidable. Possible reasons might be:

  1. The technology that was used in the original software is no longer supported.
    • It may also be old enough that it’s difficult to find people willing or able to support it–which amounts to the same thing.
  2. The component is small enough that a rewrite is virtually no risk.
  3. The component cannot be ported to the new infrastructure it is intended to run on (e.g., cloud, mobile, docker, etc).

In these cases, the process is the same as above–stand up a test environment. Wrap the old system in automated black-box acceptance tests. Limit yourself to targeting parity (no new features!) until the replacement is done.

Building Testing into your SDLC

Testing software is critically important to ensuring quality. Automated tests provide a lower Mean Time to Feedback (MTTF) for errors as well as enable developer’s to make changes without fear of breaking things. The earlier in the SDLC that errors can be detected and corrected, the better. (See the Test Pyramid). As engineers on the platform we should practice TDD in order to generate a thorough bed of unit tests. Unit tests alone do not ensure that everything works as expected so we will need gradually more sophisticated forms of testing.

There are different approaches to testing software. This document chooses to articulate types of automated testing by the point in the SDLC at which it is executed and by what it covers. There may be different strategies for testing at each of these lifecycle points (e.g., deterministic, fuzz, property-based, load, perf, etc..)

SDLC StageTypeTargetWho Runs Them?Description
Design / Build TimeUnitSingle ApplicationEngineer, CIIn process, no external resources. Mock at the Architectural boundaries but otherwise avoid mocks where possible.
IntegrationSingle ApplicationEngineer, CIThese tests will mostly target the adapters for external systems (e.g., file io, databases, 3rd party API’s, 1st party API’s that are not the component under test.)

Integration tests differ from acceptance tests in that they should never fail to an issue with an external service.
Post Deployment to Test EnvironmentAcceptanceEntire System or PlatformCI, CDLargely black box, end-to-end testing.

For bonus points, tie failures into telemetry to see if your monitors are alerting you.
Manual UX TestingEntire System or PlatformEngineer, QA, UsersThis testing is qualitative and pertains to the “feel” of the platform with respect to the user experience.
Post Production ReleaseSmokeEntire System or PlatformEngineer, CDA small suite of manual tests to validate production configuration.
Synthetic TransactionsEntire System or PlatformSystemBlack box, end-to-end use-case testing, automated, safe for production. These tests are less about correctness and more about proving the service is running.
Other?This is not an exhaustive list.

Emphasize Unit Tests

In general, our heaviest investment in testing should be done at the time the code is written. This means that unit tests should far outweigh other testing efforts. Why?

Unit tests are very low-cost to write and have very low Mean Time to Feedback (MTTF). This means they have the greatest ROI of any other kind of test.

The other kinds of testing are important but they get more complex as you move through the SDLC. This makes covering finicky edge-cases challenging from both an implementation and maintenance perspective. Unit Tests don’t have these drawbacks provided you follow good TDD guidance.

TDD

TDD is the strongly preferred manner of writing unit tests as it ensures that all code written is necessary (required by a test) and correct. Engineers who are not used to writing code in a TDD style often struggle with the practice in the early stages. If this describes your experience, be satisfied with writing tests for the code you’ve written in the same commit until it starts to feel natural.

The activity of TDD consists of three steps:

  1. (RED) Write a failng unit test.
  2. (GREEN) Write enough productino code to make it pass.
  3. (REFACTOR) Now make the code pretty.

The unit tests you write should strive to obey the three laws of TDD:

  1. Don’t write any production code unless it is to make a failing unit test pass.
  2. Don’t write any more of a unit test than is sufficient to fail; and compilation failures are failures.
  3. Don’t write any more production code than is sufficient to pass the one failing unit test.

Good unit tests have the following attributes:

  1. The test must fail reliably for the reason intended.
  2. The test must never fail for any other reason.
  3. There must be no other test that fails for this reason.

Further Reading

It’s impossible to fully convey the scope of what you should know about test automation in this document. Below are some resources you may be interested in as you move through your career.

  1. Test Driven Development: By Example by Kent Beck
  2. The Art of Unit Testing: 2nd Edition by Roy Osherove
  3. Working Effectively With Legacy Code by Michael Feathers
  4. Refactoring: Improving the Design of Existing Code (2nd Edition) by Martin Fowler
Learning Rust: Look Ma, No Null!

Introduction

Rust is a systems programming language (think C-like) that makes it easier to perform memory-safe operations than languages like C or C++. It accomplishes this by making it harder to do memory-unsafe operations–and catching these sorts of issues at compile-time instead of runtime.

In order to accomplish this, Rust imposes some constraints on the engineer through the borrow checker and immutable-by-default types. I’m not going to write about those things here as they have been covered in depth by others.

My focus for this post (and other posts in this potential series) is to focus on other language features and idioms that may be unfamiliar to managed-language developers.

No Null

Rust does not have a the concept of null. Consider the following declaration of a struct:

let employee = Employee {
    first_name: "Chris",
    last_name: "McKenzie",
};

In this code sample, an Employee struct is immediately created in memory. There is no option to create a variable of type Employee and leave it unassigned. This might not seem terrible on its face, but what about data we don’t have? What if we add a birth date field to the struct but we don’t know what it is?

struct Employee {
    first_name: &str,
    last_name: &str,
    birth_date: &str,
}

let employee = Employee {
    first_name: "Chris",
    last_name: "McKenzie",
    birth_date: // not sure what to do here?
}

Option

For this scenario, Rust has a built-in enum called Option<T>. An option has two variants: Some and None which are automatically imported for you. To represent the above code, you would write this:

#[derive(Debug)]
struct Employee {
    first_name: String,
    last_name: String,
    birth_date: Option<String>
}

pub fn main() {
    let employee = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None
    };

    println!("{:?}", employee);
}

or

#[derive(Debug)]
struct Employee {
    first_name: String,
    last_name: String,
    birth_date: Option<String>
}

pub fn main() {
    let employee = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::Some(<value>)
    };

    println!("{:?}", employee);
}

Since rust imports std::Option for you automatically, the above can be rewritten without the Option:: prefix:

birth_date: None
...
birth_date: Some(<value>)

Am I Cheating?

Wait! Isn’t “None” just null under the covers?

Well–no. To understand the difference, you first need to understand a few things about enums.

The members of an enum are called variants. Variants can carry data, and variants must be matched.

Let’s examine what this means in practice for a moment.

Building an Enum

Consider the requirement that we represent both salaried and non-salaried employee types. The annual salary must be stored for salaried employees. The hourly rate, overtime rate, and normal hours must be stored for the hourly employee.

In managed languages such as C# you might represent that as follows:

public enum EmployeeType {
    Salaried,
    Hourly,
}

public class Employee {
    public string FirstName {get; set;}
    public string LastName {get; set;}
    public DateTime? BirthDate {get; set;}
    public EmployeeType Type {get; set;}

    // should only be stored for Salaried employees
    public double AnnualSalary {get; set;} 

    // should only be stored for Hourly employees
    public double HourlyRate {get; set;} 
    // should only be stored for Hourly employees
    public double OvertimeRate {get; set;} 
    // should only be stored for Hourly employees
    public int NormalHours {get; set;} 
}

This structure is fine but has the drawback that it’s possible to populate the class with data that’s invalid–e.g., setting the HourlyRate for a Salaried employee is a violation of the business rules–but nothing in the structure of the code prevents it. It’s also possible to fail to provide any of the needed information.

Obviously, you can enforce these rules through additional code, but again–nothing in the structure of the code prevents it–or even communicates the requirements.

I could do exactly the same thing in Rust of course and it doesn’t look much different:

#[derive(Debug)]
pub enum EmployeeType {
    Salaried,
    Hourly,
}

#[derive(Debug)]
pub struct Employee {
    pub first_name: String,
    pub last_name: String,
    pub birth_date: Option<String>,
    pub employee_type: EmployeeType,

    // should only be stored for Salaried employees
    pub annual_salary: Option<f32>,

    // should only be stored for Hourly employees
    pub hourly_rate: Option<f32>, 
    // should only be stored for Hourly employees
    pub overtime_rate: Option<f32>, 
    // should only be stored for Hourly employees
    pub normal_hours: Option<u8>, 
}

pub fn main() {
    let employee = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None,
        employee_type: EmployeeType::Salaried,
        annual_salary: None,
        hourly_rate: None,
        overtime_rate: None,
        normal_hours: None,
    };

    println!("{:?}", employee);
}

Ideally, we’d want to bind the annual salary information to the Salaried employee type, and the hourly information to the Hourly employee type. It turns out–we can!

#[derive(Debug)]
pub enum EmployeeType {
    Salaried { annual_salary: f32 },
    Hourly { rate: f32, overtime_rate: f32, normal_hours: u8 } ,
}

#[derive(Debug)]
pub struct Employee {
    pub first_name: String,
    pub last_name: String,
    pub birth_date: Option<String>,
    pub employee_type: EmployeeType,
}

pub fn main() {
    let salaried = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None,
        employee_type: EmployeeType::Salaried {
            // one, MILLION DOLLARS! Mwa ha ha ha ha ha ha! (I wish!)
            annual_salary: 1_000_000.00
        },
    };

    println!("{:?}", salaried);

    let hourly = Employee {
        first_name: "Fred".to_string(),
        last_name: "Bob".to_string(),
        birth_date: Some("1/1/1970".to_string()),
        employee_type: EmployeeType::Hourly {
            rate: 20.00, 
            overtime_rate: 30.00, 
            normal_hours: 40
        }
    };

    println!("{:?}", hourly);
}

Suddenly things get very interesting because we can express through the structure of the language what fields are required for which enum variant. Since all fields of a struct must be filled out when it is created, the compiler is able to force us to provide the correct data for the correct variant.

Nice!

Reading our enum variant

The match statement in Rust roughly corresponds to switch or if-else in C# or Java. However, it becomes much more powerful in the context of enum variants that carry data.

#[derive(Debug)]
pub enum EmployeeType {
    Salaried { annual_salary: f32 },
    Hourly { rate: f32, overtime_rate: f32, normal_hours: u8 } ,
}

#[derive(Debug)]
pub struct Employee {
    pub first_name: String,
    pub last_name: String,
    pub birth_date: Option<String>,
    pub employee_type: EmployeeType,
}

pub fn main() {
    let salaried = Employee {
        first_name: "Chris".to_string(),
        last_name: "McKenzie".to_string(),
        birth_date: Option::None,
        employee_type: EmployeeType::Salaried {
            // one, MILLION DOLLARS! Mwa ha ha ha ha ha ha! (I wish!)
            annual_salary: 1_000_000.00
        },
    };


    let hourly = Employee {
        first_name: "Fred".to_string(),
        last_name: "Bob".to_string(),
        birth_date: Some("1/1/1970".to_string()),
        employee_type: EmployeeType::Hourly {
            rate: 20.00, 
            overtime_rate: 30.00, 
            normal_hours: 40
        }
    };

    let employees = vec![salaried, hourly];

    for item in employees {
        // match on employee type
        match item.employee_type {
            EmployeeType::Salaried {annual_salary} => {
                println!("Salaried! {} {}: {}", item.first_name, item.last_name, annual_salary);
            },
            EmployeeType::Hourly {rate, overtime_rate: _, normal_hours: _ } => {
                println!("Salaried! {} {}: {}", item.first_name, item.last_name, rate);
            }
        }

    }

}

I need to point out a couple of things about the match statement.

  1. Each variant must fully describe the data in the variant. I can’t simply match on EmployeeType::Salaried without bringing the rest of the data along.
  2. match requires that every variant of the enum be handled. It is possible to use “_" as a default handling block, but ordinarily you would just handle all of the variants.

Back to Option

The language definition for Option is as follows:

pub enum Option<T> {
    /// No value
    None,
    /// Some value `T`
    Some(T),
}

None is a variant with no data. Some is a variant carrying a tuple with a single value of type T.

Anytime we match on an Option for processing, we are required to handle both cases. None is not the same as null. There will be no “Object reference not set to instance of object.” errors in Rust.

You can play with this code and run it in the playground.

Can This Idea Be Useful Elsewhere?

I’ve you’ve not spent a decent amount of time tracking down null reference exceptions in managed languages, this may not be that interesting to you. On the other hand, the Optional value concept can help drive semantic clarity in your code even if you’re not working in Rust. I’m not arguing that you should stop using null in C# & Java. I am arguing that you should add the concept of Optional values to your toolbox.

The Optional library for C# provides an implementation of this time. It’s worth reviewing the README. Java introduced the Optional type in version 8.

NBuilder 6.0.1 Released

RELEASE NOTES 6.0.1

This is a bug fix release.

  • Bug Guid.Empty being incremented. Solution was to disable NBuilder’s property name for static or read-only fields. Thanks to Dominic Hemken for the PR.
  • Bug CreateListOfSize had undefined behavior when called from the static Builder and executed on multiple threads at the same time. While the future of NBulider will be to remove the static builder, it’s a defect in the current implementation. Thanks to Ovidiu Rădoi for the PR.
Testable Component Design in Rust

I consider myself an advanced beginner in Rust. There is still much I’m wrapping my head around–and I still get caught off guard by the “move” and “mutability” rules Rust enforces. However, in keeping with my personal emphasis, I’ve devoted my efforts to learning how to create automated tests in Rust. The below guidelines are not exhaustive, but represent my learning so far. Feedback is welcome!

Engineering Values

  • Code should be clean.
  • Code should be covered by automated tests.
    • Tests should be relatively easy to write.
  • Dependencies should be configurable by the components that use them (see Depedency Inversion Principle and Ports & Adapters)

Achieving These Values in Rust Component Design

These are great engineering values, but how do we achieve them practically in Rust? Here are my thoughts so far.

Required for Unit Testing

  • The component should provide a stable contract composed of traits, structs, and enums.
  • Structs exposed in the contract layer should be easy to construct in a test.
  • All types exposed in the contract layer should implement derive(Clone, Debug) so that they can be easily mocked in tests.
    • This means that types like failure::Error should be converted to something that is cloneable.

Required for Configurable Dependencies

  • The contract layer should not reference any technology or framework unless it is specifically an extension for that technology or framework.

Empathy

  • Every effort should be made to make the public api surface of your component as easy to use and understand as possible.
  • The contract layer should minimize the use of generics.
    • Obvious exceptions are Result<T> and Option<T>.
    • Concepts like PagedResult<T> that are ubiquitous can also be excepted.
    • Using type aliases to hide the generics does not qualify since the generic constraits still have to be understood and honored in a test.
    • In general this advice amounts to “generics are nice, but harder to understand than flat types. Use with care in public facing contracts.”
  • If a trait exposes a Future as a return result, it should offer a synchronous version of the same operation. This allows the client to opt-in to futures if they need them and ignore that complexity if they don’t.
    • I understand that the client can add the .wait() call to the end of a Future. My point is that an “opt-in” model is friendlier than an “opt-out” model.

Example Hypothetical Contract Surface

#[derive(Clone, Debug)]
struct Employee {
    id: String,
    type: String,
    status: String,
    first_name: String,
    last_name: String,
    address: String,
    city: String,
    birth_date: UTC,
    // snipped for brevity
}

struct PagedResponse<T> { // exposes a generic, but the reason is warranted.
    page_number: i32,
    page_size: i32,
    items: Vec<T>
}


#[derive(Debug, Clone)]
enum MyComponentError {
    Error1(String), // If the context parameter is another struct, it must also derive Clone & Debug
    Error2(i32),
};

#[derive(Clone, Debug)]
struct EmployeesQuery {
    r#type: Option<String>,
    name: String, 
    types: Vec<String>, // matches any of the specified types,
    cities: Vec<String>, // matches any of the specified cities
}

type Result<T> : Result<T, MyComponentError>; // Component level Result. Type aliasing expected here.

trait EmployeeService {
    type Employees = PagedResponse<Employee>;

    // sync version of async_get()
    fn get(id: String) -> Result<Employee>{
        async_get(id).wait();
    }

    fn async_get(id: String) -> Future<Item = Employee, Error = MyCompomentError>;

    // sync version of async_query()
    fn query(query: Option<EmployeesQuery>) -> Employees {
        async_query(id).wait()
    }

    fn async_query(query: Option<EmployeesQuery>) -> Future<Item = Transactions, Error = MyCompomentError>;

    // etc...
}
Non-Technical Engineering Quality Indicators

A PM I work with asked me the following question:
“How can someone who is not close to the engineering read the tea-leaves about the engineering quality of a given project.”

I love this question because it shows the PM cares about engineering quality despite not having engineering expertise. I’m much more accustomed to having to argue that non-technical folks should care about engineering quality. Now I have someone who does care and wants help knowing what to look for. How do I help this person? I have some starting ideas. Some of my favorite engineering manager metrics are easily adaptable for non-technical PM’s.

  1. Does the team rely on manual testing? In my opinion manual testing adds little value. It’s impossible to do a complete regression. There is no future benefit to the effort beyond the immediate release. It’s slow. If you want to remove waste from the delivery pipeline, invest in test automation.
  2. What is the defect/user story ratio in the backlog?
  3. How often are new defects discovered and added to the backlog?
  4. How does the team react to the idea of asking for a near-zero cycle time for defects? To achieve this the team would need:
    a. To have relatively few defects
    b. To receive new defects on an infrequent basis
    c. To have confidence that correcting any defect would take hours instead of days
    e. To have deep knowledge of a well-engineered system so that the exact nature of the problem can be identified quickly
    f. To have confidence that they can pass the system through their quality gates and get into production in less than an hour

The idea that you can have zero defects is sometimes shocking for both PM’s and engineers to consider. It can be an uphill battle to convince them that this is achievable in reality. If your team can’t accept this as a reality, see if they can accept is as a goal and move in this direction.

  1. Is Lead Time increasing? That could indicate that the team can not respond to work as quickly as it arrives. You will also need to know what Lead Time you want. If you want to plan your roadmap in three-month increments, your Lead Time should be between 45 and 180 days–depending on how often you update your roadmap.
  2. Is Cycle Time increasing? If so, it probably means the system is difficult to work in.
  3. Is Cycle Time close to Lead Time? If so, it could indicate a lack of planning, or an inability to work on the plan due to emergent issues.
  4. How long does it take to correct a defect in production? Ideally, this would be ~1hr from discovery.
  5. How much do we spend on support for this service through all channels? (E.g., service desk, engineering time, etc.)

This list is not exhaustive and none of these measurements would be conclusive on their own. As diagnostics however they could be quite useful to draw your attention to potential problems. Most would have to be tracked over a longer period of time to be meaningful.

If you’ve got ideas of your own, please leave them in the comments. This is an important piece of the communication between product management and engineering.

NBuilder 6.0.0 Released

Thank you to the contributors who submitted pull requests for the issues that were important to them. A summary of the changes for NBuilder 6 are as follows:

  • Breaking Change: WithConstructor
    • No longer takes an Expression<Func<T>>.
    • Takes a Func<T>.
    • Marked [Obsolete] in favor of WithFactory
    • This change was to address an issue in which the constructor expression was not being reevaluated for each item in a list.
  • Feature: @AdemCatamak Added support for IndexOf as part of the ListBuilder implementation.
var products = new Builder()
    .CreateListOfSize<Product>(10)
    .IndexOf(0, 2, 5)
    .With(x => x.Title = "A special title")
    .Build();
  • Feature: @PureKrome Added support for DateTimeKind to RandomGenerator
var result = randomGenerator.Next(DateTime.MinValue, DateTime.MaxValue, DateTimeKind.Utc);
  • Feature: Added DisablePropertyNamingFor(PropertyInfo) overload to BuilderSettings.
  • Feature: Added TheRest as an extension to the ListBuilder.
var results = new Builder()
        .CreateListOfSize<SimpleClass>(10)
        .TheFirst(2)
        .Do(row => row.String1 = "One")
        .TheRest()
        .Do(row => row.String1 = "Ten")
        .Build()
    ;
  • Bug: Last item in enum is never generated when generating property values randomly.
  • Bug: Lost strong name when porting to .NET Standard.
  • Bug: Non-deterministic behavior when calling TheLast multiple times for the same range.
A New Chapter at Microsoft

I am leaving Redacted Financial Services* in November to manage an IT team at Microsoft. I am changing the focus of my career from the day-to-day tech toward management–strategy over tactics. I’ll be bringing what I know about software engineering into the IT space as well as learning an entirely new set of disciplines.

I’ve worked at Redacted for 6 years. In that time I’ve enjoyed working with a motivated, dedicated group of Software Craftsmen. Other than getting my start writing software, it’s the best time of my professional life. I grew professionally in that time in no small part due to a manager who made room for me to explore my interests and found ways to capitalize on them for the benefit of the company. It is my goal to match his example.

One of the things I accomplished there was founding an internship program which became a feeder program into our development organization for up-and-coming developers. It had the unintended side-effect of creating a mechanism people within the company who had shown an interest in writing software could use to explore a career-change. I’ve worked with close to 30 interns. Some have stayed and worked with us. Others have gone on to companies like Visa, Google, Nordstrom, and Tableau. I’m proud to have played a part in their career development.

I took control of the hiring process for interns which expanded to include running the hiring process for our entire development organization. I learned that the largest impact I could have on my organization is through who I choose to hire. My wife works as an agency recruiter for accounting and finance professionals and with her help I learned how to work with agency recruiters to find the candidates I needed quickly. Hiring is hard and people are seldom properly trained how to do it. The end-result was that we spent less time sorting through resumes and interviewing dud-candidates. Instead, nearly every candidate we talked to was brought on-site. For the most part we were able to hire quickly with only a few cycles through the process.

A couple of years ago our DevOps initiative was going sideways. Known to be a passionate advocate for Software Craftsmanship, I was asked to ride-along with the DevOps group and make recommendations that would get us back on track. I ended up leading that group for the last year and a half. The improvements we made include tracking work in one place, identifying and eradicating root causes of common problems, clearly identifying our customers, identifying standard practices for common work, establishing a customer-centric mindset for the team, and practicing what we preach with respect to quality software. It’s a DevOps team, but we write tests for our scripts and services. In that time the stability and reliability of our production deployments increased dramatically.

In addition to being a technical leader on my team, I began managing other people. I always thought of this responsibility in servant-leadership terms. My role was to collaborate with the employee to make sure s/he is feeling challenged and growing. I learned to be free with my praise and politely direct with my critical feedback. I learned never to give critical feedback without also giving concrete examples of different behavior. I was able to coach my reports through some challenging scenarios and save them the effort of learning everything the hard way.

While I’m the one who did the work to learn these things, I was enabled by a phenomenal manager who gave me room to grow and challenge myself. He listened to my interests and made room for me to explore them–ever confident that it would pay off for the team. It did.

I was also challenged by a group of quality-focused engineers who accepted my ideas when they thought they were good, and who had the courage to speak up when they thought I was off the deep end. Some of my favorite people are my worst critics–and good friends.

Finally, I was aided by a wonderful wife with the highest emotional intelligence of any person I’ve ever encountered. I learned from her how critically important successful communication is and endeavored to apply that learning to my career. I’ve learned that I need to adapt my communication style to my audience–although putting that into practice is still a challenge!

I feel a swell of pride for having these people in my life and at the work we’ve accomplished together. To all of these people I feel a great debt of gratitude.

Thank you All.

 

* One of the interesting “perks” of working for a finance company is that some of them don’t want you to name your employer on social media. The rationale is that if you were to broadcast a stock purchase or otherwise comment on the markets it may be construed by someone else as Financial Advice which would in turn make the company potentially liable for the quality of that advice.
Stay Focused on the Goal, Not the Metrics

The goal is the thing you are trying to do.
The metric is how you are measuring your progress toward the thing you are trying to do. Metrics are only as good as their ability to measure progress toward the goal.

Don’t confuse them.

An Example

Imagine a sales team for an organization selling widgets has a goal to increase sales of a particular product line by 10%. The Sales Manager decides that the best way to achieve the goal is for the sales staff to make a certain number of cold-calls over the following months. After a few weeks, one of her sales staff is falling way behind in cold-calls. The wrinkle is that this salesman is the top biller in the department. Should the sales manager berate her top performer for not doing enough cold-calls?

Absolutely not. No manager should ever punish their reports for doing well (provided the means used are legal and ethical of course).

Punishing the salesman would send the message that billing isn’t the goal, but cold-calls are. Do you want a salesman who makes lots of cold-calls but can’t bill? Since sales staff are compensated via commission, punishing the salesman would introduce a division between his performance and his pay. In the best case, the salesman simply ignores the manager and continues to bill and get paid–benefiting the company in the process. In the worst case, the salesman leaves the company for greener pastures, depriving the company of it’s top biller.

The mistake here is that cold-calls are simply a form of measuring progress toward the goal–increased sales. Cold calls themselves are not the actual goal–they are a proxy for the goal. Further, they may not be the only possible proxy. Their value as a proxy is proportional to the relationship between cold-calls and increased sales. If a salesman is generating increased sales without cold-calls then there is either another possible metric or cold-calls are a poor metric.

It’s one thing to say that cold-calls are a proven way to generate increased sales. It’s quite another to ignore that there are other possible ways to do the same thing. It’s flat wrong to take the position that cold-calls are the only way to increase sales.

What is the appropriate response? Find out how the top biller is selling so well without cold-calls. Is the top biller doing something that no one else is doing? Is there something for the other sales staff to learn? Are there new, better metrics that can be introduced? Of course, it’s also possible that the top biller could bill even more if he did more cold-calls. Finding out will require collaboration between the salesman and the manager–but this is a process of active investigation instead of passive authoritarianism.

If the manager focuses on the metric instead of the goal, she is taking on the responsibility of having all the answers and dictating them to others. The proper approach is to adopt a learning stance toward the team’s work. If the team is doing well but the metrics aren’t being met, what can the manager learn from this? If the team meets the metrics, will they do better? If not–what good are they?

Choosing Metrics

When choosing metrics it’s important to consider that people will game the system. If you’re a software engineering manager and you make Lines of Code or Test Coverage the metric, people will write verbose code and create meaningless tests. A good metric will encourage people to game the system by focusing on the thing you want to achieve. I recently heard of an example in which the Product Owner threw out story sizing as part of their Scrum process. The only thing developers got credit for was the number of stories they completed. It didn’t matter how large or small–credit was only given when the story was completed an in production. The developers began gaming the system by reducing the story size to the smallest thing they could deliver.

Perfect.

Next Page