Strong Type Checking with Semantic Types in C#

Introduction

One of the drawbacks of common typed languages is loss of semantic information.  For example:

string zipcode = "12565";
string state = "NY";
WhatAmI(zipcode);
WhatAmI(state);

void WhatAmI(string str)
{
  Console.WriteLine("I have no idea what " + str + " is.");
}

This illustrates that the semantic meaning for zipcode and state is not known to the program — it is merely a convenient label for the programmer to hopefully populate with the correct value.

The goal of this article is to create a means for implementing immutable semantic types that wrap (usually) native types, providing stronger type checking of parameters, and to do so in a way that is easy to define the semantic type and easy to use the semantic type.

Source Code

Implementing semantic types is trivial — just copy and paste the interfaces and base classes described in the section “Behind the Scenes” and start implementing your own concrete semantic types.

“Stronger Typing” with Semantic Types

The implementation described below let’s us write this instead:

Zipcode z = Zipcode.SetValue("12565");
State s = State.SetValue("NY");
WhatAmI(z);
WhatAmI(s);

// Now we can pass a sematic type rather than "string"

static void WhatAmI(Zipcode z)
{
  Console.WriteLine("Zipcode = " + z.Value);
}

static void WhatAmI(State z)
{
  Console.WriteLine("State = " + z.Value);
}

Now we have “stronger” type checking as a result of using semantic types in place of native types.  Another benefit of this approach is that the semantic type is immutable — every call to SetValue instantiates a new semantic instance.  Using semantic types is therefore very advantageous in multi-threaded applications–in other words, semantic types implement a feature similar to functional programming.  Of course, this immutability can be easily defeated, but it’s not recommend!

Behind The Scenes

The implementation behind the declaration of a semantic type involves a couple interfaces and an abstract base class:

/// <summary>
/// Topmost abstraction.
/// </summary>
public interface ISemanticType
{
}

public interface ISemanticType<T>
{
  T Value { get; }
}

/// <summary>
/// Enforces a semantic type of type T with a setter.
/// </summary>
/// <typeparam name="T">The native type.</typeparam>
public abstract class SemanticType<T> : ISemanticType
{
  public T virtual Value { get; protected set; }
}

/// <summary>
/// Abstract native semantic type. Implements the native type T and the setter/getter.
/// This abstraction implements an immutable native type due to the fact that the setter
/// always returns a new concrete instance.
/// </summary>
/// <typeparam name="R">The concrete instance.</typeparam>
/// <typeparam name="T">The native type backing the concrete instance.</typeparam>
public abstract class NativeSemanticType<R, T> : SemanticType<T>
  where R : ISemanticType<T>, new()
{
  public T Value { get { return val; } }

  protected T val;

  public static R SetValue(T val)
  {
    R ret = new R();
    ret.Value = val;

    return ret;
  }
}

The interface ISemanticType is merely one of convenience when the type information is not available.

The interface ISemanticType<T> is another convenience — this allows us to pass instances of a semantic type without knowing, well, the semantic type.  In other words, it allows us to break the whole point of this article by passing a non-semantic interface instance, but sometimes that’s necessary.

The abstract class SemanticType<T> implements an immutable Value property.  We need a protected setter so that the concrete semantic type can be instantiated with a static factory method, but we don’t want the programmer to change the value once it’s been set.

The abstract class NativeSemanticType<R, T> is where the magic happens.

  1. This class derives from SemanticType<T>, allowing it access to the base class’ protected Value setter.
  2. The class takes R, the generic parameter of the concrete semantic type that is istelf derived from NativeSemanticType<R, T>. That’s really the fun part – a class that takes generic type that is itself derived from that class.

Regarding that last point, the compiler is very picky about what type R can be.  In order for:

ret.Value = val;

to work, ret (being of type R) must be able to access the protected Value setter.  For this to work, R must be of type NativeSemanticType<R, T> — it cannot be (though it would seem reasonable that it ought to be) of type SemanticType<T>.

Implementing Concrete Semantic Types

We can implement the concrete semantic types very easily.  In the example used earlier, the implementation is:

public class Zipcode : NativeSemanticType<Zipcode, string> { }
public class State : NativeSemanticType<State, string> { }

The only nuance to this is that the base class must specify the concrete class as the generic parameter R so that the base class’ SetValue function knows what type to instantiate.  Because this is a static “factory” method, we really can’t avoid this small awkwardness (at least I haven’t figured out how to avoid it.)

The second generic parameter is the backing native type.  Of course, this doesn’t actually have to be a native type — it could be any other class as well.

Additional Benefits of Semantic Types

Here’s a couple reasons to consider semantic types:

Validation

Another neat thing about concrete semantic types is that the type implementation can override the value setter and perform checks.  For example:

public class Zipcode : NativeSemanticType<Zipcode, string> 
{
  public override string Value
  {
    get
    {
      return base.Value;
    }
    protected set
    {
      if (value.Length != 5)
      {
        throw new ApplicationException("Zipcode must have length of 5.");
      }

      base.Value = value;
    }
  }
}

This:

Zipcode fail = Zipcode.SetValue("123");

will now throw an exception.

Security

By using the concept illustrated above, you can ensure that the underlying value is secured, whether encrypted, hashed, or if a credit card, the credit card digits are masked, etc.

Semantic Computing / Distributed Semantic Computing

And of course, if you want to go whole hog, semantic types are also very amenable to multithreaded and distributed computing, as I’ve written about here.

Conclusion

While it may seem strange to code semantically, you may find this a useful technique to provide even stronger type checking of parameters, especially when dealing with function calls or class properties that have many of the same native types in common.  The ability to also provide value checking and other behaviors during the setting of a semantic value is an added bonus.

Also check out Matt Perdeck’s excellent article Introducing Semantic Types in .NET.