Strict type-safety framework

C++ is a strongly typed language by a lot of definitions, but I reckoned I can take it further. As a starting point, I tried to implement the system of physical measurements, trying to imagine what could come handy if I was coding programs for, say, Mars probes.

Demands from this system

Every operation that needs to be prevented, should fail at compile-time. Run-time is too late to find an error that can be fixed before deploying.
It should be impossible to create a quantity value without supplying a unit of measurement. For example, length can never be just 3; it can be 3m or 3km or some other length unit.
Uncontrolled conversions between different quantities are prevented. Length cannot be assigned 3 seconds, and if function receives length as an argument, it should fail to compile if it was invoked by passing it an area.
There should be no ambiguity about which units of measurement you need to pass to a function. If a function receives length, it should work correctly with meters, miles, all of them. You shouldn't need to rely on variable name or documentation to use the function properly.

Basic idea for implementation

Every physical quantity has its own class:


class length {
  protected:
    long double value;
    explicit constexpr length(long double v) :
      value(v) {}
    friend constexpr length
      operator"" _m (long double);
}

This class will have a protected constructor, so that it can't be initialised from normal code. Every unit of measurement also has one friend function. That friend function is a basic unit string literal definition. An example for length:


constexpr length operator"" _m (long double v) {
  return length ( v );
}

See the point? You can only create such a value through string literal function, which defines the quantity and the unit too. There is also no possible conversions between primitive types (double) and such quantity, unless an operator is explicitely provided.

Other string literals can be added to define other units too, like _km, _miles, _light_years, whatever. These additional string literals need not be friends of quantity class, as they can be defined in terms of basic unit:


constexpr length operator"" _km (long double v) {
  return operator "" _m ( 1000*v );
}

Overloading appropriate operators for each quantity class then enables writing mathematical operations:


length operator+= (length l1, length l2); // 1
length operator== (length l1, length l2); // 2
length operator*= (length l1, double d); // 3
area operator*= (length l1, length l2); // 4

As you can see, this system now accomplishes much of the goals set out for my type-safety system:


// Initialising with string literal
length l = 10_m; // OK

// 10 of what? Can't add unitless value.
l += 10; // compilation error

// Using operator 1
l += 10_m; // OK

// Using operator 2
l == 20_m // OK, is true

// 20 of what? Can't compare when missing unit.
l == 20 // compilation error

// Using operator 3
l *= 10; // OK, will retain the same quantity

// Using operator 4, formula for area
area a = l * l; // OK, nice, isn't it?

// What did you mean by this? Was it a mistake?
time t = l * l; // compilation error

Issues

Unfortunately, it brings with it some problems I was not able to mend.

For example, I got into trouble when writing a templated mathematical vector, that would set all values to 0 at initialisation. I couldn't assign primitive type numbers to my units system. I could assign 0_m, but that is not generic anymore, because it wouldn't work if vector was used with primitive types.

I tried to make a public constructor that uses shiny new static_assert to assert in compile time that a value of an argument is not zero, but unsurprisingly it doesn't work exactly that way.

Side note: if in C++ you could specialise template functions based on value, when the value could be determined at compile time (constexpr), then you could make a constructor for receiving 0 public, but keep the other one protected. But you can't, because this is not a feature of C++.

You could argue that it is not a container's place to initialise values. 0 is not always the default neutral that is universaly wanted as an initial number. 0 degrees Celsius is 273 Kelvins, for example, and Kelvin is a basic unit. But similar to how I was prevented from using my own vector class, I couldn't use any of the already written mathematical functions from standard library because they use double. Furthermore, I can't write a superior library, because I can't make the functions compatible with primitive types. It would have to be a library written just for these strong types.

You'd think that maybe the approach is wrong, but I don't think it's because of that. Whatever the approach, assigning primitive type to such a variable breaks the demands from the beginning, because it does not provide a unit, so the most you can do is to make strongly-typed values read-only when accessing from outside the system.

Seems to me that strongly-typed values have to live in their own world and I could not consolidate them with existing code.

The second problem is that physical formulas make sense after everything has been calculated, but steps do not necessarily need to. For instance, acceleration includes time squared, but "square second" is a bit contrived physical quantity. To allow expressing physical formulas with a rigid system like this one, a number of such contrived quantities would exceed the number of real ones.

Consider an equation to calculate the force of gravity between two bodies:


f_g = G( (mass1*mass2) / sqr(distance) )

It needs following quantities: mass_squared, length_squared (use area instead?), mass_squared_per_area, and whatever the quantity of gravitational constant is. All this, where all the quantities you really need are force, mass, and length. Too much overhead, too inelegant, impractical, and semantically incorrect!

I though about having a base class to all quantities and then operators could be defined in terms of base class, so that calculations still happen within the boundaries of the type-safe system, but do not necessarily have a specific quantity attached to it, until the calculation is completed. You could determine when that is, because it would be the only time operator= is needed. But that doesn't detect errors in compile-time, so it's neither here nor there. What is there to prevent me writing "time t = l * l" that correctly failed compilation before?

Is it even possible to make a system elegant, convenient, and super type-safe at the same time?

Previous: Unfair lock optimisation
Next: Improvement to three-way min