Why We Need Transfloats

From my viewpoint, transreals provide the basis for a very welcome clean-up of floating point arithmetic. Existing floating point arithmetic (IEEE 754 & its updates) has analogous values for ∞, -∞ and ϕ called INFINITY, -INFINITY and NAN. This standard has a lot of obscure corner cases where behaviour is not tightly defined and the knock-on effect is that programmers have to be very careful to guard against these when programming.

In practice, this means that developers currently have to put a disproportionate amount of effort into details that have very little business value, which is what we mean when we talk about 'low-level' rather than 'high-level' programming. And this is pervasive.

To illustrate how troublesome this is, here's a simple problem. Write a program that takes a several series of numbers, take the arithmetic mean of each series and return the biggest mean that we find. So if we have two series (1,2,3) and (4,5,6) we find the mean of the first series (2) and the mean of the second series (5) and return the biggest (5).

I will write the program in the obvious way in all three languages and the results will speak for themselves. For simplicity, I'll pass the numbers on the command line and separate each list with a comma. You can skip the code if you like, of course, but I provide it so you can check what's going on.

Here's the program in C++ ….

#include <iostream>
#include <cmath>
#include <cstdlib>
#include <vector>
#include <sstream>
 
using namespace std;
 
//  Can return any number.
double sumVector( vector< double > & data ) {
    auto sum = 0.0;
    for ( auto d : data ) {
        sum += d;
    }
    return sum;
}
 
//  Note that we may be dividing by zero.
double arithmeticMean( vector< double > & data ) {
    return sumVector( data ) / data.size();
}
 
//  Read in several lists of double, separated by a ",". So 
//  1 2 3 , 4 5 6 turns into 2 lists of 3 elements.
vector< vector< double > > parseArgs( int argc, char ** argv ) {
    vector< vector< double > > data_arrays( 1 );
    int n = 0;
    for ( int i = 1; i < argc; i++ ) {
        if ( string( "," ) == argv[ i ] ) {
            n += 1;
            data_arrays.push_back( vector< double >() );
        } else {
            double d;
            stringstream str( argv[ i ] );
            str >> d;
            data_arrays[ n ].push_back( d );
        }
    }
    return data_arrays;
}
 
int main( int argc, char ** argv ) {
    vector< vector< double > > data_arrays = parseArgs( argc, argv );
 
    double max_sofar = -1.0/0.0;
    for ( auto & data : data_arrays ) {
        const double m = arithmeticMean( data );
        max_sofar = max( m, max_sofar );
    }
 
    cout << max_sofar << endl;
 
    return EXIT_SUCCESS;
}

What does this program do? Here's an examples …

steve% ./a.out 1 2 3 , 4 5 6 , -10 100 -100
5

And that's what we expect. What happens if we supply an empty list?

steve% ./a.out , 1 2 3
2

OK, that's nice. We don't have to worry about empty lists. Or do we …

steve% ./a.out 1 2 3 ,
nan

Ah, that's a disaster! Even though the arithmetic mean is completely insensitive to order and so is taking the maximum, somehow we have ended up with a function that is extremely order sensitive. Let's see how bad it is …

steve% ./a.out inf
inf
[~/tmp]
steve% ./a.out inf , nan
nan
[~/tmp]
steve% ./a.out inf , nan , inf
inf

Indeed, the situation is hopeless. In order to convey how supremely frustrating this is for the average engineer, I am not going to explain what the problem is (at least not in this message). I don't see why other people shouldn't suffer when I do :D

Here's the Program in Python

Well, perhaps the problem is that C++ is a 'bad' language? It certainly has a tarnished reputation in some corners. How about Python? That's a pretty friendly language:

import sys
 
def get_data():
    data = [[]]
    for i in sys.argv[1:]:
        if i == ',':
            data.append( [] )
        else:
            data[-1].append( float( i ) )
    return data
 
def mean( list ):
    return sum( list ) / len( list )
 
def main():
    print( max( [ mean( i ) for i in get_data() ] ) )
 
if __name__ == "__main__":
    main()

It's a lot more compact. How does it work out?

steve% python3 mean.py 1 2 3 , 4 5 6
5.0

So far so good. Let's throw it some nan's and infs.

steve% python3 mean.py 1 2 3 , nan
2.0
[~/tmp]
steve% python3 mean.py 1 2 3 , nan , inf
inf
[~/tmp]
steve% python3 mean.py 1 2 3 , nan , inf , nan
inf

Awesome, the NaN's get ignored and the results are completely stable! Let's try an empty list.

steve% python3 mean.py 1 2 3 , 
Traceback (most recent call last):
  File "mean.py", line 21, in <module>
    main()
  File "mean.py", line 18, in main
    print( max( [ mean( i ) for i in get_data() ] ) )
  File "mean.py", line 18, in <listcomp>
    print( max( [ mean( i ) for i in get_data() ] ) )
  File "mean.py", line 15, in mean
    return sum( list ) / len( list )
ZeroDivisionError: division by zero

Python doesn't like dividing by zero - even though it supports +/- inf and NaN, which is a little strange. So we need to make a decision about whether or not we guard against empty lists getting through. What if we modified our program to filter out empty lists? In Python that's a cinch ….

def get_data():
    data = [[]]
    for i in sys.argv[1:]:
        if i == ',':
            data.append( [] )
        else:
            data[-1].append( float( i ) )
    return [ i for i in data if i ]     # just change this

Let's try it out …

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in main
ValueError: max() arg is an empty sequence

Ah, in attempting to provide a guard against supplying a partial function (/) with data that doesn't meet its precondition, we violated the precondition of another partial function (max)! We had better guard against that …

def main():
    try:
        print( max( [ mean( i ) for i in get_data() ] ) )
    except ValueError:
        print( 'nan' )

Now this gives us a stable and meaningful answer (in fact exactly the answer we would have got for free with transreal arithmetic). But there's something quite disturbing about this code, which is the trap of ValueError. A professional programmer should not be satisfied with this because:

  • The reader does not know which of max, mean, get_data might be the cause of a ValueError (if any!).
  • The author cannot be confident that ValueError might be raised for reasons s/he did not expect.

In fact two rounds of refactoring are required to get this into a reasonable state: one to separate the printing from the results and the other to isolate the call of nan and nothing else. However I won't torture you with even more code, the point is made: professional programming with exceptions is very demanding. And with such pervasive operations as these, we would really appreciate them being total.

The other point to take away is that our program naturally converged on transreal arithmetic as soon as we started applying real-world concerns to it. This is not a coincidence - good housekeeping is a major real-world programming issue and that's transreal arithmetic's USP.

Here's the program in Java …

Wait, we're not finished.

import java.util.List;
import java.util.ArrayList;
 
class MaxMean {
 
    private static List< List< Double > >getData( String[] args ) {
        List< List< Double > > data = new ArrayList< List< Double > >();
        data.add( new ArrayList< Double >() );
        for ( String s : args ) {
            if ( ",".equals( s ) ) {
                data.add( new ArrayList< Double >() );
            } else if ( "inf".equals( s ) ) {
                data.get( data.size() - 1 ).add( Double.POSITIVE_INFINITY );
            } else {
                Double d = Double.parseDouble( s );
                data.get( data.size() - 1 ).add( d );
            }
        }
        return data;    
    }
 
    static double sum( List< Double > ds ) {
        double sofar = 0.0;
        for ( Double d : ds ) {
            sofar += d;
        }
        return sofar;
    }
 
    static double mean( List< Double > ds ) {
        return sum( ds ) / ds.size();
    }
 
    public static final void main( String[] args ) {
        List< List< Double > > data = getData( args );
        double max_sofar = Double.NEGATIVE_INFINITY;
        for ( List< Double > list : data ) {
            max_sofar = Math.max( max_sofar, mean( list ) );
        }
        System.out.println( max_sofar );
    }
}

Does it work?

steve% java MaxMean 1 2 3 , 4 5 6
5.0

And what does it do with empty sets?

steve% java MaxMean 
NaN

OK, what about if we add an infinity in it.

steve% java MaxMean inf
Infinity

OK. What about NaN and infinity (and the other way around)?

[[/code]]steve% java MaxMean inf ,
NaN
steve% java MaxMean , inf
NaN

The good news is that the Java program can be written more or less naturally and it generates stable results. But it also generates useless results if any of the lists are empty (which is why IEEE felt the need to introduce in 2008 new max and min functions that discarded NaNs).

That means we'll have to go around the loop, just the same as we did in Python - but for different reasons!

Putting It Together

So we have looked at the corner cases of a relatively simple arithmetic task in three of the worlds most popular programming languages and found niggling and different issues in all three cases. Yet the perceived incremental business value of getting the code to work for these corner cases is (typically) negligible - and so programmers are motivated to make excuses for why they should ignore these problems.

Of course, computers pay no attention to these protestations! And sooner or later these corner cases combine to cause serious issues. And I tried to illustrate this with the example of feeding the output of one instance of a Python program into other. The corner case was marginal, ignored, and caused downstream costs.

I suggest that there are two possible morals the professional programmer can take away from this story. The standard interpretation is learn the mathematics of floating point arithmetic and all the library functions built on them thoroughly and always prove the preconditions are satisfied, conditioning the inputs as appropriate and raising exceptions where not. You can go onto StackOverflow and hear this repeated over and over again. Indeed that's exactly what I tell my team to do.

There are several snags with this. The maths behind floating point arithmetic is not easy and the majority of programmers are not numerical analysts. And the library functions built on top are not standardised across all languages - so it is a mountain of learning to climb. And programming with exceptions is not at all easy because clumsy syntax, semantics and poor multi-processing facilities and the in-your-face problem of maintaining runtime integrity in the face of interruptions.

Combine these with the pervasive nature of arithmetic and being so relentlessly careful simply isn't a practical option. And with the crushing pressure on software businesses to bring costs down, this is only getting worse.

The alternative moral, of course, is that we should adopt a total model of arithmetic that doesn't have these ugly corner cases and never throws exceptions. The good news is that transreal arithmetic delivers exactly that & without the kind of impractical performance impact that otherwise sensible different approaches (such as interval arithmetic) bring with them. I think most people would objectively regard this as a worthwhile long term solution.