Proof-of-concept for validating laws via transient variable lifetime
#1
Let us have a program in C*:

Code:
typedef uint32_t myint;

/* Must be less than 100 and cannot ever equal 17 */
law : myint
{
   _ < 100;
   _ != 17;
};

/* Fibonacci sequence will satisfy both of those constraints, but how do we know? */
void fibonacci( void )
{
   uint32_t i, n;
   myint t0, t1;
   uint32_t tn;

   t0 = 0;
   t1 = 1;

   /* print the first two terms */
   fprintf(stdout, "Fibonacci series: %d, %d", t0, t1);

   /* print 3rd to 12th terms */
   for(i = 2; i < 12; ++i)
   {
      tn = t0 + t1;
      fprintf(stdout, ", %d", tn);
      t0 = t1;
      t1 = tn;
   }
}

Going through the Fibonacci sequence, we know that if we limit the number of terms to 12, we will never reach 100. But how does the C* compiler break this down?

It evaluates the possible values of each variable term that it is enforcing at every point they are modified, in an exhaustive recursive fashion. This means that the algorithmic complexity of verification is proportional to the algorithmic complexity of the program being verified. The verification algorithm will first minimise the possible program space by factoring in all constant values, which in the function above is very helpful.

In cases where the output of the function depends on outside variables, the laws applied to the incoming paramters are assumed to hold either directly or by marshalling, but beyond that, it will assume worst values for the type's size. In the case of complex algorithms, it will often happen that it is not trivial to guarantee the validity of a given combination of laws; for example, if a foreign n was given of type uint32_t, it may require brute force search to ensure that some other variable dependent on n never equals 17.

The default behaviour of the C* compiler in situations like these is to error out, asking the programmer to give it more certainty about the data it is dealing with. Practically speaking, this involves creating more concise types with more permissible laws. For instance, if you want to be sure a 40 bit integer never overflows via multiplication, you need to make sure the types multiplied to create it have a bit size that, summed together, does not exceed 40 bits. Like so:

Code:
typedef uint64_t outint;
typedef uint64_t term0;
typedef uint64_t term1;

law : outint
{
   _ <= 0xFFFFFFFFFF;
};

law : term0
{
   _ <= 0xFFFFFF;
};

law : term1
{
   _ <= 0xFFFF;
};

void mysubroutine( void )
{
   myout a;
   term0 b = /* ... */;
   term1 c = /* ... */;

   /* This is OK */
   a = b * c;
}

If the above code was modified to have laws that permit any valid addition or subtraction but not multiplication (ergo, the laws are only enough to allow linear mixing, not quadratic), then a = b + c would still be valid, but the compiler would error out if it found a = b * c. The precautionary principle is in play.

However, it will be possible to put the compiler into that brute force mode, potentially at great computational cost, in order to arrive definitively at an answer to that question. This is accomplished using a framework of satisfiability solver programs, which provide a bitcode proof that can be saved by a programmer for trivial verification of its satisfiability once the solution is found.
The line it is drawn, the curse it is cast
The slow one now, will later be fast
As the present now, will later be past
The order is rapidly fadin’
And the first one now, will later be last
For the times, they are a-changin’
Reply
#2
Transient variable lifetime is a term referring to the exhaustive graphing of data as it flows through various names in all possible call graphs of a program. In a nutshell, we imagine a "variable" as a kind of ethereal object that "travels" around the program, being modified and passed on.

Above we discussed how a variable is validated by simulating the subroutine and all of its possible branches, and what is done when there isn't enough information to definitively enforce a law. Introducing the transient variable lifetime to this approach means that we transcend callsite boundaries within the total system to thoroughly simulate all subroutines in a program as one big meta-routine.

This means that we can get more information about possible states than is possible when marshalling without attached formal proofs. Data confined within a total system has a far smaller number of possible states – or more precisely, the number of possible states it has is directly proportional to the number of changes it has. The larger the program, the longer it takes to validate, but it does not scale exponentially in its own right – it merely follows the algorithmic complexity of the program being validated.
The line it is drawn, the curse it is cast
The slow one now, will later be fast
As the present now, will later be past
The order is rapidly fadin’
And the first one now, will later be last
For the times, they are a-changin’
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)