Friday, December 18, 2009

Revised proposal

I. Problem statement


Think back to every time you've been knee deep in C with no way to test your code but to-forbid the thought- write more C! We've all been there; stub testing requires, at the very least, a cocktail comprised of copy and pastes along with massive printfs. For even tighter control, you may even opt to GDB it. But why should we have to go through the same routine every time we create a new C module? While not particularly hard to write, creating a new program for each module creates a lot of redundancy. We propose a streamlined way of reproducing the following code:


int main(int argc, char **argv) {

printf(“Testing fn\n”);

printf(“Input: 1 Output should be: 3 Output is: %d\n”, fn(1));

printf(“Input: 4 Output should be: 5 Output is: %d\n”, fn(4));

printf(“Input: -34234 Output should be: 0 Output is: %d\n”, fn(-34234));

...

return 0;

}


in a syntax that is both simple and intuitive. Whether you are testing your own C functions or ensuring that someone else's contained within your own code work as expected, this is a very handy feature to execute on the fly.



II. Existing Solution


A similar solution exists in the Python Doctest module. With it, we can specify inputs to a given module's functions, along with the expected outputs. This is done by providing a multiline string after the function's definition, in which we give inputs and the corresponding outputs, line by line. A small example would be:


def factorial(n):

"""Return the factorial of n, an exact integer >= 0.


If the result is small enough to fit in an int, return an int.

Else return a long.


>>> [factorial(n) for n in range(6)]

[1, 1, 2, 6, 24, 120]

>>> [factorial(long(n)) for n in range(6)]

[1, 1, 2, 6, 24, 120]

>>> factorial(30)

265252859812191058636308480000000L

>>> factorial(30L)

265252859812191058636308480000000L

>>> factorial(-1)

Traceback (most recent call last):

...

ValueError: n must be >= 0

“””

(function def...)


Here, the tests are included in the function's docstring. The triple chevron signifies what the input is in python syntax; the first line is a list comprehension using the factorial function. The last line is a simple call to the factorial function, given the argument 30 as a long integer. Each test input is followed by the expected output; even thrown exceptions can be tested. When we execute the module, no output is given if the tests were successful (i.e. they match the expected output on the following line), unless the -v option is given by the caller.


The Doctest module is charged with the parsing of the docstring, and the Python interpreter is used to perform the execution of what is found. Each test that is run uses a shallow copy of the tested module's global variables as to not incorrectly influence the other tests.

An issue with Doctest is that it does not currently recognize doctests from Python extension modules. In its inner workings, Doctest's isfunction() returns False for an actual function that is defined in an extension module, while its isbuiltin() returns True, even though it is not a builtin. Also, there are weird corner cases, such as not being allowed to represent ellipses in the expected output. For example, the following test would incorrectly pass:


>>> print 'Hello'

… print 'World!'

Hello


Aside from these minor shortcomings, this module is a very nice built in feature that we would like to expand on for the C language.


There also exists unit testing frameworks for C (CUnit) that are a little more involved than what we have planned; we aim to provide a quick and simple way to test our C code without the hassle of writing more.


III. Features


  • Abstraction

      • Tests/Testing commands stored as variables

      • Range Values – The ability to have a single command expand into any number of commands

      • Sets of values

  • Test Logs – A log created containing the results of the testing commands

  • Testing Options

      • Verbose versus silent output

  • Testing Constraints – Flags thrown if testing conditions are met


The testing language is modeled after common scripting languages, where operations are performed with as few keystrokes as possible. The first example of C code in this document could possibly be rewritten as:


fn(1) -> 3;

fn(4) -> 5;

fn(-34234) -> 0;

fn("hello")->"world!";


where the output specified would be checked with the actual output. Or perhaps they can all be combined into one line using sets of inputs:


fn({1,4,-34234}) -> {3.67,5,0};


Maybe we would like to test a range of integers with a single testing command:


fn(-10:10);


where no output specified would simply print out the results. The objects being operated on in the language would be the C functions that we are testing, while the operators are the inputs we specify and the options for how the functions are tested. The above tests are expressions, so they can be used as such:


import "emps.c";

include "emps.h"


if ( register_emp("Tom", 986-54-3210, "Park Place")->NULL ) {

print "Error, failed to register employee.";

} else {

retrieve_emp(986-54-3210)->"Name: Tom Office: Park Place";

}


register_emp("Test", 986543211 : 986543220, "Test");

print_emps_from(98654324);



Abstractions will be somewhat like defining a function, except that in this case it is a test. A test can match some function call to the output and also execute certain actions in the target language, such as creating an object or assigning a variable with some value to use as a test argument. Allowing the specification of testing logs are for the benefit of the programmer. Implementing constraints allows the programmer to easily see whether certain conditions on the output are met over conditions on the input, such as if the output falls within a certain range, or perhaps if an impossible value is retrieved:


import "emps.c";

use char* register_emp(char* emp_name, int emp_id, char* office);

use int fact(int n);


assume n in fact >= 0 then output > 1;

assume emp_name in register_emp == "" then output == NULL;


fact(0:10); #fails constraint

register_emp("", 0,""); # passes constraint


Now, each input to the fact function that adheres to the given condition will have its output tested against the given constraints. If the first constraint has not been met (an input to the fact function that is greater or equal to 0 yields a value greater than 1), the programmer will simply be notified. Similiarly, the second constraint fails when register_emp is given an empty string as emp_name.


IV. Implementation


We plan to use our Python parser generator, given the specified grammar of our language, to parse the test script. Alternatively, we could start over and write a new parser in the same language we will be examining (C). The latter way would be faster, and not too complex given our simplistic grammar, but we would lose portability. Besides, we have already developed the tools for the job, and keeping each module of our code (parser, interpreter, debugger) in the same language is preferable.


The Python language allows us to easily execute some of the features of the testing language. Specifying a range of variables, or groups of them, along with groups of outputs, can be stored and checked with Python's mapping functions and dictionaries. Constraints can be quickly looked up from dictionaries as well.


After we parse the input, we will use an interpreter written in Python to execute the AST generated. We can alternatively create bytecode as the representation of the parsed input, but since our grammar is simplistic and unambiguous, it lends itself to interpreting the AST.


We will use SWIG to create an interface for a given C file to run in our Python environment. The user will specify the C file using the import statement, followed by a declaration of any C functions she chooses to use:


import "database.c";

use int* init_bloom_filter();

use void* get_field(char* table, char* field);

...


Alternatively, the user may specify a header file where these functions are already declared:


import "database.c";

include "database.h";

...

We will automatically create interface and setup files necessary to use SWIG from these declarations. Once we run the setup files, we are free to import the encapsulated C file as a module into our Python interpreter and run the code as if it were another Python function. This way, debugging our code will all be done in Python, keeping our entire existing tool set and all new additions that we create in the same language. Then, all of our debugging can be done in PDB.


SWIG affords us the ability to declare arbitrary C code for use in testing; essential data types, such as pointers, structs, and arrays could be created for the sole purpose of testing inside the script. If we wanted to create an iterator to run through and print the values of our linked list iterator, it could be done; we will not, however, delve into this aspect of testing. Given more time we would like to support the aforementioned C types, but for now simple data types such as ints, floats, and char* (SWIG takes care of passing string values between Python and C) will have to do.

Friday, December 11, 2009

Sample program 3

import "test5.o";

%constraint listSize != 0 %exec %print "The size of the list is zero";
%constraint listSize >= 0 %exec %print "IMPOSSIBLE VALUE:The size of the list is negative!";
%constraint listSize != LIST_MAX %exec %print "List max reached.";

listAdd("Tom");
listAdd("Dick");
listAdd("Harry");

listRemove("Peter");
listRemove("Harry");
listRemove("Tom");
listRemove("Harry");

Sample program 2

import "test2.o";

#store a test file in a variable
t0 = file "test0.o";
#store a test block in a variable
t2 = test(x) {
#some other test
%print "Testing "+$x+"function.";
$x(1)->1;
$x(2)->3;
$x(5)->7;
}
#specify the verbose and logfile switches
//v,llogfile

if(foo(5)->3) {
#eval t0
$t0;
}
else {
#eval $t1
$t1(x);
$t1(y);
}

//-v,-l

Sample program 1

import "fact.o";

######This is a comment


#This will call the fact function
#and test the output against the 6

fact(3)->6;

fact(4) -> 24;

# : is the range operation
# {} is a group of expected outputs

fact(3:5) -> {6, 24, 120};

# no expected output just prints the result
fact(3:5);

# print messages to sdout
print "A message to stdout!";

# Grouped inputs along with grouped outputs
fact({2,4}) -> {2, 24};

Friday, November 20, 2009

I. Problem statement

Think back to every time you've been knee deep in C with no way to test your code but to-forbid the thought- write more C! We've all been there; stub testing requires, at the very least, a cocktail comprised of copy and pastes along with massive printfs. For even tighter control, you may even opt to GDB it. But why should we have to go through the same routine every time we create a new C module? While not particularly hard to write, creating a new program for each module creates a lot of redundancy. We propose a streamlined way of reproducing the following code:

int main(int argc, char **argv) {

printf(“Testing fn\n”);

printf(“Input: 1 Output should be: 3 Output is: %d\n”, fn(1));

printf(“Input: 4 Output should be: 5 Output is: %d\n”, fn(4));

printf(“Input: -34234 Output should be: 0 Output is: %d\n”, fn(-34234));

...

return 0;

}

in a syntax that is both simple and intuitive. Whether you are testing your own C functions or ensuring that someone else's contained within your own code work as expected, this is a very handy feature to execute on the fly.


II. Existing Solution

A similar solution exists in the Python Doctest module. With it, we can specify inputs to a given module's functions, along with the expected outputs. This is done by providing a multiline string after the function's definition, in which we give inputs and the corresponding outputs, line by line. A small example would be:

def factorial(n):

"""Return the factorial of n, an exact integer >= 0.

If the result is small enough to fit in an int, return an int.

Else return a long.

>>> [factorial(n) for n in range(6)]

[1, 1, 2, 6, 24, 120]

>>> [factorial(long(n)) for n in range(6)]

[1, 1, 2, 6, 24, 120]

>>> factorial(30)

265252859812191058636308480000000L

>>> factorial(30L)

265252859812191058636308480000000L

>>> factorial(-1)

Traceback (most recent call last):

...

ValueError: n must be >= 0

“””

(function def...)

Here, the tests are included in the function's docstring. The triple chevron signifies what the input is in python syntax; the first line is a list comprehension using the factorial function. The last line is a simple call to the factorial function, given the argument 30 as a long integer. Each test input is followed by the expected output; even thrown exceptions can be tested. When we execute the module, no output is given if the tests were successful (i.e. they match the expected output on the following line), unless the -v option is given by the caller.

The Doctest module is charged with the parsing of the docstring, and the Python interpreter is used to perform the execution of what is found. Each test that is run uses a shallow copy of the tested module's global variables as to not incorrectly influence the other tests.

An issue with Doctest is that it does not currently recognize doctests from Python extension modules. In its inner workings, Doctest's isfunction() returns False for an actual function that is defined in an extension module, while its isbuiltin() returns True, even though it is not a builtin. Also, there are weird corner cases, such as not being allowed to represent ellipses in the expected output. For example, the following test would incorrectly pass:

>>> print 'Hello'

… print 'World!'

Hello

Aside from these minor shortcomings, this module is a very nice built in feature that we would like to expand on for the C language.


III. Features

  • Abstraction

      • Tests/Testing commands stored as variables

      • Range Values – The ability to have a single command expand into any number of commands

  • Test Logs – A log created containing the results of the testing commands

  • Testing Options

      • Repetition – The ability to run tests a given number of times

      • Log Options – Verbose versus silent output

  • Testing Constraints – Flags thrown if testing conditions are met

The testing language is modeled after common scripting languages, where operations are performed with as few keystrokes as possible. The first example of C code in this document could possibly be rewritten as:

fn(1) -> 3;

fn(4) -> 5;

fn(-34234) -> 0;

where the output specified would be checked with the actual output. Or perhaps they can all be combined into one line:

fn({1,4,-34234}) -> {3,5,0};

Maybe we would like to test a range of values with a single testing command:

fn(-10:10);

where no output specified would simply print out the results. The objects being operated on in the language would be the C functions that we are testing, while the operators are the inputs we specify and the options for how the functions are tested.

Abstractions will be somewhat like defining a function, except that in this case it is a test. A test can match some function call to the output and also execute certain actions in the target language, such as creating an object or assigning a variable with some value to use as a test argument. Allowing the specification of repetitions for tests, along with testing logs, are for the benefit of the programmer. Implementing constraints allows the programmer to easily see whether certain conditions on the output are met, such as if the output falls within a certain range, or perhaps if an impossible value is retrieved:

%constraint fn(3)

%contraint fn > 0 %exec $test0;

Now, each output to the fn function will be tested against the preceding constraints. If the first constraint has not been met (function fn given an input of 3 stays below 56), the programmer will simply be notified. If the second constraint fails (any of function fn's return values are not greater than 0), test0 will be executed.


IV. Implementation

We plan to use our Python parser generator, given the specified grammar of our language, to parse the test script. Alternatively, we could start over and write a new parser in the same language we will be examining (C). The latter way would be faster, and not too complex given our simplistic grammar, but we would lose portability. Besides, we have already developed the tools for the job, and keeping each module of our code (parser, interpreter, debugger) in the same language is preferable.

The Python language allows us to easily execute some of the features of the testing language. Specifying a range of variables, or groups of them, along with groups of outputs, can be stored and checked with Python's mapping functions and dictionaries. Constraints can be quickly looked up from dictionaries as well.

After we parse the input, we will use an interpreter written in Python to execute the AST generated. We can alternatively create bytecode as the representation of the parsed input, but since our grammar is simplistic and unambiguous, it lends itself to interpreting the AST.

We can take advantage of an interface generator, like SWIG, to create an interface of the given C code to run in our Python environment. This way, debugging our code will all be done in Python, keeping our entire existing tool set and all new additions that we create in the same language. Then, all of our debugging can be done in PDB.


V. Resources

http://docs.python.org/library/doctest.html

http://bugs.python.org/issue3158

http://bemusement.org/diary/2008/October/24/more-doctest-problems

http://www.swig.org/