Parallel problem-solving

I’ve just put up Rexel (Remote Execution Environment for Linux) for download. Download the source tarball or the thesis from this website.

In short, Rexel allows the developer to write a single native-code C program that can execute on multiple computers concurrently. This is not the same as writing multiple programs which, via the clever passing of messages, allow all the programs to work on the same problem. Rexel allows a single C program to execute different parts of itself on remote computers.

One of the examples of Rexel usage is to run an unmodified Sqlite3 program within the Rexel environment. This allowed the Sqlite3 binary program to unknowingly perform replication when Rexel redistributed all the filesystem calls across a cluster. Whenever the program modified the file, those modifications were instantly replicated across the cluster.

Another example is the naive Lisp implementation within Rexel. This simple Lisp interpreter written to take advantage of Rexel facilities automatically offloads computations on to remote nodes (without the programmer specifying as such).

Anyway, download and play with it.

Remote Function Invocation

Full MSc thesis


Reference management in HTML, Part II

A while back I proposed a reference management scheme for HTML blog posts (incurring the wrath of the semantic web people, no doubt :)). I’ve updated it somewhat, making it easier to use. Now it’s easier than ever to ensure that your HTML content is well-referenced.

Usage is simple:

<script src="jsnote.js"></script>
<script src="references.js"></script>
<p>This is how we cite the first reference<cit>FirstRef</cit>.
We can cite the second reference like this<cit>SecondRef</cit>
and the third like so<cit>ThirdRef</cit> and the fourth reference
like this<cit>FourthRef</cit></p>
... more content goes here ... pages and pages if you want ...



The first javascript file that is included contains all the routines for making references work. The second javascript file contains a list of all the references used in the example article, namely  FirstRef, SecondRef, ThirdRef and FourthRef. All the text within <cite> tags are used as a lookup into the references.js file to find the reference and the generated list of references are placed within all the <bibliography> tags found (although why an article would need more than a single bibliography is beyond me).

The resulting article has all of the <cite>tags replaced with the reference number in square brackets, which is a link to the actual reference. The above example HTML code looks like this:

This is how we cite the first referenceFirstRef. We can cite the second reference like thisSecondRef and the third like soThirdRef and the fourth reference like thisFourthRef… more content goes here … pages and pages if you want …

The actual references appear at the end of the article (click on one, see what happens).

The function jsNote() starts all the javascript to make the reference-magic happen. Authors can now easily reference all their material in a thorough and complete manner without resorting to manually inserting links to their references.

You can download jsnote.js, the example references file references.js and the example article jsnote-example.js from the links below.

  • jsnote.js (Include this file in your article)
  • references.js (Use this file as a template and example for all your references)
  • jsnote-example.js (Small example of usage)
  • jsnote.pdf (The as-yet-unpublished paper. This link will be enabled once I publish the paper)





Array Bounds Checking in C

Most languages (Java, C#, Python, Javascript) prevent the programmer from going past the end of an array. This process, performed at runtime by the language implementation, is frequently called Bounds Checking.

Unfortunately in C there is no way for the programmer to determine the size of an array at runtime (not quite true, as I’ll show below) and so it becomes necessary for the programmer to keep track of the length of the array. In other languages, one can declare an array and then at runtime get the length of that array. For example, in Javascript the programmer can get the length of an array using the length property of the array:

for (var i=0; i<array.length; i++) {
   // Do something with array[i]

In C the programmer is required to keep track of the length of the array. Failure to keep array references to within the array result in segfaults, protection faults, etc. Most programmers define the array length somewhere.

#define ARRAY_LENGTH     (25)
for (size_t i=0; i<ARRAY_LENGTH; i++) {
   // Do something with array[i]

This becomes a problem if the array length changes and the programmer didn’t reflect that change in the #define ARRAY_LENGTH. A quick fix is to use the sizeof operator as follows:

int array[] = {10, 20, 30, 40, 50, 60, 70, 80};
for (size_t i=0; i<sizeof array / sizeof array[0]; i++) {
   // use array[i]

The evaluation of (sizeof array) is the total number of bytes that the array uses. The evaluation of (sizeof array[0]) is the number of bytes that array[0] uses. Dividing the total number of bytes that the array uses by the number of bytes that each element uses gives you the number of elements in the array. It works for all data types, including pointers, function pointers, structs, etc.

struct foo_t {
   int a; float b; char *c;
struct foo_t array[] = {
   {10, 12.12, "First Item"},
   {20, 23.34, "Second Item"},
   {30, 34.12, "Third item"},
   {40, 54.32, "Fourth Item"},
for (size_t i=0; i<sizeof array / sizeof array[0]; i++) {
   // Use array[i].a and array[i].b and array[i].c

The problem comes in with function calls, and passing your array as arguments to other functions. In C all arrays decay to pointers when passed as an argument. Thus the called function only gets a pointer to the first element. For example in the following snippet function func_foo() gets the array array as a pointer a foo_t struct, hence the sizeof trick I used above will not work:

struct foo_t {
   int a; float b; char *c;
struct foo_t array[] = {
   {10, 12.12, "First Item"},
   {20, 23.34, "Second Item"},
   {30, 34.12, "Third item"},
   {40, 54.32, "Fourth Item"},

void func_foo (struct foo_t array[])
   // (sizeof array) = the size of the pointer to a struct foo_t (4)
   // (sizeof array[0]) = the size of a struct foo_t (12)
   // Therefore the length will be calculated as (4 / 12), which
   // is zero using integer division. In this case, the loop won't
   // run at all!
   for (size_t i=0; i<sizeof array / sizeof array[0]; i++) {
      // Use array[i].a and array[i].b and array[i].c

The problem is that a parameter of the form (type name[]) is turned into (type *name). A common workaround is to simply have all your arrays terminated by a special character. If you have an array of integers use INT_MAX (from limits.h) as the special value to signify end of array. For all pointers use NULL (if NULL can legitimately appear in your array then use (NULL -1) instead). Thus your array iteration code will look like this.

int array[] = {10, 12, 13, 14, 15, INT_MAX};
for (size_t i=0; array[i]!=INT_MAX; i++) {
   // Use array[i]

Same with pointers of any type:

char *array[] = {
   "One", "Two", "Three", "Four", NULL,
for (size_t i=0; array[i]; i++) {
   // Use array[i]

It even works across function calls: you can pass this array to functions and not have to worry about passing along a separate length value. The function has to know how to check for end-of-array.

struct foo_t {
   int a; float b; char *c;
struct foo_t array[] = {
   {10, 12.12, "First Item"},
   {20, 23.34, "Second Item"},
   {30, 34.12, "Third item"},
   {40, 54.32, "Fourth Item"},
   {0, 0.0, NULL},

void func_foo (struct foo_t array[])
   // Because each element is a struct, the best convention to
   // represent end-of-array is to use all NULL or zero values
   // for all fields.
   for (size_t i=0; array[i].a && array[i].c; i++) {
      // Use array[i].a and array[i].b and array[i].c

Obviously for that last code snippet you can make your convention be “The last field is NULL”, or “All the fields are NULL or Zero”. Personally, I use NULL wherever possible.

The only drawback is that you cannot have an element of the array that corresponds to your end-of-array terminator.I haven’t encountered this problem yet though … since I started doing this about 15 years ago … so you should be safe using this convention.



































Reference Management in HTML

[EDIT – this article has been superceded by a newer one – click here to view it]

Writing well-referenced material is a chore. If you’re using LaTeX, then you’re in luck because it works with BibTeX to lessen the tedium. If you’re using a word-processor, you’re still in luck as you can use something like EndNote to manage your references. How about if you are writing a blog, or an HTML book, or contributing to a Wiki? Then, it seems, you’re out of luck.

Until now. Enter JSNote (Javascript Note). It works pretty much the same way that BibTeX does with LaTeX. You maintain a list of references in a separate file, and in your material you cite those references by means of calling a javascript function. For example;

// Filename /downloads/references.js
addRef ({name : "john",
year : "2012",
volume : "17",
author : "Charles Dodgeson",
title : "Obtuse Works of Mathematics in 'Alice in WonderLand'",
addRef ({name : "boehm_1",
year : "2006",
month: "May",
conference: "ICSE '06",
author : "Barry Boehm",
publisher: "ACM",
title : "A View of 20th and 21st Century Software Engineering",

And then, in your HTML you pull the jsnote.js script in, as well as your list of references and use them like so:

<script src="/downloads/jsnote.js"></script>
<script src="/downloads/references.js"></script>
<p>I now will reference Boehms article<script>cite ("boehm_1")</script>and then reference Johns article like this<script>cite ("john")</script></p>


The practical result is that the citations are all printed properly. The above example looks like this. Note that the list of references in reference.js works exactly like it would in BibTeX, namely that you can put the arguments in any order, or even leave some of them out if you don’t want them displayed. As long as each reference in the references list has a name, jsnote will be able to cite and display all the information you gave when adding it as a reference.

Literate Programming Is Not Dead Yet.

Get jsnote.js
Get a sample references file

Memory accounting routines for C

Managing memory is Hard. Very Hard. If you are programming in C you have a few options, notably valgrind. However, if you find yourself in the situation that I find myself in now in which valgrind cannot work, then you need something more.

For example, I’m writing code in ECL to call out to a few C libraries (that I’ve written myself). Valgrind is incompatible with ECL. In fact, it might just be incompatible with any GC-language. Whatever. I still need to keep accounting of my memory during testing.

Hence, I wrote this little replacement/wrapper for malloc. It stores all allocated memory as a list of blocks internally. It stores the filename, function and line number and the amount of memory that was allocated by each invocation of xmalloc. Usage is fairly standard:

int *foo = XMALLOC (sizeof *foo);


Easy? You can look through the header for all the options (how to dump the list of blocks allocated, some general statistics on allocation, etc). Also, this can be turned off while the program is running, and normal malloc() is then used.


Download the the xmalloc library

iBurst USB Drivers for Linux

Got iBurst yesterday, along with a USB modem. The salesperson assured me that the modem works under Linux. Well, chalk one up to the forces of sales and marketing, because the modem definitely did not work, and the enclosed CD definitely did not have drivers for Linux. I downloaded the open source drivers (ibdriver-1.3.5) from sourceforge, but the compilation failed.

So, having “fixed” the compilation problems by means of removing the pcmcia functionality, I offer the updated drivers to the world. This iBurst USB modem driver will work under Ubuntu, Debian, Mint, Redhat, CentOS, slackware or anything else that has a linux kernel 2.6 and above (including kernel 3.0). Download it here (choose your kernel).

Note that this is self-extracting script, so before running it make sure that it has execute permissions set. Please let me know if this does not work by emailing me the output of the executable. If it does work, then you can thank me in your will (of course, if you don’t own much, then just leave a comment here thanking me:)

Download iBurst USB Modem Linux driver (kernel v2.6.x)

Download iBurst USB Modem Linux driver (kernel v3.0.0)

Download iBurst USB Modem Linux driver (kernel v3.2.0)

Writing fiction is for Dummies

I’ve been writing short works of fiction (not on this blog, of course, everything here is as fact-based as I can muster), and I’ve now got a full six short stories up on various sites.

You can get a nice overview of the short stories over here at smashwords. Feel free to read, review and rate. They are free, after all, although this state of affairs is not going to last if I can manage to keep up the pace for another four  months.

Why four months? Well, in about four months I figure I’ll have enough to put together a collection into a single book, which will then go on sale. At the moment though, if you like science-fiction, OR, wicked mind-games, OR short humour, OR Zombie apocalypse tales, then head on over to the smashwords site, and read, review, rate.

I’m open to suggestions, by the way, so feel free to email me about any of the short stories.

Baby Sex-selection at Conception

Everyone knows how to make a baby. When a mummy and a daddy love each other very much, they make a baby. Well, thats what we tell the kids anyway. There is more to that, however.

The ovum is released  roughly 14 days into the womens monthly cycle. The sperm cells can survive for between 3 and 5 days. The ovum will be viable for only 24 hours (or less). Hence, intercourse must occur before ovulation, and ovulation must occur before the sperm cells all die.

Hence the fertile period  (the best time to make love in order to conceive a child) is during those 4- 6 days (sperm cells will survive for 3 – 5 days, and you can still make love on the 4th – 6th day when the ovum is descending the fallopian tube). It’s pretty easy to mark these days off on a calendar.

What most woman don’t realise is that not all fertile days are equally likely to result in a boy or a girl baby. Making love on certain days is more likely to result in a boy baby, and making love on other days is more likely to result in a girl baby. This is due to two particular characteristics of the sperm cells. For a more thorough explanation please see this document over here.

So, lets say you want a girl baby – how do you know which day will be the best to conceive a girl? Well, you can either use the Conception Fertility Calendar which will mark off the days that are better for conceiving a girl, and mark off those days that are better for conceiving a boy, or you can do the calculation manually and mark it off on a calendar yourself. Let me demonstrate.

First, figure out when your ovulation is due to occur. This is generally 14 days from the end of your cycle, so measure your cycles to get an accurate number (not all woman have a 28 day cycle), add that to the start date of your period and then subtract 14 days from that.

For example, if your cycle length is 26 days, and your period starts on the 2nd of the month, then add 26 to the 2nd, and you get 28th, then subtract 14 days from that, and you get the 14th of the month.

Now that you know when your ovulation will occur, you need to figure out when the fertile period starts. The start of the fertile period is simply the day of ovulation minus the lifespan of the sperm cells. So, if we assume that sperm cells will last for 3 days, then the start of the fertile period is … “the 14th (the day of ovulation) minus 3 days (the lifespan of the sperm cells)”, which works out to the 11th of the month.

This means that if you made love on the 11th, with a sperm lifespan of 3 days, they’ll all be dead  at the end of the day on the 13th, there will be no more sperm cells left on the 14th when the ovum is viable, therefore we have to add one more day so that the start of the fertile period works out to be the 12th.

To determine which of the days are better for a girl (or a boy) is a little more complicated:

  • Do integer division on the fertile period by 2. In our case it is 4 (the total number of days you are fertile for) divided by 2, which is 2 with no remainder. Hence the first two days of the fertile period are better for a girl and the final two days are better for a boy.
  • The above won’t work if the fertile period is an odd number – for example if the fertile period is 5, the results of the integer division is 2 with a remainder of 0.5. In this case mark the first two days of your calendar as fertile for a girl, the day after those as fertile for both boy or girl and the final two days as fertile for a boy.
  • To determine the sperm lifespan, or to work out more accurately when ovulation will occur, I have some tips that you can follow.

Is all of that too much work to get a baby of the sex you want? I expect it is, but good news is here in the form of an Android application I wrote called (unsurprisingly) “The Conception Helper”.

For the price of a happy meal ($3.99) you can simply purchase an application for your cellphone that will do all this for you (and do it more accurately as well). I’ve written the Conception Helper for Android phones and tablets. Simply use it as directed and it will display a calendar for you each month with the relevant days indicated on it.

Further, even if you do not start it at all (after setting your start of period), when the relevant days come along,  it will warn you. No need to trust your memory to look at it at all. See the Conception Website  for more details.

And, there is also no need to take my word for it – there is a trial version that will work for 30 days – after 30 days you’ll have to pony up the cash to buy a copy, or simply redownload it to your phone again.

Download it, try it before you buy it, and if you are happy with it, then and only then spend your money on it. In the meantime, you can read the manual and see just how effective Conception can be.

Google with egg on their face?

The Google Android Developer Challenge/Africa is well underway. App submission deadline was 1-July-2011, semi-finalists will be announced 15-July-2011. Winners will be announced 12-September-2011.

The point of this, presumably, is to encourage African developers to develop for Android. The biggest motivation to do something is money, so participants are allowed to enter paid apps into the challange (under “Eligibility”). Sadly, it seems that the organisers of ADC spend very little time with the Google Checkout team, because Google checkout does not support many of the countries that are eligible for ADC/Africa, and there is no other way to get paid for apps in Google marketplace.

While I know that, sometimes, in large companies its a case of “the right hand not knowing what the left hand is doing”, in this case it seems even the left hand has only a very hazy notion as well. I’m working on porting my application to iPhone. Of course, the longer it takes for Google to get this working, the larger the number of ADC apps that get ported. It could be a case of Google simply giving app developers motivation to come up with a good concept that then goes to iPhone because the concept could not be sold for Android.

Heads up to Google: those devs that came up with a new, nifty and/or novel app will certainly not make it freely available for Android (not even as trial) if they can make it for money for iPhone. Making a trial version for Android simply means putting a good idea out there that some iPhone dev is going to implement on iPhone. To keep the first-mover advantage of a novel concept, the dev has to be the first one with the app. If it is so novel and nice, people would buy it from elsewhere.

Of course, its always possible that Apple also won’t allow devs from South Africa to sell applications either – with Apple you don’t get this information until after you pay their $99 fee, so if anyone knows if South African developers are allowed paid apps in Apples App Store, let me know.

Dynamic arrays in C

(NOTE: This code has been rolled up into the open-source libxc – see While you can still use this code as is, it’s recommended to use libxc instead which has the same functionality and more).

Dynamic arrays are remarkably easy to implement in plain C with two caveats:
1. No destructors, so the user of the array has to free the array elements one item at a time (not such a problem!)
2. No copy constructors, so the user of the array has to ensure that structures with pointers are copied correctly.

I implemented the dynamic array by simply storing an array of void pointers, and only storing pointers to elements (a pointer to void can point to anything), hence this lets me have a dynamic array that can store elements of anything, even elements of different types (although this is a bad idea – unreadable code is not the goal!). You can download the code that implements this as a library.

The “array” is simply a struct like so:

struct rsarray_t {
   size_t alloc_len;   /* Store the number of allocations */
   size_t used_len;   /* Store the of cells we have thus far used */
   void **data;        /* The actual array of void pointers */

There are no new_array or create_array functions that the user must call. All the user has to call to append a pointer onto the end of the array is rsarray_ins_tail(array, element), where element is the new pointer to be appended to array, which is of type struct rsarray_t*.

If array is NULL, then an array is created and returned with element as the first and only element of the array (which is why there is no need for an “array_create” function). If array is not NULL, then the element is appended to the array using realloc if there is not enough space to store another element. Usage is dead-easy:

struct rsarray_t *array;
int a =0, b = 1, c = 2;
array = rsarray_ins_tail (NULL, &a);
array = rsarray_ins_tail (array, &b);
array = rsarray_ins_tail (array, &c);
/* array now contains pointers to a, b and c */

You may not always want to store many references to the variable, though. For example, at this point in the code we may need to modify the values of a, b and c and we don’t want those modifications to take effect on the values already stored in the array. To this end, I provide a function rsa_dup, which will make a duplicate of any object passed to it (with the size of the object specified by the user). The above code can then be written as:

int a = 0, b = 1, c = 2;
array = rsarray_ins_tail (NULL, rsa_dup (&a, sizeof a));
array = rsarray_ins_tail (array, rsa_dup (&b, sizeof b));
array = rsarray_ins_tail (array, rsa_dup (&c, sizeof c));
/* array now contains copies of a, b and c */
a = 42; /* Element 0 of array is still 0 */
/* Change the value of b */
*(int *)array->array[1] = 43;

The above is still a little dirty. The duplication of the variable in the insertion is bound to lead to problems, and the verbose semantics of accessing an element (seen in the last line) is simply unreadable. They can both be fixed with macros:

/* unsafe macro, 'x' should not have side-effects */
#define RSA_INS_TAIL(a,x) \
   rsarray_ins_tail (a, rsa_dup (&x, sizeof x))
#define RSA_INDEX(a,i,t)   \
   *(t *)a->data[i]
int a = 0, b = 1, c = 2;
array = RSA_INS_TAIL (NULL, a);
array = RSA_INS_TAIL (array, b);
array = RSA_INS_TAIL (array, c);
/* array now contains copies of a, b and c */
a = 42; /* Element 0 of array is still 0 */
/* Access the second element i.e. array[1] */
RSA_INDEX (array, 1, int) = 43;

All well and good, how do we iterate? Easy, again.

/* unsafe macro, 'x' should not have side-effects */
#define RSA_INS_TAIL(a,x) \
   rsarray_ins_tail (a, rsa_dup (&x, sizeof x))
#define RSA_INDEX(a,i,t)   \
   *(t *)a->data[i]
#define RSA_LENGTH(a)    \
struct rsarray_t *array = NULL;
int i;
/* Insert ten elements into array */
for (i=0; i<10; i++) {
   array = RSA_INS_TAIL (array, i);
/* Change all the values stored in the array */
for (i=0; i<RSA_LENGTH; i++) {
    RSA_INDEX (array, i, int) = 42 + i;

Freeing is as simple as freeing each element (because you are certain that there are no other references to it – the memory was allocated when we appended elements using rsa_dup, so no other code should be sharing the pointer), so we can simply iterate over the array and free each of the void*. In fact, since iteration over the array is a common occurrence, we can just go ahead and create a function to do it for us, so that we never have to write the actual loop:

void rsarray_iterate (rsarray_t *a, void (*f) (void *))
   size_t i;
   for (i=0; iused_len; i++) {
      f (a->data[i]);
struct rsarray_t * array = NULL;
[... here we append to the array ...]
/* Now we free each element */
rsarray_iterate (array, free);
/* Free the array that held all the elements */
free (array->array);
/* Free the structure that held the array */
free (array);

Because this isn’t rare either, we can put the above 3 lines into a function call (for example) rsarray_free (). Of course, once you’ve got that, you can use the resizable array to implement a host of other data structures. You can download the code that does this over here.

It was written about 8 years ago, and I’ve been using it every since (dynamic arrays are more useful than you think!). I go a lot further in the download above, as the download is actually written as a library, and uses the dynamic array as a basis to implement growable arrays, stacks and queues. Check the download for example usage, further avenues to explore, for example creating a ring-buffer, or implementing copy-construction so that composite datatypes consisting of pointers are sensibly copied when an insertion is made.