Pages

Wednesday, March 31, 2010

Side effects

Even panacea can have side-effects in the sense that it makes humans too much dependent on it and subsequently lazy...even C is not untouched by it.
The best way to explain it can be done using examples. Consider the snippet below:

/*
int a=5;
printf("%d %d %d %d",a++,a++,++a,a++,++a);
*/

A superhit snippet and one of the most common queries posted on forums.
The confusion comes when the innocent looking snippet above gives different confusing outputs under different environments.
And most of the responses form a two word sentence like Undefined Behaviour, Sequence Points leaving the confused person even more confused and ...(undefined verb!!!)

But the explaination of the above innocent code will have to be done by covering few important topics. Lets take them down!

1> Undefined Behaviour
   A behaviour which is not defined is Undefined Behaviour. The most common example comes in the           following form:
  
/*
int a;
a++;  // a will have garbage
*/

The above action leaves compiler free to assign any value to 'a'.Similar instances are division by zero and indexing out of array.

2> Unspecified Behaviour
The behaviour arises when a code behaves differently under different compilers. Like the size of an integer is 2 bytes under Turbo C/C++ and 4 bytes under GCC compiler.
Its like, you behaving differently in front of your parents and in front of your friends ;)

3> Side effects
We now come to the order of evaluation in C. C language does not specify the order in which the operands of an operator are evaluated, with few exceptions like &&, ||, ?: and comma(,).
for example:

x=f() + g();

f() may be evaluated before g() or vice-versa. And if either f() or g() alters a variable on which the other depends, x can depend on order of evaluation.

But, if we write something like:
x=f(),g();
then the order of evaluation is defined [ f() is called first, then g()] because comma operator is an exception as mentioned earlier.

The best ambuigity of order of evaluation can be shown by the statement:

a[i]=i++;
The subscript can be old value of i or the new one.


Now, we come back to our good old snippet.

int a=5;
printf("%d %d %d %d",a++,a++,++a,a++,++a);

Following reasons can be given for the outcome of the above snippet:
* The comma operator in a function call does not guarantee order of evaluation, so we cant say which of the a's will be evaluated first.                                                                          
* This will leave compiler free to evaluate any 'a' and in any order. This should explain the confusing outputs that might appear.
* The Undefined Behaviour thus encountered gives rise to Unspecified Behaviour where the same snippet gives different outputs under different compilers and environments.

To summarise,
Compilers interpret these things in different ways, and generate different answers depending on their interpretation. The standard intentionally leaves most such matters unspecified. The writing of code that depends on order of evaluation is a bad programming practice, and, should be avoided.

Have fun with C!