Professional Documents
Culture Documents
Part I: C Macros
And how they can damage your code
Operator Order #1
Heres a nice macro: Lets try to use it:
Operator Order #2
Lets try this one:
#define sqr(x) (x*x)
Multiple Evaluation
Heres the fixed version of the previous macro:
#define sqr(x) ((x)*(x))
After preprocessing:
Set to 0 OK
Add 0 Bug!
\ \ \ \ \
#define move(srcpool, dstpool, num) { int move_num = num; move_the_elements; srcpool->size -= move_num ; dstpool->size += move_num ; }
After preprocessing:
Flow Control #1
Heres a useful macro:
#define dbgprint(msg) if (dbgflag) printf(msg); if (x==3) dbgprint(very bad\n); else dbgprint(very good\n);
After preprocessing:
else to the if compilation error. Solution: Make sure to use braces around the conditional statement, even if theres only one. But as the macro writer, you cant assure this. Omit the ; from the macro leave it to the caller.
Flow Control #2
Look at this debugging macro:
#define dbgprint(msg) if (dbgflag) printf(msg) if (x==3) dbgprint(very bad\n); else dbgprint(very good\n);
After preprocessing:
Result - whose else is it? The else will be related to if(dbgflag), not
to if (x==3) wrong results. Solutions: Use braces with the if. But the macro writer doesnt control it. Use one of these structures:
#define dbgprint(msg) if (!dbgflag) {} else printf(msg) #define dbgprint(msg) do {if (dbgflag) printf(msg);} while(0)
Preview
Do we really understand integers?
We take mathematical operations for granted. We assume that things work just like we have learned in elementary school. We take integer arithmetic as something basic, that doesnt require any bothering. Question: Which integers satisfy the condition (x == -x) ? In normal math, theres only one 0. In fact there are two 0 and 0x80000000. Check for yourself. Conclusion integers are not as simple as you may think. In this presentation you will find: Many functions and code segments, doing integer arithmetic. All are mathematically correct if integers were simple numbers, they would give correct results. All are buggy they fail because of how integers work.
Intermediate Results
When evaluating a complex expression, there are intermediate
results. Normally, we ignore them we look at the big picture. Intermediate results have their types, and their data range. They are not stored in an arbitrary size and precision. Its just as if you have declared them explicitly : int x = a + b * c; is the same as: int temp1 = b * c; (Assuming that b,c are ints) int x = a + temp1;
What if the intermediate result wraps, but the final result doesnt?
With addition and subtraction, its usually OK.
In (1-2)+3, though 1-2 is 0xffffffff, we eventually get 2 correct.
Any problem?
pos >= 0 is a meaningless condition! pos will go down from SIZE-1 to 0, then to 0xffffffff. This is still positive.
We get -2!!!
x,y are unsigned, so (x-y) is unsigned. (x-y) > 0 is the same as (x-y) != 0.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential
We get 0!
x is greater than y, but (x-y), when viewed as an integer, is negative.
0x80000000 is negative! its not greater than SIZE. The function will not catch the error!
2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Using Constants #1
Consider the following function:
int big_enough(int size) { return (size > sizeof(int)); } Tells you whether a given size is big enough.
Obviously, a negative size is not big enough. Or is it?
Using Constants #2
Heres another function:
int big_enough2(int size) { return ((size - sizeof(int)) > 100); }
Calculating Average
Heres a simple exercise:
Write a function to calculate the average of two numbers. Actually two exercises signed and unsigned.
Solutions:
unsigned int u_avg(unsigned int x, unsigned int y) { return (x + y) / 2; } int s_avg(int x, int y) { return (x + y) / 2; }
Signed:
We get -1!
x+x equals 0xfffffffe, which is -2. So (x+x)/2 is -1.
Percentage #1
How much is 30% out of something?
Thats easy. Can you program it?
Percentage #2
The last percentage function was stupid. Lets
write a better one:
int f(x) { return x * 30 / 100; }
Now it works.
Always?
Percentage #3
Writing a percentage function cant be that
hard. This time, well do it right:
f(143165600) reurns 42,949,680. I like this one much better.
int f(int x) { return x / 100 * 30; }
Lets how it does with the last example: But how about something easier?
How much is 30% of 10? f(10) returns 0!
10 / 100 is 0.
Percentage #4
Heres a more general percentage function:
int p(int x, unsigned int p) { if (x>1000 || x<-1000 || p>100) return OUT_OF_RANGE; return x * p / 100; }
Percentage #5
Heres a harder question what percentage is 30 out
of 50?
Or generally, what percentage is x out of y?
(x * 100 / y)
Overflows when x is large (whats 5M out of 8M?)
x / (y / 100)
Crashes when y is small (whats 5 out of 8?) Inaccurate when y is not very large (whats 500 of 599?)
100 / y * x
Inaccurate when y less than 100. 0 when y is more than 100.
Signed/Unsigned Division
Heres a nice function:
Bit Fields
Bit fields are very nice they save memory. Heres a program that uses them:
struct x { int flag:1; int count:31; } const char *flag_set(struct x *s) { const char *n[] = { FALSE, TRUE }; return n[s->flag]; }
Shifting
Check out this function:
void printb(int x, char *buf) { buf[0] = \0; for (; x != 0; x >>= 1) strcat(buf, (x & 1) ? 1 : 0); }
Conclusions
Be aware. Remember that:
Your code may not mean what you think it means. The variables type and valid range are important. Intermediate data has a type and valid range.
Especially important with multiplication and division.
Even code that seems simple and correct may surprise you. Use types explicitly: When types matter, dont let the compiler cast automatically. Cast yourself, to make things clear. Use variables for intermediate results, even when not needed.
This may remind you of the intermediate values importance.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Alignment
Consider the following function:
char buf[SIZE]; void write_num(int off, int num) { int *p = &buf[off]; *p = num; }
Operator Precedence
We all know the precedence of some operators:
Multiplication and division before addition and subtraction.
a * b + c is the same as (a * b) + c.
/* char *get_name(int id); No prototype! */ printf(%s\n, get_name(MY_ID)); Will this work? On 64bit platforms, the returned value will be assumed int. The higher 32 bits will be ignored. If the string is located above 4GB it will crash. Sometimes we get away with it. In Solaris 64bit kernel, all global and static variables are located below 4GB. The problem is when returning a pointer to dynamic memory.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential
The index is an unsigned variable. The index is of a type smaller than a pointer.
This should allow access to fwiftab[-1]. But what if the index is unsigned?
It will crash on 64bit platforms. It will crash if the index is u_char or u_short.
In practice:
Its always called with a signed int. -1 is possible only on Nokia, which isnt 64bit. Were lucky.
In practice:
Its called many times, always with variables named timeout and ttl. This is the only case where the macro can work.
Possible Overflow
Heres a piece of code from fwatom.c:
u_int fw_hmem_size_new, fw_hmem_maxsize_new; ... if (fw_hmem_size_new * 2 > fw_hmem_maxsize_new) fw_hmem_size_new = fw_hmem_maxsize_new / 2;
Makes sure that the new size doesnt exceed half the new limit.
Both sizes are in bytes. But what if the size is 2GB or more? fw_hmem_size_new * 2 will wrap around. The size wont be decreased. In practice: The size cant be more than 2GB minus something. This is because we currently cant use more than 2GB. The bug is just around the corner.
char * fw_func_getname(int func_id) { if (func_id < fwfuncs.nfunc) return fwfuncs.funcdesc[func_id].funcname; return NULL; }
In Practice:
func_id isnt negative, unless theres another bug. The string is used only if debug is enabled.
2005 Check Point Software Technologies Ltd. Proprietary & Confidential