Don’t Hand-count Characters
July 15th, 2006A common mistake that most programmers do is to hard-code hand-counted lengths of string literals. For example:
if (!strncmp(str, "MAGIC_PREFIX_", 13)) {
...
}
Even if the string literal "MAGIC_PREFIX_" is not due to change in a life time, it makes the code more prone to human error and harder to read when the length of the string is hand-counted and hard-coded as a separate entity (13 in this case). A better way is to use the sizeof() operator:
if (!strncmp(str, "MAGIC_PREFIX_", sizeof("MAGIC_PREFIX_") - 1)) {
...
}
To some, it may look like a function call that needs to be dealt with at run time; however, sizeof() is an operator and its outcome is resolved at compile time. For the example above, the compiler would simply insert 13 in place of the whole “sizeof() - 1” expression. Note that since "MAGIC_PREFIX_" is a null-terminated character array, sizeof() would yield 14, which is the exact size of the character array, since we have to take into account the extra null character at the end. The purpose of the “- 1” following sizeof() is to account for the null character.
The advantages are:
- You don’t have to hand-count the characters while doing the coding and therefore you avoid the risk of miscounting,
- Other people looking at your code can clearly see your intention and don’t have to do additional hand-counting to verify your code.
However, it’s a bit awkward to make a copy of the string to be used inside the sizeof() operator because it requires you to update both copies when a string needs to be modified. To get around this redundancy (while unfortunately adding extra bulk to your code), you could define the string as a constant:
static const char MAGIC_PREFIX_STR[] = "MAGIC_PREFIX_";
static const size_t MAGIC_PREFIX_LEN = sizeof(MAGIC_PREFIX_STR) - 1;
if (!strncmp(str, MAGIC_PREFIX_STR, MAGIC_PREFIX_LEN)) {
...
}
Just change the value of MAGIC_PREFIX_STR and you’re good to go! Doing this obviously has additional advantages if the string in question is going to be used in more than one location.
You should also take note that the constant has to be defined as “char MAGIC_PREFIX_STR[]” and not “char *MAGIC_PREFIX_STR“. They are identical when it comes to just using the pointer of the string but sizeof() will just yield the size of a char pointer instead of the actual length of the string plus the null character.
July 21st, 2008 at 12:25 pm
There is an (insignificant) disadvantage when using the last version of the code compared to using the hand-counted version: MAGIC_PREFIX_LEN takes up some memory, while the constant 13 probably won’t. Compare the assembler code to see the difference — though I haven’t done it :-). I just bet on the fact that when using a hard-coded value (13) it will directly be assigned to a register (which is fast, and no RAM memory is used), while when using the ’static const size_t’ it will be assigned to a register via a memory access (RAM memory is used to store the value of the size_t constant; accessing this memory location is surely a slower operation than register assignment).
Another approach might be:
#define MAGIC_PREFIX_LEN ( sizeof(MAGIC_PREFIX_STR) - 1 )
or, even better:
enum { MAGIC_PREFIX_LEN = sizeof(MAGIC_PREFIX_STR) - 1 };
However, this really counts ONLY when you’re into super-optimizations. In real life (meaning, the kind of programming more than 90% C++ coders do) the few octets used by a size_t constant and the micro-seconds spent accessing it just isn’t an issue worth thinking about.
Your code, in fact, has two interesting points: (1) the use of “magic numbers” (that is, hard-coded constant values that have no apparent meaning, except to the original coder; i.e. 13); and (2) the use of code compilers vs. human counting. You seem to address mainly the second point, while I believe the first one is much more common. But you do hint to the right (IMHO) approaches on both counts (i.e. avoid “magic numbers”; avoid human counting), so I guess I’ll have to agree with the overall statement.