Jump to: navigation, search

History

This essay is adapted from an article I posted to comp.lang.c on November 17, 2003, under the subject "ANSI C compliance". I'm placing it on this wiki not as static text, but as a base for anyone to edit, in the wiki way. In particular, feel free to (a) refine it to reflect comp.lang.c consensus, (b) remove the first-person perspective, and (c) knit the disparate subsections together better, or at least give them section headings or something. For reference, here is my static version of the same expanded essay. -- scs 21:09, 4 February 2006 (GMT)

Change of Person and Section Headings

I've taken you up on (b) and the second part of (c). Please check that nothing has been lost in the translation. Some places didn't have a clear mapping, such as the first sentence of the conclusion - I almost rewrote it as "It's unfortunate if the folks in comp.lang.c tend to come across as...", but I don't think that preserves the meaning you intended with "I'm sorry if the folks...". There does seem to be scope to better section the article, but appropriate headings are a first step. The place I debated where a heading should go most was "Correct Code that Works Right" - originally in that spot I didn't have a heading and that heading I had placed where "Of Shibboleths and Herrings" currently is (probably too off-centre as a heading title in a technical article but appropriate for a rhetorical article such as this). There's also some redundancy between that section and the one headed "ANSI C as a Code-word". --Netocrat 13:03, 5 March 2006 (GMT)

The internal debate re-resolved in favour of moving the "Correct Code..." heading to where I originally placed it. --Netocrat 19:50, 5 March 2006 (GMT)

Suggestion re non-portable efficiency code

In the paragraph on portable code, as separated out from non-portable code, you write:

If you don't know about the conforming alternative, that's one thing, but having been presented with it, if you insist on continuing to do it your old way, because it's "more efficient" or it's "what you're used to" or because the resulting alleged nonportability "doesn't matter in this case", you're probably being willfully stubborn.

I'd add that if efficiency is a concern on a particular platform, it's not unreasonable to have a platform-specific alternative that only gets called on that platform. I'd like to make sure that this isn't counter to the spirit of the essay first, and that I'm not missing anything important. The only drawback I can see is dual-maintenance, which is offset against the importance of efficiency and how much extra efficiency can actually be achieved through non-portable code.
--Netocrat 03:13, 6 February 2006 (GMT)

I thought about that when I wrioe it. My thinking was that efficiency hacks that you can perpetrate at the source code level -- that aren't algorithmic or function callistic, but rather that involve two or more ways of coding the same algorithm that end up compiling differentially efficiently on two or more different platforms -- are supposed to be rare. Now, of course, in the real world they're less rare than they're supposed to be, but I decided to err on the side of preaching, there (rather than sullying a nice clean argument with a bunch of honest real world excuses). -- scs 05:11, 6 February 2006 (GMT)
Here's an example from my own experience. A file indexing program I wrote a while ago stores integers as variable-size in the binary index file (for space efficiency). The conversion from a regular integral type to the binary file's variable-size representation is possible to write in portable C using bit-twiddling and the ...MAX macros, and I have code that does that (actually since the code was written whilst my knowledge of the Standard was nascent, it's not really all that portable and iirc assumes that unsigned long long is available and is an unpadded 8 byte type). Knowing that my development platform is 2's complement, little-endian and has no padding bits in its integers though, I can instead use char-by-char access (still with some bit-twiddling involved of course). This turns out after benchmarking to be considerably faster, and in that program I used a #define that can bypass the portable code and use this code - important to me because indexing is a potentially huge task when there's a lot of data. This is not so much about code compiling differently as about making use of implementation-specific knowledge. Are similar situations in the 'real world' common? I don't have enough experience to say, but given that many people will be directed to the essay due to c.l.c topicality grudges, honesty about such real-world issues seems to be a good policy. I do take your point that it shouldn't be emphasised, and I recognise the reasonable (apparent) c.l.c consensus that optimisation should be avoided until shown to be necessary. There are cases though such as this example where (I contend that) optimisation and non-portable code are acceptable choices, particularly when the 'non-portable' alternative is in fact likely to be usable on the majority of machines the code is targetted at, and the portable code is more or less a fallback. --Netocrat 05:56, 6 February 2006 (GMT)
An unsigned long long is guaranteed to use at least 64 bits to represent it's values, due to the minimum range guaranteed by the C standard. Implementations may extend this range, but such extensions aren't guaranteed and shouldn't be habitually relied upon. When choosing an unsigned long long to represent an integer in portable software, the most significant factor should be that "My values will never be out-side of the range that the C standard guarantees". With this in mind, it then makes sense that "I can determine exactly how many bytes my values should occupy in a file" (eight, in your case), and hence "The file input/output need not deal with variable-sized representations". For the sake of clarity, your code should express these intentions as simply, precisely and if possible, as portably as possible. For example, your code should notify you, as the developer/debugger before writing values to a file, when those values fall outside of the range that C guarantees: assert(value <= 0xFFFFFFFFFFFFFFFFULL); ... so that you can decide to use a type with guaranteed wider ranges, or a portable multi-precision arithmetic library rather than reinventing the wheel. By writing your guarantees in code, you might just be enabling further optimisations.
I find it difficult to believe that access to secondary storage (the hard drive) is the most significant bottleneck in typical real-world programs. I know that common computers of today cache small pieces of code in very fast memory in their CPUs. The tighter the machine code is, the more likely it is to fit into the code cache. However, this shouldn't impact upon the legibility (simplicity, precision and portability) of your code. When you do happen to find a significant bottleneck that can be fixed with a non-portable optimisation, perhaps it'd be a good idea to send a patch or a description of the optimisation to your compiler maintainers, since that is where these optimisations will be of the most benefit.
Your compiler is most probably extremely intelligent and capable of recognising when it can and can't optimise unused or repetitive code, tail recursive functions and strlen from the condition of your loops. Furthermore, it'll happily perform these optimisations on code written in the past and the future, far quicker than you can manually. Consider that these problems are far more complex than optimising input/output to write directly to/from integers. If your compiler doesn't optimise your simple, portable input/output code to the same as (or better than) your non-portable input/output code, then it won't be a problem to implement such an optimisation. That is, assuming the compiler developers see this as a significant bottleneck in typical real-world programs.
I wonder if it'd be more helpful to introduce this perspective into the article. --Modifiable lvalue (talk) 15:00, 21 April 2013 (UTC)
If you'd like to give it a try, you'd be welcome to. Welcome to the wiki, and sorry for taking so long to respond. --Laird (talk) 09:06, 22 May 2013 (UTC) (formerly going by "Netocrat")
Personal tools