Jump to: navigation, search

The C Programming Language, 2nd Edition, by Kernighan and Ritchie
Exercise 7.09 on page 168

Functions like isupper can be implemented to save space or to save time. Explore both possibilities.



Solution by Gregory Pietsch

This question is best left to an essay rather than code, so here's my take:

The easiest way to implement the eleven is() functions in C90's version of <ctype.h> is via a table lookup. If UCHAR_MAX is 255, then a table would take up around 514 8-bit bytes and still have room for five more is() functions. In modern programs, this is a miniscule expense of both space and time since a mere table lookup doesn't cost a whole lot (although space may be a priority for embedded systems).

Additionally, since the is() functions of <ctype.h> are locale-dependent and therefore subject to locale-specific whims, a table could more easily be modified than modifying hard calculations. Consider the following three implementations of isupper():

Implementation #1:

int isupper(int c)
{
    return (c >= 'A' && c <= 'Z');
}


Implementation #2:

int isupper(int c)
{
    return (strchr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", c) != NULL);
}


Implementation #3:

/* Presumably, _UP is a power of 2 and 
 * _Ctype is a table
 */

int isupper(int c)
{
    return ((_Ctype[(unsigned char)c] & _UP) != 0);
}


Implementation #1 fails in EBCDIC and implementation #2 fails in a locale that adds more upperspace characters than the ones mentioned. Implementation #3, however, suggests that _Ctype[] can be modified to accommodate new uppercase characters.


Solution by codybartfast

Similar to Gregory's Implementation 1, my 'save space' version is:

static int sisupper(int c);

int sisupper(int c)
{
	switch ('A') {
	case 0x41: /* ASCII */
		return 'A' <= c && c <= 'Z';
	case 0xC1: /* EBCDIC */
		return (('A' <= c && c <= 'I') || ('J' <= c && c <= 'R') ||
			('S' <= c && c <= 'Z'));
	default:
		fprintf(stderr, "error: Can't guess encoding. 'A'=0x%X\n", 'A');
		exit(1);
	}
}

Worst case this could result in a switch, 6 integer comparisions and 5 logic operations.

I'm not sure how this affects performance but it's probably not good that there are perhaps 12 execution routes through the code.

While this will take a trivial amount of (modern) processor time, it's easy to imagine this approach being relatively slow compared to a table lookup.

Similar to Gregory's Implementation 3, my 'save time' implementation is:

static int tisupper(int c);
static int map[UCHAR_MAX];
static void tisupper_init(void);

int tisupper(int c)
{
	return *(map + (unsigned char)c);
}

void tisupper_init(void)
{
	int i;

	for (i = 0; i < UCHAR_MAX; i++)
		map[i] = sisupper(i);
}

The speed of this should be consistent as there is only one execution path: *(map + (unsigned char)c).

On a 32/64bit computer I think it could take up more space as each member of the table needs to a separate memory address so it would occupy (UCHAR_MAX * wordsize) bits. I.e. 2KB on a 64 bit machine.

On the other hand Gregory shows how using as mask allows at least 16 different attributes to be stored for each char. This is supported by the truth value returned by the various is() functions:

For my (gcc 8.3) implementation:

iscntrl:       2
ispunct:       4
isalnum:       8
isupper:     256
islower:     512
isalpha:   1,024
isdigit:   2,048
isxdigit:  4,096
isspace:   8,192
isprint:  16,384
isgraph:  32,768

Solution by anonymous

I think this is simpler than what the other two solutions considered. I could be wrong though...

/*
    Exercise 7-9. Functions like isupper can be implemented to save space or to save time. Explore both possibilities.
*/

// saves time (no function call), but uses more space since the macro is expanded every time it is used (function pointer space vs two part if-statement)
#define isupper(c) ((c) >= 'A' && (c) <= 'Z') ? 1 : 0 // Note: don't use because isupper(*++s) would expand to ((*++s) >= 'A' && (*++s) <= 'Z') ? 1 : 0

// saves space, but takes more time due to function call
int isupper(int c)
{
    return (c >= 'A' && c <= 'Z');
}
Personal tools