Jump to: navigation, search

Chapter 12 - More string functions

Because strings are such a vital part of programming, you will not be surprised to hear that C provides quite a rich set of standard library functions for manipulating them. In this chapter, we will introduce a few more of those functions.

The strcpy function

You can't copy arrays. To be more precise, the compiler doesn't allow you to place an unindexed array name on the left hand side of an assignment operator. So you can't do this:

int main(void)
{
  char arrayA[32] = "Hello, world!";
  char arrayB[32];
  arrayB = arrayA; /* ERROR - C won't let you do this */
  return 0;
}

So how can we copy a string?

We are now in a position to write a function to do it for us. We'll need two parameters, a place for the function to copy the string to (or rather, a pointer to the first element of an array large enough to store the copy of the string) and the string to copy from (or rather, a pointer to the first element of that string). We put the target parameter first not because we have to, but because by doing so we intentionally mimic the ordering of the assignment operator:

void copy_string(char *target, char *source)
{
  while(*source != 0)
  {
    *target = *source;
    ++target;
    ++source;
  }
  *target = 0; /* don't forget to terminate the target string */
}

This works, provided that the target pointer is pointing into an array big enough to hold all the data in the array that source is pointing to. It's not very elegant code, though, and we can do better:

void copy_string(char *target, char *source)
{
  /* use post-increment to bump the pointers along */
  while(*source != 0)
  {
    *target++ = *source++;
  }
  *target = 0; /* don't forget to terminate the target string */
}

The last statement is still rather a nuisance. Can we get rid of it? Yes, we can! Remember that assignment yields a value. If we copy the character first, and then compare the value copied to 0, we don't even need to put anything inside the loop body:

void copy_string(char *target, char *source)
{
  while((*target++ = *source++) != 0)
  {
  }
}

Can we make this any tighter? Yes, we can. The != operator yields a value that is either 0 or 1, but while isn't that fussy, and it's only interested in whether the condition is 0 or non-0. That means we can do this:

void copy_string(char *target, char *source)
{
  while(*target++ = *source++)
  {
  }
}

which is about as elegant as it gets. But there is one more refinement that we can make, which is to do without the function altogether. That's possible because C provides a standard library function, strcpy, to do the same job. Here is the prototype:

char *strcpy(char *target, const char *source);

The const keyword simply means that strcpy promises not to change the string that we're copying from.

Unlike our function, strcpy returns a value, which is a pointer to the first character of the target string. (That probably isn't the most useful value it could have returned, but we'll have to live with that choice.)

How do we use this function to solve our original problem?

#include <stdio.h> /* for the printf() prototype */
#include <string.h> /* for the strcpy() prototype */

int main(void)
{
  char arrayA[32] = "Hello, world!";
  char arrayB[32] = "Whatever";
  printf("Before the copy operation, arrayB contains [%s]\n", arrayB);
  strcpy(arrayB, arrayA);
  printf("After the copy, arrayB contains [%s]\n", arrayB);
  return 0;
}

The strlen function

How long is a piece of string? That is, how many characters are there in a C string (not including the null terminator)?

We could write a function to find out:

int length_of_string(char *s)
{
  int length = 0;
  while(*s++)
  {
    ++length;
  }
  return length;
}

but of course C has beaten us to it, with the strlen function. But strlen doesn't return an int. Rather, it returns a special unsigned integer type called size_t, which is actually the type yielded by the sizeof operator. This size_t type is an alias for one of the inbuilt unsigned integer types, so it's at least 16 bits wide (and probably wider on your implementation). Because we don't necessarily know exactly which type it's aliased to, printing a size_t value is a problem. It's a problem we will solve, but not for some time.

The size_t type is not actually built into the language, so to use it you have to include a header where it's defined. Its spiritual home is stddef.h, but it can also be found lurking in places like stdio.h and string.h, so you don't normally have to worry about it (as those headers are so commonly included).

The following program shows how strlen can be used in a practical way:

#include <stdio.h>
#include <string.h>

void reverse(char *s);

int main(void)
{
  char mystring[] = "Able was I, ere I saw Elba!";
  printf("Before reverse: %s\n", mystring);
  reverse();
  printf("After reverse: %s\n", mystring);
  return 0;
}

void reverse(char *s)
{
  size_t len = strlen(s);
  if(len > 1) /* no point reversing an empty or single-char string */
  {
    char *left = s; /* point to the first character */
    char *right = s + len - 1; /* point to the last non-null character */

    while(left < right)
    {
      char temp = *left;
      *left++ = *right;
      *right-- = temp;
    }
  }
}

This function works by maintaining one pointer to each end of the string, swapping the characters found there (using a spare char as swap space), and then moving the pointers one place towards each other, until they meet.

Did you notice that left, right, and temp are defined in unusual places? C allows us to place definitions at the beginning of any block statement, not just at the beginning of a function. (In 1999, even that restriction was lifted, and it's now possible to place definitions pretty much anywhere you like. But you might be using an old compiler, so I'm sticking to the old rules, because they always work.)

The strcmp function

Although it is possible to use an unindexed array name in a comparison, it doesn't mean what you might hope it would mean: you can't compare strings that way, for example. So we need to use a function instead.

What we need is a function that compares two strings (let's call them left and right) character by character. If the character from left has a lower code point than the character from right, the function should return a negative integer value. If it's higher, we should return a positive integer value. If they're the same, it should move on to the next character in each string. And if it runs out of string without encountering a difference, it should return 0.

As an exercise, try to write such a function for yourself. But, except for testing that it works, you'll never have to use that function, because C provides one for you: strcmp. Here is the prototype (which is stored in <string.h>):

int strcmp(const char *left, const char *right);

Let's try it out:

#include <stdio.h>
#include <string.h>

int main(void)
{
  char s[] = "Now is the time for all good men to party.";
  char t[] = "NOW is the time for all good men to party.";
  char u[] = "Nuw is the time for all good men to party.";
  char v[] = "Now is the time for all good men to party.";

  printf("Comparing s and t: %d\n", strcmp(s, t));
  printf("Comparing s and u: %d\n", strcmp(s, u));
  printf("Comparing s and v: %d\n", strcmp(s, v));

  return 0;
}

Because strcmp returns an int, it can be used where an int is expected, such as in a printf parameter list in a position that matches the %d conversion specifier. The function is called, and it is the return value of the function that is actually passed to printf.

Summary

In this chapter, you learned how to copy a string using strcpy, and how to find its length using strlen. You also learned how to compare strings using strcmp.

In the next chapter, we will do something else.

Progress

Terminology
  • EOF
  • NULL
  • signed integer type
  • unsigned integer type
  • byte
  • bit
  • variadic function
  • conversion specifier
  • precedence
  • array
Syntax
  • comments
  • const
  • types
  • operators
    • increment and decrement operators
      • ++n n++ --n n--
    • assignment operators
      • = += -= *= /= %=
    • additive operators
      • the + operator
      • the - operator
    • multiplicative operators
      • the * operator
      • the / operator
      • the % operator
    • equality and relational operators
      • == != < > <= >= !
    • logical operators
      • the && operator
      • the || operator
    • address and indirection operators
      • the unary * operator
      • the unary & operator
      • the array subscripting operator []
    • miscellaneous operators
      • the conditional operator ? :
      • the comma operator ,
      • the sizeof operator
  • control structures
    • if/else
    • while
    • do/while
    • for
Standard library functions, by header
Personal tools