Science and technology

The way to (safely) learn person enter with the getline perform

Reading strings in C was a really harmful factor to do. When studying enter from the person, programmers could be tempted to make use of the will get perform from the C Standard Library. The utilization for will get is straightforward sufficient:

char *will get(char *string);

That is, will get reads information from commonplace enter, and shops the lead to a string variable. Using will get returns a pointer to the string, or the worth NULL if nothing was learn.

As a easy instance, we would ask the person a query and browse the end result right into a string:

#embrace <stdio.h>
#embrace <string.h>

int
fundamental()
{
  char metropolis[10];                       // Such as "Chicago"

  // that is unhealthy .. please don't use will get

  places("Where do you live?");
  will get(metropolis);

  printf("<%s> is length %ldn", metropolis, strlen(metropolis));

  return 0;
}

Entering a comparatively quick worth with the above program works properly sufficient:

Where do you reside?
Chicago
<Chicago> is size 7

Programming and improvement

However, the will get perform could be very easy, and can naively learn information till it thinks the person is completed. But will get does not examine that the string is lengthy sufficient to carry the person’s enter. Entering a really lengthy worth will trigger will get to retailer extra information than the string variable can maintain, leading to overwriting different elements of reminiscence.

Where do you reside?
Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch
<Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch> is size 58
Segmentation fault (core dumped)

At finest, overwriting elements of reminiscence merely breaks this system. At worst, this introduces a vital safety bug the place a foul person can insert arbitrary information into the pc’s reminiscence by way of your program.

That’s why the will get perform is harmful to make use of in a program. Using will get, you don’t have any management over how a lot information your program makes an attempt to learn from the person. This usually results in buffer overflow.

The fgets perform has traditionally been the beneficial technique to learn strings safely. This model of will get offers a security examine by solely studying as much as a sure variety of characters, handed as a perform argument:

char *fgets(char *string, int measurement, FILE *stream);

The fgets perform reads from the file pointer, and shops information right into a string variable, however solely as much as the size indicated by measurement. We can check this by updating our pattern program to make use of fgets as an alternative of will get:

#embrace <stdio.h>

#embrace <string.h>

int

fundamental()

{

char metropolis[10]; // Such as “Chicago”

// fgets is best however not good

places(“Where do you live?”);

fgets(metropolis, 10, stdin);

printf("<%s> is length %ldn", metropolis, strlen(metropolis));

return 0;

}

If you compile and run this program, you may enter an arbitrarily lengthy metropolis title on the immediate. However, this system will solely learn sufficient information to suit right into a string variable of measurement=10. And as a result of C provides a null (‘’) character to the ends of strings, meaningfgets will solely learn 9 characters into the string:

Where do you reside?
Minneapolis
<Minneapol> is size 9

While that is actually safer than utilizing fgets to learn person enter, it does so at the price of “cutting off” your person’s enter whether it is too lengthy.

A extra versatile resolution to studying lengthy information is to permit the string-reading perform to allocate extra reminiscence to the string, if the person entered extra information than the variable would possibly maintain. By resizing the string variable as crucial, this system all the time has sufficient room to retailer the person’s enter.

The getline perform does precisely that. This perform reads enter from an enter stream, such because the keyboard or a file, and shops the information in a string variable. But in contrast to fgets and will get, getline resizes the string with realloc to make sure there’s sufficient reminiscence to retailer the entire enter.

ssize_t getline(char **pstring, size_t *measurement, FILE *stream);

The getline is definitely a wrapper to the same perform known as getdelim that reads information as much as a particular delimiter character. In this case, getline makes use of a newline (‘n’) because the delimiter, as a result of when studying person enter both from the keyboard or from a file, traces of information are separated by a newline character.

The result’s a a lot safer technique to learn arbitrary information, one line at a time. To use getline, outline a string pointer and set it to NULL to point no reminiscence has been put aside but. Also outline a “string size” variable of kind size_t and provides it a zero worth. When you name getline, you will use tips that could each the string and the string measurement variables, and point out the place to learn information. For a pattern program, we will learn from the usual enter:

#embrace <stdio.h>
#embrace <stdlib.h>
#embrace <string.h>

int
fundamental()
{
  char *string = NULL;
  size_t measurement = 0;
  ssize_t chars_read;

  // learn an extended string with getline

  places("Enter a really long string:");

  chars_read = getline(&string, &measurement, stdin);
  printf("getline returned %ldn", chars_read);

  // examine for errors

  if (chars_read < 0) {
    places("couldn't read the input");
    free(string);
    return 1;
  }

  // print the string

  printf("<%s> is length %ldn", string, strlen(string));

  // free the reminiscence utilized by string

  free(string);

  return 0;
}

As the getline reads information, it’ll robotically reallocate extra reminiscence for the string variable as wanted. When the perform has learn all the information from one line, it updates the dimensions of the string by way of the pointer, and returns the variety of characters learn, together with the delimiter.


Enter a extremely lengthy string:
Supercalifragilisticexpialidocious
getline returned 35
<Supercalifragilisticexpialidocious
> is size 35

Note that the string contains the delimiter character. For getline, the delimiter is the newline, which is why the output has a line feed in there. If you don’t need the delimiter in your string worth, you should use one other perform to alter the delimiter to a null character within the string.

With getline, programmers can safely keep away from one of many widespread pitfalls of C programming. You can by no means inform what information your person would possibly attempt to enter, which is why utilizing will get is unsafe, and fgets is awkward. Instead, getline affords a extra versatile technique to learn person information into your program with out breaking the system.

Most Popular

To Top