C standard functions for working with strings. Functions for processing strings in C. Main functions of the standard library string.h

Please suspend AdBlock on this site.

So, strings in C language. There is no separate data type for them, as is done in many other programming languages. In C, a string is an array of characters. To mark the end of a line, the "\0" character is used, which we discussed in the last part of this lesson. It is not displayed on the screen in any way, so you won’t be able to look at it.

Creating and Initializing a String

Since a string is an array of characters, declaring and initializing a string are similar to similar operations with one-dimensional arrays.

The following code illustrates the different ways to initialize strings.

Listing 1.

Char str; char str1 = ("Y","o","n","g","C","o","d","e","r","\0"); char str2 = "Hello!"; char str3 = "Hello!";

Fig.1 Declaration and initialization of strings

On the first line we simply declare an array of ten characters. It's not even really a string, because... there is no null character \0 in it, for now it is just a set of characters.

Second line. The simplest way initialization in the forehead. We declare each symbol separately. The main thing here is not to forget to add the null character \0 .

The third line is analogous to the second line. Pay attention to the picture. Because There are fewer characters in the line on the right than there are elements in the array, the remaining elements will be filled with \0 .

Fourth line. As you can see, there is no size specified here. The program will calculate it automatically and create an array of characters of the required length. In this case, the null character \0 will be inserted last.

How to output a string

Let's expand the code above into a full-fledged program that will display the created lines on the screen.

Listing 2.

#include int main(void) ( char str; char str1 = ("Y","o","n","g","C","o","d","e","r"," \0"); char str2 = "Hello!"; char str3 = "Hello!"; for(int i = 0; i< 10; i = i + 1) printf("%c\t",str[i]); printf("\n"); puts(str1); printf("%s\n",str2); puts(str3); return 0; }


Fig.2 Various ways to display a string on the screen

As you can see, there are several basic ways to display a string on the screen.

  • use the printf function with the %s specifier
  • use the puts function
  • use the fputs function, specifying the standard stream for output as stdout as the second parameter.

The only nuance is with the puts and fputs functions. Note that the puts function wraps the output to the next line, but the fputs function does not.

As you can see, the conclusion is quite simple.

Entering strings

String input is a little more complicated than output. The simplest way would be the following:

Listing 3.

#include int main(void) ( char str; gets(str); puts(str); return 0; )

The gets function pauses the program, reads a string of characters entered from the keyboard, and places it in a character array, the name of which is passed to the function as a parameter.
The gets function exits with the character corresponding to the enter key and written to the string as a null character.
Noticed the danger? If not, the compiler will kindly warn you about it. The problem is that the gets function only exits when the user presses enter. This is fraught with the fact that we can go beyond the array, in our case - if more than 20 characters are entered.
By the way, buffer overflow errors were previously considered the most common type of vulnerability. They still exist, but using them to hack programs has become much more difficult.

So what do we have? We have a task: write a string to an array of limited size. That is, we must somehow control the number of characters entered by the user. And here the fgets function comes to our aid:

Listing 4.

#include int main(void) ( char str; fgets(str, 10, stdin); puts(str); return 0; )

The fgets function takes three arguments as input: the variable to write the string to, the size of the string to be written, and the name of the stream from which to get the data to write to the string, in this case stdin. As you already know from Lesson 3, stdin is the standard input stream usually associated with the keyboard. It is not at all necessary that the data must come from the stdin stream; in the future we will also use this function to read data from files.

If during the execution of this program we enter a string longer than 10 characters, only 9 characters from the beginning and a line break will still be written to the array, fgets will “cut” the string to the required length.

Please note that the fgets function does not read 10 characters, but 9! As we remember, in strings the last character is reserved for the null character.

Let's check it out. Let's run the program from the last listing. And enter the line 1234567890. The line 123456789 will be displayed on the screen.


Fig. 3 Example of the fgets function

The question arises. Where did the tenth character go? And I will answer. It hasn't gone anywhere, it remains in the input stream. Run the following program.

Listing 5.

#include int main(void) ( char str; fgets(str, 10, stdin); puts(str); int h = 99; printf("do %d\n", h); scanf("%d",&h) ; printf("after %d\n", h); return 0; )

Here is the result of her work.


Fig.4 Non-empty stdin buffer

Let me explain what happened. We called the fgets function. She opened the input stream and waited for us to enter the data. We entered 1234567890\n from the keyboard (\n I mean pressing the Enter key). This went to the stdin input stream. The fgets function, as expected, took the first 9 characters 123456789 from the input stream, added the null character \0 to them and wrote it to the string str . There are still 0\n left in the input stream.

Next we declare the variable h. We display its value on the screen. Then we call the scanf function. Here it is expected that we can enter something, but... there is 0\n hanging in the input stream, then the scanf function perceives this as our input and writes 0 to the variable h. Next we display it on the screen.

This is, of course, not exactly the behavior we expect. To deal with this problem, we need to clear the input buffer after we have read the user's input from it. For this purpose it is used special function fflush. It has only one parameter - the stream that needs to be cleared.

Let's fix the last example so that it works predictably.

Listing 6.

#include int main(void) ( char str; fgets(str, 10, stdin); fflush(stdin); // clear the input stream puts(str); int h = 99; printf("do %d\n", h) ; scanf("%d",&h); printf("after %d\n", h); return 0; )

Now the program will work as it should.


Fig.4 Flushing the stdin buffer with the fflush function

To summarize, two facts can be noted. First. On this moment using the gets function is unsafe, so it is recommended to use the fgets function everywhere.

Second. Don't forget to clear the input buffer if you use the fgets function.

This concludes the conversation about entering strings. Go ahead.

Habra, hello!

Not long ago, a rather interesting incident happened to me, in which one of the teachers of one computer science college was involved.

The conversation about Linux programming slowly progressed to this person arguing that complexity system programming actually greatly exaggerated. That the C language is as simple as a match, in fact, like the Linux kernel (in his words).

I had with me a laptop with Linux, which contained a gentleman's set of utilities for development in the C language (gcc, vim, make, valgrind, gdb). I don’t remember what goal we set for ourselves then, but after a couple of minutes my opponent found himself at this laptop, completely ready to solve the problem.

And literally on the very first lines he made a serious mistake when allocating memory to... a line.

Char *str = (char *)malloc(sizeof(char) * strlen(buffer));
buffer - a stack variable into which data from the keyboard was written.

I think there will definitely be people who will ask, “How could there be anything wrong with this?”
Believe me, it can.

And what exactly - read on the cat.

A little theory - a kind of LikBez.

If you know, scroll to the next header.

A string in C is an array of characters, which should always end with a "\0" - the end-of-line character. Strings on the stack (static) are declared like this:

Char str[n] = ( 0 );
n is the size of the character array, the same as the length of the string.

Assignment ( 0 ) - “zeroing” the string (optional, you can declare it without it). The result is the same as running the functions memset(str, 0, sizeof(str)) and bzero(str, sizeof(str)). It is used to prevent garbage from being left in uninitialized variables.

You can also immediately initialize a string on the stack:

Char buf = "default buffer text\n";
In addition, a string can be declared as a pointer and memory can be allocated for it on the heap:

Char *str = malloc(size);
size - the number of bytes that we allocate for the string. Such strings are called dynamic (due to the fact that the required size is calculated dynamically + the allocated memory size can be increased at any time using the realloc() function).

In the case of a stack variable, I used the notation n to determine the size of the array; in the case of a heap variable, I used the notation size. And this perfectly reflects the true essence of the difference between a declaration on the stack and a declaration with memory allocation on the heap, because n is usually used when talking about the number of elements. And size is a completely different story...

Valgrind will help us

In my previous article I also mentioned it. Valgrind ( , two - a small how-to) is a very useful program that helps the programmer track down memory leaks and context errors - exactly the things that pop up most often when working with strings.

Let's look at a short listing that implements something similar to the program I mentioned and run it through valgrind:

#include #include #include #define HELLO_STRING "Hello, Habr!\n" void main() ( char *str = malloc(sizeof(char) * strlen(HELLO_STRING)); strcpy(str, HELLO_STRING); printf("->\t%s" , str); free(str); )
And, in fact, the result of the program:

$ gcc main.c $ ./a.out -> Hello, Habr!
Nothing unusual yet. Now let's run this program with valgrind!

$ valgrind --tool=memcheck ./a.out ==3892== Memcheck, a memory error detector ==3892== Copyright (C) 2002-2015, and GNU GPL"d, by Julian Seward et al. == 3892== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==3892== Command: ./a.out ==3892== ==3892== Invalid write of size 2 ==3892= = at 0x4005B4: main (in /home/indever/prg/C/public/a.out) ==3892== Address 0x520004c is 12 bytes inside a block of size 13 alloc"d ==3892== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299) ==3892== by 0x400597: main (in /home/indever/prg/C/public/a.out) ==3892== ==3892== Invalid read of size 1 == 3892== at 0x4C30BC4: strlen (vg_replace_strmem.c:454) ==3892== by 0x4E89AD0: vfprintf (in /usr/lib64/libc-2.24.so) ==3892== by 0x4E90718: printf (in /usr/ lib64/libc-2.24.so) ==3892== by 0x4005CF: main (in /home/indever/prg/C/public/a.out) ==3892== Address 0x520004d is 0 bytes after a block of size 13 alloc"d ==3892== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299) ==3892== by 0x400597: main (in /home/indever/prg/C/public/a.out) ==3892== -> Hello, Habr! ==3892== ==3892== HEAP SUMMARY: ==3892== in use at exit: 0 bytes in 0 blocks ==3892== total heap usage: 2 allocs, 2 frees, 1,037 bytes allocated ==3892= = ==3892== All heap blocks were freed -- no leaks are possible ==3892== ==3892== For counts of detected and suppressed errors, rerun with: -v ==3892== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)
==3892== All heap blocks were freed - no leaks are possible- there are no leaks, which is good news. But it’s worth lowering your eyes a little lower (although, I want to note, this is just the summary, the main information is a little different place):

==3892== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)
3 mistakes. In 2 contexts. In such a simple program. How!?

Yes, very simple. The whole “funny thing” is that the strlen function does not take into account the end-of-line character - “\0”. Even if you explicitly specify it in the incoming line (#define HELLO_STRING “Hello, Habr!\n\0”), it will be ignored.

Just above the result of the program execution, the line -> Hello, Habr! there is a detailed report of what and where our precious valgrind didn’t like. I suggest you look at these lines yourself and draw your own conclusions.

Actually, the correct version of the program will look like this:

#include #include #include #define HELLO_STRING "Hello, Habr!\n" void main() ( char *str = malloc(sizeof(char) * (strlen(HELLO_STRING) + 1)); strcpy(str, HELLO_STRING); printf("->\ t%s", str); free(str); )
Let's run it through valgrind:

$ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==3435== ==3435== HEAP SUMMARY: ==3435== in use at exit: 0 bytes in 0 blocks ==3435== total heap usage: 2 allocs, 2 frees, 1,038 bytes allocated ==3435= = ==3435== All heap blocks were freed -- no leaks are possible ==3435== ==3435== For counts of detected and suppressed errors, rerun with: -v ==3435== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Great. There are no errors, +1 byte of allocated memory helped solve the problem.

What’s interesting is that in most cases both the first and second programs will work the same, but if the memory allocated for the line in which the ending character did not fit was not zeroed, then the printf() function, when outputting such a line, will also output all the garbage after this line - everything will be printed until a line ending character gets in the way of printf().

However, you know, (strlen(str) + 1) is such a solution. We face 2 problems:

  1. What if we need to allocate memory for a string generated using, for example, s(n)printf(..)? We do not support the arguments.
  2. Appearance. The variable declaration line looks just awful. Some guys also manage to attach (char *) to malloc, as if they write under pluses. In a program where you regularly need to process strings, it makes sense to find a more elegant solution.
Let's come up with a solution that will satisfy both us and valgrind.

snprintf()

int snprintf(char *str, size_t size, const char *format, ...);- a function - an extension of sprintf, which formats a string and writes it to the pointer passed as the first argument. It differs from sprintf() in that str will not write a byte larger than that specified in size.

The function has one interesting feature - in any case, it returns the size of the generated string (without taking into account the end of line character). If the string is empty, then 0 is returned.

One of the problems I described with using strlen is related to the sprintf() and snprintf() functions. Let's assume that we need to write something into the string str. The final line contains the values ​​of the other variables. Our entry should be something like this:

Char * str = /* allocate memory here */; sprintf(str, "Hello, %s\n", "Habr!");
The question arises: how to determine how much memory should be allocated for the string str?

Char * str = malloc(sizeof(char) * (strlen(str, "Hello, %s\n", "Habr!") + 1)); - it will not work. The strlen() function prototype looks like this:

#include size_t strlen(const char *s);
const char *s does not imply that the string passed to s can be a variadic format string.

The useful property of the snprintf() function, which I mentioned above, will help us here. Let's look at the code for the following program:

#include #include #include void main() ( /* Since snprintf() does not take into account the end of line character, we add its size to the result */ size_t needed_mem = snprintf(NULL, 0, "Hello, %s!\n", "Habr") + sizeof("\0"); char *str = malloc(needed_mem); snprintf(str, needed_mem, "Hello, %s!\n", "Habr"); printf("->\t%s", str); free(str); )
Run the program in valgrind:

$ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==4132== ==4132== HEAP SUMMARY: ==4132== in use at exit: 0 bytes in 0 blocks ==4132== total heap usage: 2 allocs, 2 frees, 1,041 bytes allocated ==4132= = ==4132== All heap blocks were freed -- no leaks are possible ==4132== ==4132== For counts of detected and suppressed errors, rerun with: -v ==4132== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) $
Great. We have argument support. Due to the fact that we pass null as the second argument to the snprintf() function, writing to a null pointer will never cause a Seagfault. However, despite this, the function will still return the size required for the string.

But on the other hand, we had to introduce an additional variable, and the design

Size_t needed_mem = snprintf(NULL, 0, "Hello, %s!\n", "Habr") + sizeof("\0");
looks even worse than in the case of strlen().

In general, + sizeof("\0") can be removed if you explicitly specify "\0" at the end of the format line (size_t needed_mem = snprintf(NULL, 0, "Hello, %s!\n \0 ", "Habr");), but this is by no means always possible (depending on the string processing mechanism, we can allocate an extra byte).

We need to do something. I thought a little and decided that now was the time to appeal to the wisdom of the ancients. Let's describe a macro function that will call snprintf() with a null pointer as the first argument, and null as the second. And let's not forget about the end of the line!

#define strsize(args...) snprintf(NULL, 0, args) + sizeof("\0")
Yes, it may be news to some, but C macros support a variable number of arguments, and the ellipsis tells the preprocessor that the specified macro function argument (in our case, args) corresponds to several real arguments.

Let's check our solution in practice:

#include #include #include #define strsize(args...) snprintf(NULL, 0, args) + sizeof("\0") void main() ( char *str = malloc(strsize("Hello, %s\n", "Habr! ")); sprintf(str, "Hello, %s\n", "Habr!"); printf("->\t%s", str); free(str); )
Let's start with valgrund:

$ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==6432== ==6432== HEAP SUMMARY: ==6432== in use at exit: 0 bytes in 0 blocks ==6432== total heap usage: 2 allocs, 2 frees, 1,041 bytes allocated ==6432= = ==6432== All heap blocks were freed -- no leaks are possible ==6432== ==6432== For counts of detected and suppressed errors, rerun with: -v ==6432== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Yes, there are no errors. Everything is correct. And valgrind is happy, and the programmer can finally go to sleep.

But finally, I’ll say one more thing. In case we need to allocate memory for any string (even with arguments) there is already fully working ready solution.

We are talking about the asprintf function:

#define _GNU_SOURCE /* See feature_test_macros(7) */ #include int asprintf(char **strp, const char *fmt, ...);
It takes a pointer to a string (**strp) as its first argument and allocates memory to the dereferenced pointer.

Our program written using asprintf() will look like this:

#include #include #include void main() ( char *str; asprintf(&str, "Hello, %s!\n", "Habr"); printf("->\t%s", str); free(str); )
And, in fact, in valgrind:

$ valgrind --tool=memcheck ./a.out -> Hello, Habr! ==6674== ==6674== HEAP SUMMARY: ==6674== in use at exit: 0 bytes in 0 blocks ==6674== total heap usage: 3 allocs, 3 frees, 1,138 bytes allocated ==6674= = ==6674== All heap blocks were freed -- no leaks are possible ==6674== ==6674== For counts of detected and suppressed errors, rerun with: -v ==6674== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Everything is fine, but, as you can see, more memory has been allocated, and there are now three allocs, not two. On weak embedded systems, using this function is undesirable.
In addition, if we write man asprintf in the console, we will see:

CONFORMING TO These functions are GNU extensions, not in C or POSIX. They are also available under *BSD. The FreeBSD implementation sets strp to NULL on error.

From here it is clear that this function only available in GNU sources.

Conclusion

In conclusion, I want to say that working with strings in C is a very complex topic that has a number of nuances. For example, to write “safe” code when dynamic allocation memory, it is still recommended to use the calloc() function instead of malloc() - calloc fills the allocated memory with zeros. Or, after allocating memory, use the memset() function. Otherwise, the garbage that was initially located in the allocated memory area may cause problems during debugging, and sometimes when working with the string.

More than half of the C programmers I know (most of them are beginners) who solved the problem of allocating memory for strings at my request, did it in a way that ultimately led to context errors. In one case - even to a memory leak (well, a person forgot to do free(str), it never happens to anyone). As a matter of fact, this prompted me to create this creation that you just read.

I hope this article will be useful to someone. Why am I making all this fuss - no language is simple. Everywhere has its own subtleties. And the more subtleties of the language you know, the better your code.

I believe that after reading this article your code will become a little better :)
Good luck, Habr!

Lines. String input/output. Formatted I/O. String processing using standard C language functions. Working with memory.

1.1. Declaration and initialization of strings.

A string is an array of characters that ends with the empty character '\0'. The string is declared as a regular character array, for example,

char s1; // string nine characters long

char *s2; // pointer to string

The difference between pointers s1 and s2 is that pointer s1 is a named constant, and pointer s2 is a variable.

String constants are enclosed in double quotes, as opposed to characters, which are enclosed in single quotes. For example,

“This is a string.”

The length of a string constant cannot exceed 509 characters according to the standard. However, many implementations allow longer string lengths.

When initializing strings, it is better not to specify the array size; the compiler will do this by calculating the length of the string and adding one to it. For example,

char s1 = “This is a string.”;

In the C programming language, there are a large number of functions for working with strings, the prototypes of which are described in the header files stdlib.h and string.h. Working with these functions will be discussed in the following paragraphs.

1.2. String input/output.

To enter a string from the console, use the function

char* gets(char *str);

which writes a string to the address str and returns the address of the entered string. The function stops input if it encounters a '\n' or EOF (end of file) character. The newline character is not copied. A zero byte is placed at the end of the read line. If successful, the function returns a pointer to the line read, and if unsuccessful, NULL.

To output a string to the console, use the standard function

int puts (const char *s);

which, if successful, returns a non-negative number, and if unsuccessful, returns EOF.

The gets and puts function prototypes are described in the stdio.h header file.

#include

printf("Input String: ");

1.3. Formatted I/O.

For formatted data input from the console, use the function

int scanf (const char *format, ...);

which, if successful, returns the number of units of data read, and if unsuccessful, returns EOF. The format parameter must point to the string to be formatted, which contains the input format specifications. The number and types of arguments that follow the format string must match the number and types of input formats specified in the format string. If this condition is not met, then the result of the function is unpredictable.

A space, "\t" or "\n" character in a format string describes one or more empty characters in the input stream, which include the characters: space, '\t', '\n', '\v', '\f '. The scanf function skips blank characters in the input stream.

Literal characters in a format string, with the exception of the % character, require that exactly the same characters appear in the input stream. If there is no such character, the scanf function stops entering. The scanf function skips literal characters.

In general, the input format specification looks like this:

%[*] [width] [modifiers] type

The symbol '*' denotes omission when entering a field defined by this specification;

- ‘width’ defines the maximum number of characters entered according to this specification;

The type can take the following values:

c – character array,

s – a string of characters, lines are separated by empty characters,

d – signed integer of 10 s/s,

i – signed integer, the number system depends on the first two digits,

u – unsigned integer at 10 s/s,

o – unsigned integer in 8 s/s,

x, X – unsigned integer at 16 s/s,

e, E, f, g, G – floating number,

p – pointer to pointer,

n – pointer to an integer,

[…] – an array of scanned characters, for example, .

In the latter case, only characters enclosed in square brackets will be entered from the input stream. If the first character inside the square brackets is '^', then only those characters that are not in the array are entered. The range of characters in the array is specified using the '-' symbol. When you enter characters, the leading blank characters and the trailing null byte of the string are also entered.

Modifiers can take the following values:

h – short integer,

l, L – long integer or floating,

and are used only for integer or floating numbers.

IN following example shows options for using the scanf function. Note that the format specifier, starting with the floating number input, is preceded by a space character.

#include

printf("Input an integer: ");

scanf("%d", &n);

printf("Input a double: ");

scanf(" %lf", &d);

printf("Input a char: ");

scanf(" %c", &c);

printf("Input a string: ");

scanf(" %s", &s);

Note that in this program the floating point number is initialized. This is done so that the compiler includes the library to support working with floating numbers. If this is not done, an error will occur at runtime when entering a floating number.

For formatted output of data to the console, use the function

int printf (const char *format, ...);

which, if successful, returns the number of units of data output, and if unsuccessful, returns EOF. The format parameter is a format string that contains specifications for output formats. The number and types of arguments that follow the format string must match the number and types of output format specifications specified in the format string. In general, the output format specification looks like this:

%[flags] [width] [.precision] [modifiers] type

- ‘flags’ are various symbols that specify the output format;

- ‘width’ defines the minimum number of characters output according to this specification;

- ‘.precision’ defines the maximum number of characters displayed;

- ‘modifiers’ specify the type of arguments;

- 'type' specifies the type of the argument.

To output signed integers, the following output format is used:

%[-] [+ | space] [width] [l] d

- – alignment left, default – right;

+ – the ‘+’ sign is displayed, note that for negative numbers the '-' sign is always displayed;

‘space’ – a space is displayed at the character position;

d – int data type.

To output unsigned integers, use the following output format:

%[-] [#] [width] [l]

# – initial 0 is output for numbers in 8 c/c or initial 0x or 0X for numbers in 16 c/c,

l – long data type modifier;

u – integer in 10c/c,

o – integer in 8 c/c,

x, X – integer at 16 c/c.

The following output format is used to display floating point numbers:

%[-] [+ | space] [width] [.precision]

"precision" - indicates the number of digits after the decimal point for formats f, e and E or the number of significant digits for formats g and G. Numbers are rounded off. The default precision is six decimal digits;

f – fixed point number,

e – a number in exponential form, the exponent is denoted by the letter “e”,

E – a number in exponential form, the exponent is denoted by the letter “E”,

g – the shortest of the f or g formats,

G – the shortest of the f or G formats.

printf ("n = %d\n f = %f\n e = %e\n E = %E\n f = %.2f", -123, 12.34, 12.34, 12.34, 12.34);

// prints: n = 123 f = 12.340000 e = 1.234000e+001 E = 1.234000E+001 f = 12.34

1.4. Formatting strings.

There are options scanf functions and printf, which are designed to format strings and are called sscanf and sprintf, respectively.

int sscanf (const char *str, const char *format, ...);

reads data from the string specified by str, according to the format string specified by format. If successful, returns the number of data read, and if unsuccessful, returns EOF. For example,

#include

char str = "a 10 1.2 String No input";

sscanf(str, "%c %d %lf %s", &c, &n, &d, s);

printf("%c\n", c); // prints: a

printf("%d\n", n); // prints: 10

printf("%f\n", d); // prints: 1.200000

printf("%s\n", s); // prints: String

int sprintf (char *buffer, const char *format, ...);

formats the string in accordance with the format specified by the format parameter and writes the resulting result to the character array buffer. The function returns the number of characters written to the character array buffer, excluding the terminating null byte. For example,

#include

char str = "c = %c, n = %d, d = %f, s = %s";

char s = "This is a string.";

sprintf(buffer, str, c, n, d, s);

printf("%s\n", buffer); // prints: c = c, n = 10, d = 1.200000, s = This is a string

1.5. Convert strings to numeric data.

Prototypes of functions for converting strings to numeric data are given in the stdlib.h header file, which must be included in the program.

To convert a string to an integer, use the function

int atoi (const char *str);

char *str = “-123”;

n = atoi(str); // n = -123

To convert a string to a long integer, use the function

long int atol (const char *str);

which, if successful, returns the integer to which the string str is converted, and if unsuccessful, returns 0. For example,

char *str = “-123”;

n = atol(str); // n = -123

To convert a string to a double number, use the function

double atof(const char *str);

which, if successful, returns a floating number of type double, into which the string str is converted, and if unsuccessful, returns 0. For example,

char *str = “-123.321”;

n = atof(str); // n = -123.321

The following functions perform similar functions to atoi, atol, atof, but provide more advanced functionality.

long int strtol (const char *str, char **endptr, int base);

converts the string str to a long int number, which it returns. The parameters of this function have the following purposes.

If base is 0, then the conversion depends on the first two characters of str:

If the first character is a number from 1 to 9, then the number is assumed to be represented in 10 c/c;

If the first character is the digit 0 and the second character is a digit from 1 to 7, then the number is assumed to be represented in 8 c/c;

If the first character is 0 and the second is 'X' or 'x', then the number is assumed to be represented in 16 c/c.

If base is a number between 2 and 36, then that value is taken to be the base of the number system, and any character outside the number system stops converting. In base 11 to base 36 number systems, the symbols 'A' to 'Z' or 'a' to 'z' are used to represent digits.

The value of the endptr argument is set by the strtol function. This value contains a pointer to the character that stopped the string str from being converted. The strtol function returns the converted number if successful, and 0 if unsuccessful. For example,

n = strtol (“12a”, &p, 0);

printf("n = %ld, %stop = %c, n, *p); // n = 12, stop = a

n = strtol("012b", &p, 0);

printf("n = %ld, %stop = %c, n, *p); // n = 10, stop = b

n = strtol (“0x12z”, &p, 0);

printf("n = %ld, %stop = %c, n, *p); // n = 18, stop = z

n = strtol (“01117”, &p, 0);

printf("n = %ld, %stop = %c, n, *p); // n = 7, stop = 7

unsigned long int strtol (const char *str, char **endptr, int base);

works similar to the strtol function, but converts symbolic representation numbers to a number of type unsigned long int.

double strtod (const char *str, char **endptr);

Converts the symbolic representation of a number to a double.

All functions listed in this paragraph stop working when they encounter the first character that does not fit the format of the number in question.

In addition, if the character value of a number exceeds the range of acceptable values ​​for the corresponding data type, then the functions atof, strtol, strtoul, strtod set the value of the errno variable to ERANGE. The errno variable and the ERANGE constant are defined in the math.h header file. In this case, the atof and strtod functions return the HUGE_VAL value, the strtol function returns the LONG_MAX or LONG_MIN value, and the strtoul function returns the ULONG_MAX value.

The non-standard functions itoa, ltoa, utoa, ecvt, fcvt, and gcvt can be used to convert numeric data to character strings. But it is better to use the standard sprintf function for these purposes.

1.6. Standard functions for working with strings.

This section discusses functions for working with strings, the prototypes of which are described in the header file string.h.

1. String comparison. The functions strcmp and strncmp are used to compare strings.

int strcmp (const char *str1, const char *str2);

lexicographically compares the strings str1, str2 and returns –1, 0 or 1 if str1 is respectively less than, equal to or greater than str2.

int strncmp (const char *str1, const char *str2, size_t n);

lexicographically compares at most the first n characters from the strings str1 and str2. The function returns -1, 0, or 1 if the first n characters from str1 are respectively less than, equal to, or greater than the first n characters from str2.

// example of string comparison

#include

#include

char str1 = "aa bb";

char str2 = "aa aa";

char str3 = "aa bb cc";

printf("%d\n", strcmp(str1, str3)); // prints: -1

printf("%d\n", strcmp(str1, str1)); // prints: -0

printf("%d\n", strcmp(str1, str2)); // prints: 1

printf("%d\n", strncmp(str1, str3, 5)); // prints: 0

2. Copying lines. The strcpy and strncpy functions are used to copy strings.

char *strcpy (char *str1, const char *str2);

copies the string str2 to the string str1. The entire string str2 is copied, including the terminating null byte. The function returns a pointer to str1. If the lines overlap, the result is unpredictable.

char *strncpy (char *str1, const char *str2, size_t n);

copies n characters from string str2 to string str1. If str2 contains fewer than n characters, then the last zero byte is copied as many times as necessary to expand str2 to n characters. The function returns a pointer to the string str1.

char str2 = "Copy string.";

strcpy(str1, str2);

printf(str1); // prints: Copy string.

4. Connecting strings. The functions strcat and strncat are used to concatenate strings into one string.

char* strcat (char *str1, const char *str2);

appends string str2 to string str1, with the trailing zero byte of string str1 erased. The function returns a pointer to the string str1.

char* strncat (char *str1, const char *str2, size_t n);

appends n characters from string str2 to string str1, with the trailing zero byte of string str1 erased. The function returns a pointer to the string str1. if the length of the string str2 is less than n, then only the characters included in the string str2 are appended. After concatenating strings, a null byte is always added to str1. The function returns a pointer to the string str1.

#include

#include

char str1 = "String";

char str2 = "catenation";

char str3 = "Yes No";

strcat(str1, str2);

printf("%s\n", str1); // prints: String catenation

strncat(str1, str3, 3);

printf("%s\n", str1); // prints: String catenation Yes

5. Search for a character in a string. To search for a character in a string, use the functions strchr, strrchr, strspn, strcspn and strpbrk.

char* strchr (const char *str, int c);

searches for the first occurrence of the character specified by c in the string str. If successful, the function returns a pointer to the first character found, and if unsuccessful, NULL.

char* strrchr (const char *str, int c);

searches for the last occurrence of the character specified by c in the string str. If successful, the function returns a pointer to the last character found, and if unsuccessful, NULL.

#include

#include

char str = "Char search";

printf("%s\n", strchr(str, "r")); // prints: r search

printf("%s\n", strrchr(str, "r")); // prints: rch

size_t strspn (const char *str1, const char *str2);

returns the index of the first character from str1 that is not in str2.

size_t strcspn (const char *str1, const char *str2);

returns the index of the first character from str1 that appears in str2.

char str = "123 abc";

printf ("n = %d\n", strspn (str, "321"); // prints: n = 3

printf ("n = %d\n", strcspn (str, "cba"); // prints: n = 4

char* strpbrk (const char *str1, const char *str2);

finds the first character in the string str1 that is equal to one of the characters in the string str2. If successful, the function returns a pointer to this character, and if unsuccessful, NULL.

char str = "123 abc";

printf("%s\n", strpbrk(str, "bca")); // prints: abc

6. String comparison. The strstr function is used to compare strings.

char* strstr (const char *str1, const char *str2);

finds the first occurrence of str2 (without the trailing null byte) in str1. If successful, the function returns a pointer to the found substring, and if unsuccessful, NULL. If the str1 pointer points to a zero-length string, then the function returns the str1 pointer.

char str = "123 abc 456;

printf ("%s\n", strstr (str, "abc"); // print: abc 456

7. Parsing a string into tokens. The strtok function is used to parse a string into tokens.

char* strtok (char *str1, const char *str2);

returns a pointer to the next token (word) in the string str1, in which the token delimiters are characters from the string str2. If there are no more tokens, the function returns NULL. On the first call to the strtok function, the str1 parameter must point to a string that is tokenized, and on subsequent calls this parameter must be set to NULL. After finding a token, the strtok function writes a null byte after this token in place of the delimiter.

#include

#include

char str = "12 34 ab cd";

p = strtok(str, " ");

printf("%s\n", p); // prints the values ​​in a column: 12 34 ab cd

p = strtok(NULL, " ");

8. Determining the length of a string. The strlen function is used to determine the length of a string.

size_t strlen (const char *str);

returns the length of the string, ignoring the last null byte. For example,

char str = "123";

printf("len = %d\n", strlen(str)); // prints: len = 3

1.7. Functions for working with memory.

The header file string.h also describes functions for working with memory blocks, which are similar to the corresponding functions for working with strings.

void* memchr (const void *str, int c, size_t n);

searches for the first occurrence of the character specified by c in n bytes of the string str.

int memcmp (const void *str1, const void *str2, size_t n);

compares the first n bytes of strings str1 and str2.

void* memcpy (const void *str1, const void *str2, size_t n);

copies the first n bytes from string str1 to string str2.

void* memmove (const void *str1, const void *str2, size_t n);

copies the first n bytes from str1 to str2, ensuring that overlapping strings are handled correctly.

void* memset (const void *str, int c, size_t n);

copies the character specified by c into the first n bytes of str.

34

--- C# Guide --- Strings

From a regular programming point of view, string string data type is one of the most important in C#. This type defines and supports character strings. In a number of other programming languages, a string is an array of characters. And in C#, strings are objects. Therefore, the string type is a reference type.

Building strings

The simplest way to construct a character string is to use a string literal. For example, in next line code, the string reference variable str is assigned a reference to a string literal:

String str = "Example string";

In this case, the variable str is initialized with the sequence of characters "Example String". An object of type string can also be created from an array of type char. For example:

Char chararray = ("e", "x", "a", "m", "p", "l", "e"); string str = new string(chararray);

Once a string object is created, it can be used anywhere you need a string of text enclosed in quotes.

String persistence

Oddly enough, the contents of an object of type string cannot be changed. This means that once a character sequence has been created, it cannot be changed. But this limitation contributes to a more efficient implementation of character strings. Therefore, this seemingly obvious disadvantage actually turns into an advantage. Thus, if a string is required as a variation of an existing string, then for this purpose a new string should be created containing all the necessary changes. And since unused string objects are automatically collected as garbage, you don’t even have to worry about the fate of unnecessary strings.

It should be emphasized, however, that variable references to strings (that is, objects of type string) are subject to change, and therefore they can refer to another object. But the contents of the string object itself do not change after it is created.

Let's look at an example:

Static void addNewString() ( string s = "This is my stroke"; s = "This is new stroke"; )

Let's compile the application and load the resulting assembly into the ildasm.exe utility. The figure shows the CIL code that will be generated for the void addNewString() method:

Note that there are numerous calls to the ldstr (string load) opcode. This CIL ldstr opcode performs loading of a new string object onto the managed heap. As a result, the previous object that contained the value "This is my stroke" will eventually be garbage collected.

Working with Strings

In class System.String a set of methods is provided for determining the length of character data, searching for a substring in the current string, converting characters from uppercase to lowercase and vice versa, etc. Next we will look at this class in more detail.

Field, Indexer, and String Class Property

The String class defines a single field:

Public static readonly string Empty;

The Empty field denotes an empty string, i.e. a string that does not contain any characters. This is different from an empty String reference, which is simply made to a non-existent object.

In addition, the String class defines a single read-only indexer:

Public char this ( get; )

This indexer allows you to get a character at a specified index. Indexing of strings, like arrays, starts from zero. String objects are persistent and do not change, so it makes sense that the String class supports a read-only indexer.

Finally, the String class defines a single read-only property:

Public int Length ( get; )

The Length property returns the number of characters in the string. The example below shows the use of the indexer and the Length property:

Using System; class Example ( static void Main() ( string str = "Simple string"; // Get the length of the string and the 6th character in the line using the indexer Console.WriteLine("Length of the string is (0), 6th character is "(1)"" , str.Length, str); ) )

String Class Operators

The String class overloads the following two operators: == and !=. The == operator is used to test two character strings for equality. When the == operator is applied to object references, it typically tests whether both references are made to the same object. And when the == operator is applied to references to objects of type String, the contents of the strings themselves are compared for equality. The same applies to the != operator. When it is applied to references to objects of type String, the contents of the strings themselves are compared for inequality. However, other relational operators, including =, compare references to objects of type String in the same way as they compare references to objects of other types. And in order to check whether one string is greater than another, you should call the Compare() method defined in the String class.

As will become clear, many types of character string comparisons rely on cultural information. But this does not apply to the == and != operators. After all, they simply compare the ordinal values ​​of characters in strings. (In other words, they compare the binary values ​​of characters that have not been modified by cultural norms, that is, locale standards.) Therefore, these operators perform string comparisons in a case-insensitive and culture-insensitive manner.

String class methods

The following table lists some of the most interesting methods in this class, grouped by purpose:

Methods for working with strings
Method Structure and overloads Purpose
String comparison
compare() public static int Compare(string strA, string strB)

Public static int Compare(string strA, string strB, bool ignoreCase)

Public static int Compare(string strA, string strB, StringComparison comparisonType)

Public static int Compare(string strA, string strB, bool ignoreCase, CultureInfo culture)

Static method, compares the string strA with the string strB. Returns a positive value if strA is greater than strB; negative if strA is less than strB; and zero if the strings strA and strB are equal. Comparisons are made based on register and culture.

If ignoreCase evaluates to true, the comparison does not take into account differences between uppercase and lowercase letters. Otherwise, these differences are taken into account.

The comparisonType parameter specifies the specific way strings are compared. The CultureInfo class is defined in the System.Globalization namespace.

public static int Compare(string strA, int indexA, string strB, int indexB, int length)

Public static int Compare(string strA, int indexA, string strB, int indexB, int length, bool ignoreCase)

Public static int Compare(string strA, int indexA, string strB, int indexB, int length, StringComparison comparisonType)

Public static int Compare(string strA, int indexA, string strB, int indexB, int length, bool ignoreCase, CultureInfo culture)

Compares parts of strings strA and strB. The comparison starts with the string elements strA and strB and includes the number of characters specified by the length parameter. The method returns a positive value if part of the string strA is greater than part of the string strB; negative value if part of string strA is less than part of string strB; and zero if the parts of strings strA and strB being compared are equal. Comparisons are made based on register and culture.

CompareOrdinal() public static int CompareOrdinal(string strA, string strB)

Public static int CompareOrdinal(string strA, int indexA, string strB, int indexB, int count)

Does the same thing as the Compare() method, but without taking into account local settings

CompareTo() public int CompareTo(object value)

Compares the calling string with the string representation of the value object. Returns a positive value if the calling string is greater than value; negative if the calling string is less than value; and zero if the compared strings are equal

public int CompareTo(string strB)

Compares the calling string with the string strB

Equals() public override bool Equals(object obj)

Returns the boolean true if the calling string contains the same sequence of characters as the string representation of obj. Performs case-sensitive but culturally insensitive ordinal comparison

public bool Equals(string value)

Public bool Equals(string value, StringComparison comparisonType)

Returns the boolean value true if the calling string contains the same sequence of characters as the string value. An ordinal comparison is performed that is case sensitive but not culturally sensitive. The comparisonType parameter specifies the specific way strings are compared

public static bool Equals(string a, string b)

Public static bool Equals(string a, string b, StringComparison comparisonType)

Returns the boolean value true if string a contains the same sequence of characters as string b . An ordinal comparison is performed that is case sensitive but not culturally sensitive. The comparisonType parameter specifies the specific way strings are compared

Concatenation (connection) of strings
Concat() public static string Concat(string str0, string str1);

public static string Concat(params string values);

Combines individual string instances into a single string (concatenation)
Search in a string
Contains() public bool Contains(string value) A method that allows you to determine whether a string contains a certain substring (value)
StartsWith() public bool StartsWith(string value)

Public bool StartsWith(string value, StringComparison comparisonType)

Returns the boolean value true if the calling string begins with the substring value. Otherwise, the boolean value false is returned. The comparisonType parameter specifies the specific way to perform the search

EndsWith() public bool EndsWith(string value)

Public bool EndsWith(string value, StringComparison comparisonType)

Returns the boolean value true if the calling string ends with the substring value. Otherwise, returns the boolean value false. The comparisonType parameter specifies the specific search method

IndexOf() public int IndexOf(char value)

Public int IndexOf(string value)

Finds the first occurrence of a given substring or character in a string. If the searched character or substring is not found, then the value -1 is returned.

public int IndexOf(char value, int startIndex)

Public int IndexOf(string value, int startIndex)

Public int IndexOf(char value, int startIndex, int count)

Public int IndexOf(string value, int startIndex, int count)

Returns the index of the first occurrence of the character or substring value in the calling string. The search begins at the element specified by startIndex and spans the number of elements specified by count (if specified). The method returns -1 if the searched character or substring is not found

LastIndexOf() The overloaded versions are similar to the IndexOf() method

Same as IndexOf, but finds the last occurrence of a character or substring, not the first

IndexOfAny() public int IndexOfAny(char anyOf)

Public int IndexOfAny(char anyOf, int startIndex)

Public int IndexOfAny(char anyOf, int startIndex, int count)

Returns the index of the first occurrence of any character from the anyOf array found in the calling string. The search starts at the element specified by startIndex and spans the number of elements specified by count (if specified). The method returns -1 if no character in the anyOf array is matched. The search is carried out in an ordinal manner

LastIndexOfAny The overloaded versions are similar to the IndexOfAny() method

Returns the index of the last occurrence of any character from the anyOf array found in the calling string

Splitting and joining strings
Split public string Split(params char separator)

Public string Split(params char separator, int count)

A method that returns a string array with the substrings present in this instance inside, which are separated from each other by elements from the specified char or string array.

The first form of the Split() method splits the calling string into its component parts. The result is an array containing the substrings obtained from the calling string. The characters delimiting these substrings are passed in the separator array. If the separator array is empty or refers to the empty string, then a space is used as the substring separator. And in the second form this method returns the number of substrings specified by the count parameter.

public string Split(params char separator, StringSplitOptions options)

Public string Split(string separator, StringSplitOptions options)

Public string Split(params char separator, int count, StringSplitOptions options)

Public string Split(string separator, int count, StringSplitOptions options)

In the first two forms of the Split() method, the calling string is split into parts and an array is returned containing the substrings obtained from the calling string. The characters separating these substrings are passed in the separator array. If the separator array is empty, then a space is used as a separator. And in the third and fourth forms of this method, the number of rows limited by the count parameter is returned.

But in all forms, the options parameter specifies a specific way to handle empty lines that are produced when two delimiters are adjacent. The StringSplitOptions enumeration defines only two values: None And RemoveEmptyEntries. If options is None, then empty strings are included in the final split result of the original string. And if the options parameter is set to RemoveEmptyEntries, then empty lines are excluded from the final result of splitting the original string.

Join() public static string Join(string separator, string value)

Public static string Join(string separator, string value, int startIndex, int count)

Constructs a new string by combining the contents of an array of strings.

The first form of the Join() method returns a string consisting of concatenated substrings passed in the value array. The second form also returns a string consisting of the substrings passed in the value array, but they are concatenated in a certain number count, starting with the value array element. In both forms, each subsequent line is separated from the previous line by a separator line specified by the separator parameter.

Filling and trimming lines
Trim() public string Trim()

Public string Trim(params char trimChars)

A method that allows you to remove all occurrences of a specific set of characters from the beginning and end of the current line.

The first form of the Trim() method removes leading and trailing spaces from the calling string. And the second form of this method removes the leading and trailing occurrences of the calling character string from the trimChars array. Both forms return the resulting string.

PadLeft() public string PadLeft(int totalWidth)

Public string PadLeft(int totalWidth, char paddingChar)

Allows you to pad a string with characters on the left.

The first form of the PadLeft() method introduces spaces on the left side of the calling string so that its total length becomes equal to the value of the totalWidth parameter. And in the second form of this method, the characters denoted by the paddingChar parameter are entered on the left side of the calling string so that its total length becomes equal to the value of the totalWidth parameter. Both forms return the resulting string. If the value of the totalWidth parameter is less than the length of the calling string, then a copy of the unchanged calling string is returned.

PadRight() Same as PadLeft()

Allows you to append a string with characters to the right.

Inserting, deleting, and replacing rows
Insert() public string Insert(int startIndex, string value)

Used to insert one row into another, where value denotes the row to be inserted into the calling row at startIndex. The method returns the resulting string.

Remove() public string Remove(int startIndex)

Public string Remove(int startIndex, int count)

Used to remove part of a string. In the first form of the Remove() method, removal begins at the location indicated by startIndex and continues until the end of the line. And in the second form of this method, the number of characters determined by the count parameter is removed from the string, starting from the place indicated by the startIndex index.

Replace() public string Replace(char oldChar, char newChar)

Public string Replace(string oldValue, string newValue)

Used to replace part of a string. In the first form of the Replace() method, all occurrences of the character oldChar in the calling string are replaced with the character newChar. And in the second form of this method, all occurrences of the string oldValue in the calling line are replaced with the string newValue.

Change case
ToUpper() public string ToUpper()

Capitalizes all letters in the calling string.

ToLower() public string ToLower()

Lowercase all letters in the calling string.

Getting a substring from a string
Substring() public string Substring(int startIndex)

Public string Substring(int startIndex, int length)

In the first form of the Substring() method, the substring is retrieved starting from the location indicated by the startIndex parameter and ending at the end of the calling string. And in the second form of this method, a substring consisting of the number of characters determined by the length parameter is extracted, starting from the place indicated by the startIndex parameter.

The following example program uses several of the above methods:

Using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace ConsoleApplication1 ( class Program ( static void Main(string args) ( // Compare the first two lines string s1 = "this is a string"; string s2 = "this is text, and this is a string"; if (String.CompareOrdinal(s1, s2) != 0) Console.WriteLine("Strings s1 and s2 are not equal"); if (String.Compare(s1, 0, s2, 13, 10, true) == 0) Console.WriteLine("However, they contain same text"); // Concatenation of strings Console.WriteLine(String.Concat("\n" + "One, two ","three, four")); // Search in a string // First occurrence of a substring if (s2. IndexOf("this") != -1) Console.WriteLine("The word \"this\" found in the line, it "+ "is at: (0) position", s2.IndexOf("this")); / / Last occurrence of the substring if (s2.LastIndexOf("this") != -1) Console.WriteLine("The last occurrence of the word \"this\" is " + "at (0) position", s2.LastIndexOf("this" )); // Search from an array of characters char myCh = ("ы","x","t"); if (s2.IndexOfAny(myCh) != -1) Console.WriteLine("One of the characters from the array ch "+ "found in the current line at position (0)", s2.IndexOfAny(myCh)); // Determine whether the line begins with the given substring if (s2.StartsWith("this is text") == true) Console.WriteLine("Substring found!"); // Determine whether the string contains a substring // using the example of determining the user's OS string myOS = Environment.OSVersion.ToString(); if (myOS.Contains("NT 5.1")) Console.WriteLine("Your operating system Windows XP"); else if (myOS.Contains("NT 6.1")) Console.WriteLine("Your operating system Windows system 7"); Console.ReadLine(); ) ) )

A little about string comparison in C#

Probably the most common of all character string operations is comparing one string to another. Before we look at any string comparison methods, it's worth emphasizing the following: String comparisons can be done in the .NET Framework in two main ways:

    First, the comparison may reflect the customs and norms of a particular cultural environment, which are often cultural settings that come into play when the program is implemented. This is standard behavior for some, but not all comparison methods.

    And secondly, the comparison can be made regardless of cultural settings only by the ordinal values ​​of the characters that make up the string. Generally speaking, non-cultural comparisons of strings use lexicographical order (and linguistic features) to determine whether one string is greater than, less than, or equal to another string. In ordinal comparison, the strings are simply ordered based on the unmodified value of each character.

Because of the differences in the way cultural string comparisons and ordinal comparisons differ, and the consequences of each such comparison, we strongly recommend that you follow the best practices currently offered by Microsoft. After all, choosing the wrong method for comparing strings can lead to incorrect operation of the program when it is operated in an environment different from the one in which it was developed.

Choosing how to compare character strings is a very important decision. As a general rule, and without exception, you should choose to compare strings in a culturally sensitive manner if this is done for the purpose of displaying the result to the user (for example, to display a series of strings sorted in lexicographical order). But if the strings contain fixed information that is not intended to be modified to accommodate cultural differences, such as a file name, keyword, website address, or security-related value, you should select ordinal string comparison. Of course, the characteristics of the particular application being developed will dictate the choice of an appropriate method for comparing character strings.

The String class provides the most different methods comparing the strings that are listed in the table above. The most universal among them is the Compare() method. It allows two strings to be compared in whole or in part, case-sensitive or case-insensitive, in a manner specified by the type parameter StringComparison, as well as cultural information provided by the type parameter CultureInfo.

Those overloads of the Compare() method that do not include a parameter of type StringComparison perform a case- and culture-sensitive comparison of character strings. And in those overloaded variants that do not contain a CultureInfo type parameter, information about the cultural environment is determined by the current runtime environment.

The StringComparison type is an enumeration that defines the values ​​shown in the table below. Using these values, you can create string comparisons that suit the needs of your specific application. Therefore, adding a parameter of type StringComparison extends the capabilities of the Compare() method and other comparison methods such as Equals(). This also makes it possible to unambiguously indicate how strings are intended to be compared.

Because of the differences between culturally sensitive string comparisons and ordinal comparisons, it is important to be as precise as possible in this regard.

Values ​​defined in the StringComparison enumeration
Meaning Description
CurrentCulture String comparisons are made using the current cultural environment settings
CurrentCultureIgnoreCase String comparisons are made using the current culture settings, but are not case sensitive
InvariantCulture String comparisons are made using immutable ones, i.e. universal data about the cultural environment
InvariantCultureIgnoreCase String comparisons are made using immutable ones, i.e. universal cultural data and case-insensitive
Ordinal String comparisons are made using the ordinal values ​​of the characters in the string. In this case, the lexicographic order may be disrupted, and symbols accepted in a particular cultural environment are ignored
OrdinalIgnoreCase String comparisons are made using the ordinal values ​​of the characters in the string, but are not case sensitive

In any case, the Compare() method returns a negative value if the first string compared is less than the second; positive if the first string compared is greater than the second; and finally, zero if both strings being compared are equal. Although the Compare() method returns zero if the strings being compared are equal, it is generally better to use the Equals() method or the == operator to determine whether character strings are equal.

The fact is that the Compare() method determines the equality of the compared strings based on their sort order. Thus, if a cultural comparison is made between strings, both strings may be the same in their sort order, but not equal in substance. By default, string equality is determined in the Equals() method, based on the ordinal values ​​of the characters and without taking into account the cultural environment. Therefore, by default, both strings are compared in this method for absolute, character-by-character equality, similar to how it is done in the == operator.

Despite the great versatility of the Compare() method, for simple ordinal comparisons of character strings it is easier to use the CompareOrdinal() method. Finally, keep in mind that the CompareTo() method only performs culturally sensitive string comparisons.

The following program demonstrates the use of the Compare(), Equals(), CompareOrdinal() methods, and the == and != operators to compare character strings. Note that the first two comparison examples clearly demonstrate the differences between culturally sensitive string comparisons and ordinal comparisons in an English-speaking environment:

Using System; class Example ( static void Main() ( string str1 = "alpha"; string str2 = "Alpha"; string str3 = "Beta"; string str4 = "alpha"; string str5 = "alpha, beta"; int result; / / First, demonstrate the differences between culture-sensitive string comparison // and ordinal comparison result = String.Compare(str1, str2, StringComparison.CurrentCulture); Console.Write("Culturally-sensitive string comparison: "); if (result 0 ) Console.WriteLine(str1 + " greater than " + str2); else Console.WriteLine(str1 + " equal to " + str2); result = String.Compare(str1, str2, StringComparison.Ordinal); Console.Write("Ordinal comparison lines: "); if (result 0) Console.WriteLine(str1 + " greater than " + str2); else Console.WriteLine(str1 + " equal to " + str4); // Use the CompareOrdinal() method result = String.CompareOrdinal( str1, str2); Console.Write("Comparing strings using the CompareOrdinal() method:\n"); if (result 0) Console.WriteLine(str1 + " greater than " + str2); else Console.WriteLine(str1 + " equal to " + str4); Console.WriteLine(); // Determine string equality using the == operator // This is an ordinal comparison of character strings if (str1 == str4) Console.WriteLine(str1 + " == " + str4); // Define line inequality using the != operator if(str1 != str3) Console.WriteLine(str1 + " != " + str3); if(str1 != str2) Console.WriteLine(str1 + " != " + str2); Console.WriteLine(); // Perform a case-insensitive ordinal comparison of strings // using the Equals() method if(String.Equals(str1, str2, StringComparison.OrdinalIgnoreCase)) Console.WriteLine("Comparison of strings using the Equals() method with the " + "OrdinalIgnoreCase parameter: \n" + str1 + " equals " + str2); Console.WriteLine(); // Compare parts of strings if(String.Compare(str2, 0, str5, 0, 3, StringComparison.CurrentCulture) > 0) ( Console.WriteLine("Compare strings taking into account the current cultural environment:" + "\n3 first characters of the string " + str2 + " more than the first 3 characters of the line " + str5); ) ) )

Running this program produces the following output:

IN modern standard C++ defines a class with functions and properties (variables) for organizing work with strings (in the classical C language there are no strings as such, there are only arrays of char characters):

#include

#include

#include

To work with strings, you also need to connect a standard namespace:

Using namespace std;

Otherwise, you will have to specify the std::string class descriptor everywhere instead of string .

Below is an example of a program working with string (does not work in older C-compatible compilers!):

#include #include #include using namespace std; int main() ( string s = "Test"; s.insert(1,"!"); cout<< s.c_str() << endl; string *s2 = new string("Hello"); s2->erase(s2->end()); cout<< s2->c_str(); cin.get(); return 0; )

The main features that the string class has:

  • initialization with an array of characters (a built-in string type) or another object of type string . A built-in type does not have the second capability;
  • copying one line to another. For a built-in type you have to use the strcpy() function;
  • access to individual characters of a string for reading and writing. In a built-in array, this is done using an index operation or indirect addressing using a pointer;
  • comparing two strings for equality. For a built-in type, the functions of the strcmp() family are used;
  • concatenation (concatenation) of two strings, producing the result either as a third string or instead of one of the original ones. For a built-in type, the strcat() function is used, but to get the result in a new line, you need to use the strcpy() and strcat() functions sequentially, and also take care of memory allocation;
  • built-in means of determining the length of a string (class member functions size() and l ength()). The only way to find out the length of a built-in type string is by calculating it using the strlen() function;
  • ability to find out if a string is empty.

Let's look at these basic features in more detail.

Initializing Strings when describing and string length(not including the terminating null terminator):

String st("My string\n"); cout<< "Длина " << st << ": " << st.size() << " символов, включая символ новой строки\n";

The string can also be empty:

String st2;

To check that is the line empty, you can compare its length with 0:

If (! st.size()) // empty

or use the empty() method, which returns true for an empty string and false for a non-empty one:

If (st.empty()) // empty

The third form of string creation initializes an object of type string with another object of the same type:

String st3(st);

The string st3 is initialized with the string st . How can we make sure these the lines match? Let's use the comparison operator (==):

If (st == st3) // initialization worked

How copy one line to another? Using the normal assignment operator:

St2 = st3; // copy st3 to st2

For string concatenation the addition operator (+) or the addition plus assignment operator (+=) is used. Let two lines be given:

String s1("hello, "); string s2("world\n");

We can get a third string consisting of a concatenation of the first two, this way:

String s3 = s1 + s2;

If we want to add s2 to the end of s1, we should write:

S1 += s2;

The addition operation can concatenate class objects string not only among themselves, but also with built-in type strings. You can rewrite the example above so that special characters and punctuation marks are represented by the built-in char * type, and significant words are represented by objects of the class string:

Const char *pc = ", "; string s1("hello"); string s2("world"); string s3 = s1 + pc + s2 + "\n"; cout<< endl << s3;

Such expressions work because the compiler "knows" how to automatically convert objects of the built-in type to objects of the string class. It is also possible to simply assign a built-in string to a string object:

String s1; const char *pc = "a character array"; s1 = pc; // Right

The inverse transformation in this case does not work. Attempting to perform the following built-in type string initialization will cause a compilation error:

Char *str = s1; // compilation error

To perform this conversion, you must explicitly call a member function called c_str() ("C string"):

Const char *str = s1.c_str();

The c_str() function returns a pointer to a character array containing the string object's string as it would appear in the built-in string type. The const keyword here prevents the "dangerous" possibility in modern visual environments of directly modifying the contents of an object via a pointer.

TO individual characters an object of type string , like a built-in type, can be accessed using the index operation. For example, here is a piece of code that replaces all periods with underscores:

String str("www.disney.com"); int size = str.size(); for (int i = 0; i< size; i++) if (str[i] == ".") str[ i ] = "_"; cout << str;

Replace(str.begin(), str.end(), ".", "_");

True, it is not the replace method of the string class that is used here, but the algorithm of the same name:

#include

Because the string object behaves like a container, other algorithms can be applied to it. This allows you to solve problems that are not directly solved by the functions of the string class.

Below is a brief description of the main operators and functions of the string class; links in the table lead to Russian-language descriptions on the Internet. A more complete list of the capabilities of the string class can be found, for example, on Wikipedia or on the website cplusplus.com.

Specifying characters in a string

operator=

assigns values ​​to a string

assign

assigns characters to a string

Access to individual characters

at

getting the specified character and checking if the index is out of bounds

operator

getting the specified character

front

getting the first character

back

getting the last character

data

returns a pointer to the first character of the string

c_str

returns unmodifiable a C character array containing the characters of the string

Line Capacity Check

empty

checks if a string is empty

size
length

returns the number of characters in a string

max_size

returns the maximum number of characters

reserve

reserves storage space

String operations

clear

clears the contents of a string

insert

inserting characters

erase

deleting characters

push_back

adding a character to the end of a string

pop_back

removes the last character

append

operator+=

appends characters to the end of a string

compare

compares two strings

replace

replaces every occurrence of the specified character

substr

returns a substring

copy

copies characters

resize

changes the number of characters stored