Sign in to follow this  

can anyone explain const char*, char, strings, char array

This topic is 814 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm sorely confused.

 

My compiler keeps telling me this function's argument only accepts "cons char*". I gave it a char, a string, a char array, a function that return a char. I really don't know what I'm dealing with. Can anyone explain what I have to give this function and compare and contrast these 3 things.

Edited by LAURENT*

Share this post


Link to post
Share on other sites

char is a single character, like 'a', 'B' or '1'. Can internally be any of 256 values (on most platforms where it is 8 bits, some exotic ones use another bit-count).

 

char[10] is an array of 10 characters stored sequentially in memory.

 

char* is a pointer which holds the address in memory of a character, for example it can point at a single 'char' variable, or it can point at the address where a sequential array begins.

const char* is such a pointer that is declared constant, which means that any function accepting it promises not to change the values stored in memory at that address. Read-only access to a string is usually the meaning.

 

std::string is a class that internally holds a memory-buffer of many characters, and has methods to manipulate them.

 

 

For string-functions that take const char* it is usually appropriate to pass them string.c_str(), as the c_str() method returns a pointer to an address in memory where the string characters are stored sequentially, + it guarantees to end that sequence with a null-character, which means that a 'char' with value 0 will be stored at the end of the sequence. Such a null-terminator is used by many string functions to determine where a string ends.

Share this post


Link to post
Share on other sites

As a side note.

const is a good subject to explore and I believe you should get into the habit of making your code const aware, part of this is const functions in a class

 

In the following the function CalcSomeValue is tagged const, which means that it does not change the state of MyClass, it is a read only function. Been a while since I worked in C++ but I believe this is compiler checked.

 

The effect of this is that if you have a const MyCLass* you are only able to call const functions on the object as you have a const object.

class MyClass
{
public:
	int CalcSomeValue() const
	{
		// code that returns something
	}
};

Cost in C++, unlike C# and some other languages, is a subtle keyword that is worth learning. It provides lots of semantic implications in function contracts that you need to understand as you pull in 3rd party libraries etc as you will encounter it a lot :)

Share this post


Link to post
Share on other sites


For the other 1%, where the function is expecting char* and forgot to const-qualify it, you'll have to vomit up a const_cast.

 

I only ever had to use const_cast, when I used some badly designed or outdated libraries (looking at you, FreeImage!). I think replacing the library, or if it is your own code, fixing it, is the right answer. Beginners should not even be told about const_cast in my opinion, it'll just cause trouble somewhere down the road.

Share this post


Link to post
Share on other sites
If I'm forced to pass a char* due to a badly designed library, I prefer to do this than fall back on a const_cast, unless there is a good reason to avoid the additional overhead:
 
void badMethod(char *s);

void f(const char *s)
{
    std::vector<char> v(s, s + strlen(s) + 1); // +1 to ensure the null terminator is added
    badMethod(v.data());
}
const_cast is undefined behaviour if it is invoked on an entity that was originally declared as const, since the compiler may have decided to locate it in read-only memory. Edited by Aardvajk

Share this post


Link to post
Share on other sites

I only ever had to use const_cast, when I used some badly designed or outdated libraries (looking at you, FreeImage!).


That's what I'm talking about: the 1% is badly designed/outdated libraries. Unfortunately, I think one of the improperly non-const char* functions is inherited from the C standard library, though I can't remember what function.

I think replacing the library, or if it is your own code, fixing it, is the right answer. Beginners should not even be told about const_cast in my opinion, it'll just cause trouble somewhere down the road.

Absolutely!

Share this post


Link to post
Share on other sites

If I'm forced to pass a char* due to a badly designed library, I prefer to do this than fall back on a const_cast, unless there is a good reason to avoid the additional overhead:
 

void badMethod(char *s);

void f(const char *s)
{
    std::vector<char> v(s, s + strlen(s) + 1); // +1 to ensure the null terminator is added
    badMethod(v.data());
}
const_cast is undefined behaviour if it is invoked on an entity that was originally declared as const, since the compiler may have decided to locate it in read-only memory.

 

Wouldn't this work?

void badMethod(char *s);

void f(const char *s) // but likely the parameter will have to leave out const in its declaration, to not be a corrupted declaration?
{
    char* memofstr=s; 
    badMethod(memofstr);
}

I understand only that const at defintion makes compiler to warn/error you out if the variable is left-side placed, or promises to not alter if something is declared as const (public parameter for example)

Share this post


Link to post
Share on other sites

 

If I'm forced to pass a char* due to a badly designed library, I prefer to do this than fall back on a const_cast, unless there is a good reason to avoid the additional overhead:
 

void badMethod(char *s);

void f(const char *s)
{
    std::vector<char> v(s, s + strlen(s) + 1); // +1 to ensure the null terminator is added
    badMethod(v.data());
}
const_cast is undefined behaviour if it is invoked on an entity that was originally declared as const, since the compiler may have decided to locate it in read-only memory.

 

Wouldn't this work?
void badMethod(char *s);

void f(const char *s) // but likely the parameter will have to leave out const in its declaration, to not be a corrupted declaration?
{
    char* memofstr=s; 
    badMethod(memofstr);
}

Nope. You can't point a non-const pointer at the address of a const variable or const pointer. That defeats the purpose of const.
The reverse does work though: You can point const at non-const.

You can do this:

const int meow = 357;
int blah = meow;

Because that's copying the value, which still guarantees that 'meow' itself isn't getting modified - it's copying the read-only const value into a different chunk of memory that's not read-only.

But this doesn't work:

const int meow = 357;
int *blah = &meow; //Nope!

Neither does this:

int meow = 357;
const int *meowPtr = &meow; //Fine.
int *blah = meowPtr; //Not fine.

Note, this is perfectly okay:

int meow = 357;
const int *meowPtr = &meow;
//(*meowPtr) = 357
meow = 12345;
//(*meowPtr) = 12345

Which may confuse beginners - Just because 'meowPtr' is const, that doesn't mean the value it points to can't change.

It just means it can't be changed through the const variable.

 

Copying the data is what Aardvark's code is doing:

void badMethod(char *s);

void f(const char *s)
{
    std::vector<char> v(s, s + strlen(s) + 1); // +1 to ensure the null terminator is added
    badMethod(v.data());
}

 

It's conceptually the same as:

void f(const char *constStr)
{
    char* memofstr = allocate_string_somehow(stdlen(s));
    strcpy(memofstr, constStr);
    badMethod(memofstr);
}

It's safer and easier to just do:

void f(const char *constStr)
{
    std::string myStr(constStr); //Copies to non-const memory.
    badMethod(&myStr[0]); //or &myStr.front()
}

Seeing that the [] subscript operator returns non-const, and that std::string is guaranteed to use consecutive memory, and is guaranteed to be null-terminated even when using [], if you are using a C++11 complaint compiler. This is still assuming that 'badMethod()' doesn't actually modify the data, and just is improperly forgetting to be const-qualified. Though even if it did modify the data, this would still be fine, I think. No const violations would occur.

Edited by Servant of the Lord

Share this post


Link to post
Share on other sites



Nope. You can't point a non-const pointer at the address of a const variable or const pointer. That defeats the purpose of const.
The reverse does work though: You can point const at non-const.

I do not know though, wheather this applies to the entire scope with the const pointer assignation, but perhaps yes, all sub-closures will be unable to write. Interesting compiler feature this const is in the end, at least towards the understanding and readibilty of run time better.

 

In other words, it is always, all the time, good, to declare your parameters as const if they do not change...

 

Or not, and just rudely use const libraries in your code, and use your code only in const unaware code.

Share this post


Link to post
Share on other sites
Servant - does the standard guarantee that std::strings data is a contiguous, null-terminated array like that? I was (perhaps mistakenly) under the impression it was not, hence the use of vector which does make such a guarantee I believe.

Share this post


Link to post
Share on other sites

Servant - does the standard guarantee that std::strings data is a contiguous, null-terminated array like that? I was (perhaps mistakenly) under the impression it was not, hence the use of vector which does make such a guarantee I believe.

 

I believe since C++11, it is. (Following from draft standard - n3376)

 


21.4.1 basic_string general requirements

5 The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string

object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0

<= n < s.size().

 

I don't believe it was pre C++11.

Edited by Rattrap

Share this post


Link to post
Share on other sites

Servant - does the standard guarantee that std::strings data is a contiguous, null-terminated array like that? I was (perhaps mistakenly) under the impression it was not, hence the use of vector which does make such a guarantee I believe.

Internal storage doesn't actually matter, the pointer returned by c_str() has that guarantee.

Obviously, a fast implementation would already such have a contiguous array.

Share this post


Link to post
Share on other sites

While it is guaranteed to be contiguous, it is not guaranteed to be null-terminated.

 

Also, the .data() method returns the buffer, but it was not guaranteed to be null-terminated until c++11.  After C++11 the .data() function and .c_str() function behave the same.

 

 

If you access through the pre-c++11 .data() function or through the address of the first element, or through any other method other than .cstr() or post-c++11 .data(), then you don't have the guarantee that it terminates.

 

It probably terminates, and you can check your implementation to see if it happens to terminate, but it isn't guaranteed to terminate everywhere.

Share this post


Link to post
Share on other sites

Servant - does the standard guarantee that std::strings data is a contiguous, null-terminated array like that? I was (perhaps mistakenly) under the impression it was not, hence the use of vector which does make such a guarantee I believe.

 
As Rattrap mentions, only C++11 onward. Contiguous memory is explicitly guaranteed by C++11.

 

Null-terminated isn't quite the guarantee I thought, but still strong enough for me. You have to take several different guarantees together to come to that conclusion.

 

21.4.5 - N3376
const_reference operator[](size_type pos) const;
reference operator[](size_type pos);

  1. Requires: pos <= size().
  2. Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to unde?ned behavior.
  3. Throws: Nothing.
  4. Complexity: constant time

 

operator[] requires that the index is within range including str[str.size()]. If you are greater than size(), it's undefined. But if you are not greater than size, and not less than size, then it returns null.

 

So it guarantees that:

str[str.size()] == '\0'

Technically, operator[] as a function could just have a if(index == str.size()) return '\0' (but as a reference) and that'd satisfy that requirement, making this not guaranteed:

char *cStr = str[0];
cStr[str.size()] == '\0' //Not explicitly guaranteed

However, c_str() still has to return a null-terminated string:

 

const charT* c_str() const noexcept;

const charT* data() const noexcept;

  1. Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()]

 

And c_str() isn't allowed to invalidate iterators, so it can't make a call to c_str() change the results of future calls to operator[].

 

So if a standard library implementation wanted to make this be invalid:

char *cStr = str[0];
cStr[str.size()] == '\0'

...the implementation would have to bend backwards to intentionally mess that up, by having two entirely separate but equal strings that it keeps in sync - one null-terminated and one not. I'm not worried about an intentionally malicious implementation.

 

I'd consider this a soft guarantee - not explicit, but given the other hard guarantees, the only sane way to implement it.

I just checked the GCC/MinGW standard library's code, to be sure, and it's null-terminated there as well. 

 

To sum:

.c_str() or .data() == null-terminated guaranteed
str[str.size()] == '\0', guaranteed

char *cStr = str[0];
cStr[str.size()] == '\0' - Not explicitly guaranteed, but the most straightforward way to implement it; having to bend backwards for any other way.

That's good enough for my code, but your own project may have a higher standard than mine.

Edited by Servant of the Lord

Share this post


Link to post
Share on other sites

They could have the space allocated but not place the terminator unless you call either of those functions.  Basically implementing it as:

 

char* c_str() { 

  m_buffer[m_length]=0;   // Not normally stored, store it so the user can have a null terminated string.  Possible implementation.

  return m_buffer;

}

 

Termination of strings is something left up to the implementation libraries.  It almost certainly is implemented with normally storing the null terminator, and it probably is implemented that way everywhere, but it is not guaranteed to be that way on every obscure compiler or every obscure system. An implementation has the option to not have a null terminator except in those cases.

 

(Also tiny nitpick in case someone wants to steal the code, it needs to be the address, so char *cStr=&str[0]; rather than char *cStr=str[0]; so you are capturing the address of the first character rather than the value of the first item.) 

Share this post


Link to post
Share on other sites
One thing that occasionally comes up is that std::string supports embedded null characters - because it stores its length separate. This usually won't trip people up, as almost every string you'll use in C and C++ will not contain embedded nulls (because null-termination is required by several standard C library functions), but can crop up in Unicode situations, or in certain specific cases where embedded nulls are used in other ways, like separating multiple strings passed in the same char array (usually terminated by two null characters in sequence).

In general I would suggest using std::string when you actually want a string, and avoid using char*/const char*/etc unless you're writing a C API or interfacing with a C API. Not only is std::string easier to work with, it is usually faster for certain tasks (no searching for the terminator) and more explicitly states what you mean in your interface. (Rather than relying on convention, which the compiler cannot enforce)

Share this post


Link to post
Share on other sites

Any advanced string type, std::string as well, will inform you of chracter count especialy, when calling .size() like property (Advanced::String("?au bruško!").size()==11 ).

if you would like to know actual allocated char[] data size of it being allocated owned and operated, call sizeof(), it will very likely by quite any, since it tends to be a rescalable memory of some (unicode or any) vectored data bytes, yet the class tend to write "\0" character at the end of explicit data bytes, to recognize intented characters data to interprete, so, my good all gess is that every advanced string handling class will allocate space for "\0" magic byte when created tight at start- and it will be written there (and counts on it being used). Though this is no compiler standard, since string type is an advanced type,  the trailing "\0" character is vital-allocated- and interpreted, all the time. So if you visit raw char* memory of the class, it will really have a "\0" value written somewhere,  but there is no guarantee it is at the end of the allocated memory, or anything like that :)

 

but generaly you can count on "\0" value written somewhere in allocated memory, to interpret.

Share this post


Link to post
Share on other sites

Any advanced string type, std::string as well, will inform you of chracter count especialy, when calling .size()

This is not true if the string is UTF enocded. UTF8 and UTF16 can encode a single character in multiple bytes. If you store UTF8 in a std::string size() doesn't guarantee that it corresponds to the number of characters in the string, it will however guarantee the amounts of bytes needed for the buffer, and the same goes for storing a UTF16 string in a std::wstring.

Share this post


Link to post
Share on other sites

 

Any advanced string type, std::string as well, will inform you of chracter count especialy, when calling .size()

This is not true if the string is UTF enocded. UTF8 and UTF16 can encode a single character in multiple bytes. If you store UTF8 in a std::string size() doesn't guarantee that it corresponds to the number of characters in the string, it will however guarantee the amounts of bytes needed for the buffer, and the same goes for storing a UTF16 string in a std::wstring.

 

How does it contradict my quote? It is what I said, that size yields amount of characters...

Share this post


Link to post
Share on other sites

 

 

Any advanced string type, std::string as well, will inform you of chracter count especialy, when calling .size()

This is not true if the string is UTF enocded. UTF8 and UTF16 can encode a single character in multiple bytes. If you store UTF8 in a std::string size() doesn't guarantee that it corresponds to the number of characters in the string, it will however guarantee the amounts of bytes needed for the buffer, and the same goes for storing a UTF16 string in a std::wstring.

 

How does it contradict my quote? It is what I said, that size yields amount of characters...

 

That is just what I am saying it does not do that for a UTF8 or UTF16 encoded string, size actually gives you the size of the internal buffer not the number of characters used in the string. UTF8 can encode a single utf character in 3 chars internally in the string buffer, this happens when you start using characters outside the latin A character set, japanese and chinese characters for example need more bytes to express their pattern so it can be displayed.

Edited by NightCreature83

Share this post


Link to post
Share on other sites
sizeof on a pointer type also requires care that you understand when it has decayed to a simple pointer e.g. by being passed as such to a function. Then sizeof will give you the size of the pointer, not the buffer.

Share this post


Link to post
Share on other sites

This topic is 814 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this