Question about strings

Started by
7 comments, last by Trap 18 years ago
Hi. I am trying to write a c++ dll which is going to be called from a different language, one that I dont know all that much about. In this other language, the format for strings seems to be: a four byte length field, followed by the null-terminated ascii string itself. But the actual pointers to the strings point to the null-terminated string itself - the length field is in the 4 bytes preceeding this. Question - What is this type of string called in c++?
Advertisement
This doesn't sound like any standard C++ type.

What is the other language?
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
Quote:Original post by DaBookshah
Question - What is this type of string called in c++?
It's not called anything, it doesn't exist, as Fruny said.

The other language sounds like Pascal to me, that's the only language I know of that prefixes the string with the length (And this was about 10 years ago...)
I've been known to call them Pascal strings, but I don't think you can have that specific representation in C++. You'll have to fudge it with a structure and some pointer arithmetic. The best I can come up with is to allocate your c-string with an extra four bytes, and then offset your pointer by four.

void SomeFunctionThatWantsAPascalString(char* string){   int length = *(int*)(string - 4);   cout << length << " \"" << string << "\"" << endl;}int main(){    char* CString = "hello world";    char* PascalString = new char[strlen(CString) + 4 + 1];    //is the string really still null terminated, by the way?    strcpy(&PascalString[4], "hello world");    *((__int32*)PascalString) = (__int32)strlen(CString)+1;    SomeFunctionThatWantsAPascalString(&PascalString[4]);    delete[] PascalString;    return 0;}

Naturally, you'd do well to wrap all the nasties up in some functions. Possibly even just use std::string with a CreatePascalString function that is called just before passing the memory to your library.

CM
Thanks. Just for anyone whoose interested, the language is a piece of crap known as "Visual Dataflex 7". Evil Steve's figure of about 10 years is pretty close.
Weird. Why would it need both a length AND a terminator?
In one of his columns Joel Spolesky called them f***ed strings. And I would tend to agree with him on that one, since it's silly to have both a pascal-style(with the size prefixing the string) and c-style(null-terminated) in one string.

There's no need to have both. The primary reason for using pascal strings is speed (strlen is O(1) not O(n) ) and the primary reason for c-strings is that pascal strings were originally only prefixed with one byte for length, so you couldn't have strings longer than 256 chars. Plus null-terminated strings worked well on the PDP-11, when C was designed :)

With a 4-byte length field, and more modern comps than the PDP-11, the reason for null-terminated strings is gone.
seems like a compatibility hack to me. you could just pass such a a pointer to a C funktion, without convertion. Could be a pain though, if the C function manipulates the string.
Quote:Original post by DaBookshah
a four byte length field, followed by the null-terminated ascii string itself.

GCC implements std::string in a very similar way, but has 3 fields preceding the actual string: length, capacity, refcount. It is null-terminated too.

This topic is closed to new replies.

Advertisement