Advertisement Jump to content
Sign in to follow this  
fir

deinstancing c-strings

This topic is 1788 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

is it possible (in migw) to make such things as enforcing

all identical content strings instances (like "cat", "cat", and "cat"

pplaced in three different places) to be represented in memory as

only one instance

 

i need it to compare words by adres like

 

char* str = "cat";

 

if(str=="cat") ...

 

I need whole indetical strings in my whole program to

be deisntanced to one instance - When i used old borland b55

there was a switch compiler option for this

 

also is it possible when i have the strings copies located in different 

.o modules 9i would be happy if i could deinstance that all string to

only one original instance)

 

much tnx id somebody know and could answer that

 

 

Share this post


Link to post
Share on other sites
Advertisement

Well you could do it with a constant 

 

const char* CAT_STR = "cat"

 

Also, str == "cat" is not what in C.   You want to do strcmp(str, "cat").    And it returns 0 on equality tongue.png

Share this post


Link to post
Share on other sites

Hodgman I really like, and I often support you, but that code you just posted is filled with ugliness: It's got std::set, std::string and globals. Can't get uglier than that.
 
To answer the OP post, you'll have to use const variables with extern forward declarations, and define the string in one compilation unit:
 

//Definitions.cpp
const char *gCAT = "cat";
 
//Definitions.h
extern const char *gCAT;
 
//Foo.cpp
if( strcmp( myVar, gCAT ) == 0 )
{
    //myVar contains "cat"
}

Edit: Just to clarify what's "ugly" in Hodgman's code:

  • Common implementations of std::set are not cache friendly. Any advantage from string interning is obliterated by this (duplicated string in the final compiled binary is just going to be faster).
  • std::string create unnecessary heap allocations (+ other minor misc overhead).
  • It's not thread safe (the code allows adding more strings to the pool at runtime; thus if it R/W access happens from multiple threads...)
  • It's got globals
Edited by Matias Goldberg

Share this post


Link to post
Share on other sites

 

Hodgman I really like, and I often support you, but that code you just posted is filled with ugliness: It's got std::set, std::string and globals. Can't get uglier than that.
 
To answer the OP post, you'll have to use const variables with extern forward declarations, and define the string in one compilation unit:
 

//Definitions.cpp
const char *gCAT = "cat";
 
//Definitions.h
extern const char *gCAT;
 
//Foo.cpp
if( strcmp( myVar, gCAT ) == 0 )
{
    //myVar contains "cat"
}

 

this is not good solution for me, I need to compare by pointer

void DrawPixel(int x, int y, char* color)
{
    if(color=="red")  DrawPixel(x,y,0x00ff0000);
    if(color=="blue")  DrawPixel(x,y,0x000000ff);
 
}

etc it is some technique of my invention called ad-hoc enums (*)

(extremally usable becaouse you do not need to jump and at all wory

by definitions - very comfortable technique

 

in b55  i was used "unity builds" then (i mean i included 100 surce files into one then compiled it all in one 'translation unit') I got a switch "-d Merge duplicate strings" and it worked perfectly well (and i used it very heavy skiping normal enums at all)

 

Is this not possible in mingw to merge stringa across different modules ? (maybe it is possible to merge them in one translation unit?) (I think theoretically this could be quite possible, linker just need to merge many to one at a linking stage)

 

(Im very sory if i would drop my favourite ad-hoc enums in string implementation)

 

(*) precisely spiking I invented the concept of ad-hoc enum as an ad-hoc enum without definition, and found that i can use strings to implement that without dedicated compiler support

 

really it is imposible in mingw to do that?

Edited by fir

Share this post


Link to post
Share on other sites

gcc does merge string constants at all optimisation levels except -O0. Mingw should do likewise.

 

You can also add -fmerge-all-constants so it merges constant variables in addition to literals, but note that this is not standard conforming.

Edited by King Mir

Share this post


Link to post
Share on other sites

Hodgman I really like, and I often support you, but that code you just posted is filled with ugliness: It's got std::set, std::string and globals. Can't get uglier than that.

Hahah no problem. It was the smallest demonstration of a string interning system that I could think of (and for the record, at least the global badness was commented).

If you want a general purpose system (e.g. dealing with strings that aren't known until runtime), then an interning system like that is what you'd want.

I've seen a bunch such systems inside other engines, except using better hash tables, and string storage containers, and lockless thread safety.... Such features aren't needed in an example that explains the idea though laugh.png

really it is imposible in mingw to do that?

It's probably possible, but I would never trust such an optimisation so much that your code's logic depends on it.
if(color=="red") is just wrong.
if(color==g_red) like in Matias' example is the only standards-compliant way to do this.

Share this post


Link to post
Share on other sites

gcc does merge string constants at all optimisation levels except -O0. Mingw should do likewise.

 

You can also add -fmerge-all-constants so it merges constant variables in addition to literals, but note that this is not standard conforming.

 

 im not quite understand what you mean by string constants

they ale constants but i do not write a word const here

 

void DrawPixel(int x, int y, char* color)
{
    if(color=="red")  DrawPixel(x,y,0x00ff0000);
    if(color=="blue")  DrawPixel(x,y,0x000000ff);
}

 

I can change char* above to const char* but its a way to inform

compiler that "red" and "blue" in the body are also constants?

- i need youst this kind of code to work - on b55 & unity builds

(I didnt tried this with separate module strings there) it worked perfectly well (when turned -d "merge duplicate strings")

 

PS I tested it yet once (I tested before but a bit forgot the results)

 

when i use string literals in the same compiletion unit they are merged (sama adres) but when used in different compilation units

they are different (not merged) - and "-fmerge-constants -fmerge-all-constants" didnt help here I dont know why becouse I understand that it should

-fmerge-constants Attempt to merge identical constants (string constants and floating-point constants) across compilation units.

 

still not sure how to understand by constants, all string literals are considered as a constants if im not wrong so they could be safely merged imo  - but this do not work 

 

 

 

 

 

 

 

 

 

  Edited by fir

Share this post


Link to post
Share on other sites

 

really it is imposible in mingw to do that?

It's probably possible, but I would never trust such an optimisation so much that your code's logic depends on it.
if(color=="red") is just wrong.
if(color==g_red) like in Matias' example is the only standards-compliant way to do this.

 

 

I think one could belong on that if only one know that all the "sjhsbjhs"

in his whole c program are lying in the same place - then its perfectly safe and it is not wrong - but i dont know what switch tu use to assure this

 

and I got a little breakdown because i was very accustomed to this way of writing, esp when writing roguelike projest when you use alot of enums, then i just used my string invention with great succes

Share this post


Link to post
Share on other sites

Just don't try to use pointers as enumeration constants. Its a bad idea - loss of enumeration type safety, hard to serialize, hard (as you have seen) to enforce the behaviour you need with different compilers, hard for other programmers to understand etc etc.

 

I totally fail to see why "red" is better than Color::red by the way. I can see reasons why Color::red is better than "red" though:

 

void f(Color::Type type);

void g(const char *type);

 

f() you can't accidentally pass a wrong type to. g() you can pass anything to. Its just abuse of the compiler and silly.

 

Yes, if you can guarantee that all of the identical strings are merged and all reside at the same address, this can work. That doesn't make it something you should do.

Edited by Aardvajk

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!