• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
fir

deinstancing c-strings

48 posts in this topic

is it possible (in migw) to make such things as enforcing

all identical content strings instances (like "cat", "cat", and "cat"

pplaced in three different places) to be represented in memory as

only one instance

 

i need it to compare words by adres like

 

char* str = "cat";

 

if(str=="cat") ...

 

I need whole indetical strings in my whole program to

be deisntanced to one instance - When i used old borland b55

there was a switch compiler option for this

 

also is it possible when i have the strings copies located in different 

.o modules 9i would be happy if i could deinstance that all string to

only one original instance)

 

much tnx id somebody know and could answer that

 

 

-1

Share this post


Link to post
Share on other sites

Well you could do it with a constant 

 

const char* CAT_STR = "cat"

 

Also, str == "cat" is not what in C.   You want to do strcmp(str, "cat").    And it returns 0 on equality tongue.png

1

Share this post


Link to post
Share on other sites

 

Hodgman I really like, and I often support you, but that code you just posted is filled with ugliness: It's got std::set, std::string and globals. Can't get uglier than that.
 
To answer the OP post, you'll have to use const variables with extern forward declarations, and define the string in one compilation unit:
 

//Definitions.cpp
const char *gCAT = "cat";
 
//Definitions.h
extern const char *gCAT;
 
//Foo.cpp
if( strcmp( myVar, gCAT ) == 0 )
{
    //myVar contains "cat"
}

 

this is not good solution for me, I need to compare by pointer

void DrawPixel(int x, int y, char* color)
{
    if(color=="red")  DrawPixel(x,y,0x00ff0000);
    if(color=="blue")  DrawPixel(x,y,0x000000ff);
 
}

etc it is some technique of my invention called ad-hoc enums (*)

(extremally usable becaouse you do not need to jump and at all wory

by definitions - very comfortable technique

 

in b55  i was used "unity builds" then (i mean i included 100 surce files into one then compiled it all in one 'translation unit') I got a switch "-d Merge duplicate strings" and it worked perfectly well (and i used it very heavy skiping normal enums at all)

 

Is this not possible in mingw to merge stringa across different modules ? (maybe it is possible to merge them in one translation unit?) (I think theoretically this could be quite possible, linker just need to merge many to one at a linking stage)

 

(Im very sory if i would drop my favourite ad-hoc enums in string implementation)

 

(*) precisely spiking I invented the concept of ad-hoc enum as an ad-hoc enum without definition, and found that i can use strings to implement that without dedicated compiler support

 

really it is imposible in mingw to do that?

Edited by fir
0

Share this post


Link to post
Share on other sites

gcc does merge string constants at all optimisation levels except -O0. Mingw should do likewise.

 

You can also add -fmerge-all-constants so it merges constant variables in addition to literals, but note that this is not standard conforming.

Edited by King Mir
2

Share this post


Link to post
Share on other sites

gcc does merge string constants at all optimisation levels except -O0. Mingw should do likewise.

 

You can also add -fmerge-all-constants so it merges constant variables in addition to literals, but note that this is not standard conforming.

 

 im not quite understand what you mean by string constants

they ale constants but i do not write a word const here

 

void DrawPixel(int x, int y, char* color)
{
    if(color=="red")  DrawPixel(x,y,0x00ff0000);
    if(color=="blue")  DrawPixel(x,y,0x000000ff);
}

 

I can change char* above to const char* but its a way to inform

compiler that "red" and "blue" in the body are also constants?

- i need youst this kind of code to work - on b55 & unity builds

(I didnt tried this with separate module strings there) it worked perfectly well (when turned -d "merge duplicate strings")

 

PS I tested it yet once (I tested before but a bit forgot the results)

 

when i use string literals in the same compiletion unit they are merged (sama adres) but when used in different compilation units

they are different (not merged) - and "-fmerge-constants -fmerge-all-constants" didnt help here I dont know why becouse I understand that it should

-fmerge-constants Attempt to merge identical constants (string constants and floating-point constants) across compilation units.

 

still not sure how to understand by constants, all string literals are considered as a constants if im not wrong so they could be safely merged imo  - but this do not work 

 

 

 

 

 

 

 

 

 

  Edited by fir
0

Share this post


Link to post
Share on other sites

 

really it is imposible in mingw to do that?

It's probably possible, but I would never trust such an optimisation so much that your code's logic depends on it.
if(color=="red") is just wrong.
if(color==g_red) like in Matias' example is the only standards-compliant way to do this.

 

 

I think one could belong on that if only one know that all the "sjhsbjhs"

in his whole c program are lying in the same place - then its perfectly safe and it is not wrong - but i dont know what switch tu use to assure this

 

and I got a little breakdown because i was very accustomed to this way of writing, esp when writing roguelike projest when you use alot of enums, then i just used my string invention with great succes

-2

Share this post


Link to post
Share on other sites

Just don't try to use pointers as enumeration constants. Its a bad idea - loss of enumeration type safety, hard to serialize, hard (as you have seen) to enforce the behaviour you need with different compilers, hard for other programmers to understand etc etc.

 

I totally fail to see why "red" is better than Color::red by the way. I can see reasons why Color::red is better than "red" though:

 

void f(Color::Type type);

void g(const char *type);

 

f() you can't accidentally pass a wrong type to. g() you can pass anything to. Its just abuse of the compiler and silly.

 

Yes, if you can guarantee that all of the identical strings are merged and all reside at the same address, this can work. That doesn't make it something you should do.

 

I was verry hapyy with that : you just passes string litereals everything 

works and is top efficient end easy expandable, errors and mistakes 

are also very easy to cactch here

void f(char* str)
{
       if(str=="red") ;
else if(str=="green") ;
else if(str=="orange") ;
else ERROR("string %s unrecognized in  ...", s);
}

you can also build many othet techniques on top of that (you can fall to consuming contents of the string if needed etc)

 

 

 

(thats why i am emphasizing this is my invention, (though maybe someone was used before but i never heard of it) (well two inventions really, one is a concept of "ad-hoc" enum that was first , and second is its string literal implementation)  because i was so happy with that

 

the practical reason of its great usability is the thing that you 

spare code jumping (to definition - which is tiring really),

also you do not need to maintain this enums definitions stuff

 

but aprat of this - i got a bit of breadown because this  -fmerge-constants   do not work across the modules (so i cannot use this technique across the modules) and i dont know why it does not work, and how to make it work - could someone help yet? (tnx kingmir for hint it was usefull)

Edited by fir
-2

Share this post


Link to post
Share on other sites

I understand your reasons why you think this is a good idea. But I still maintain it isn't. You are typing all of these properties as const char*, with no way to differentiate them. Your example of the if-else chain is moving a check to runtime that should be done at compile time, so it isn't more efficient at all.

 

If the types were defined as an enumeration, the compiler can check at compile time whether a valid valid is being used (unless of course you mess about with static_casts but then all bets are off).

 

There are cases where I can see this being useful. Using strings as identifiers can reduce coupling between modules and is in some circumstances a good approach. But these cases are handled best using interning and maybe hashmapping, not by directly comparing pointers.

 

With respect, you haven't "invented" this. You have just been exploiting an implementation detail of a specific compiler in a standards-undefined manner.

 

It is very cheap check in runtime, not a problem for me (you could even include it only in debug mode)

 

I like this kind of oldschool simplicity

We got a difference of opinions here, well, go on your way if you like 

Edited by fir
-4

Share this post


Link to post
Share on other sites

 

Hodgman I really like, and I often support you, but that code you just posted is filled with ugliness: It's got std::set, std::string and globals. Can't get uglier than that.
 

 

Hodgman is a good man :) interesting answers often

0

Share this post


Link to post
Share on other sites

 

I understand your reasons why you think this is a good idea. But I still maintain it isn't. You are typing all of these properties as const char*, with no way to differentiate them. Your example of the if-else chain is moving a check to runtime that should be done at compile time, so it isn't more efficient at all.

 

If the types were defined as an enumeration, the compiler can check at compile time whether a valid valid is being used (unless of course you mess about with static_casts but then all bets are off).

 

There are cases where I can see this being useful. Using strings as identifiers can reduce coupling between modules and is in some circumstances a good approach. But these cases are handled best using interning and maybe hashmapping, not by directly comparing pointers.

 

With respect, you haven't "invented" this. You have just been exploiting an implementation detail of a specific compiler in a standards-undefined manner.

 

It is very cheap check in runtime, not a problem for me (you could even include it only in debug mode)

 

I like this kind of oldschool simplicity

We got a difference of opinions here, well, go on your way if you like 

 

 

No problem at all, each to their own. If you find a way to get this to work in a standards-defined manner across every compiler your code may every possibly be compiled with (including assurances that no future updates to the compiler will suddenly break it), please share.

 

I too would be interested in maybe using this technique if the above was not an issue.

Edited by Aardvajk
0

Share this post


Link to post
Share on other sites

 

 

I understand your reasons why you think this is a good idea. But I still maintain it isn't. You are typing all of these properties as const char*, with no way to differentiate them. Your example of the if-else chain is moving a check to runtime that should be done at compile time, so it isn't more efficient at all.

 

If the types were defined as an enumeration, the compiler can check at compile time whether a valid valid is being used (unless of course you mess about with static_casts but then all bets are off).

 

There are cases where I can see this being useful. Using strings as identifiers can reduce coupling between modules and is in some circumstances a good approach. But these cases are handled best using interning and maybe hashmapping, not by directly comparing pointers.

 

With respect, you haven't "invented" this. You have just been exploiting an implementation detail of a specific compiler in a standards-undefined manner.

 

It is very cheap check in runtime, not a problem for me (you could even include it only in debug mode)

 

I like this kind of oldschool simplicity

We got a difference of opinions here, well, go on your way if you like 

 

 

No problem at all, each to their own. If you find a way to get this to work in a standards-defined manner across every compiler your code may every possibly be compiled with (including assurances that no future updates to the compiler will suddenly break it), please share.

 

alright, i think the assurance should be made by c standard because it is so usable, but afaik it now sadly depends on the linker 

 

in reality form me it is not primary trouble because i can stick to one given compiler (mingw) and only write code using compiler dependant 

behavior, so if it only be working here i would be happy [i know many will find write compiler dependant source (use specyfic extensions etc) as a sin but Im okay with that]

0

Share this post


Link to post
Share on other sites


alright, i think the assurance should be made by c standard because it is so usable, but afaik it now sadly depends on the linker

Well, the C language has assured this will not work since about 1970.  Resolving symbols and other issues regarding separate compilation units has never been a part of the language, and always a part of the system object linker.

 

If you really want to ensure the starting address of arbitrary constant data sequences in memory will be the same across all compilation units, you're going to need to use language extensions to explicitly put the constants into named sections (__attribute__((section,"my_enum"))), then use a linker script to assign those sections to a fixed base address.  You will need to use the named constant addess everywhere for comparison (as in if(color==g_red).

 

Or, you could stick to using strcmp() and move on.

0

Share this post


Link to post
Share on other sites

 


alright, i think the assurance should be made by c standard because it is so usable, but afaik it now sadly depends on the linker

Well, the C language has assured this will not work since about 1970.  Resolving symbols and other issues regarding separate compilation units has never been a part of the language, and always a part of the system object linker.

 

If you really want to ensure the starting address of arbitrary constant data sequences in memory will be the same across all compilation units, you're going to need to use language extensions to explicitly put the constants into named sections (__attribute__((section,"my_enum"))), then use a linker script to assign those sections to a fixed base address.  You will need to use the named constant addess everywhere for comparison (as in if(color==g_red).

 

Or, you could stick to using strcmp() and move on.

 

 

I do not understant why 

 

"-fmerge-constants

 

Attempt to merge identical constants (string constants and floating-point constants) across compilation units."

 

does not work- i understand it that it should work, this is probably easy by linker  to realloc all multiplicated string literal instances into one place in the data (constants probably) section - so why this option is not working, and what "attempt" mean?

 

(i would like to write a improved c dialect compiler and linker myself but is a matter of years, so by now i would like to prefer it would just be working here in mingw)

 

((customized strcmp is probably the way  to consider if primary way would be not working but its sad))

Edited by fir
0

Share this post


Link to post
Share on other sites

For my guess at why -fmerge-constants may not be working for you: are you sure it's being properly passed as a linker setting in your build system? Post the command to link your program together.

 

But I agree with others that this is a bad idea. If you need an enum, use an enum. It's faster, safer, and clearer. Here's a few reasons why:

1)You shouldn't do at runtime what you can do at compile time. An enum is a compile time constant. A pointer is a runtime constant.

2)Char * may confuse your compiler because it could alias with anything. That means it can't reorder accesses to the char pointer with other operations involving pointers. Since reordering is crucial to optimisation, this can be a major blow in non obvious ways.

3)Is it "turquose" or "turquoise" or what? If you misspell a word, it's a different value, and there's nothing to catch your mistake. Likewise, is cyan a different color or not?

4)The use of standard patterns, like enums, makes your code easier to read by others and by you in the future. Being clever is a bad thing for this.

5)If the value is not a compile time literal, your method will check it against all constant pointers and always fail to match. That's an overhead for nothing. And there's no compile time check to guarantee the pointer passes is a literal. In contrast, an enum can match with a runtime integer, or optionally not try to match a runtime integer because it's a different type.

6)On a 64 bit system, pointers are 64 bits. But an enum can have a smaller memory and register footprint. This can also mean that more function parameters are passed by register.

7)You're relying on behaviour not guaranteed by the standard. That makes your code not portable and technically not C.

Edited by King Mir
0

Share this post


Link to post
Share on other sites

For my guess at why -fmerge-constants may not be working for you: are you sure it's being properly passed as a linker setting in your build system? Post the command to link your program together.

 

 

 

I used this way

 

c:\mingw\bin\g++ -O3 module1.c -c -fno-rtti -fno-exceptions -fmerge-constants -fmerge-all-constants

 

to compile modules (only change the name of modules here

and

 

c:\mingw\bin\g++ -O3 -Wl,--subsystem,windows -w module1.o module2.o module3.o  -lgdi32 -s -fno-rtti -fno-exceptions -o program.exe -fmerge-constants -fmerge-all-constants
 
to link them
 
i found some example by some man (in SO) when searching google about this merging (no need to write it yourself)
 
// s.c
#include <stdio.h>

void f();

int main() {
printf( "%p\n", "foo" );
printf( "%p\n", "foo" );
f();
}

// s2.c
#include <stdio.h>

void f() {
printf( "%p\n", "foo" );
printf( "%p\n", "foo" );
}
when compiled as:
gcc s.c s2.c
produces:
00403024
00403024
0040302C
0040302C

but didnt find if it is possible to merge.. this example i was not yet tried, will check this rught now 

 

edit: tested

 

sadly 
with 
 
c:\mingw\bin\gcc test.c test2.c -fmerge-constants -fmerge-all-constants
 
still got 
 
00403024
00403024
0040302C
0040302C
 
 
Edited by fir
0

Share this post


Link to post
Share on other sites

 

But I agree with others that this is a bad idea. If you need an enum, use an enum. It's faster, safer, and clearer. Here's a few reasons why:

1)You shouldn't do at runtime what you can do at compile time. An enum is a compile time constant. A pointer is a runtime constant.

2)Char * may confuse your compiler because it could alias with anything. That means it can't reorder accesses to the char pointer with other operations involving pointers. Since reordering is crucial to optimisation, this can be a major blow in non obvious ways.

3)Is it "turquose" or "turquoise" or what? If you misspell a word, it's a different value, and there's nothing to catch your mistake. Likewise, is cyan a different color or not?

4)The use of standard patterns, like enums, makes your code easier to read by others and by you in the future. Being clever is a bad thing for this.

5)If the value is not a compile time literal, your method will check it against all constant pointers and always fail to match. That's an overhead for nothing. And there's no compile time check to guarantee the pointer passes is a literal. In contrast, an enum can match with a runtime integer, or optionally not try to match a runtime integer because it's a different type.

6)On a 64 bit system, pointers are 64 bits. But an enum can have a smaller memory and register footprint. This can also mean that more function parameters are passed by register.

7)You're relying on behaviour not guaranteed by the standard. That makes your code not portable and technically not C.

 

in general i agree (this is, the worst thing here, is this, that there could be a compiler where i could not switch this behavior on :c (like still here) -

 

other are some tradeofs i could easily pay for personally (espescially that it really should be hopefuly only very slightly slower than enum )  ..,

 

As to 'enum is compile time and string literal pointer is not compile time' im not sure if this is true,- you use enum you have 117 here you got 00403024 both are 'runtime static' numbers

 

As to second point youre right (thats good point) but im too not sure as to details - if compiler would be easy enough to detect that this char* points to const section it should (probably)  not cumber him 

Edited by fir
0

Share this post


Link to post
Share on other sites

 


 

As to 'enum is compile time and string literal pointer is not compile time' im not sure if this is true,- you use enum you have 117 here you got 00403024 both are 'runtime static' numbers

 

 

What you seem to be missing is that the reason you are finding this so difficult is that there is a world of difference between an enum being equal to 117 (a standards-defined behaviour) and the value of a pointer, which is an implementation detail.

 

Nobody minds if you want to write compiler-dependant, non-standard code that can break in the future, that's entirely up to you. But continuing to try to argue that it "should" be possible and "should" be implemented is just wrong.

 

 

you say value of a pointer is implementation detail (i do not quite understand this statement) but you use them in some way - so you can compare them for equality here too

 

(maybe even if you will get a rule that they should be stored in 

string literals secton in alphabetic ordered way you could even 

legally compare them for less or more relation ;/ ) (thats a bit

joke probably it would not be good)

 

- so this is not a good argument imo, but I am not sure if I want to convince somebody and take such kind of conversation on my back ;/

 

i need some advice why this merge strings not working in my mingw

Edited by fir
0

Share this post


Link to post
Share on other sites

It is absolutely true enums get better optimizations than pointers at compile time. Enums allow optimizations that are literally impossible with pointers. For example, simple case of color channel by enum versus string constants (strikethrough is part optimized):

 

enum {RED, GREEN, BLUE}

if (enum == RED) do something

else if (enum == GREEN) do something

else if (enum == BLUE) do something

 

It is very easy to tell enum has to be BLUE if it is not GREEN or RED

 

if (str == "red) do something

else if (str == "green") do something

else if (str == "blue") do something

 

It is impossible to know if str is restricting itself to only those 3 possible values. There are 4 billion possible values for str on 32-bit, and 18 quintillion possible values on 64-bit. The optimizer is not even going to try to track values for a pointer given how impossible a task that would be.

Edited by richardurich
1

Share this post


Link to post
Share on other sites

Even if you could do this and there were no performance issues, you're giving up compile-time checking for runtime checking. If you misspell a string somewhere in your code, you won't find out about it until that code is executed (hope you have a way to get 100% code coverage) - and even then the result might just be a subtle bug.

 

If you're using enums (or string constants), then if you mistype the name somewhere you'll get a compile error.

 

It sounds like you prefer to be a bit lazy writing your code ("simplicity"), even if it results in less maintainable, less robust code. That's never a good trade-off.

Edited by phil_t
2

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0