Variable Prefixing in OO Languages.

Started by
30 comments, last by Nathan Baum 18 years, 5 months ago
Quote:Original post by Nitage
Quote:Original post by Oluseyi
Prefixes should indicate use, not type. Ironically, this was Simonyi's original intention with what is popularly known as "Hungarian Notation" (or derisively known as Microsoft Notation). Unfortunately, being Hungarian, his facility with English was limited and he used the word "type" where he meant "kind," and we have suffered ever since under the yoke of Petzold.


Use == Type in OO.

I'm not sure I follow you here. But if you're saying that the manner in which an object is used uniquely defines its type, I have to disagree strenuously. The obvious example is any single-inheritance chain: common base types can substitute for the specialized types (aka LSP).
- k2"Choose a job you love, and you'll never have to work a day in your life." — Confucius"Logic will get you from A to B. Imagination will get you everywhere." — Albert Einstein"Money is the most egalitarian force in society. It confers power on whoever holds it." — Roger Starr{General Programming Forum FAQ} | {Blog/Journal} | {[email=kkaitan at gmail dot com]e-mail me[/email]} | {excellent webhosting}
Advertisement
Quote:Original post by kSquared
Quote:Original post by Nitage
Quote:Original post by Oluseyi
Prefixes should indicate use, not type. Ironically, this was Simonyi's original intention with what is popularly known as "Hungarian Notation" (or derisively known as Microsoft Notation). Unfortunately, being Hungarian, his facility with English was limited and he used the word "type" where he meant "kind," and we have suffered ever since under the yoke of Petzold.


Use == Type in OO.

I'm not sure I follow you here. But if you're saying that the manner in which an object is used uniquely defines its type, I have to disagree strenuously. The obvious example is any single-inheritance chain: common base types can substitute for the specialized types (aka LSP).


The use of the class from the perspective of the user equates to the type which the object is referenced with (the base class).
I didn't say anything about uniqueness.



I agree that prefixes should be dropped when it's possible to just use the type system instead, since the compiler's much more relyable at checking things than humans.

And anyways, how the hell are you supposed to prefix for templated types? >;]

Oh, and regarding the first post:
unsigned int guiJoeRating; // <-- sure looks like a badly-named GUI widget to me
http://www.joelonsoftware.com/articles/Wrong.html

interesting article on variable prefixes, and some history/true story type stuff about hungarian notation. excellent read.

edit: this is the article Oluseyi and Nitage are talking about.
Two ironic things.

1) The time I most want to actually use any kind of Hungarian notation is when I'm working in Python. :s

2) The people who seem to defend HN the most strongly are the ones who are also fond of fancy IDEs which are supposed to provide the exact same information. (But then, looking at their code, it seems that redundancy is indeed no problem for these people... sigh...)
Normally I adhere to the standard of the project I'm on.

For my own projects I use a prefix to declare the scope:
g - global
m - member
a - argument
l - local
(n - namespace, I never use this)

This prevents trouble with getters and setters in Java or whatever language you use:

public void setTime (int aTime) {
mTime = aTime;
}

Instead of:

public void setTime (int time) {
this.time = time;
}

I think the one with prefixes makes the code clearer.

[Edited by - EmmetjeGee on November 28, 2005 2:45:26 PM]
Identifier

[<prefix>+_]<name>

Prefix

m -- Class member
g -- Global
s -- Static

Name

The length of a name depends upon how widely visible it is, and how much context is available when it is referenced.

For example, the Customer member variable storing the name of the customer can be m_name. It doesn't need to be m_customerName: it's implied that it's a customer's name. Furthermore, since you're using documentation comments (right?), that can be displayed in a tooltip.

Type information shouldn't be anywhere near the identifier, unless it's the only distinction which you can make between two variables. and then you should be using domain-specific 'kinds' rather than C++ types: e.g. m_textureId, m_textureName.

Additionally, suppose a m_stdstringName contains a string at the moment, but later it needs to contain a Name instance with given() and family() functions? The name is now a lie, even though it might have a conversion from Name to std::string so its clients don't need to be updated unless they want to.
Quote:Original post by Nitage
Quote:Original post by Oluseyi
Quote:Original post by etothex
I don't prefix variables, except sometimes putting _ in front of class members.

Which, of course, is illegal in Standard C++ (identifiers with leading underscores are reserved for the standard library implementation).


Use what you want, just be consistent, and accept that virtually nobody else will agree with you.


It is not illegal.

The following identifiers are reserved:
Quote:ISO 14882, section 17.4.3.1.3:
- Each name that contains a double underscore or begin with an underscore followed by an upper-case letter.
- Each name that begins with an underscore is reserved to the implementation in the global namespace


It's not illegal, but it's still unsafe. Most vendors and library writers tend towards the rule given by Olesuyi rather than that actually written. Examples are numerous.

[size=2]
Quote:Original post by Oluseyi
Prefixes should indicate use, not type. Ironically, this was Simonyi's original intention with what is popularly known as "Hungarian Notation"...


Do we have to go through this again? I posted a response to this exact same comment back in July, showing that Simonyi's intention must have been TYPE, not usage. Here you go, again:


Following the link to the MS site (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs600/html/hunganotat.asp), it appears that you are wrong, Oluseyi. The sample that Simonyi gave very strongly shows HN being used for TYPE information, rather than usage information, as Simonyi does not use the information in a usage context.

Here is Simonyi's original code, from his original article explaining HN, marked up a bit. It shows the extreme atrocity that is Hungarian Notation, and how brittle it quickly becomes on the *second* line of the example!

1   #include "sy.h"2   extern int *rgwDic;2a  //I placed this quote in my original reply to Oluseyi.  See 2k for more.2a  //This example was probably created when words were 16bit, as were ints.2b  //'rgwDic' indicates that it is used as a WORD, and therefore words were2c  //intermixable with ints, therefore Simonyi's example sucked in regard2d  //to showing that the intention of Hungarian Notation was to show the2e  //'usage' of variables.  It also showed the extreme brittleness that2f  //Hungarian Notation suffers from.2k  //A poster indicated that 'w' stood for word (ie - usage), which makes the2l  //above note invalid.  But it it not entirely invalidated, as Simonyi goes2m  //on to state an example of HN prefixing of "cwX  Size of instances of X 2n  //in words", clearly indicating 'w' being used as a machine word, rather2o  // than a dictionary word.3   extern int bsyMac;4   struct SY *PsySz(char sz[])6      {7      char *pch;8      int cch;9      struct SY *psy, *PsyCreate();10      int *pbsy;11      int cwSz;12      unsigned wHash=0;13      pch=sz;14      while (*pch!=015         wHash=(wHash<>11+*pch++;16      cch=pch-sz;17      pbsy=&rgbsyHash[(wHash&077777)%cwHash];18      for (; *pbsy!=0; pbsy = &psy->bsyNext)  //If 'b' indicates USAGE, bsyNext indicates18a                                             //it should be used as a byte.  But it is18b                                             //clearly being assigned to an int pointer!,18c                                             //so the 'b' CANNOT designate USAGE, but18d                                             //rather TYPE.19         {20         char *szSy;21         szSy= (psy=(struct SY*)&rgwDic[*pbsy])->sz;22         pch=sz;23         while (*pch==*szSy++)  //Note that at this point, 'szSy' is NOT being used for23a                               //USAGE, but rather for TYPE!23b                               //If 'szSy' was used for usage, it would NOT have been23c                               //incremented!24            {25            if (*pch++==0)26               return (psy);27            }28         }29      cwSz=0;30      if (cch>=2)31         cwSz=(cch-2/sizeof(int)+1;32      *pbsy=(int *)(psy=PsyCreate(cwSY+cwSz))-rgwDic;33      Zero((int *)psy,cwSY);34      bltbyte(sz, psy->sz, cch+1);35      return(psy);36      }


Here is the code cleaned up to my standards (and best guesses). I may be wrong in many of my conclusions, but that is because of the difficulty inherent in HN. Which code would you rather troubleshoot?

1   #include "DictionarySymbol.h"2   extern int *gNumericDictionary; //Would just leave as 'gDictionary', but int-char2a                                  //is confusing.  You would expect a dictionary to be2b                                  //composed of 'char's.  (In the old days, that is.)3   extern int gNextDictionaryPtrAvailable;3a  //The following function places a string in the dictionary if it doesn't already exist4   struct DictionarySymbol * placeStringInDictionary(char string[])6      {7      char *ch;8      int characterPosition;9      struct DictionarySymbol * symbol, *createNewSymbol();10      int * hashTableLocation;11      int stringSizeInTable;12      unsigned hashValue=0;13      ch = string;14      while (*ch!=0        //Note: original article missing closing ')'15         hashValue = (hashValue<>11+*ch++;   //Again, missing closing ')'16      stringLength = ch-string;17      hashTableLocation=&hashTable[(hashValue&077777)%numHashEntries];18      for (; *hashTableLocation!=0; hashTableLocation = &symbol->nextSymbol)19         {20         char *ptr;21         ptr = (symbol=(struct DictionarySymbol*)&gNumericDictionary[*hashTableLocation])->string;22         ch = string;23         while (*ch==*ptr++)24            {25            if (*ch++==0)26               return (symbol);27            }28         }28a     //The string has not been found in the table, so insert it.29      stringSizeInTable=0;30      if (stringLength>=2)31         stringSizeInTable=(stringLength-2/sizeof(int)+1;31a     //The following uses 'cwSY'.  It is probably the size of the additional information31b     //to place into the table.  As I can't determine this for certain, from the code31c     //given, I will not name it more appropriately, such as 'symbolInfoSize'.32      *hashTableLocation=(int *)(symbol =32a                 createNewSymbol(cwSY+stringSizeInTable))-gNumericDictionary;33      Zero((int *)symbol,cwSY);34      bltbyte(string, symbol->string, stringLength+1);35      return (symbol);36      }


Note that the cleaned up version is also pure C code, and as such, there is no reason that it couldn't have been used in the first place.

The fact that you can actually understand the USAGE better in the cleaned up version makes a strong argument to me that HN was originally created as a way for elitist programmers to obfuscate their code, and proclaim it 'better' than other people's code.

It is pretty clear from the above example that Simonyi meant 'type' in the regular usage of the word when he wrote 'type', and that others coming along behind him argued that he meant 'usage' without really reading the article.
Quote:Original post by Anonymous Poster
Quote:Original post by Oluseyi
Prefixes should indicate use, not type. Ironically, this was Simonyi's original intention with what is popularly known as "Hungarian Notation"...


Do we have to go through this again? I posted a response to this exact same comment back in July, showing that Simonyi's intention must have been TYPE, not usage.

This goes directly contrary to what Simonyi says in the article:

Quote:
As suggested above, the concept of "type" in this context is determined by the set of operations that can be applied to a quantity. The test for type equivalence is simple: could the same set of operations be meaningfully applied to the quantities in questions? If so, the types are thought to be the same. If there are operations that apply to a quantity in exclusion of others, the type of the quantity is different.

The concept of "operation" is considered quite generally here; "being the subscript of array A" or "being the second parameter of procedure Position" are operations on quantity x (and A or Position as well). The point is that "integers" x and y are not of the same type if Position (x,y) is legal but Position (y,x) is nonsensical. Here we can also sense how the concepts of type and name merge: x is so named because it is an x-coordinate, and it seems that its type is also an x-coordinate. Most programmers probably would have named such a quantity x. In this instance, the conventions merely codify and clarify what has been widespread programming practice.

Note that the above definition of type (which, incidentally, is suggested by languages such as SIMULA and Smalltalk) is a superset of the more common definition, which takes only the quantity's representation into account. Naturally, if the representations of x and y are different, there will exist some operations that could be applied to x but not y, or the reverse.


I think this makes Simonyi's intention quite clear: HN codes for the operations which can be performed upon the object. Necessarily, this is a subset of the operations which the compiler will allow to be performed, and the compiler will only allow an operation to be performed upon an object if the object is of a suitable 'compiler type'.

Further examination of the code snippet is informative. Although compiler types are coded within the names of many variables, not all of them code for the type of that variable, but of a related variable.

rgwDic Array of words used as a dictionary. There's no way to tell from "int*" if the variable is to be thought of as pointing at an array of objects or a single object, thus 'rg' is not coding for compiler type. 'w' might be coding for compiler type, though.

bsyMac The index of the first-past-the-end element of some array of 'sy'. Whilst inarguably a really stupid name -- it would be useful to know which array it was -- it certainly doesn't code for the compiler type of the variable.

PsySz A function accepting a zero-terminated string and a returning a pointer to a sy. Obviously a terrible function name. Clearly codes for the compiler type of the return value. However, "char[]" isn't necessarily zero-terminated, so that codes for a well-defined subset of character arrays which cannot be expressed as a compiler type.

pch Pointer to a character. Codes for the fact that it's a pointer, and that it points to a character. Wrong on both counts.

cch Count of characters. Nothing to to with its compiler type.

psy Pointer to SY. Wrong on both counts.

PsyCreate Codes for return type.

pbsy Pointer to a index into an array of SY. Codes for it being a pointer, but 'b' and 'sy' do not code for its compiler type.

cwSz A count of words in a zero-terminated string. Doesn't code for its compiler type.

wHash A word which stores a hash value? Or a hash of a word?

cwHash A count of the number of words in a hash table.

rgbsyHash An array of indices into an array of SYs, used as a hash table. 'rb' might code for compiler type unless it's declared as a pointer.

szSy A zero-terminated string which in some fashion represents an SY. As above, "char*" alone is not enough to establish that a zero-terminated array is being pointed at.

By my counting:

8 prefixes which redundantly code for the exact compiler type of the object they name.

17 prefixes which code for the type of some other object, a functionally distinct subset of the compiler type, or just usage.

3 prefixes which might go either way.

Quote:
It is pretty clear from the above example that Simonyi meant 'type' in the regular usage of the word when he wrote 'type', and that others coming along behind him argued that he meant 'usage' without really reading the article.


It is pretty clear from the article that people who read the article would know that he didn't mean 'type' in the regular usage of the word when he wrote type.

There is, however, something I think we can both agree on. The follow snippet from the article is purest nonsense:

Quote:
In closing, it is evident that the conventions participated in making the code more correct, easier to write, and easier to read.

This topic is closed to new replies.

Advertisement