Three real nitpicky C++ questions.

Started by
11 comments, last by iMalc 16 years, 9 months ago
Hello all, I have three real nitpicky C++ questions that have just been on the back of my mind for awhile. 1) Ok, so say I have a class (class A) that is only used by another class (class B). The relationship is that A is stored in an stl container in B. By the rules of data hiding one should limit the accessibility of A by restricting access to A from other sources besides B. This sounds exactly like a friend relationship, but the problem is is that because it is stored by the container a second friendship to the container is needed. Now say you store a pointer to A in a second stl container in B. Now suddenly your simple class is no longer simple because you are racking up the friendship count and what was once a simple class is now an accurate but somewhat sloppy class to read and decipher. It becomes a double edged sword, either you stick to the rules of data hiding and provide up to several friendship relationships at the cost of readability, simplicity and portability, or you keep the simple class but ignore the rules of data hiding. I know this is very nitpicky but I find myself having to create a lot of friendship relationships (and it seems as if I do too many) to fully support the theory of data hiding. 2) When and where exactly are inlined functions a good idea to use? My take is that if the function is used in many places but in places that do not require optimization or are not called regularly then inlining functions may not be such a good idea due to larger program size, whereas places nestled deep in regularly used loops may justify the inline. Basically I am just trying to figure out when one should consider inlining, and when exactly inlining is a good idea or not. 3) I know that inlining is at the descression of the compiler to acknowledge or not but how many statements approximately is the maximum length of a function to be even considered for inlining. Thank you for your help, Jeremy (grill8)
Advertisement
To answer your questions:

1) C++ does allow this construct:
class B{private:  class A  {  };};

which would hide A from the rest of the application. Also, the visibility is not altered if A is stored in an STL container:
class B{private:  class A  {  };  vector <A>    m_vector_of_a;};

The member m_vector_of_a can't be made public if A is not public. The only unpleasent bit here is the fact the definition of A is in the definition of B so programmers using B can look at A. If this is a problem, then perhaps the following:
class A; // forward declare Aclass B{private:  vector <A *> // can only have vector of pointers to A    m_vector_of_a;   // because A is not fully defined};

and in the CPP file:
// define the class in the CPP file to hide it from other classesclass A{};// class B's code


2) Don't worry about it, the compiler really does know best. However, inline functions can only be inlined if they are defined in the current translation unit, regardless of the inline keyword. For example:
// B.hclass B{public:  inline int func ();};// B.cppint B::func (){  return 42;}// C.cpp#include "B.h"C::C (){  B b;  int i = b.func (); // won't be inlined, may even cause linker error}


3) It really depends on the compiler and the compiler options used.

Skizz
1: No, you don't need to use friendships at all, certainly not for the STL container. If A is a helper class to B and only B, you might want to declare A as a member of B. However, there is no real point in doing so, or limiting access to the class A itself. What encapsulation is about is limiting access to members; A's members should be private as often as possible and B should keep A as a private member if there is no reason to do otherwise.

Friends should only be used as a last (or second to last) resort, because they break encapsulation. Also, if B is a friend of A, and A is instanced and stored in an STL container in B, then that does not affect the friend relation between them. B does not need to be a friend of the STL container template, nor vice versa.

2: Inlining is usually a very small concern. You might want to use it for very small functions because there isn't a reason not to, and you might gain some measure of performance benefit. But usually, it's one of the last things you should worry about.
-------------Please rate this post if it was useful.
Quote:Original post by grill8
1) Ok, so say I have a class (class A) that is only used by another class (class B). The relationship is that A is stored in an stl container in B. By the rules of data hiding one should limit the accessibility of A by restricting access to A from other sources besides B. This sounds exactly like a friend relationship, but the problem is is that because it is stored by the container a second friendship to the container is needed. Now say you store a pointer to A in a second stl container in B. Now suddenly your simple class is no longer simple because you are racking up the friendship count and what was once a simple class is now an accurate but somewhat sloppy class to read and decipher.
It becomes a double edged sword, either you stick to the rules of data hiding and provide up to several friendship relationships at the cost of readability, simplicity and portability, or you keep the simple class but ignore the rules of data hiding.

Why would a friendship be appropriate here? In general you should try to avoid friend classes as much as possible since it breaks encapsulation. A's public interface should be good enough for B to use, if not then A's interface isn't well-designed.

Anyway lets say a friendship-relationship is appropriate. Why is a friendship-relationship needed to the container? As long as the default constructor, copy-constructor and assignment operator is public you should have no problem. For example try:
#include <iostream>#include <vector>class B;class A{private:	friend B;	int data;		explicit A(int i)		:		data(i)	{}};class B{	std::vector<A> obj;public:	B()	{		for(unsigned int i = 0; i <= 10; ++i)		{			obj.push_back( A(i) );		}		std::vector<A>::const_iterator iter = obj.begin();		std::vector<A>::const_iterator end_iter = obj.end();		for( ; iter != end_iter; ++iter)		{			std::cout << iter->data << std::endl;		}	}};int main(){	new B();}

Here B accesses A's private content in a vector and B is A's only friend.

Quote:I know this is very nitpicky but I find myself having to create a lot of friendship relationships (and it seems as if I do too many) to fully support the theory of data hiding.

In general friendship relationships to other classes should be a very rare thing.

Quote:2) When and where exactly are inlined functions a good idea to use? My take is that if the function is used in many places but in places that do not require optimization or are not called regularly then inlining functions may not be such a good idea due to larger program size, whereas places nestled deep in regularly used loops may justify the inline. Basically I am just trying to figure out when one should consider inlining, and when exactly inlining is a good idea or not.

Usually you shouldn't consider inlining, that's the compiler's job. The compiler knows page sizes, stack size, data flow and lots of other things that helps it determine exactly when inlining would be a good idea. In some cases the compiler can do a poor job, but in general it's much better than any human.

If the compiler didn't do much inlining then you would inline small functions since the overhead of calling the function might be too big, but this might be misleading, since the extra inlined instructions might cause a page fault or cache miss, so it's only a rule of thumb.

In real-world code the best approach to take is to completely ignore inlining, and then when you're doing profiling and finds that the calling overhead of a function is a performance problem, then should you try inlining it.

Quote:3) I know that inlining is at the descression of the compiler to acknowledge or not but how many statements approximately is the maximum length of a function to be even considered for inlining.

It completely depends on the compiler and settings. I don't think the number of statements is what matters, but rather how long the resulting assembly code is and what affect it would have on other concerns like register allocation. You can compare this with Microsoft's .NET JIT compiler which will only inline functions that are 32 or less lines of CIL (a kind of bytecode used for .NET); usually C++ compilers are smarter than this since they can afford to use more time for compilation.
Quote:Original post by Skizz
However, inline functions can only be inlined if they are defined in the current translation unit, regardless of the inline keyword.

That's a common myth, but not quite true. Inlining can easily happen at link-time too. For a good overview of inlining and where it can happen see Herb Sutter's article Inline Redux.

Quote:Original post by CTar
Quote:Original post by Skizz
However, inline functions can only be inlined if they are defined in the current translation unit, regardless of the inline keyword.

That's a common myth, but not quite true. Inlining can easily happen at link-time too. For a good overview of inlining and where it can happen see Herb Sutter's article Inline Redux.


Good article that. I was aware of the link time code generation but I wonder how common it is? Does gcc implement it? I'm not familiar with the latest gcc although a quick search seems to suggest there's some form of support or at least talk of support for it. Also, this type of inlining is not defined as part of language specification (because linking can link object files generated by other languages and is platform dependant) so the implementation is not guaranteed.

Perhaps I should have said: "inline functions can only be inlined by the compiler if they are defined in the current translation unit".

Skizz
Quote:Original post by Skizz
I was aware of the link time code generation but I wonder how common it is? Does gcc implement it?

I'm not sure about GCC, but VC++ have supported it at least since 7.0 so I expect GCC and Intel C++ supports it too.

Quote:Also, this type of inlining is not defined as part of language specification (because linking can link object files generated by other languages and is platform dependant) so the implementation is not guaranteed.

No part of inlining is defined by the language. The standard just says that a function with the inline specifier has to be present in all translation units that use it, not whether it will be inlined or non-inline functions will be inlined. The C++ standard defines a kind of function known as an inline function, but it makes no guarantees at all, so you can never depend on the compiler to do or not to do inlining. A compiler could choose to do no inlining at all or inline all functions if it wanted to.

Quote:Perhaps I should have said: "inline functions can only be inlined by the compiler if they are defined in the current translation unit".

By definition inline functions have to be defined in every translation unit they are used in, so it doesn't really matter.
Quote:Original post by CTar
Why would a friendship be appropriate here? In general you should try to avoid friend classes as much as possible since it breaks encapsulation. A's public interface should be good enough for B to use, if not then A's interface isn't well-designed.


C++ FAQ Lite - Do friends violate encapsulation?

I agree there are cases where using friend functions can be wrong, but it does not neccessarily break encapsulation or is a bad design.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

Quote:Original post by Rattrap
Quote:Original post by CTar
Why would a friendship be appropriate here? In general you should try to avoid friend classes as much as possible since it breaks encapsulation. A's public interface should be good enough for B to use, if not then A's interface isn't well-designed.


C++ FAQ Lite - Do friends violate encapsulation?

I agree there are cases where using friend functions can be wrong, but it does not neccessarily break encapsulation or is a bad design.


While I generally think C++ FAQ Lite is an excellent resource, I somewhat disagree here. They start out saying "You often need to split a class in half when the two halves will have different numbers of instances or different lifetimes."
If a class has a single responsibility (as it should) then I see no reason for that class to be split in two halves and therefore I consider the statement false. It might be used to enhance encapsulation in a poorly designed application, but in a well-designed application you should never need to split classes in halves. If you are refactoring an existing system and you find a class that has n responsibilities, then you split it up into n distinct classes, but you design them such that they don't need access to each other's internals. So personally I disagree that friend classes is a good idea here.

I agree with the second part though, which states "Similarly, if you use friend functions as a syntactic variant of a class's public access functions, they don't violate encapsulation any more than a member function violates encapsulation." That is why I was careful to mention friend classes and not friends in general. I guess I should have been a bit more explicit about it. Friend functions can enhance encapsulation, but I have yet to see a good use for friend classes that enhance encapsulation. Scott Meyers have actually written a good article about how friend functions enhance encapsulation here.
Your inlining questions are good, but unfortunately I don't know the answers ... I do however believe the main benefit of inlining comes when the function size is extremely small (say 2-20 machine instructions long), or when the program running has code cache coherency problems (ie over small units of time the code being executed is located in too many places for all of them to be in L1 cache most of the time) - in this case, if the inlined version of the function fits inside a single cache line (back to reason 1) it is ALWAYS better than the non-inlined version, before even taking into account function call overhead. But I've never had to optimize so much that I dug into the actual numbers for such things. I imaging if I was writing a modern program on a PSP, PS2, or Wii or a cell phone or PDA I would probably look into the machine specific details of this question. The C++ generic answer is mostly useless though - as what is "big" is platform specific ... so C++ standard code meant to run on a cell phone has completely different optimization considerations to that meant to run on desktop computer.

Now question 1 .... this really scares me. I think you have a fundamentally scewed / incomplete view of what OO and encapsulation and data hiding are really about.

Encapsulation is an organizational structure about collecting the methods for working on certain "types" (shapes / schemas / logical meanings) of data, with the definition of the "type" (shape) itself. Bascially this is the simplest OO idea and just takes the pre C++ C practice of taking struct "abc" with methods "abc_diff", "abc_clear" and "abc_add" all inside "ABC.c" and formalizing and cleaning it up to be class "abc" with member methods "diff", "clear" and, "add". Nothing fancy here, purely a program artifact organizational concept.

Data Hiding - this is the level past Encapsulation, and also only makes since if you are using encapsulation. Once you have "encapsulated" the code that is "supposed" to operate on a certain shape of data, then you can go further and declare that not only is it "the" code to manipulate the data, it is the ONLY code allowed to manipulate the data (directly). This is data hiding. All it is, is a way of saying, here are the methods I trust to modify this data in acceptable ways, and do everything I require them to do ... so all code anywhere in the program that wants to manipulate this data, must do so through these methods (and only these) ... therefore the data is hidden (from DIRECT access).

The reason this exists is not to hide knowledge or information (data) - contrary to what it might seem. It is to allow the guaranteed enforcement of rules, relating to the data (for instance if you have a fraction class you can require that the denominator not be able to be 0) - prior to data hiding, it way always possible (and easy and hard to detect) for other code in the program to erroneously change the values in a struct, while forgetting to do something they we're supposed to (for instance calling Update() after a modification or whatever - or not validating the values before writing them - or not setting the dirty flag to true.

So in your example ... class A is a self-contained class completely and totally independent of class B (since it has no references to class B) - no different than the std::string class or my NPC class). As such, class A should use the principles of object-oriented programming to expose any and all VALID access to its internal members, while not exposing any methods which would violate its core rules (class invariants) ... for the simplest class (like a point) there are no rules to enforce (every range of values is valid so direct access could be given to internal members) ... so the main use of a point class is its encapsulation, not data hiding. But for say a CreditCardAccount class, it might be helpful for the class to verify that any assignments at least be in proper ranges.

Now then, class B has a list of class A instances ... good. This should be no different than having a list of ints, string, or TextBox's ... the writer of those classes doesn't know class B exists ... but class B knows about them. Class B knows not only how they CAN be used, but how he actually WANTS to use them (ie just because a float can be almost any number, doesn't mean class B's list of floats can be any number ... they might be grades that should only be between 0 and 100 (or maybe 0 and 110 if the teacher offers bonus points). So while class A limits dictates the rules for dealing with any and all instance of type A in the program. Class B has the responsibility to control access to ITS list of type A objects. To expose appropriate methods and enforce appropriate rules.

If class B is trusting, it may give the rest of the program unprotected direct access to its vector<A> object's or whatever. But in more complex programs it would probably go to the trouble to actually expose custom functions for everything people could legally do with its A objects ... you know, those kinda annoying, but often necessary wrapper type methods. All the decisions at the level of class B (while writing B) are exactly the same as they we're while writing A ... but the context is different, so the final outcome is different.

For instance my DateTextBox class requires entries to be valid dates, but my DateRangeControl requires that the first date be earlier than the second date AND interprets empty in the first box to be Min and empty in the second box to be max ... and my StandardLogQueryControl has a DateRangeControl and other internal controls, that it uses as needed to build the corresponding SQL Query string.

Each control is completely ignorant of the control(s) that use it. Except in 1 since ... requirements. When I wrote the DateTextBox control, it had to be useful enough for the needs of the DateRangeControl (therefore it had the option to support being unset). When I wrote the LogQuery control, I needed events when the date range changed .... so I went back to the DateRangeControl and added a DateRangeChanged event ... based on the internal text boxes ... these features could have been designed right up front ... but every piece of code gets its correctness and value from its purpose (intended use) not some theoretical idea of "right" ... and as such, we typically are constantly changing even little controls to do "more" that we didn't need until just now.

This topic is closed to new replies.

Advertisement