Isn't encapsullation in c++ just a theory?

Started by
9 comments, last by JD 18 years, 5 months ago
Ok, well the topic title, I would replace it to why is that to have real encapsullation you have to sacrifice performance? All right let's say I made an abstraction to the C dirent.h header for c++. It works like this:

browser.h:

class Folder
{

 public:
    Folder(const char* dirname);
    ~Folder();
    
    bool Open();
    void Close();
    bool Reset();
    const char* NextSubFolder();
    const char* NextFile();
    const char* NextFileOfType(const char* ext);

 
};



Well yeah, but to declare the private members, you have to place them on the class declaration itself even if it is an h file

class Folder
{
 private:
    DIR* D;
    char dirname[FILENAME_MAX];

 public:
    Folder(const char* dirname);
    ~Folder();
    
    bool Open();
    void Close();
    bool Reset();
    const char* NextSubFolder();
    const char* NextFile();
    const char* NextFileOfType(const char* ext);

 
};



But now we are giving a lot of information to the user , in this case I am making it for my own use, but I guess that it would eventually happen when you are working for every user, you wouldn't like to show the private members on a header, mostly because it is the first step into not having private members. A guy can copy and paste the class' declaration contents, make all the members public, use empty methods and then typecast a pointer to an instance of first class into the other class, change the attributes and typecast back. It is a problem for me in this case because I seriously wanted to encapsullate dirent.h and not have it included on the header file, but I was forced to because of the DIR* member Well most likelly not a lot of people like to show their private members. But there are work arounds, the first one I found in this site's article section and it is to have 2 classes

.h file:

class Folder
{
 private:
    RFolder* rf

 public:
    Folder(const char* dirname);
    ~Folder();
    
    bool Open();
    void Close();
    bool Reset();
    const char* NextSubFolder();
    const char* NextFile();
    const char* NextFileOfType(const char* ext);

 
};



So the functions would actually call RFolder's methods, well that works and all but doesn't dereffering all those pointers charge over performance? Not to say that your class member is just a wrapper. The other way I thought is to let the class be virtual

.h file:

class Folder
{
 public:
    static Folder* NewFolder();
    virtual ~Folder()=0;
    
    virtual bool Open()=0;
    virtual void Close()=0;
    virtual bool Reset()=0;
    virtual const char* NextSubFolder()=0;
    virtual const char* NextFile()=0;
    virtual const char* NextFileOfType(const char* ext)=0;

 
};



Then the actual class in the cpp inherits from Folder and implements the methods, the NewFolder() was required because you wouldn't have been able to construct Folder otherwise because of the virtual stuff. It would work but I guess it would just be the same as the pointer solution Only that easier to implement? Are there other ways? Edit: No, "use boost::filesystem" Is not the kind of answer I am looking for [Edited by - Vexorian on December 4, 2005 6:15:11 AM]
------ XYE - A new edition of the classic Kye
Advertisement
I don't agree that writing private members into the header file is already a break of encapsulation only because of one could see it when reading the header file. In fact there is always a way of hacking your app and looking onto what it does.

Slightly out of topic, but I don't recommend of reading header files for gathering infos about to use a class anyway, but I recommend to use documentation generation tools (like e.g. doxygen for linux). I'm sure such stuff also exists for windows. It give nicely formatted HTML output, and is configurable to show only non-private stuff.


If DIR is to be allocated by you (and not by the OS) you could try something like this: In the header file:
class Folder {   class _ImplPrivate;   _ImplPrivate* _privateData;public: // your interface as usual};

And in the implementation file:
class Folder::_ImplPrivate:   public DIR {};

So a reader could see the existance of something spurious, but s/he doesn't know the exact type. Also accessing _privateData is as performant as accessing a DIR* directly. Moreover it may help in compiling for different OS's, since the actual type is defined only in the implementation file. (I use this way e.g. for encapsulting pthread mutexes.)


If you only wants to drop the inclusion of the OS header file containing DIR, you could also do a forwrd declaration and include it only in the implementation file (what does actually not hide the type, of course, but reduces overall compilation time).


Other solutions like the bridge pattern are, of course, not so efficient. Also the simple inheritance use is not so efficient, although I don't think that that invocation of virtual functions will make a measurable difference since we're talking about accessing the file system, something that is not the fastest at all.

For solutions like the bridge pattern and the inheritance you may need some factories (e.g. static methods of the Folder class), but IMHO that is also not really a problem.

EDIT: @Illco "doxygen also for Windows": Good to know :-)
Quote:
I recommend to use documentation generation tools (like e.g. doxygen for linux). I'm sure such stuff also exists for windows.

Like Doxygen. It also works for Windows.

Illco
Hi Vexorian,

RE:
"Ok, well the topic title, I would replace it to why is that to have real encapsullation you have to sacrifice performance?"

Well, the assumption that you sacrifice performance when you have real encapsulation is not really true if the performance does not suffer.

Doesn't that sound terribly obvious? Sorry.

You could go into the fine details of testing performance differences and analysing them (theoretically and/or practically), but ultimately you will find the answers (and more than you expected) when you actually try implementing
your prefered solution.

I am here to tell you that using a pure abstract base class similar to that described in your second "virtual" example is the first option you should try.

Here is an example to support:

I have a 3D file viewer that imports/exports a few 3D model file formats(3ds,obj, ms3d,x). And now I am building a Plugin system that uses dlls to
import/export the files. So far I have an ms3d plugin implemented that replaces
stuff in the exe. I used a pure virtual abtract base class to hide info from the plugin developer. And provided all the data structures the dev needs to import a file format. To my suprise, I have noticed no performance difference what so ever between the plugin system and the older harded code executable version.

RE:
"Then the actual class in the cpp inherits from Folder and implements the methods, the NewFolder() was required because you wouldn't have been able to construct Folder otherwise because of the virtual stuff."

static Folder* NewFolder();

From my understanding you will not be able to construct Folder anyway: a pure abstact base class cannot be instantiated.You construct its derived class, Thus NewFolder() is not required.

So, give it try!

ps. you can take a look at the fileviewer @

http://members.optusnet.com.au/~skatic/index.htm

The plugin version will not be there yet however.

Steve Katic

The issue is one of protection vs Murphy (accidents) or protection vs Machiavelli (abuse). C++'s standard encapsulation provides the former. The latter is extremely difficult to protect against and increasing levels of security generally cost increasing amounts of performance. Even pImpl (the technique you described) is not full protection against Machiavelli. Fortunately protection against Murphy is usually all that is required. This article may be of interest.

Enigma
Quote:Original post by Enigma
The issue is one of protection vs Murphy (accidents) or protection vs Machiavelli (abuse). C++'s standard encapsulation provides the former.


And a manager who knows when to fire someone provides the later.

Just about the only other reason to want to "encapsulate" headers would be to prevent global namespace/macro cluttering. This is the point I laugh and say "You're programming in C++ and want a tidy global namespace?!?" and suggest ample application of #undef... or appropriate API-specific #define to disable more in one shot.
I agree that it is annoying that private variables are declared in the interface. Their presence requires you to include all the baggage associated with them even though the user of the class has no need for it. I frequently run into the case where a header file has to include windows.h even though nothing about the class's public interface requires the Win32 API.

Fortunately, there is a partial solution: forward declarations. In the OP's example, using the forward declaration "struct DIR;" (or however it is actually declared) removes the requirement of including dirent.h.
John BoltonLocomotive Games (THQ)Current Project: Destroy All Humans (Wii). IN STORES NOW!
Encapsulation is about accessibility not visibility. It's ok to know that an object has a particular member, but you have to have an instance to that object to access it.

On a modern system the performance impact is hard to even detect. Intel added addressing modes to the i386 or i486 that perform an indirection and indexing. At that time I think it cost you a few clock cycles, but today with the pipelined architectures I do not think it cost more than a load. You could look at the Intel ISR and developer notes and check the cycle times on the different addressing modes.


To answer the title question though, encapsulation is definitely not a theory. Computer Science has the fortune of being closely related to mathematics, so we generally have theorems (proven) not theories. Software Engineering addresses some of the less rigorous aspects of development; there theories are rampant.
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara

Quote:
From my understanding you will not be able to construct Folder anyway: a pure abstact base class cannot be instantiated.You construct its derived class, Thus NewFolder() is not required.


Exactly, and that's actually the reason I need NewFolder huh? the derived class would be in the cpp file and no information will be provided about it to the sources that include the header, so you won't be able to directly call the constructor of the derived class.

Quote:
Folder* Folder::NewFolder()
{
return (new RealFolder);
}


Anyways great post although you could have used
tags instead of real quotes, was kind of hard to read.

And well in most cases virtual stuff wouldn't matter, but if you do so the same for blitting for example the difference is kind of noticeable, Well Currently with the latest technologies it shouldn't be as much as before.

And well perhaps there is no need to hide the private members from the users but it is still an annoyance when the private members require you to include another header thus increasing your chances to get into header including issues with redeclared things and forcing you to fix them using ifdefs and things like that.

But oh well
------ XYE - A new edition of the classic Kye
When the compiler sees the declaration of a class, it's member functions and data members, it will need enough information to calculate the class's size. This means you have to put this information into your class's declaration (or a base class of it). Otherwise you could only forward declare, but not declare member functions.

If you care about your data being abused by applications you can use an abstract base as was mentioned. This way the application is equipped with an interface of functions to work on a block of bytes. The data has no visible structure. Virtual functions create a level of indirection. For almost any compiler this will use a vtable to lookup the virtual function resulting in a two-level indirection.

Another solution would be to do like an OS does. Don't give the application a pointer to the real data, but give it a handle to it instead. In your example your Folder class would manage memory internally and each instance of a Folder would have just an identifier visible in the header file.

Both methods outlined above provide a at least single level of indirection, the first via virtual functions, the second via a handle. In both cases a table lookup is required. Compared to the work the function has to perform this should be negligible in most cases. If you really care about performance and the implementation of your functions is VERY short you should consider inline functions. But in this case you have to declare the data in your Folder class anyways. (In your RFolder* workaround you can inline RFolder's functions into Folder's functions.)

There is a third method that cannot be recommended. You can declare a buffer in your public interface class (Folder) that is big enough to hold the real data of your implementation. Then your implementation does only need to perform a simple cast (to an internal structure) to read/write that data.

This topic is closed to new replies.

Advertisement