"Prefer writing nonmember nonfriend funcs" - when to stop?

Started by
17 comments, last by mrbastard 18 years, 4 months ago
I've been reading up on various software engineering techniques, especially refactoring. I've also been reading Herb Sutter's GotW articles in a effort to become better at both general design and more importantly in efficent C++ design. One of Herb's maxims is "Prefer writing nonmember nonfriend functions wherever possible". I can see this is good advice as it encourages component reuse and modularity. Thing is though, most member functions can be written as nonmember nonfriend by moving member variables to parameters. Taken to it's extreme, this maxim will produce lots of nonmember nonfriend functions and lots of PoD classes with little functionality (or with member functions just wrapping nonmember nonfriend functions). Is this a bad thing? It seems counter intuitive from both OO design and efficiency points of view. I'd like to ask some of the more experienced coders here if they have any rules of thumb for this kind of thing. Should Herb's advice be ignored until the point you are in danger of producing a monolithic class?
[size="1"]
Advertisement
That kind of seems to take away the point of C++. Making functions non-member is just moving back to C. Surely the whole idea of reuse in C++ is to make reusable classes as small components that can be 'plugged in' as needed?
Quote:most member functions can be written as nonmember nonfriend by moving member variables to parameters


If they really can be, then they shouldn't be fields. If you read his detailed explaination of this principle, what he's getting at is that a class should provide only the basic 'first-principal' methods required to manipulate it: These are the methods that absolutely require access to private data and code in order to work. If a method can be expressed in terms of other methods in the class, it is a candidate for being moved to a non-member, non-friend function.

For example:
The C++ ostream class has many << operators for dealing with all the primitive types (int, float, double, etc...). However, given a method to output a single byte, all other primitives can be output in terms of that method. So technically the instertion operators for all types byte unsigned char could be converted to non-member, non-friend functions.

The purpose of this principal is to keep the number of methods that are changing private data to a minimum. By doing this it's easier to ensure that there's no way to break the invariants the class is trying to achieve.

Did that help any?
Another, language-agnostic take on it.
Thanks for the replies everyone.

d000hg: Absolutely. My point is that classes can be written as collections of non-class components, which theoretically makes those components slightly more reusable. However, this makes them wrappers for C style code, which as we agree seems counter to good OO design. Seeing as Sutter is regarded as one of the foremost experts on c++, I felt sure I was misinterpreting him somehow.

Morbo: Thanks, that didn't directly help but made me think that perhaps what's confusing me is only partially related. To clarify, I didn't really mean that the member fields should be moved, but that the nm-nf functions could be called within the class with the private fields as parameters. Again, this produces horrible looking wrapper code if overused, but the wrapped code is technically more reusable as it is factored out as far as possible. Whether this has any immediate use is questionable, but means that the nm/nf function can be used in more than one class. The downside is passing more parameters, though if const references can be used it doesn't have to mean a loss of efficiency.

Overall, I think you're trying to tell me that I've over generalised Sutter's guideline.

Zahlman: thanks for that. I should have known there'd be something related on the c2 wiki.

All:

The more I think about it, the more I believe it's possible that the thing that's confusing me is only partially related to Sutter's advice, and the two have just become uselessly intertwined in my mental model because I'm trying to apply new principles in a small area. Maybe I'm being blinkered by their implications in that small area, even though that area is not nessesarily representative of OO design in general.

Suppose we have a class that holds a pointer (or smart pointer) to some data. The main purpose of the class is to iterate over that data with a given 'stride', perform some transformation on it, and return the transformed data. It will require functions to ensure the stride value and number of strides don't over run the data. As this is a simple mathematical operation, it seems a good candidate for nm/nf - even though it operates on member variables, those variables can be passed to the nm/nf function as parameters.

Now, you are probably thinking that the entire class should actually be a function or functor, but in that case it needs a long parameter list. The standard way to solve that is to make a parameter struct holding all those parameters and pass a single one of those by reference. Now we end up with a nm/nf function / functor and a PoD parameter struct.... which leads us back to C style code.

Is it better to factor it all out into nm/nf and PoDs, or better to put it all in a class? At one extreme we could have silliness like a_float.Sin(). At the other extreme we could have AFunction(PoDtype arg), AnotherFunction(PoDtype arg) which is just as silly.

Obviously neither extreme is correct and there's a huge grey area here. I'm sure which end of that grey area a particular design fits in can be judged with experience (and probably personal taste), which is why I am so interested in hearing experienced advice.
[size="1"]
The C++ STL might be a good example of non-member, non-friend functions being used in a way that makes the code much more reliable, and possibly directly applicable to your example of a container. If your container could be written to provide iterators, you could use the STL algorithms package to apply the functor to the elements the iterators iterate over. Correctly writing all of this stuff can be a bit difficult of course.

What is silly about having lots of functions that all take the same type as their first argument? This only seems silly to people who are too used to the language doing it for them.
The argument, I dare say, is that in C++, there's no point making a class for that at all. At the most complicated, make an iterator class that holds a smart pointer to the data and represents some complex iteration process (possibly as an iterator adaptor; see reverse_iterator), and then write nm/nf stuff a la <algorithm> header. C++ is a multi-paradigm language, remember.

I'm not sure I see why you'd need terribly many parameters for that. Post the code maybe and we could offer suggestions?
Quote:Original post by Anonymous Poster
What is silly about having lots of functions that all take the same type as their first argument? This only seems silly to people who are too used to the language doing it for them.


I beg to differ. I've been using Python for months and all the self-business still annoys me to no end, whereas I got used to the indentation policy almost immediately. (My other complaints with the language are generally little things - most recently, I've been bitten by "AttributeError: 'tuple' object has no attribute 'index'", which is silly because lists provide it, and the operation clearly respects constness.)
The difference between a POD and a class is that a class enforces rules (commonly called invariants) on its data, whereas a POD is simply a container for data. A class also emphasizes functionality over implementation: don't care how it's done or stored in memory, just care what you can do with it (which is also closely related to data hiding).

For example, a string is usually stored as just a char*. If you just created a struct with a pointer and passed it around, anyone would be able to break your string by deleting the array, changing the pointer, or replacing the null terminator with a 1. This reveals the fact that a string is NOT just a character array, it's an array with certain constraints: mainly that it's null terminated and always contains valid data.

So the job of a string class is to ensure these constraints are always met and no one can mess with them. It does this by hiding the char* and providing methods used to safely manipulate and access the string. Ideally these methods should be comprehensive yet primitive: setting the string, appending/inserting a substring, removing a substring, etc...

In short, PODs are fragile and hard to modify, while classes hide data and provide methods in order to prevent breaking the data and allow you to vary how it's implemented.
i remember heaving read a comment by stroustrup once that basically stated he's making this decision of member/non-member according to the invariant of the class/struct being changed or not. i find this very helpful when designing. basically whenever a method is not changing the data that constitutes the very meaning of the structure you could and should think about transforming this method into a non-member function. when i find the source i have this from i'll post it here.

cheers,
simon

This topic is closed to new replies.

Advertisement