Jump to content

  • Log In with Google      Sign In   
  • Create Account


Opinion needed: returning values without null checks everywhere


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
22 replies to this topic

#1 Strewya   Members   -  Reputation: 1353

Like
0Likes
Like

Posted 13 May 2014 - 02:19 AM

Hi all.

 

Here at work, we have a class that stores non-POD objects by raw pointer, and has a get method that returns that raw pointer if the key was found (the storage is an std::map<key, value*> and the get method has a signature value* getThing(const key&); ). The objects are created elsewhere and simply stored in this class for ownership. A note to keep in mind is that c++ exceptions are disabled, and we're not using c++11 yet.

 

My issue with this is that everywhere the getThing method is called, we need to check for null, since if the value is not found, null is returned.

This annoys me to hell, since it bloats the code everywhere with null checks. I have a few ideas about changing it.

 

1. Add a 'bool exists(const key&)' method which does the obvious, and change the getThing to return a reference.

- The issue here is if someone calls getThing without checking if the value exists, then i don't know what to return? The .second value of the end iterator? It should cause a crash as soon as someone uses the reference, right?

- Since exceptions are disabled, i can't throw in case nothing was found, so i'm forced to always return something, and end.second seems like the only choice.

- Another problem with this is that the getThing method is const too, so i'm not sure if returning a plain reference will work (i remember it being a compilation error [in some cases]) or if the reference also has to be const, which makes it impossible to change the thing we get. A solution is to remove the const, but isn't nice.

 

2. Have the getThing method return a bool, and accept a double pointer or reference-to-pointer as an 'out' argument, so if the value is found, true is returned and the out argument is set to point to the found value.

- This seems ok and removes the need for a null check at the call site, but is again using pointers instead of references. I don't have anything against pointers, mind you, but i believe having references in the public interface of a class makes the code look cleaner since it's similar to the rest of it (using references everywhere instead of a mix of pointers and references, i find pointers fine for internals of specific classes).

 

I'd like to get others opinion on this, another perspective on how this could be done better and/or different. For all i know, maybe the current implementation is fine, and i just need to come to terms with it... Any suggestions are welcome.

Thanks!


devstropo.blogspot.com - Random stuff about my gamedev hobby


Sponsor:

#2 ApochPiQ   Moderators   -  Reputation: 15020

Like
7Likes
Like

Posted 13 May 2014 - 02:30 AM

Are assertions an option? That's what I would use: GetFoo() just asserts that the requested item exists, and you can use HasFoo() or whatever to check first if you want to avoid the assert.

#3 Felix Ungman   Members   -  Reputation: 1000

Like
0Likes
Like

Posted 13 May 2014 - 02:44 AM

Using std::map<key, value*> isn't always as convenient as you want to, because when you access it with operator[] you easily create bogus entries that will bite you later when you iterate the map. You can use find() instead, but the code won't be as pretty (but in general, code working with standard containers is not that pretty, you just have to cope with that fact).

 

About null values, they can be null for two reasons. Either (1) because of a bug in your code, and in that case use assert to make sure you catch the bug. Or (2) they actually represent the non-existence of something. If that's the case, using pointers and null checks is the most obvious way to do it.


openwar  - the real-time tactical war-game platform


#4 frob   Moderators   -  Reputation: 20191

Like
9Likes
Like

Posted 13 May 2014 - 02:47 AM

A NULL test is very nearly free on the x86 family. It usually is either one or two instructions, refers to objects likely in a register, and likely uses a local jump already loaded from memory.

Sanity checks are important in code. It seems that NULL is a legitimate value; you want to get something, and if the thing cannot be found you need a sane response. That response is NULL.

Your second option really does nothing different from the first. Instead of testing against a NULL value you are testing against a bool value. Either way a test takes place and the ultimate logic is the same. The object is not found the code needs to handle it.

Use pointers when NULL is a potential legitimate values. Use references when a value cannot ever be NULL and there must always be an object. If you could guarantee that the object always exists, return a reference. But since you cannot (it might not be in the map, and the map itself might store a NULL value) then just return the NULL and be done with it.

If you could build the code so the thing is always found, and you can guarantee that a valid object is always returned, then you can return a reference to the thing that is guaranteed to be there and eliminate the sanity check. But since that isn't the case, you are stuck with it.
Check out my personal indie blog at bryanwagstaff.com.

#5 NightCreature83   Crossbones+   -  Reputation: 2737

Like
0Likes
Like

Posted 13 May 2014 - 03:45 AM

Hi all.

 

Here at work, we have a class that stores non-POD objects by raw pointer, and has a get method that returns that raw pointer if the key was found (the storage is an std::map<key, value*> and the get method has a signature value* getThing(const key&); ). The objects are created elsewhere and simply stored in this class for ownership. A note to keep in mind is that c++ exceptions are disabled, and we're not using c++11 yet.

 

My issue with this is that everywhere the getThing method is called, we need to check for null, since if the value is not found, null is returned.

This annoys me to hell, since it bloats the code everywhere with null checks. I have a few ideas about changing it.

 

1. Add a 'bool exists(const key&)' method which does the obvious, and change the getThing to return a reference.

- The issue here is if someone calls getThing without checking if the value exists, then i don't know what to return? The .second value of the end iterator? It should cause a crash as soon as someone uses the reference, right?

- Since exceptions are disabled, i can't throw in case nothing was found, so i'm forced to always return something, and end.second seems like the only choice.

- Another problem with this is that the getThing method is const too, so i'm not sure if returning a plain reference will work (i remember it being a compilation error [in some cases]) or if the reference also has to be const, which makes it impossible to change the thing we get. A solution is to remove the const, but isn't nice.

 

2. Have the getThing method return a bool, and accept a double pointer or reference-to-pointer as an 'out' argument, so if the value is found, true is returned and the out argument is set to point to the found value.

- This seems ok and removes the need for a null check at the call site, but is again using pointers instead of references. I don't have anything against pointers, mind you, but i believe having references in the public interface of a class makes the code look cleaner since it's similar to the rest of it (using references everywhere instead of a mix of pointers and references, i find pointers fine for internals of specific classes).

 

I'd like to get others opinion on this, another perspective on how this could be done better and/or different. For all i know, maybe the current implementation is fine, and i just need to come to terms with it... Any suggestions are welcome.

Thanks!

 

I prefer getting null pointers back from a cache to be honest, it's not up to the cache to decide how you want to use that object its the calling codes responsibility, it might actually be able to deal with non existing objects. I am currently working on a codebase that asserts when this happens which is even more annoying, libs and upstream code shouldn't care about downstream usage patterns.

 

Current pattern seems fine to me, you could add an exists method but is not needed, returning a pointer kind of indicates whether a succes or fail has happened in one go, get a null pointer your operation failed.

 

The other thing you could do is, if all downstream code deals with default value and that value is the same everywhere, is to store default in the map and return a pointer to that when it fails to grab the object you want.


Edited by NightCreature83, 13 May 2014 - 03:47 AM.

Worked on titles: CMR:DiRT2, DiRT 3, DiRT: Showdown, GRID 2, Mad Max

#6 dmatter   Crossbones+   -  Reputation: 3029

Like
0Likes
Like

Posted 13 May 2014 - 03:56 AM

Returning NULL isn't so bad if the referred-to object may not exist. Either that or use boost::optional, but there's hardly a difference.

 

I try to design around that kind of situation whenever possible, which really comes down to asking questions like:

 

How does client code manage to get hold keys that don't exist in the map?

What does client code do when it receives a NULL and is there any commonality that may be generalised?

 

 

For example, if the client code is not meant to have keys that are not in the map then the NULL response indicates a problem. In which case move the NULL check inside the getThing() function with an assert (or whatever) - you want to die hard and fast.

 

Another example, if every client does the same thing when receiving a NULL, such as they all call loadThing(), then it's possible to move all that behind a higher-level abstraction, one that auto-loads things that aren't in the map.



#7 swiftcoder   Senior Moderators   -  Reputation: 9853

Like
0Likes
Like

Posted 13 May 2014 - 06:24 AM

The alternative used in languages with a richer type system (Scala, Rust, etc) is 'optional types'. In brief, a variable of type Optional<T> can either contain Some<T> or None(), and the language semantics force you to explicitly handle *both* cases.

 

This can be simulated in C++ (well, C++11 at least) by having an accessor function that takes a lambda to be invoked if the data exists. There is an interesting investigation in this direction over on github, which might be worth a look (particularly the semantics of maybe_if and otherwise).


Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#8 BitMaster   Crossbones+   -  Reputation: 3883

Like
0Likes
Like

Posted 13 May 2014 - 06:40 AM

Wouldn't a boost::optional be a valid C++ equivalent even without C++11?

#9 Strewya   Members   -  Reputation: 1353

Like
0Likes
Like

Posted 13 May 2014 - 07:09 AM

Assertions are not an option, we don't have debug/release builds, there's just one build output which goes through 5 levels of testing (unit, component, function, system function, and manual exploratory test).

 

The keys either exist in the map or don't, together with their associated values. There cannot be a key mapped to NULL. The only thing a get does is do a .find() in the map, and returns null or it.second, depending on whether .find() returned end or not. The keys are supplied by command line, so it is entirely possible for someone to request a non existant key.

 

The only thing that is done if a null check fails is a trace log in the else block, which is done for each check in a 1-4 levels deep nested ifs that each check a different type, which leads to a lot of code duplication. But, this is another issue i'm working on.

 

What i was hoping for most is to somehow keep the client code entirely reference based, without having to deal with any pointers in it, but if that's not working, i guess i'll have to compromise.

 

Thanks for all your input!


devstropo.blogspot.com - Random stuff about my gamedev hobby


#10 Felix Ungman   Members   -  Reputation: 1000

Like
0Likes
Like

Posted 13 May 2014 - 07:38 AM

If deeply nested ifs are the problem, one option is to use flat conditional null returns.


openwar  - the real-time tactical war-game platform


#11 Strewya   Members   -  Reputation: 1353

Like
0Likes
Like

Posted 13 May 2014 - 07:56 AM

Nah, the nested ifs aren't a problem, they are a replacement for multiple return statements per function, which is something i'm attempting to avoid.


devstropo.blogspot.com - Random stuff about my gamedev hobby


#12 Aardvajk   Crossbones+   -  Reputation: 5936

Like
1Likes
Like

Posted 13 May 2014 - 08:11 AM

If the only way that client code can call with invalid keys, sanitize that input in a central place before passing it on to the rest of the program. Sounds like after that input has been checked, it would be a program fault if an invalid key was supplied, so then you can handle that with exceptions and return references to your data.

 

Sticking to pointers and checking null everywhere because there may have been an invalid key supplied at the command line seems wrong to me.



#13 swiftcoder   Senior Moderators   -  Reputation: 9853

Like
0Likes
Like

Posted 13 May 2014 - 08:44 AM

Wouldn't a boost::optional be a valid C++ equivalent even without C++11?

That's fair, but it basically boils down to assert-on-invalid. You really need the lambda support to provide sane fallback semantics.

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#14 jHaskell   Members   -  Reputation: 988

Like
1Likes
Like

Posted 13 May 2014 - 02:24 PM

Without knowing what you're doing with the objects in your map, I can't say if this will be a valid resolution for you or not, but what I've done in some cases (for example, with a map of commands), is have one 'empty' object, that can be manipulated the same way any other object in the map can be, it just either does nothing, or does some default 'empty' behavior.  For example, in the case of a map of commands, the empty command would simply send 'command not found' to the appropriate output.  These empty objects don't need to be in the map itself.  If you have some higher level container wrapping your map, that container can contain an instance of the 'empty' object, and return a pointer to it when find() fails to find the entry in the map.

 

This doesn't work if the object returned NEEDS to perform some nontrivial actions to be usable.  Or, at least, it doesn't work as well.  At that point the object is no longer empty.  It may still be an acceptable resolution, it just becomes more of a default object as opposed to an empty object.

 

This would allow you to return references instead of pointers as well, as you are then always returning an actual object



#15 de_mattT   Members   -  Reputation: 308

Like
0Likes
Like

Posted 13 May 2014 - 04:56 PM

If the only way that client code can call with invalid keys, sanitize that input in a central place before passing it on to the rest of the program. Sounds like after that input has been checked, it would be a program fault if an invalid key was supplied, so then you can handle that with exceptions and return references to your data.

 

Sticking to pointers and checking null everywhere because there may have been an invalid key supplied at the command line seems wrong to me.

Agreed.

 

So while you'll probably still need some way of checking that the colleciton contains the key->value mapping, you only need to do it once when the key is received from the user.

 

Speaking in terms of a generic collection object, a return value of NULL from your function value* getThing(const key&); ) could mean that the key is mapped onto NULL or that the key is not in the map. You've already mentioned that keys will not be mapped onto null values, so in this instance a return value of NULL can only mean that the key is not in the collection. However will that always be the case? Are you intending to re-use the collection? Also someone updating the code in the future may not immediatly appreciate that a return value of NULL can only be generated when the key isn't mapped. Therefore, while NULL checks would work, I think adding explicit 'exists' checks could make your code easier to read and make the collection class more re-usable.

 

Both of your suggestions look like fine choices.

 

Another option you might want to consider, (although I'd still go with the 'bool exists(const key&)' suggestion) is to have your collection return a default object when the key isn't stored. The object would be an instance of a default class, extending the class of the values stored in the collection, but stubbing out all procedures so that they do nothing and all functions so that they return neutral values. Then the rest of your code can process the default object as if a key-value map was found, removing the need for any 'exists' or 'is NULL' checks. (The NULL Object pattern, from Robert Martin's book 'Agile Softwaree Development: Principles, Patterns and Practices')



#16 frob   Moderators   -  Reputation: 20191

Like
2Likes
Like

Posted 13 May 2014 - 10:30 PM

... have one 'empty' object ... then always returning an actual object

I've seen it done in game engines. It doesn't solve the problem.

First it just migrates the NULL test. In some ways that is good as it consolidates the logic. But every function call you still end up doing a test for search failure and still end up taking an alternate action. That means that you don't actually save any cycles for testing for a NULL search result, you just handle it inside the function instead of outside the function, the count of NULL tests remains the same. In fact, it is often very slightly worse because now you are not just testing for failure, you are also potentially constructing a useless dummy game object.

Second, very often it just exchanges a test for NULL with a test for the special dummy object. I remember writing hundreds of times variants of "if(result.Type == ObjectType.FailureObject)". It is basically identical to testing against NULL, except it is a special dummy object just in case somebody forgets to check for a bad result. All the operations on a FailureObject just assert in dev and QA builds so the programmers can fix the bug, but we get a silent failure in the final build rather than a null-pointer crash.

It is just a different variant of the identical problem. In all cases you still need to test for the condition of a missing object and code defensively.
Check out my personal indie blog at bryanwagstaff.com.

#17 Felix Ungman   Members   -  Reputation: 1000

Like
1Likes
Like

Posted 14 May 2014 - 12:32 AM

Objective-C has an interesting take on this problem. If you have a nil object reference, calling a method is a no-op, and calling a function returns a default value (i.e. zero, false, or nil). Good coding practice is *not* checking the object for nil, but instead just send the message and checking the return value instead. Unfortunately there's no simple equivalent construction in C++.


openwar  - the real-time tactical war-game platform


#18 Strewya   Members   -  Reputation: 1353

Like
0Likes
Like

Posted 14 May 2014 - 01:01 AM

Using the Null Object pattern would be fine if the logic has to be the same no matter whether the object exists or not, but this is not the case. Non existance has to be logged, and the client code needs to exit without a crash (effectively informing the user via the trace log that the key entered was invalid), and not continue processing something that doesn't actually exist pretending everything is okay.

 

An idea popped into my head about using the Null Object as a logging device, so that method calls actually log the error, but the problem in that case is that client code thinks everything is okay, and continues to the next positive case, when it should have exited instead. And besides, the objects contain other data, and some objects do non-trivial operations, so it's not applicable in all cases.

 

EDIT:

I forgot to comment Aardvajk's suggestion about sanitizing the input. The input is sanitized in one location purely on a syntactic level, making sure that string arguments are strings, integers are really numbers, and the correct sub-command was invoked. When the data from the command arrives to our part of the code, it's all potentially fine until we check something is actually mapped to that key (the key is a hash string).


Edited by Strewya, 14 May 2014 - 01:12 AM.

devstropo.blogspot.com - Random stuff about my gamedev hobby


#19 radioteeth   Prime Members   -  Reputation: 992

Like
0Likes
Like

Posted 14 May 2014 - 04:45 AM

Creating a 'ThingExists' function, and checking its result is the equivalent of checking for a NULL result from the 'GetThing' function, at least in my experience. ThingExists could be implemented by calling GetThing and performing the NULL check there.



#20 frob   Moderators   -  Reputation: 20191

Like
0Likes
Like

Posted 14 May 2014 - 08:13 AM

Creating a 'ThingExists' function, and checking its result is the equivalent of checking for a NULL result from the 'GetThing' function, at least in my experience. ThingExists could be implemented by calling GetThing and performing the NULL check there.


Not so good in practice.

Doing the query to check for existence is usually identical to the code to get the object. For example, ThingExists() could be trivially implemented as { return GetThing() != NULL; }.

Better to do the call only once rather than twice.

If you have to do two sets of queries, the first is existential the second is acquisition, it is usually better to write just a single acquisition query that gets the final result and then tests for failure.

A similar example in C#, don't do both "is" and "as" operations, just do the "as" and check for null.
Check out my personal indie blog at bryanwagstaff.com.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS