# Opinion needed: returning values without null checks everywhere

This topic is 1345 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi all.

Here at work, we have a class that stores non-POD objects by raw pointer, and has a get method that returns that raw pointer if the key was found (the storage is an std::map<key, value*> and the get method has a signature value* getThing(const key&); ). The objects are created elsewhere and simply stored in this class for ownership. A note to keep in mind is that c++ exceptions are disabled, and we're not using c++11 yet.

My issue with this is that everywhere the getThing method is called, we need to check for null, since if the value is not found, null is returned.

This annoys me to hell, since it bloats the code everywhere with null checks. I have a few ideas about changing it.

1. Add a 'bool exists(const key&)' method which does the obvious, and change the getThing to return a reference.

- The issue here is if someone calls getThing without checking if the value exists, then i don't know what to return? The .second value of the end iterator? It should cause a crash as soon as someone uses the reference, right?

- Since exceptions are disabled, i can't throw in case nothing was found, so i'm forced to always return something, and end.second seems like the only choice.

- Another problem with this is that the getThing method is const too, so i'm not sure if returning a plain reference will work (i remember it being a compilation error [in some cases]) or if the reference also has to be const, which makes it impossible to change the thing we get. A solution is to remove the const, but isn't nice.

2. Have the getThing method return a bool, and accept a double pointer or reference-to-pointer as an 'out' argument, so if the value is found, true is returned and the out argument is set to point to the found value.

- This seems ok and removes the need for a null check at the call site, but is again using pointers instead of references. I don't have anything against pointers, mind you, but i believe having references in the public interface of a class makes the code look cleaner since it's similar to the rest of it (using references everywhere instead of a mix of pointers and references, i find pointers fine for internals of specific classes).

I'd like to get others opinion on this, another perspective on how this could be done better and/or different. For all i know, maybe the current implementation is fine, and i just need to come to terms with it... Any suggestions are welcome.

Thanks!

##### Share on other sites

Using std::map<key, value*> isn't always as convenient as you want to, because when you access it with operator[] you easily create bogus entries that will bite you later when you iterate the map. You can use find() instead, but the code won't be as pretty (but in general, code working with standard containers is not that pretty, you just have to cope with that fact).

About null values, they can be null for two reasons. Either (1) because of a bug in your code, and in that case use assert to make sure you catch the bug. Or (2) they actually represent the non-existence of something. If that's the case, using pointers and null checks is the most obvious way to do it.

##### Share on other sites

Hi all.

Here at work, we have a class that stores non-POD objects by raw pointer, and has a get method that returns that raw pointer if the key was found (the storage is an std::map<key, value*> and the get method has a signature value* getThing(const key&); ). The objects are created elsewhere and simply stored in this class for ownership. A note to keep in mind is that c++ exceptions are disabled, and we're not using c++11 yet.

My issue with this is that everywhere the getThing method is called, we need to check for null, since if the value is not found, null is returned.

This annoys me to hell, since it bloats the code everywhere with null checks. I have a few ideas about changing it.

1. Add a 'bool exists(const key&)' method which does the obvious, and change the getThing to return a reference.

- The issue here is if someone calls getThing without checking if the value exists, then i don't know what to return? The .second value of the end iterator? It should cause a crash as soon as someone uses the reference, right?

- Since exceptions are disabled, i can't throw in case nothing was found, so i'm forced to always return something, and end.second seems like the only choice.

- Another problem with this is that the getThing method is const too, so i'm not sure if returning a plain reference will work (i remember it being a compilation error [in some cases]) or if the reference also has to be const, which makes it impossible to change the thing we get. A solution is to remove the const, but isn't nice.

2. Have the getThing method return a bool, and accept a double pointer or reference-to-pointer as an 'out' argument, so if the value is found, true is returned and the out argument is set to point to the found value.

- This seems ok and removes the need for a null check at the call site, but is again using pointers instead of references. I don't have anything against pointers, mind you, but i believe having references in the public interface of a class makes the code look cleaner since it's similar to the rest of it (using references everywhere instead of a mix of pointers and references, i find pointers fine for internals of specific classes).

I'd like to get others opinion on this, another perspective on how this could be done better and/or different. For all i know, maybe the current implementation is fine, and i just need to come to terms with it... Any suggestions are welcome.

Thanks!

I prefer getting null pointers back from a cache to be honest, it's not up to the cache to decide how you want to use that object its the calling codes responsibility, it might actually be able to deal with non existing objects. I am currently working on a codebase that asserts when this happens which is even more annoying, libs and upstream code shouldn't care about downstream usage patterns.

Current pattern seems fine to me, you could add an exists method but is not needed, returning a pointer kind of indicates whether a succes or fail has happened in one go, get a null pointer your operation failed.

The other thing you could do is, if all downstream code deals with default value and that value is the same everywhere, is to store default in the map and return a pointer to that when it fails to grab the object you want.

Edited by NightCreature83

##### Share on other sites

Returning NULL isn't so bad if the referred-to object may not exist. Either that or use boost::optional, but there's hardly a difference.

I try to design around that kind of situation whenever possible, which really comes down to asking questions like:

How does client code manage to get hold keys that don't exist in the map?

What does client code do when it receives a NULL and is there any commonality that may be generalised?

For example, if the client code is not meant to have keys that are not in the map then the NULL response indicates a problem. In which case move the NULL check inside the getThing() function with an assert (or whatever) - you want to die hard and fast.

Another example, if every client does the same thing when receiving a NULL, such as they all call loadThing(), then it's possible to move all that behind a higher-level abstraction, one that auto-loads things that aren't in the map.

##### Share on other sites

The alternative used in languages with a richer type system (Scala, Rust, etc) is 'optional types'. In brief, a variable of type Optional<T> can either contain Some<T> or None(), and the language semantics force you to explicitly handle *both* cases.

This can be simulated in C++ (well, C++11 at least) by having an accessor function that takes a lambda to be invoked if the data exists. There is an interesting investigation in this direction over on github, which might be worth a look (particularly the semantics of [tt]maybe_if[/tt] and [tt]otherwise[/tt]).

##### Share on other sites
Wouldn't a boost::optional be a valid C++ equivalent even without C++11?

##### Share on other sites

Assertions are not an option, we don't have debug/release builds, there's just one build output which goes through 5 levels of testing (unit, component, function, system function, and manual exploratory test).

The keys either exist in the map or don't, together with their associated values. There cannot be a key mapped to NULL. The only thing a get does is do a .find() in the map, and returns null or it.second, depending on whether .find() returned end or not. The keys are supplied by command line, so it is entirely possible for someone to request a non existant key.

The only thing that is done if a null check fails is a trace log in the else block, which is done for each check in a 1-4 levels deep nested ifs that each check a different type, which leads to a lot of code duplication. But, this is another issue i'm working on.

What i was hoping for most is to somehow keep the client code entirely reference based, without having to deal with any pointers in it, but if that's not working, i guess i'll have to compromise.

Thanks for all your input!

##### Share on other sites

If deeply nested ifs are the problem, one option is to use flat conditional null returns.

##### Share on other sites

Nah, the nested ifs aren't a problem, they are a replacement for multiple return statements per function, which is something i'm attempting to avoid.

##### Share on other sites

If the only way that client code can call with invalid keys, sanitize that input in a central place before passing it on to the rest of the program. Sounds like after that input has been checked, it would be a program fault if an invalid key was supplied, so then you can handle that with exceptions and return references to your data.

Sticking to pointers and checking null everywhere because there may have been an invalid key supplied at the command line seems wrong to me.

##### Share on other sites

Wouldn't a boost::optional be a valid C++ equivalent even without C++11?

That's fair, but it basically boils down to assert-on-invalid. You really need the lambda support to provide sane fallback semantics.

##### Share on other sites

Without knowing what you're doing with the objects in your map, I can't say if this will be a valid resolution for you or not, but what I've done in some cases (for example, with a map of commands), is have one 'empty' object, that can be manipulated the same way any other object in the map can be, it just either does nothing, or does some default 'empty' behavior.  For example, in the case of a map of commands, the empty command would simply send 'command not found' to the appropriate output.  These empty objects don't need to be in the map itself.  If you have some higher level container wrapping your map, that container can contain an instance of the 'empty' object, and return a pointer to it when find() fails to find the entry in the map.

This doesn't work if the object returned NEEDS to perform some nontrivial actions to be usable.  Or, at least, it doesn't work as well.  At that point the object is no longer empty.  It may still be an acceptable resolution, it just becomes more of a default object as opposed to an empty object.

This would allow you to return references instead of pointers as well, as you are then always returning an actual object

##### Share on other sites

If the only way that client code can call with invalid keys, sanitize that input in a central place before passing it on to the rest of the program. Sounds like after that input has been checked, it would be a program fault if an invalid key was supplied, so then you can handle that with exceptions and return references to your data.

Sticking to pointers and checking null everywhere because there may have been an invalid key supplied at the command line seems wrong to me.

Agreed.

So while you'll probably still need some way of checking that the colleciton contains the key->value mapping, you only need to do it once when the key is received from the user.

Speaking in terms of a generic collection object, a return value of NULL from your function value* getThing(const key&); ) could mean that the key is mapped onto NULL or that the key is not in the map. You've already mentioned that keys will not be mapped onto null values, so in this instance a return value of NULL can only mean that the key is not in the collection. However will that always be the case? Are you intending to re-use the collection? Also someone updating the code in the future may not immediatly appreciate that a return value of NULL can only be generated when the key isn't mapped. Therefore, while NULL checks would work, I think adding explicit 'exists' checks could make your code easier to read and make the collection class more re-usable.

Both of your suggestions look like fine choices.

Another option you might want to consider, (although I'd still go with the 'bool exists(const key&)' suggestion) is to have your collection return a default object when the key isn't stored. The object would be an instance of a default class, extending the class of the values stored in the collection, but stubbing out all procedures so that they do nothing and all functions so that they return neutral values. Then the rest of your code can process the default object as if a key-value map was found, removing the need for any 'exists' or 'is NULL' checks. (The NULL Object pattern, from Robert Martin's book 'Agile Softwaree Development: Principles, Patterns and Practices')

##### Share on other sites

... have one 'empty' object ... then always returning an actual object

I've seen it done in game engines. It doesn't solve the problem.

First it just migrates the NULL test. In some ways that is good as it consolidates the logic. But every function call you still end up doing a test for search failure and still end up taking an alternate action. That means that you don't actually save any cycles for testing for a NULL search result, you just handle it inside the function instead of outside the function, the count of NULL tests remains the same. In fact, it is often very slightly worse because now you are not just testing for failure, you are also potentially constructing a useless dummy game object.

Second, very often it just exchanges a test for NULL with a test for the special dummy object. I remember writing hundreds of times variants of "if(result.Type == ObjectType.FailureObject)". It is basically identical to testing against NULL, except it is a special dummy object just in case somebody forgets to check for a bad result. All the operations on a FailureObject just assert in dev and QA builds so the programmers can fix the bug, but we get a silent failure in the final build rather than a null-pointer crash.

It is just a different variant of the identical problem. In all cases you still need to test for the condition of a missing object and code defensively.

##### Share on other sites

Objective-C has an interesting take on this problem. If you have a nil object reference, calling a method is a no-op, and calling a function returns a default value (i.e. zero, false, or nil). Good coding practice is *not* checking the object for nil, but instead just send the message and checking the return value instead. Unfortunately there's no simple equivalent construction in C++.

##### Share on other sites

Using the Null Object pattern would be fine if the logic has to be the same no matter whether the object exists or not, but this is not the case. Non existance has to be logged, and the client code needs to exit without a crash (effectively informing the user via the trace log that the key entered was invalid), and not continue processing something that doesn't actually exist pretending everything is okay.

An idea popped into my head about using the Null Object as a logging device, so that method calls actually log the error, but the problem in that case is that client code thinks everything is okay, and continues to the next positive case, when it should have exited instead. And besides, the objects contain other data, and some objects do non-trivial operations, so it's not applicable in all cases.

EDIT:

I forgot to comment Aardvajk's suggestion about sanitizing the input. The input is sanitized in one location purely on a syntactic level, making sure that string arguments are strings, integers are really numbers, and the correct sub-command was invoked. When the data from the command arrives to our part of the code, it's all potentially fine until we check something is actually mapped to that key (the key is a hash string).

Edited by Strewya

##### Share on other sites

Creating a 'ThingExists' function, and checking its result is the equivalent of checking for a NULL result from the 'GetThing' function, at least in my experience. ThingExists could be implemented by calling GetThing and performing the NULL check there.

##### Share on other sites

Creating a 'ThingExists' function, and checking its result is the equivalent of checking for a NULL result from the 'GetThing' function, at least in my experience. ThingExists could be implemented by calling GetThing and performing the NULL check there.

Not so good in practice.

Doing the query to check for existence is usually identical to the code to get the object. For example, ThingExists() could be trivially implemented as { return GetThing() != NULL; }.

Better to do the call only once rather than twice.

If you have to do two sets of queries, the first is existential the second is acquisition, it is usually better to write just a single acquisition query that gets the final result and then tests for failure.

A similar example in C#, don't do both "is" and "as" operations, just do the "as" and check for null.

##### Share on other sites

Maybe I'm missing something, but I don't see how this is all that bad.

If you want terseness (though I don't particularly like this sort of code) you can just do this:

if(value *foo = getThing(someKey))
{
//do stuff
}


As far as I know, your only options for dealing with Hash Tables are NULL results, exception throwing, an out parameter and success return (like TryGetValue in C#'s Dictionary), and optional types.  All of them (sans exceptions, which is just ugly) are pretty much the same thing in the end.

I'm with frob that calling an "Exists" function first is a waste of both your time and the CPU's.  This is also sort of similar to the EAFP principle that python programmers love so much.  Take the Nike approach and "just do it".

Edited by SeraphLance

##### Share on other sites
if(value *foo = getThing(someKey)) { //do stuff }

I can't get that to compile here... I need to declare foo outside of the if clause:

value* foo;
if(foo = getThing(someKey))
{
// do stuff
}

Or do something extremely retarded, such as:

for(value* foo = getThing(someKey);foo;foo=0)
{
// do stuff
}

Other than that, I agree that there's no problem with checking for NULL.

##### Share on other sites

I can't get that to compile here...

Then there is something up with your compiler. Declaration inside if statement of the form:

if(Type *t = whatever())
{
}

is perfectly standard. One rational for it was for things like:

if(Type *t = dynamic_cast<Type*>(p))
{

// t is only scoped where it is valid

}