Imposing some security on the lifetime of raw pointers to recycled objects

Started by
8 comments, last by Alberth 5 years, 8 months ago

 

When recycling direct pointers into a pool of allocated objects whose lifetimes are controlled by well defined periods (eg session, permanent or temporary ("user")), are there any additional clever security measures I can employ to make sure local copies of these pointers are more likely to not be used outside the object's life cycle?

 
That is, I'm not able to ensure anything as soon as I emit a raw pointer in the first place, so it's not like I want to prevent the user from being able to segfault the program if they're reckless or go out of their way to do so, but I would still prefer if there was some sort of mental barrier that would ensure the programmer is aware of the pointer's lifetime. These raw references are not to be given out in bulk, but are rather likely limited to something like 1-5 instances. I do not want to make them smart pointers as the pool must be free-able regardless of any dangling references.
 
Two options I can think of are:
 
1) add a layer of indirection and instead of providing raw pointers directly hand out internally managed weak pointer style wrappers. These could be set to null when a pool is freed or an object is recycled, but would in no way prevent the programmer from making local copies anyway,
2) force the programmer to use specific context-sensitive calls to retrieve the pointer in the first place that spell out "caveat emptor". Eg something like GetSessionPointer(), GetPermanentPointer() and GetUserPointer().
 
Cleanup of pool data is performed when a session is terminated (eg when a level/chunk is unloaded), the program closes or the user manually decides to free up temporary memory. A callback is invoked to notify the programmer when this occurs.
 
In the past I've opted to using individual allocations, but there are a few classes of objects that I wish to manage in bulk (these are mostly related general speed improvements, serialization of level data, etc).
 
Any thoughts how to add additional security on top of this? What's the best approach in a production environment? 
Advertisement

Give user a handle (ie some unique number to detect he is querying an existing block), and have him to get the raw pointer each time using the handle, ie he shouldn't store it anywhere. The callback then indicates the handle has expired (you increment the number of the block, or give it a new globally unique value or whatever), and a new handle must be obtained.

For convenience, you might want to reserve eg handle value 0 for "invalid handle", so there is a simple value to store in the handle to denote "needs a new session".

 

In the end, it's all down to how much do you trust your users. On the other hand, these things can be hairy to figure out, so it might even be beneficial for the user, as stuff will die when you're not following the rules.

I would use a light-weight wrapper object that implements operator bool() and operator->() at the very least, so you get pointer-like semantics and usage with the option to add behavior as necessary. For instance, you could make the copy constructor explicit so users can still copy the handle, while also allowing one to easily locate such copies through static analysis. Or if you're feeling clever, you could even add tracing/logging to copies, access, etc., so if/when something goes wrong, there's a "chain of custody" so to speak that let's you identify errant usage. Anything like this can also be compiled out in production builds so there's almost no overhead.

I prefer integer handles over pointers. The "downside" being that the user needs a pointer to the pool & their handle in order to retrieve and object, but IMHO, this is actually a positive/upside. If you enforce the rule that users don't retain any pointers to pools (so they must constantly be passed around as arguments instead) then it the codebase becomes so much easier to understand when trying to reason about data flows, data dependencies and multi-threading opportunities. 

Within an integer handle, you can reserve some bits to store an "ABA counter" (not sure what else these are called? Generation counters?). Basically for each slot in the pool, you have a counter which is incremented every time the object in that slot is destroyed/recycled. When fetching a handle, you copy this counter into the reserved bits in the handle. When dereferencing a handle, you check if the reserved bits match the counter, and if not, it's a "dangling pointer" bug, so you can return a dummy/null object or signal failure to the user. 

This system can fail if the object is recycled 2^n times in a row, where n is the size of the reserved counter bits in your handle. So keep that in mind. 

FWIW, you can also use this system with pointers. If you pool is full of 16bit aligned objects, then the lower 4 bits of your pointers are useless. If you're on x86-64, then IIRC the upper 16 bits are useless too (double check that). This could give you space to store a 20bit counter value within the pointer itself! You'd then just need a template class that provides operator*, operator->, etc so that it acts like a regular pointer, but first ANDs with a mask to remove the counter before dereferencing, and also decodes/checks the counter value after dereferencing. 

Thanks for sharing your thoughts!

Here's where I'm mentally at at the moment:

1) I can't prevent the user from making a raw local copy anyway
2) a PPoolObject type proxy seems like a good compromise, but...
3) I'm leaning toward compiling it to an encoded/checked index in debug mode, but a raw pointer wrapper in release mode. If some smart hat decides to dereference it in a loop, it either gets optimized or becomes an unnecessary bottleneck
 

Here are my concerns:

1) the indirection runs a risk of thrashing the cache, although I haven't written a single line of code so far so that's just speculation
2) I'm not entirely sure how to go about locking in the proxy. Technically PPoolObject should lock the pool every time its value is read, which seems like it could add up fast
3) if I don't lock, then the proxy is as unsafe as a raw pointer in the first place, so it kind of defeats at least part of the idea
4) in a way this seems like hack. The real answer here seems to stem from a grander design paradigm. If I manage to enforce a strict destruction cycle, then I feel like trusting the programmer should be fine. Maybe I'm too naive though...

Are you interested in security, as in, untrusted, user-supplied code should be memory sandboxes? Or just a development / debug helper to catch bad code? 

If the latter, you can also employ methods used generally for heaps. First though you'd probably want to make your pools way bigger than needed in development builds, so that you don't have to recycle objects as often. When releasing an object, memset it with some bit pattern that's easy to spot during debugging and likely to cause crashes if someone accidentally uses the object to help catch read-after-free bugs (i.e. Not 0! Something like 0xcdcdcdcd or maybe something that looks obviously wrong like that / will be a bad pointer / will encode a NaN float). When recycling an object, first memcmp it against the known "dead pattern" to see if it's been accidentally written to while it was dead. This let's you know that write-after-free bugs exist, at which point you can use memory breakpoints to track down the culprits. 

To go all-out, in debug builds, allocate every object in its own 4KB page. When releasing the object, set that page to be non-readable/non-writable. Don't recycle if you don't have to - just burn through address space (x64 gives you a lot)! Any use-after-free bugs will crash immediately on the line that tries to access the (now-)invalid page. 

The C++ standard library provides a pointer proxy object that does pretty much what you're describing, so you might want to pattern your smart pointer on that.  You can even take advantage of the allocator and deleter of the referent pointer object if you're doing your own pool allocation or fortification algorithms.

Stephen M. Webb
Professional Free Software Developer

The simplest solution is using a shared pointer such as std::shared_ptr, otherwise you may roll your own reference counted pointer.


// pseudo code

struct MyPointer
{
	MyClass * pointer;
	size_t count;
};

struct MyPool
{
	MyClass * requirePointer(someKey)
	{
		MyPointer * myPointer = findBySomeKey(someKey);
		++myPointer->count;
		return myPointer->pointer;
	}
	
	void releasePointer(MyClass * pointer)
	{
		MyPointer * myPointer = findByPointer(pointer);
		--myPointer->count;
	}
};

But I would highly recommend using std::shared_ptr instead of your own reference counting.

 

https://www.kbasm.com -- My personal website

https://github.com/wqking/eventpp  eventpp -- C++ library for event dispatcher and callback list

https://github.com/cpgf/cpgf  cpgf library -- free C++ open source library for reflection, serialization, script binding, callbacks, and meta data for OpenGL Box2D, SFML and Irrlicht.

I like the idea of a proxy object. Since the user has to store the pointer somewhere anyway, wrap a proxy-object around it. It has thus 1 data member, the raw pointer. Next add convenience method to it. A *() operator to make accessing the pool simpler, add a "get the raw pointer" method to it, and add a "remove/invalidate the pointer" method.

Make the proxy copy-able, so a user can make copies if he/she so desires.

 

Now you can play with being more or less paranoid vs less or more speed-conscience.

In release mode, the *() operator simply forwards the pointed-to data reference, "get the pointer" gets the pointer from the pool (ie what you have to do anyway, the additional proxy dereference won't make a dent, as it doesn't happen that often), and "remove/invalidate" does nothing. Compiler optimization then does the rest, giving you near-equal (if not actually equal) performance as with a raw pointer.

In debug mode you can have the proxy objects manage a reference count in the pool, or they register themselves, so you can visit them and check the pointer is invalidated, and pretty much anything else you may want. You can also have additional data in the proxy in debug mode if that helps. A change of size may however cause havoc in alignment / padding, which should be carefully considered.

 

This topic is closed to new replies.

Advertisement