C++ c

Started by
17 comments, last by ScottZhang 10 years ago

Make elements a T * instead of unsigned char *. Saves you all that pointer casting and sizeof-calls in your pointer arithmetics.

I did that at first, but I adjusted my code according to fastcall22's second response using placement new.

fastcall22 told you to use placement new, and I suggested that you change the pointer type from unsigned char to your template type T to get rid of the manual pointer arithmetics. Those are not mutually exclusive; you want placement new and a proper pointer type.

Instead of:


unsigned char *elements = new unsigned char[count*sizeof(T)];
...
new (elements + i * sizeof(T)) T(*((T*)(oldElements + i * sizeof(T))))

you do something like:


T *elements = operator new(count*sizeof(T));
...
new (elements + i) T(oldElements[i]);

Don't relocate objects with the C memory functions. You need to properly copy construct and release objects, or move the objects with the move constructor if you want to take advantage of the move-mechanics in C++11. For trivial objects, this is equivalent as a memory move, but for objects with side effects when constructing/destructing/moving objects, this is highly important to get right. Also ensure that you don't assign to the new memory; you must construct the object in the new memory since otherwise you'll assign something to an non-constructed object.

I think I am currently not relocating objects with C memory functions. I use placement new to copy construct the old objects. But I just noticed that I forgot to call the destructors of the released objects when relocating.

I apologize, I misread your code; the memcpy function was for the integer pointers and not the actual object pointers. Without looking into the exact details, it does indeed look like the correct approach at least.

It turns out that I don't need this class. If anything I will make a version only for trivial objects. But I learned plenty about placement new, smart pointers, initialization lists and the rule of three/five, so thanks! I appreciate all the responses.smile.png

It was kind of what I figured with my first question. Even if it doesn't apply in this particular case, knowing about object lifetimes and such is a great experience anyway.

Advertisement

fastcall22 told you to use placement new, and I suggested that you change the pointer type from unsigned char to your template type T to get rid of the manual pointer arithmetics. Those are not mutually exclusive; you want placement new and a proper pointer type.

Ah, I didn't know about operator new(). That should make everything a bit cleaner. Learned something again, thanks! smile.png

Even though I probably wont use the class anymore, did I get it right now?

Obviously the class does not yet comply with the rule of three, I won't discuss this further other than to add that a simple "quick fix" is to make the class noncopyable by declaring a private, unimplemented copy constructor and assignment operator. This technique will mean you get a compile error if client code tries to copy/assign the class, and a link error if the class itself performs these operations. In C++11, I believe there is new syntax to suppress automatic generation of such functions, but I don't much practical experience with this so I won't say more.

You appear to have a bug in the last full code you posted, where isUsed is not updated if removeElement() is called.

I'd recommend at least using an assertion that isUsed[index] is true in both getElement() and removeElement().

The freeElementList and isUsed are similar arrays that are almost always accessed together, so one idea might be to have an internal struct containing both pieces of information and allocate a single array of that type. This would reduce the number of allocations required. However, I believe in this specific case you could avoid keeping both pieces of information by choosing a marker value, for example -1, to include in the freeElementList to indicate that the element is in use.

In fact, the "similar arrays" argument is true of the data instances themselves, one alternative implementation is to have an internal structure containing the potentially uninitialised data and . However, leaving it separate may have cache benefits depending on how frequent accesses are versus removals.

In the case where all members are moved to a single structure, there is no reason not to implement your class in terms of std::vector, this would simplify a lot of manual work and reduces your class to the essentials of managing the free list. If the class is to maintain the element array separate from the used/next array(s), then you could still implement in terms of std::vector with some relatively minor memory overhead (each vector would maintain the element count and capacity separately).

The addElement function doesn't gracefully handle exceptions thrown when allocating the storage or copying the elements. It both leaks memory and can leave the object in an inconsistent state. The constructor can leak if an exception throws during allocation. Implementing in terms of a single std::vector should avoid these issues, implementing in terms of multiple std::vectors would still require care.

It is typical for such containers to have a const and non-const accessors. The convention would be the const version returns a const reference, and the non-const version returns a mutable reference - unless by necessity or design the elements should not be mutable (e.g. the key in std::map is not mutable in isolation - allowing this could corrupt the map ordering).

My C++ skills are rusting of late, but I believe this is in the ballpark of an implementation in terms of std::vector:
template <typename T>
class Vector
{
private:
	struct Element {
	private:
		static const int ELEMENT_IN_USE = -1;
		int nextFreeIndex;
		char storage[sizeof(T)];
	public:
		Element(int index) {
			nextFreeIndex = index + 1;
		}

		~Element() {
			if(isUsed()) {
				object().~T();
			}
		}

		Element(const Element &other) {
			if(other.isUsed()) {
				new(storage) T(other.object());
			}
			nextFreeIndex = other.nextFreeIndex;
		}

		int initialise(const T &data) {
			assert(!isUsed());
			int result = nextFreeIndex;
			new(storage) T(data);
			nextFreeIndex = ELEMENT_IN_USE;
			return result;
		}

		void clear(int freeIndex) {
			assert(isUsed());
			object().~T();
			nextFreeIndex = freeIndex;
		}

		T &object() {
			assert(isUsed());
			return *reinterpret_cast<T*>(storage);
		}

		const T &object() const {
			assert(isUsed());
			return *reinterpret_cast<const T*>(storage);
		}

		bool isUsed() const {
			return nextFreeIndex == ELEMENT_IN_USE;
		}
	private:
		// TODO: do I need this guy?
		Element &operator=(Element copy);
	};

	int freeIndex;
	std::vector<Element> elements;
public:
	Vector(int initialCapacity) : freeIndex(0) {
		elements.reserve(initialCapacity);
		for(int i = 0 ; i < initialCapacity ; ++i) {
			elements.emplace_back(i);
		}
	}

	int addElement(const T& element) {
		int currentSize = elements.size();
		if(freeIndex > currentSize) {
			int newSize = currentSize * 2;
			elements.reserve(newSize);
			for(int i = currentSize ; i < newSize ; ++i) {
				elements.emplace_back(i);
			}
		}

		int resultIndex = freeIndex;
		Element &target = elements[freeIndex];
		freeIndex = target.initialise(element);
		return resultIndex;
	}

	void removeElement(int index) {
		assert(index < elements.size());
		elements[index].clear(freeIndex);
		freeIndex = index;
	}

	T getElement(int index) {
		assert(index < elements.size());
		return elements[index].object();
	}

	const T &getElement(int index) const {
		return elements[index].object();
	}

private:
	// Noncopyable: deliberately private & unimplemented
	Vector(const Vector &);
	Vector &operator=(const Vector &);
};

Wow, lots of interesting stuff. I didn't even know you could have internal structs. That alone will be very helpful in some other places. It was a bit overwhelming, but I think I understood most of it now (though I wouldn't be able to reproduce it) except the reinterpret cast. I will look that up.

I will also look at the code some more and try to adapt as much as possible to improve my coding skills. Thanks for taking the time! You guys are awesome!

Glad you found it helpful.

It was late when I posted that (Irish time), so here is a revised version which implements the rule of three, fixes an off-by-one in the addElement() resizing logic and avoids some signed/unsigned warnings:

template <typename T>
class Vector
{
private:
	struct Element {
	private:
		static const int ELEMENT_IN_USE = -1;
		int nextFreeIndex;
		char storage[sizeof(T)];
	public:
		Element(int index) : nextFreeIndex(index + 1) {
		}

		~Element() {
			if(isUsed()) {
				object().~T();
			}
		}

		Element(const Element &other) : nextFreeIndex(other.nextFreeIndex) {
			if(other.isUsed()) {
				new(storage) T(other.object());
			}
		}

		Element &operator=(const Element &other) {
			if(this == &other) {
				return *this;
			}
			if(other.isUsed()) {
				if(isUsed()) {
					object() = other.object();
				} else {
					new(storage) T(other.object());
				}
			} else {
				if(isUsed()) {
					object().~T();
				}
			}
			nextFreeIndex = other.nextFreeIndex;
			return *this;
		}

		int initialise(const T &data) {
			assert(!isUsed());
			int result = nextFreeIndex;
			new(storage) T(data);
			nextFreeIndex = ELEMENT_IN_USE;
			return result;
		}

		void clear(int freeIndex) {
			assert(isUsed());
			object().~T();
			nextFreeIndex = freeIndex;
		}

		T &object() {
			assert(isUsed());
			return *reinterpret_cast<T*>(storage);
		}

		const T &object() const {
			assert(isUsed());
			return *reinterpret_cast<const T*>(storage);
		}

		bool isUsed() const {
			return nextFreeIndex == ELEMENT_IN_USE;
		}
	};

	int freeIndex;
	std::vector<Element> elements;
public:
	Vector(int initialCapacity) : freeIndex(0) {
		elements.reserve(initialCapacity);
		for(int i = 0 ; i < initialCapacity ; ++i) {
			elements.emplace_back(i);
		}
	}

	Vector(const Vector &other) : freeIndex(other.freeIndex), elements(other.elements) {
	}

	Vector &operator=(const Vector &other) {
		if(this != &other) {
			elements = other.elements;
			freeIndex = other.freeIndex;
		}
		return *this;
	}

	int addElement(const T& element) {
		int currentSize = elements.size();
		if(freeIndex >= currentSize) {
			int newSize = currentSize * 2;
			elements.reserve(newSize);
			for(int i = currentSize ; i < newSize ; ++i) {
				elements.emplace_back(i);
			}
		}

		int resultIndex = freeIndex;
		Element &target = elements[freeIndex];
		freeIndex = target.initialise(element);
		return resultIndex;
	}

	void removeElement(unsigned index) {
		assert(index < elements.size());
		elements[index].clear(freeIndex);
		freeIndex = index;
	}

	T getElement(unsigned index) {
		assert(index < elements.size());
		return elements[index].object();
	}

	const T &getElement(unsigned index) const {
		return elements[index].object();
	}
};
I wouldn't trust that this code, as it has not been tested, it is just an illustration.

Something I meant to mention is that I think Vector is a poor name for such a class, as the behaviour is quite different from what is normally expected from a Vector.

Okay, now I know that you can write unsigned instead of unsigned int. Also thanks for showing how to implement the rule of three. I think there might be a little bug if removeElement is called for an element that isn't used. The value of nextFreeIndex of that element would get lost and be overridden by freeIndex. Maybe an assertion that isUsed() is -1 like you suggested earlier would help here?

I agree about the name, though it's hard to think of something new. Maybe just call it Container?

I think there might be a little bug if removeElement is called for an element that isn't used.

There is an assertion to warn you in debug mode if this is accidentally used. It is a design decision whether the class should actually support this use case where the client code attempts to call removeElement() on arbitrary indices/ID.

It would appear to me that the intention is that a client class would use addElement(), store an index/ID, and then use getElement() when necessary, finally using removeElement() when finished. I don't see an obvious use case where client code would be calling removeElement() without having received a valid index/ID from addElement(), nor do I see an obvious use case for the client code to retain the (now invalid) index/ID after having called removeElement(). I cannot see a safe way of handling this, as if element ownership isn't exclusive, how are other clients going to ensure they don't invalidate each other's elements?

But you certainly could design a class to support these kind of use case, if you really wanted to. For example, if this is used by a scripting system to access game engine objects, you probably would want more than a debug-only assertion. Scripts shouldn't be able to crash the engine - full stop. You would want to either expose the ability for the client to test the object exists before interacting with it, or encapsulate that into the class where possible (e.g. getElement() could return a pointer, or nullptr if the index/ID isn't used).

I agree about the name, though it's hard to think of something new. Maybe just call it Container?

I think Container is too generic. To me, it is some kind of handle system, so maybe a name like HandleRepository?

There is an assertion to warn you in debug mode if this is accidentally used. It is a design decision whether the class should actually support this use case where the client code attempts to call removeElement() on arbitrary indices/ID.

Ah, I just overlooked it. I expected it in the removeElement method, but it was in the clear method.

It would appear to me that the intention is that a client class would use addElement(), store an index/ID, and then use getElement() when necessary, finally using removeElement() when finished.

Your correct. A warning when I accidentally do something wrong is all I need, I just didn't see that it's already there.

I think Container is too generic. To me, it is some kind of handle system, so maybe a name like HandleRepository?

Yes, that's better, thanks!

Was du suchst ist im wesentlichen ein sparse vector, oder nicht?

This topic is closed to new replies.

Advertisement