Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Prune

Member Since 07 Mar 2007
Offline Last Active Dec 12 2014 03:29 PM

Posts I've Made

In Topic: Overhead of GL_DYNAMIC_STORAGE_BIT​ or GL_MAP_WRITE_BIT​ when there's no...

04 June 2014 - 01:34 PM

I also suspect that using mismatched vertex formats between the vertex buffer and the shader will cause a few internal recompiles

I don't understand. Why would they be mismatched?

 

The reason I want to stick with vector is that the mesh class does a lot more than just loading and storage. I have a bunch of mesh processing functions in there. Refactoring to a new datatype will mean changing a class and supporting code that's around a thousand lines.

 

[Edit:] The driver interleaving the streams doesn't make sense for modern hardware, as if that were necessary, then I don't see how vertex pulling from multiple SSBOs could work.


In Topic: Overhead of GL_DYNAMIC_STORAGE_BIT​ or GL_MAP_WRITE_BIT​ when there's no...

03 June 2014 - 07:23 PM

I'll consider interleaving, but whether I do or not doesn't affect the main question of the thread. To avoid needing dynamic/writeable buffers for static data, I need to have all the data in a contiguous memory region (in interleaved case, or one contiguous region per attribute in the non-interleaved case--it's really besides the point).

 

Here's my attempt at a custom allocator. And it works in gcc regardless of build options, and on MSVC2012 in Release mode. But in MSVC in Debug mode, it doesn't (that is, it runs, but the semantics don't work):

template<class T>
class ContigAlloc
{
public:
	typedef T value_type;
	typedef T *pointer;
	typedef const T *const_pointer;
	typedef T &reference;
	typedef T const &const_reference;
	typedef size_t size_type;
	typedef ptrdiff_t difference_type;
	template<class U>
	struct rebind
	{
		typedef ContigAlloc<U> other; // TODO: typedef -> using when VC++2013
	};
	inline ContigAlloc(MemBuff &buff); // TODO: noexcept when VC++2013
	template<class U>
	inline ContigAlloc(ContigAlloc<U> const &other); // TODO: noexcept when VC++2013
	inline pointer address(reference x) const;
	inline const_pointer address(const_reference x) const;
	inline pointer allocate(std::size_t n);
	inline void deallocate(pointer p, std::size_t n); // TODO: noexcept when VC++2013
	inline size_type max_size(void) const;
	// TODO: Variadic temlates for construct() when VC++2013
	template<class U>
	inline void construct(U *p);
	template<class U, class A>
	inline void construct(U *p, A &&a);
	template<class U, class A0, class A1>
	inline void construct(U *p, A0 &&a0, A1 &&a1);
	template<class U, class A0, class A1, class A2>
	inline void construct(U *p, A0 &&a0, A1 &&a1, A2 &&a2);
	template<class U, class A0, class A1, class A2, class A3>
	inline void construct(U *p, A0 &&a0, A1 &&a1, A2 &&a2, A3 &&a3);
	template<class U>
	inline void destroy(U *p);
	template<class T0, class U>
	inline friend bool operator==(ContigAlloc<T0> const &x, ContigAlloc<U> const&y); // TODO: noexcept when VC++2013
	template<class T0, class U>
	inline friend bool operator!=(ContigAlloc<T0> const &x, ContigAlloc<U> const&y); // TODO: noexcept when VC++2013
private:
	ContigAlloc(void); // TODO: = delete instead of private when VC++2013
	ContigAlloc &operator=(ContigAlloc const &);
	template<class U>
	friend class ContigAlloc;
	MemBuff &_buff;
};

template<class T>
inline ContigAlloc<T>::ContigAlloc(MemBuff &buff) : _buff(buff)
{
}

template<class T>
template<class U>
inline ContigAlloc<T>::ContigAlloc(ContigAlloc<U> const &other) : _buff(other._buff) // TODO: noexcept when VC++2013
{
}

template<class T>
inline typename ContigAlloc<T>::pointer ContigAlloc<T>::address(reference x) const
{
	return ::std::addressof(x);
}

template<class T>
inline typename ContigAlloc<T>::const_pointer ContigAlloc<T>::address(const_reference x) const
{
	return ::std::addressof(x);
}

template<class T>
inline typename ContigAlloc<T>::pointer ContigAlloc<T>::allocate(std::size_t n)
{
	return reinterpret_cast<T *>(_buff.alloc(sizeof(T) * n);
}

template<class T>
inline void ContigAlloc<T>::deallocate(T *p, std::size_t n) // TODO: noexcept when VC++2013
{
	_buff.deall(p, sizeof(T) * n);
}

template<class T>
inline typename ContigAlloc<T>::size_type ContigAlloc<T>::max_size(void) const
{
	return _buff.remain();
}

template<class T>
template<class U>
inline void ContigAlloc<T>::construct(U *p)
{
	::new(reinterpret_cast<void *>(p)) U;
}

template<class T>
template<class U, class A>
inline void ContigAlloc<T>::construct(U *p, A &&a)
{
	::new(reinterpret_cast<void *>(p)) U(std::forward<A>(a));
}

template<class T>
template<class U, class A0, class A1>
inline void ContigAlloc<T>::construct(U *p, A0 &&a0, A1 &&a1)
{
	::new(reinterpret_cast<void *>(p)) U(std::forward<A0>(a0), std::forward<A1>(a1));
}

template<class T>
template<class U, class A0, class A1, class A2>
inline void ContigAlloc<T>::construct(U *p, A0 &&a0, A1 &&a1, A2 &&a2)
{
	::new(reinterpret_cast<void *>(p)) U(std::forward<A0>(a0), std::forward<A1>(a1), std::forward<A2>(a2));
}

template<class T>
template<class U, class A0, class A1, class A2, class A3>
inline void ContigAlloc<T>::construct(U *p, A0 &&a0, A1 &&a1, A2 &&a2, A3 &&a3)
{
	::new(reinterpret_cast<void *>(p)) U(std::forward<A0>(a0), std::forward<A1>(a1), std::forward<A2>(a2), std::forward<A3>(a3));
}

template<class T>
template<class U>
inline void ContigAlloc<T>::destroy(U *p)
{
	p->~U();
}

template<class T0, class U>
inline bool operator==(ContigAlloc<T0> const &x, ContigAlloc<U> const&y) // TODO: noexcept when VC++2013
{
	return x._buff == y._buff;
}

template<class T0, class U>
inline bool operator!=(ContigAlloc<T0> const &x, ContigAlloc<U> const&y) // TODO: noexcept when VC++2013
{
	return x._buff != y._buff;
}

I then feed this allocator to the std::vector container(s) I use in my mesh class. If, in allocate I check the types with std::is_same(), I find that only in Debug builds in MSVC does the allocator get rebinded to another type. The thing is, I don't know how to handle that. What I would optimally want to do is delegate to the default std::allocator in any case where the a rebind to another type has been used by the container, as that means the container is using the allocator to store other things such as members of the class or debugging crap rather than element storage. I tried changing the rebind declaration to the following, but it doesn't work in MSVC Debug:

struct rebind
{
	typedef typename std::conditional<std::is_same<T, U>::value, ContigAlloc<U>, std::allocator<U>>::type other;
}

In Topic: Overhead of GL_DYNAMIC_STORAGE_BIT​ or GL_MAP_WRITE_BIT​ when there's no...

03 June 2014 - 06:48 PM

I don't follow. Why would I interleave positions with other vertex attributes when positions are the only thing used for the early-Z pass and all the shadow map generation passes? Interleaving means that for all passes but the shading one, strided addressing would walk over a far larger chunk of memory that needed.

Then there's tangent data, which is only used for a few shaders, and so that attribute doesn't make sense to be interleaved either as it's not used most of the time.

I only see benefit interleaving normals and texture coordinates, as the two are used together almost all of the time.

Of course, I could have positions both by themselves for early-Z and shadows, and interleaved for shaded calculations, but that's a waste of graphics memory.

i don't understand what the problem is which is leading you to this question in the first place

The multi-draw calls use the same set of attribute buffers used by the underlying individual draws. I can't have a different buffer per object. I'm not sure what's not to understand. If a buffer is created with glBufferStorage(), and neither the dynamic nor the write bits are set, then all the data has to be provided at creation time. That means all meshes I need in for a given multi-draw need their data in a single contiguous memory region passed to glBufferStorage(). Thus, all data my mesh class instances store has to be allocated contiguously from the same underlying chunk of RAM. The two alternatives--make the buffer objects writeable, or copy each mesh into the larger RAM buffer to pass to glBufferStorage(), add overhead (well, as you said, "maybe" for the first option listed).


In Topic: Are two OpenGL contexts still necessary for concurrent copy and render?

03 June 2014 - 03:33 PM

Any comment on the last one? Does this only apply go glTexSubimage(), or are other buffer transfers, especially mapped buffers, handled the same way, and require a second context to occur concurrent with rendering, or only texture upload can trigger the copy engine?


In Topic: Overhead of subroutines in arrays when multidrawing?

29 May 2014 - 01:08 PM

The thing is, since we're not talking about indexing using an arbitrary value, such as an ID read from a texture, the GLSL runtime _does_ have the information that it needs to optimally allocate registers _per draw_. Indexes dependent solely on gl_DrawIDARB (and gl_InstanceID) are, by the spec, dynamically uniform expressions. Surely there is already some sort of runtime partial specialization of shaders--else why have the defined concept of dynamically uniform expressions (which are constant within a work group) at all? So why can't register allocations that depend on subroutine selection that's dependend on not just constant expressions, but dynamically uniform expressions as well, be part of that specialization?


PARTNERS