Jump to content

  • Log In with Google      Sign In   
  • Create Account

FREE SOFTWARE GIVEAWAY

We have 4 x Pro Licences (valued at $59 each) for 2d modular animation software Spriter to give away in this Thursday's GDNet Direct email newsletter.


Read more in this forum topic or make sure you're signed up (from the right-hand sidebar on the homepage) and read Thursday's newsletter to get in the running!


tivolo

Member Since 19 Jun 2004
Offline Last Active Today, 08:10 AM

#5166734 Storing position of member variable

Posted by tivolo on 14 July 2014 - 08:17 AM

 

 

EDIT: Unfortunately, pointer-to-member seems to also have a problem with the child-POD-struct:

class Sprite : 
	public ecs::Component<Sprite>
{
public:
	math::Vector2f vPos;
	float z;
	
	void Test(void)
	{
		float Sprite::* pZ = &Sprite::z; // works
		float Sprite::* pX = &Sprite::vPos::z; // doesn't even compile... I don't even know how that should work
	}
};

I don't know... is there even a solution to this? offsetof would work, though I like pointer-to-member better... I don't want to replace the vector with pure floats eighter... any ideas?

 

 

The problem is that &Sprite::vPos::z is not a pointer-to-member of class Sprite, but rather a pointer-to-member of math::Vector2f, which is entirely unrelated to Sprite.
One solution to your problem would be to implement a generic (but typesafe!) pointer-to-member, something using inheritance and type erasure should do the trick.




#5141673 C++ Template Class -Creating a static method overload in derived class hides...

Posted by tivolo on 24 March 2014 - 04:49 AM

 

 

Hi I have a some code in my engine similar to following snippet - 

...

So basically what I am trying to do here is there's a static method in template base class and I am trying to create an overload with string parameter. I want to use both of them.

But the problem is if I create the static method overload in derived class it hides the base class version. I am not sure if I am missing something here or this is the intended behavior ????

Right now as a work around I am using BaseClass<DerivedClass,InitDataType_t>::GetInstance(initData); inside main.

 

Also If I don't create any overload in derived class then DerivedClass::GetInstance(initData); works fine without any warning or error. Which I think how it should be.

 

Creating a static method overload in derived class hides the base class method, which I feel is a bug or maybe I am missing something here? So can anyone please provide an insight to whats really going on here ?

The behavior is correct and indended. You're not overloading here, but overriding. Overriding a base class function in a derived class hides the base class definition unless explicitly qualified with the base class name as you discovered. This applies to non-static member functions as well, not just static ones.

 

You can "unhide" the base class symbols in the derived class with the using statement. Something like this should work:

class DerivedClass : public BaseClass< DerivedClass, InitDataType_t >
{
using BaseClass::GetInstance;
public:
static T1* GetInstance(const cstring& somestring);
}

I understand overriding for non static methods makes complete sense to me. 

But overriding static methods, I didn't knew if that was even possible ? (mind blown)
Can you please provide a little insight into how does it works with regards to  memory mapping, vtable and stuff ? Is it like non static methods ?

 

(This problem reminded me of a quote I read somewhere on the internet - "C++ is too much of a language to learn in a single life time".)
 

 

 

I'm pretty sure Brother Bob wanted to say it the other way around.

You are creating a new overload in the derived class, and that will hide all overloads from all base classes, unless you explicitly "pull" them into the derived class via "using". That is intended, and correct.

 

There is no such thing as overriding static methods.




#5137790 Sorting a std::vector efficiently

Posted by tivolo on 10 March 2014 - 08:01 AM


 

To answer the original question: just sort the indices, and use a radix sort. If your indices are 32-bit, you can use a three-pass radix sort with 11-bit buckets that will nicely fit into the CPU's L1 cache.

This looks like indeed a nice idea for applying radix sort. There's so few places where it really fits.

 

I couldn't tell whether it's worth the trouble (still always using std::sort, which for anything practical surprisingly turns out just good enough every time, I've never had enough of a justification for actually using radix sort in my entire life), but from playing with it a decade or two ago, I remember that radix sort can easily be 3-4 times faster than e.g. quicksort/introsort.

So if the sort is indeed bottlenecking you out (that is, you don't meet your frame time, and profiling points to the sort), that would probably be a viable option.

 

Slight nitpick, though: One doesn't of course sort the indices, which would be somewhat pointless. You most certainly didn't mean to say that, but it kind of sounded like it.

One sorts the keys (moving indices). Or well, one sorts indices by key, or whatever one would call it.

Which means that most likely, 3 passes won't be enough, sadly (or you need bigger buckets), since you will almost certainly have at least 48 or 64-bit numbers to deal with (except if you have very few passes and render states, and are very unkind to your depth info).

Not using a stable radix sort to save temp memory may be viable (can instead concat the indices like described above if needed, even if this means an extra pass... the storage size difference alone likely outweights the extra pass because of fitting L1).

 

Thanks for pointing out my mistake, yes I wanted to say that one should sort the keys :). What you get back are the indices which are then used to access your data.

 

Fully generic rendering-related data might not fit into a 32-bit key, that's true. But there are certain occasions where 32 bits are more than enough (or even 16 might suffice), e.g. for sorting particles.

As a quick note on performance, sorting 50k particles back-to-front using an ordinary std::sort on 16-bit keys takes 3.2ms on my machine, whereas an optimized radix-sort needs 0.25ms. If all you need to sort are keys that somehow index other data, it's almost always better to just sort that, and correctly index the data on access because this causes much fewer memory accesses during the sort.




#5137739 Sorting a std::vector efficiently

Posted by tivolo on 10 March 2014 - 02:29 AM

To answer the original question: just sort the indices, and use a radix sort. If your indices are 32-bit, you can use a three-pass radix sort with 11-bit buckets that will nicely fit into the CPU's L1 cache.

And you can also make the radix sort benefit from temporal coherence. But even if you don't, it will be much, much faster than a std::sort in comparison.




#5135316 Simple Question BYTE->float C++

Posted by tivolo on 28 February 2014 - 03:05 AM

I don't believe the static cast is needed, unless your compiler has a nonstandard warning message.

Dividing the types char / float is going to follow a standards-required implicit conversion to make them the same types, float/float. Since the char value can be exactly represented by a float, no compiler message is required.

 

I'd argue that in the general case it would be better to use a solution with explicit static_casts as presented by ApochPiQ, because you can immediately see the four casts (and can also search for them), also making it easier to spot potentially expensive operations (load-hit-stores on PowerPC architectures).

But that's probably a matter of taste.




#5119579 Strange memory allocation bug with new[]

Posted by tivolo on 27 December 2013 - 04:38 PM

The operators new, delete, new[] and delete[] have two purposes: 1) to allocate memory, and 2) to construct and destruct objects. Operator new[] constructs an array of objects after allocating the memory for those objects. When delete[] is called, it must call the destructor on each of the objects in the array, but delete[] doesn't take an array length parameter so it has no way of knowing how many objects are in the array. So delete[] must figure out, somehow, how many objects it needs to destruct. The C++ standard doesn't specify how delete[] should figure this out (as far as I know) but a common implementation is for new[] to put an int at the front of the memory allocated that contains the length of the array.

 

So, that's what you're seeing. The new[] operator is putting the array length at 0x..e18, and the actual objects you allocated start at 0x..e1c. When delete[] is called, it'll take the pointer passed in, subtract 4 to get the pointer to the array length, destruct the number of objects specified in that int, and then free the memory.

 

As Olof Hedman says, msize is a C function, and it has no idea that any of this array trickery is happening, so it doesn't have any chance to succeed. You MIGHT get msize to work by calling it on ptr-4, but even if that did work it wouldn't be portable.

 

Solid advice, here's a bit more detailed info: http://molecularmusings.wordpress.com/2011/07/05/memory-system-part-1/

 

As Samith and others have said: never mix C and C++ memory functions! Don't mix new & malloc, delete & free, or any other C function with new/delete. It will not work, as soon as you start using non-PODs (it mostly works for PODs because delete[] then doesn't allocate extra bytes).




#5113266 C++11, Function-Local Static Variables, and Multithreading

Posted by tivolo on 30 November 2013 - 08:51 AM


Luckily, only our IDE is Visual Studio 2012, but the actual compiler is Sony’s for PlayStation Vita, which is a heavily modified GCC compiler.

 

I would still check the disassembly to be sure, don't know which GCC version the Vita compiler is based on.




#5103090 Practical benefits of the restrict keyword

Posted by tivolo on 21 October 2013 - 05:09 AM


@Tivolo: how do you store entity components in your engine ? Currently I use an std::vector of pointers allocated from a pool. Can I apply the restrict keyword in this case ?

 

The restrict keyword is not transitive, which means that storing a std::vector<MyComponent* restrict> doesn't make a difference, as long as you don't also access them by using a restricted pointer, e.g. MyComponent* restrict comp = componentVector[index];

 

In my engine, the data of individual components is stored in SoA-fashion, using plain arrays that point into a contiguous chunk of memory.




#5098816 new[] is flawed?

Posted by tivolo on 04 October 2013 - 03:04 PM

To answer the original question: yes, new[] is a bit flawed in that regard.

 

As soon as you allocate an array of non-PODs with new[], most compilers will add an extra 4 bytes to store the number of elements in the array. This is needed in order to call the destructors as soon as delete[] is invoked. I've written about the mechanism in more detail on my blog. I've yet to see a compiler which does it differently, do note though that this is not required per the standard - the compiler can store this book-keeping data where and how he sees fit. 

 

Note that if you properly declare the corresponding non-POD class type as being aligned (e.g. using __declspec(align(n)) on MSVC), you can use your own overload of operator new, and will get a properly aligned pointer. However, if you need to allocate an array of aligned non-PODs whose declaration you cannot change (e.g. a class type in a 3rd-party library), you're basically out of luck. Because of that, several engines I know of do not use new, delete, new[] or delete[], but rather their own implementations.

 

Hope that helps!




#5095434 Can pointer wrappers increase performance by making compiler analysis easier?

Posted by tivolo on 20 September 2013 - 06:06 AM

 

3.Will passing everything around as const references improve performance?A lot of sources, like this for instance ftp://ftp.sgi.com/sgi/audio/audio.apps/dev/aliasing.html talk about how you can make the compiler's job easier by pretty much restricting and/or constifying inputs everywhere.(maybe I'm not getting it right).
 

 

The const-qualifier in C++ doesn't help the compiler in making better optimizations. The compiler cannot assume that:

a) const-qualified variables are really never written to (after all, there's const_cast<>) and

b) that variables marked "const" don't alias.

"const" is there to prevent programmers from accidentally writing to data which is supposed to be read-only. Additionally, it serves as a kind of documentation or contract (if used correctly) to other programmers that use the API.

 

What really helps compiler optimizations is using the restrict keyword (a C99 keyword, understood by most C++ compilers) on pointers and class methods. Note that restrict-ing something is a promise to the compiler, though. If you break your promise, the code won't behave correctly, and can lead to hard-to-find bugs. Check this article for more information.

 

One final thing: with the restrict keyword, it doesn't matter whether the restricted pointers are const-qualified or not. After all, "restrict" promises that variables don't alias, hence read-only or write-access don't really play a role here.




#5094938 Total Data "Reentrancy": Fool's Errand?

Posted by tivolo on 18 September 2013 - 08:14 AM

I implemented such a system for the Molecule Engine I'm currently working on. Even though the engine is not completely finished yet, it has been such a time saver already. Being able to quickly iterate on things such as models, textures, shaders, etc. is paramount for fast iteration times. I've recently added support for runtime-compiled C++ code, so you can even change scripts/code on-the-fly, and see the changes immediately. I use this feature on a day-by-day basis to get things done, and move them into the engine core once I'm finished. I never want to go back to an engine that doesn't support hot-swapping of *everything*!

 

But as others already pointed out, it is quite a bit of work to get such a general system up-and-running, and of course it's easier if you planned for it right from the start. The way I do it is to use a completely separate content pipeline (= its own executable) that watches file system changes, compiles assets, and talks to the engine over TCP/IP. The engine only ever reads binary data (optimized for the platform), and all binary data must be prepared by the content pipeline - no parsing or similar is done at runtime!

 

This has several benefits: you can write your tools in any language, the engine executable is smaller (reading binary data is simpler than e.g. parsing XML), loading times are always fast, and you can change any content on-the-fly.

 

Almost all of the assets can be compiled in less than a second, so iteration times are really fast.

 

Be warned though: having such a system is addictive, you don't want to go back afterwards :). Finally, here are two videos that show the system in action: asset hot-swap, code hot-swap.




#5092286 How to save a scene in higher resolution?

Posted by tivolo on 07 September 2013 - 08:17 AM

If you want to support arbitrary resolutions and not be limited by either the size of the viewport or the render target, there are two possibilities I know of:

 

1) render the scene in tiles, e.g. 2x2, 3x3, 4x4, altering view- and projection-matrices accordingly, stitching together the results of the different renders.

2) use subpixel rendering, more or less. there's a bit of information here, and there's an article in game programming gems 4 as well.

 

note that both methods will likely cause problems with fullscreen effects, which have to be treated separately. your renderer/engine needs to be aware of the fact that it renders high-resolution screenshots, and e.g. treat all fullscreen effects differently.

 

hth,

-tiv




#5088464 Game engine unicode and multibyte support

Posted by tivolo on 23 August 2013 - 01:29 PM

The engine doesn't need it.  It needs to work with keys that map to displayable stuff.

 

The UI should have a class called LocalizedString.  You can create a localized string by passing in keys and looking it up in the localization database.  Internally the LocalizedString object might have a UTF8, extended ascii, 16-bit, 32-bit, or whatever other value-to-glyph representation you need; that should be isolated from everybody in the code, passing around LocalizedString objects instead.

 

+1, very solid advice. I have seen quite some abominations in the past, where strings of various kinds were passed around between engine, GUI and other systems. Sooner or later, somebody will break the system by introducing hard-coded text, text that cannot be properly localized, etc.

 

It really is best to have a single class for localized text (such as the LocalizedString proposed by frob), and ban all other strings from the engine completely. 99% percent of the time you do not need strings at runtime - the only exception being localized text.




#5058572 Row vs Column Major Matrices: Most Useful?

Posted by tivolo on 02 May 2013 - 03:28 AM

Ignoring row-major storage vs column-major storage for a moment, there's matrices that have been constructed to transform mathematical row-vectors and matrices that have been constructed to transform mathematical column-vectors.

 

Whether you're using row-vectors or column-vectors simply depends on whether you write vec * mat (mat1x4 * mat4x4), or mat * vec (mat4x4 * mat4x1). With the former type of mathematics, the forward/right/up axes will be in the first 3 rows, whereas with the latter, these axes will be in the first 3 columns (they're the mathematical transpose of each other).

 

Therefore, if you work with row-vectors, I'd prefer to store my matrices in row-major order, so that it's easy to extract the basis vectors (this also makes the matrix multiplication code simpler on a lot of hardware).

Likewise if you work with column-vectors, I'd use column-major storage order, for the same reason.

 

Personally, I work with column-vectors (and matrices designed to transform column-vectors), because that's the way that most maths material I've used teaches it -- it seems the be the prevalent convention. Therefore, I prefer column-major storage.

 

One place where I have to mix storage conventions is when saving space. Sometimes you want to pass matrices around as 3rows x 4columns, with a hard-coded/assumed 4th row of [0,0,0,1]. Often your registers (shader uniforms or CPU SIMD) are 4-wide, so this storage format isn't possible with column-major storage (or, it takes the same amount of space: 4 registers). So when storing a 3x4 matrix, I'd use row-major storage (which only uses 3 registers).

This same problem presents itself with the opposite convention: with row-vectors + row-major storage, you'd want to use a 4x3 matrix with an assumed 4th column of [0,0,0,1], forcing you to use column-major storage for space efficiency.

 

Very well put. It's often overlooked to differentiate between row vs. column-major matrices regarding operations, and row vs. column-major regarding storage.

Here's a bit more food for thought why one should try to go for column vectors and write multiplications like v' = M * v.




#5057881 What's the effect of a member volatile function?

Posted by tivolo on 29 April 2013 - 03:13 PM

But really volatile is so rarely used and even then doesn't really cut it for multithreaded stuff (which it is intended for), I wouldn't worry about it.

 

Volatile was never intended for multi-threaded use, people just abused it. It won't work on PowerPC, or any architecture with a weakly-ordered memory model.






PARTNERS