Jump to content

  • Log In with Google      Sign In   
  • Create Account

quarnster

Member Since 14 Jun 2009
Offline Last Active Feb 25 2014 06:46 AM

Posts I've Made

In Topic: Alignment requirements

06 December 2013 - 02:17 PM

For a reference see Intel® Advanced Vector Extensions Programming Reference:

 

Table 2-4. Instructions Requiring Explicitly Aligned Memory
Require 16-byte alignmentRequire 32-byte alignment
(V)MOVDQA xmm, m128      VMOVDQA ymm, m256
(V)MOVDQA m128, xmm      VMOVDQA m256, ymm
(V)MOVAPS xmm, m128      VMOVAPS ymm, m256
(V)MOVAPS m128, xmm      VMOVAPS m256, ymm
(V)MOVAPD xmm, m128      VMOVAPD ymm, m256
(V)MOVAPD m128, xmm      VMOVAPD m256, ymm
(V)MOVNTPS m128, xmm     VMOVNTPS m256, ymm
(V)MOVNTPD m128, xmm     VMOVNTPD m256, ymm
(V)MOVNTDQ m128, xmm     VMOVNTDQ m256, ymm
(V)MOVNTDQA xmm, m128    VMOVNTDQA ymm, m256
 
Table 2-5. Instructions Not Requiring Explicit Memory Alignment
(V)MOVDQU xmm, m128
(V)MOVDQU m128, m128
(V)MOVUPS xmm, m128
(V)MOVUPS m128, xmm
(V)MOVUPD xmm, m128
(V)MOVUPD m128, xmm
VMOVDQU ymm, m256
VMOVDQU m256, ymm
VMOVUPS ymm, m256
VMOVUPS m256, ymm
VMOVUPD ymm, m256
VMOVUPD m256, ymm

 

In http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf, we can read in section 3.6.4 that:

Misaligned data access can incur significant performance penalties. This is particularly true for cache line 
splits. The size of a cache line is 64 bytes in the Pentium 4 and other recent Intel processors, including 
processors based on Intel Core microarchitecture. 
 
An access to data unaligned on 64-byte boundary leads to two memory accesses and requires several 
µops to be executed (instead of one). Accesses that span 64-byte boundaries are likely to incur a large 
performance penalty, the cost of each stall generally are greater on machines with longer pipelines.


In Topic: Premature destruction of object in Android

09 March 2013 - 12:23 AM


@quarnster: Thanks a lot. I would probably not have found that one, at least not anytime soon. Does it matter where in the Android.mk file this line is added? Can I just add it at the end of the file?


It needs to be between the two include directives.

In Topic: Premature destruction of object in Android

08 March 2013 - 12:42 PM

About the ARM now, here is the compilation command of one source file from angelscript, compiling the ARM version for the NDK.

"Compile++ thumb : angelscript <= as_scriptengine.cpp
...-mthumb...

Others have touched the code since I first contributed the initial native arm call conventions so this might have changed, but they were originally not designed for interoperating with thumb mode. Your problem might be as easy as making the ndk compile your code in arm mode unless you absolutely need thumb. Just put this in your Android.mk:

LOCAL_ARM_MODE := arm

In Topic: Angelscript on Raspberry Pi

10 January 2013 - 04:55 AM

http://asarmjit.svn....asarmjit/trunk/  (edit: it seems this link doesn't work anymore. Perhaps it's possible to find the project on google? Or maybe quarnster has it somewhere else?)

It lives in https://github.com/quarnster/asarmjit nowadays, although it's untouched since I last worked on it so probably needs changes to work at all and I never got around to implementing a satisfying solution to the floating point operations, so none of those byte codes were ever implemented.

 

There's also my AOT compiler at https://github.com/quarnster/asaot which IIRC tested successfully on Android, but again has been untouched for a while so might need changes too.

 

Not entierly related to Angelscript, but I also have a small standalone fork of Mozilla's nanojit at https://github.com/quarnster/nanojit/tree/master/nanojit if you want to write your own platform independent jit. At the time of the fork the License was MPL1.1/GPL 2.0/LGPL 2.1, but looks like the TOT version http://hg.mozilla.org/tamarin-redux/file/f5191c18b0e4/nanojit has been updated to MPL2.0. I don't recall exactly what the obligations under MPL1.1 are, but I know that MPL2.0 is not viral so that you can statically link in MPL2.0 code without having to make your whole program's source code available, nor do you need to make the object code (for re-linking) available. If you make any changes to the MPL2.0 licensed files themselves you have to make those source code changes available though.


In Topic: Angelscript on Raspberry Pi

10 December 2012 - 03:56 AM

Long thread so this might have been mentioned already, but I wanted to make it explicit that passing floating point values in floating point registers (aka hard float) isn't a Raspberry Pi specific thing, and you can in fact get a soft-float (floating point values in int registers) Raspberry Pi OS from http://www.raspberrypi.org/downloads

If you make sure to never pass any floats between your program and pre compiled lib you could also (in theorytm) use the hard float call convention in your programs on Android/iPhone/WinCE by providing the appropriate float abi flags to the compilers.

PARTNERS