• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.

Ohforf sake

Members
  • Content count

    380
  • Joined

  • Last visited

Community Reputation

2052 Excellent

About Ohforf sake

  • Rank
    Member
  1. I'm amazed no one else has pointed this out yet, but does your girlfriend want to drive manual? I prefer manual (my only automatic experience was a rental in america) and I love to drive, but I wouldn't buy a car that my girlfriend refuses to use.   Heh, if only you could explain that to the insurance company. In my car, the ABS has this issue that once it kicks in, it won't let go unless you release the brake. Which means that if you break over a pothole or railway track, you loose most of your breaking force until you gain the mental presence to let go of the break and hit it again. Which is exactly what ABS is supposed to prevent. It's a VW btw...
  2. Hi everyone,   I have a question concerning the "proper" way of giving credit for CC-BY licensed work.   I would like to use a couple of 3D models from blendswap.com in academical research and publish those results (including the files which might require slight modifications). Some of those are licensed under CC-BY which requires giving proper credit to the author. In addition, all sources in publications must by properly cited anyways, so the same issue also holds for eg. CC-Zero.   My problem is, that the files only contain the internet alias of the respective authors. The CC - best practices guide seems to indicate that the nickname in combination with a link to the persons website is sufficient. This feels very wrong to me: The idea of giving credit is to, well, give credit to a person. A nickname usually only gives you anonymity. In addition, it is very uncommon in academia to cite a person's work by refering to the person's internet nickname.   How do you handle this in academical or non academical projects? Do you use the nicknames? Do you email each and every author and ask them for their preferred handle? Do you have a special CC credits section on your project website that you refer to (this is mentioned very briefly in the best practices guide)?   Any ideas or advice is appreciated.
  3. Thank you all for the feedback. ISPC looks interesting, but sadly the code is part of an elaborate template mechanism right now, so ISPC isn't really an option there. But it looks like a tool worth keeping in your toolbox. I was hoping that auto vectorization had progressed further after seeing some pretty impressive vectorizations for ARM-NEON. But given how fragile it is, also in your experience, I guess I'll go back to intrinsics.
  4. Hi everyone, I'm having a hard time getting the GCC auto vectorizer to auto vectorize. I believe that the problem has to to with its ability to figure out the stride/alignment of pointers. Consider the following minimal (not) working example: void func(const float *src, float *dst, const float *factors) { const float * __restrict__ alignedSrc = (const float *)__builtin_assume_aligned(src, 32); float * __restrict__ alignedDst = (float *)__builtin_assume_aligned(dst, 32); const float * __restrict__ unaliasedFactors = factors; enum { NUM_OUTER = 4, NUM_INNER = 32 }; for (unsigned k = 0; k < NUM_OUTER; k++) { const float factor = unaliasedFactors[k]; const float * __restrict__ srcChunk = alignedSrc + k * NUM_INNER; float * __restrict__ dstChunk = alignedDst + k * NUM_INNER; for (int j = 0; j < NUM_INNER; j++) dstChunk[j] = srcChunk[j] * factor; } } It is two nested loops, sequentially looping over an array of size 32*4. It gets four factors and multiplies the first 32 elements by the first factor, the next 32 elements by the second and so on. Results are stored sequentially in an output array. Now, I use "__builtin_assume_aligned" and "__restrict__" to tell the compiler that the arrays are 32 byte aligned and not aliased. This should be prime meat for a vectorizer. Sadly, the output looks like this: (compiled with -march=native -ffast-math -std=c++14 -O3 on gcc 4.9.2) 0000000000000000 <_ZN2ml3mlp4funcEPKfPfS2_>: 0: 4c 8d 54 24 08 lea 0x8(%rsp),%r10 5: 48 83 e4 e0 and $0xffffffffffffffe0,%rsp 9: 49 89 f0 mov %rsi,%r8 c: 41 ff 72 f8 pushq -0x8(%r10) 10: 55 push %rbp 11: 48 89 f9 mov %rdi,%rcx 14: 45 31 c9 xor %r9d,%r9d 17: 48 89 e5 mov %rsp,%rbp 1a: 41 56 push %r14 1c: 41 55 push %r13 1e: 41 54 push %r12 20: 41 52 push %r10 22: 53 push %rbx 23: 49 8d 40 20 lea 0x20(%r8),%rax 27: c5 fa 10 02 vmovss (%rdx),%xmm0 2b: 48 39 c1 cmp %rax,%rcx 2e: 73 0d jae 3d <_ZN2ml3mlp4funcEPKfPfS2_+0x3d> 30: 48 8d 41 20 lea 0x20(%rcx),%rax 34: 49 39 c0 cmp %rax,%r8 37: 0f 82 2b 02 00 00 jb 268 <_ZN2ml3mlp4funcEPKfPfS2_+0x268> 3d: 48 89 c8 mov %rcx,%rax 40: 83 e0 1f and $0x1f,%eax 43: 48 c1 e8 02 shr $0x2,%rax 47: 48 f7 d8 neg %rax 4a: 83 e0 07 and $0x7,%eax 4d: 0f 84 ed 01 00 00 je 240 <_ZN2ml3mlp4funcEPKfPfS2_+0x240> 53: c5 fa 59 09 vmulss (%rcx),%xmm0,%xmm1 57: c4 c1 7a 11 08 vmovss %xmm1,(%r8) 5c: 83 f8 01 cmp $0x1,%eax 5f: 0f 84 2b 02 00 00 je 290 <_ZN2ml3mlp4funcEPKfPfS2_+0x290> 65: c5 fa 59 49 04 vmulss 0x4(%rcx),%xmm0,%xmm1 6a: c4 c1 7a 11 48 04 vmovss %xmm1,0x4(%r8) 70: 83 f8 02 cmp $0x2,%eax 73: 0f 84 8f 02 00 00 je 308 <_ZN2ml3mlp4funcEPKfPfS2_+0x308> 79: c5 fa 59 49 08 vmulss 0x8(%rcx),%xmm0,%xmm1 7e: c4 c1 7a 11 48 08 vmovss %xmm1,0x8(%r8) 84: 83 f8 03 cmp $0x3,%eax 87: 0f 84 63 02 00 00 je 2f0 <_ZN2ml3mlp4funcEPKfPfS2_+0x2f0> 8d: c5 fa 59 49 0c vmulss 0xc(%rcx),%xmm0,%xmm1 92: c4 c1 7a 11 48 0c vmovss %xmm1,0xc(%r8) 98: 83 f8 04 cmp $0x4,%eax 9b: 0f 84 37 02 00 00 je 2d8 <_ZN2ml3mlp4funcEPKfPfS2_+0x2d8> a1: c5 fa 59 49 10 vmulss 0x10(%rcx),%xmm0,%xmm1 a6: c4 c1 7a 11 48 10 vmovss %xmm1,0x10(%r8) ac: 83 f8 05 cmp $0x5,%eax af: 0f 84 0b 02 00 00 je 2c0 <_ZN2ml3mlp4funcEPKfPfS2_+0x2c0> b5: c5 fa 59 49 14 vmulss 0x14(%rcx),%xmm0,%xmm1 ba: c4 c1 7a 11 48 14 vmovss %xmm1,0x14(%r8) c0: 83 f8 07 cmp $0x7,%eax c3: 0f 85 df 01 00 00 jne 2a8 <_ZN2ml3mlp4funcEPKfPfS2_+0x2a8> c9: c5 fa 59 49 18 vmulss 0x18(%rcx),%xmm0,%xmm1 ce: 41 bb 19 00 00 00 mov $0x19,%r11d d4: 41 ba 07 00 00 00 mov $0x7,%r10d da: c4 c1 7a 11 48 18 vmovss %xmm1,0x18(%r8) e0: bb 20 00 00 00 mov $0x20,%ebx e5: 41 89 c5 mov %eax,%r13d e8: 41 bc 18 00 00 00 mov $0x18,%r12d ee: 29 c3 sub %eax,%ebx f0: 41 be 03 00 00 00 mov $0x3,%r14d f6: 4b 8d 04 a9 lea (%r9,%r13,4),%rax fa: c4 e2 7d 18 c8 vbroadcastss %xmm0,%ymm1 ff: 4c 8d 2c 07 lea (%rdi,%rax,1),%r13 103: 48 01 f0 add %rsi,%rax 106: c4 c1 74 59 55 00 vmulps 0x0(%r13),%ymm1,%ymm2 10c: c5 fc 11 10 vmovups %ymm2,(%rax) 110: c4 c1 74 59 55 20 vmulps 0x20(%r13),%ymm1,%ymm2 116: c5 fc 11 50 20 vmovups %ymm2,0x20(%rax) 11b: c4 c1 74 59 55 40 vmulps 0x40(%r13),%ymm1,%ymm2 121: c5 fc 11 50 40 vmovups %ymm2,0x40(%rax) 126: 41 83 fe 04 cmp $0x4,%r14d 12a: 75 0b jne 137 <_ZN2ml3mlp4funcEPKfPfS2_+0x137> 12c: c4 c1 74 59 4d 60 vmulps 0x60(%r13),%ymm1,%ymm1 132: c5 fc 11 48 60 vmovups %ymm1,0x60(%rax) 137: 43 8d 04 22 lea (%r10,%r12,1),%eax 13b: 45 89 da mov %r11d,%r10d 13e: 45 29 e2 sub %r12d,%r10d 141: 44 39 e3 cmp %r12d,%ebx 144: 0f 84 c5 00 00 00 je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 14a: 4c 63 d8 movslq %eax,%r11 14d: 4f 8d 1c 99 lea (%r9,%r11,4),%r11 151: c4 a1 7a 59 0c 1f vmulss (%rdi,%r11,1),%xmm0,%xmm1 157: c4 a1 7a 11 0c 1e vmovss %xmm1,(%rsi,%r11,1) 15d: 44 8d 58 01 lea 0x1(%rax),%r11d 161: 41 83 fa 01 cmp $0x1,%r10d 165: 0f 84 a4 00 00 00 je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 16b: 4d 63 db movslq %r11d,%r11 16e: 4f 8d 1c 99 lea (%r9,%r11,4),%r11 172: c4 a1 7a 59 0c 1f vmulss (%rdi,%r11,1),%xmm0,%xmm1 178: c4 a1 7a 11 0c 1e vmovss %xmm1,(%rsi,%r11,1) 17e: 44 8d 58 02 lea 0x2(%rax),%r11d 182: 41 83 fa 02 cmp $0x2,%r10d 186: 0f 84 83 00 00 00 je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 18c: 4d 63 db movslq %r11d,%r11 18f: 4f 8d 1c 99 lea (%r9,%r11,4),%r11 193: c4 a1 7a 59 0c 1f vmulss (%rdi,%r11,1),%xmm0,%xmm1 199: c4 a1 7a 11 0c 1e vmovss %xmm1,(%rsi,%r11,1) 19f: 44 8d 58 03 lea 0x3(%rax),%r11d 1a3: 41 83 fa 03 cmp $0x3,%r10d 1a7: 74 66 je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 1a9: 4d 63 db movslq %r11d,%r11 1ac: 4f 8d 1c 99 lea (%r9,%r11,4),%r11 1b0: c4 a1 7a 59 0c 1f vmulss (%rdi,%r11,1),%xmm0,%xmm1 1b6: c4 a1 7a 11 0c 1e vmovss %xmm1,(%rsi,%r11,1) 1bc: 44 8d 58 04 lea 0x4(%rax),%r11d 1c0: 41 83 fa 04 cmp $0x4,%r10d 1c4: 74 49 je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 1c6: 4d 63 db movslq %r11d,%r11 1c9: 4f 8d 1c 99 lea (%r9,%r11,4),%r11 1cd: c4 a1 7a 59 0c 1f vmulss (%rdi,%r11,1),%xmm0,%xmm1 1d3: c4 a1 7a 11 0c 1e vmovss %xmm1,(%rsi,%r11,1) 1d9: 44 8d 58 05 lea 0x5(%rax),%r11d 1dd: 41 83 fa 05 cmp $0x5,%r10d 1e1: 74 2c je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 1e3: 4d 63 db movslq %r11d,%r11 1e6: 83 c0 06 add $0x6,%eax 1e9: 4f 8d 1c 99 lea (%r9,%r11,4),%r11 1ed: c4 a1 7a 59 0c 1f vmulss (%rdi,%r11,1),%xmm0,%xmm1 1f3: c4 a1 7a 11 0c 1e vmovss %xmm1,(%rsi,%r11,1) 1f9: 41 83 fa 06 cmp $0x6,%r10d 1fd: 74 10 je 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 1ff: 48 98 cltq 201: 49 8d 04 81 lea (%r9,%rax,4),%rax 205: c5 fa 59 04 07 vmulss (%rdi,%rax,1),%xmm0,%xmm0 20a: c5 fa 11 04 06 vmovss %xmm0,(%rsi,%rax,1) 20f: 49 83 e9 80 sub $0xffffffffffffff80,%r9 213: 48 83 c2 04 add $0x4,%rdx 217: 49 83 e8 80 sub $0xffffffffffffff80,%r8 21b: 48 83 e9 80 sub $0xffffffffffffff80,%rcx 21f: 49 81 f9 00 02 00 00 cmp $0x200,%r9 226: 0f 85 f7 fd ff ff jne 23 <_ZN2ml3mlp4funcEPKfPfS2_+0x23> 22c: c5 f8 77 vzeroupper 22f: 5b pop %rbx 230: 41 5a pop %r10 232: 41 5c pop %r12 234: 41 5d pop %r13 236: 41 5e pop %r14 238: 5d pop %rbp 239: 49 8d 62 f8 lea -0x8(%r10),%rsp 23d: c3 retq 23e: 66 90 xchg %ax,%ax 240: 41 bc 20 00 00 00 mov $0x20,%r12d 246: 41 be 04 00 00 00 mov $0x4,%r14d 24c: bb 20 00 00 00 mov $0x20,%ebx 251: 45 31 ed xor %r13d,%r13d 254: 41 bb 20 00 00 00 mov $0x20,%r11d 25a: 45 31 d2 xor %r10d,%r10d 25d: e9 94 fe ff ff jmpq f6 <_ZN2ml3mlp4funcEPKfPfS2_+0xf6> 262: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 268: 31 c0 xor %eax,%eax 26a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 270: c5 fa 59 0c 01 vmulss (%rcx,%rax,1),%xmm0,%xmm1 275: c4 c1 7a 11 0c 00 vmovss %xmm1,(%r8,%rax,1) 27b: 48 83 c0 04 add $0x4,%rax 27f: 48 3d 80 00 00 00 cmp $0x80,%rax 285: 75 e9 jne 270 <_ZN2ml3mlp4funcEPKfPfS2_+0x270> 287: eb 86 jmp 20f <_ZN2ml3mlp4funcEPKfPfS2_+0x20f> 289: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 290: 41 bb 1f 00 00 00 mov $0x1f,%r11d 296: 41 ba 01 00 00 00 mov $0x1,%r10d 29c: e9 3f fe ff ff jmpq e0 <_ZN2ml3mlp4funcEPKfPfS2_+0xe0> 2a1: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 2a8: 41 bb 1a 00 00 00 mov $0x1a,%r11d 2ae: 41 ba 06 00 00 00 mov $0x6,%r10d 2b4: e9 27 fe ff ff jmpq e0 <_ZN2ml3mlp4funcEPKfPfS2_+0xe0> 2b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 2c0: 41 bb 1b 00 00 00 mov $0x1b,%r11d 2c6: 41 ba 05 00 00 00 mov $0x5,%r10d 2cc: e9 0f fe ff ff jmpq e0 <_ZN2ml3mlp4funcEPKfPfS2_+0xe0> 2d1: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 2d8: 41 bb 1c 00 00 00 mov $0x1c,%r11d 2de: 41 ba 04 00 00 00 mov $0x4,%r10d 2e4: e9 f7 fd ff ff jmpq e0 <_ZN2ml3mlp4funcEPKfPfS2_+0xe0> 2e9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 2f0: 41 bb 1d 00 00 00 mov $0x1d,%r11d 2f6: 41 ba 03 00 00 00 mov $0x3,%r10d 2fc: e9 df fd ff ff jmpq e0 <_ZN2ml3mlp4funcEPKfPfS2_+0xe0> 301: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 308: 41 bb 1e 00 00 00 mov $0x1e,%r11d 30e: 41 ba 02 00 00 00 mov $0x2,%r10d 314: e9 c7 fd ff ff jmpq e0 <_ZN2ml3mlp4funcEPKfPfS2_+0xe0> There is some vectorization happening there, but most of the code is scalar and looks like some kind of duffs device. I played around with this and found out that the following "hint" procduces the output that I want:   void func(const float *src, float *dst, const float *factors) { const float * __restrict__ alignedSrc = (const float *)__builtin_assume_aligned(src, 32); float * __restrict__ alignedDst = (float *)__builtin_assume_aligned(dst, 32); const float * __restrict__ unaliasedFactors = factors; enum { NUM_OUTER = 4, NUM_INNER = 32 }; for (unsigned k = 0; k < NUM_OUTER; k++) { const float factor = unaliasedFactors[k]; const float * __restrict__ srcChunk = alignedSrc + k * NUM_INNER; float * __restrict__ dstChunk = alignedDst + k * NUM_INNER; // <HINT> if (NUM_INNER % 8 == 0) { // the gcc tree vectorizer won't recognize this on its own?!? srcChunk = (const float *)__builtin_assume_aligned(srcChunk, 32); dstChunk = (float *)__builtin_assume_aligned(dstChunk, 32); } // </HINT> for (int j = 0; j < NUM_INNER; j++) dstChunk[j] = srcChunk[j] * factor; } } 0000000000000000 <_ZN2ml3mlp4funcEPKfPfS2_>: 0: 48 8d 8f 00 02 00 00 lea 0x200(%rdi),%rcx 7: 48 8d 46 20 lea 0x20(%rsi),%rax b: c5 fa 10 02 vmovss (%rdx),%xmm0 f: 48 39 f8 cmp %rdi,%rax 12: 76 09 jbe 1d <_ZN2ml3mlp4funcEPKfPfS2_+0x1d> 14: 48 8d 47 20 lea 0x20(%rdi),%rax 18: 48 39 f0 cmp %rsi,%rax 1b: 77 43 ja 60 <_ZN2ml3mlp4funcEPKfPfS2_+0x60> 1d: c4 e2 7d 18 c0 vbroadcastss %xmm0,%ymm0 22: c5 fc 59 0f vmulps (%rdi),%ymm0,%ymm1 26: c5 fc 29 0e vmovaps %ymm1,(%rsi) 2a: c5 fc 59 4f 20 vmulps 0x20(%rdi),%ymm0,%ymm1 2f: c5 fc 29 4e 20 vmovaps %ymm1,0x20(%rsi) 34: c5 fc 59 4f 40 vmulps 0x40(%rdi),%ymm0,%ymm1 39: c5 fc 29 4e 40 vmovaps %ymm1,0x40(%rsi) 3e: c5 fc 59 47 60 vmulps 0x60(%rdi),%ymm0,%ymm0 43: c5 fc 29 46 60 vmovaps %ymm0,0x60(%rsi) 48: 48 83 ef 80 sub $0xffffffffffffff80,%rdi 4c: 48 83 c2 04 add $0x4,%rdx 50: 48 83 ee 80 sub $0xffffffffffffff80,%rsi 54: 48 39 cf cmp %rcx,%rdi 57: 75 ae jne 7 <_ZN2ml3mlp4funcEPKfPfS2_+0x7> 59: c5 f8 77 vzeroupper 5c: c3 retq 5d: 0f 1f 00 nopl (%rax) 60: 31 c0 xor %eax,%eax 62: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 68: c5 fa 59 0c 07 vmulss (%rdi,%rax,1),%xmm0,%xmm1 6d: c5 fa 11 0c 06 vmovss %xmm1,(%rsi,%rax,1) 72: 48 83 c0 04 add $0x4,%rax 76: 48 3d 80 00 00 00 cmp $0x80,%rax 7c: 75 ea jne 68 <_ZN2ml3mlp4funcEPKfPfS2_+0x68> 7e: eb c8 jmp 48 <_ZN2ml3mlp4funcEPKfPfS2_+0x48> This is more in line with what I wanted and it is actually twice as fast. In my real code, the speed difference is even bigger. Both versions produce correct output. Note that for NUM_INNER % 8 == 0, alignedSrc + k * NUM_INNER is always 32 byte aligned iff alignedSrc is 32 byte aligned. This is s.th. the compiler should be able to figure out on its own. Or am I missing s.th. here? Do you have any experience with this, or any advice on how to fix it without resorting to lots of hand crafted "hints" throughout the code? Do I really have to provide such alignment hints for every strided access that's happening? Thanks in advance for any help or advice with this.
  5. Two months is not a lot of time, especially since you (presumably?) won't be working full time on it.   Anyways, here are two programming heavy ideas from the top of my head:   Global Illumination is kinda hard, especially given your limited time frame and experience. However, this comes to mind: http://codeflow.org/entries/2012/aug/25/webgl-deferred-irradiance-volumes/ It is a neat approach that might be feasible in the two months if you push yourself a bit. There seems to be code if you get stuck and you might be able to come up with some creative improvement.   Another idea I always wanted to implement that involves light, although in an unusual way, would be a content creation tool that helps with texturing models. Usually, triangle meshes are unwrapped and then the textures are painted directly. You could create a tool where instead of directly painting the final texture(s), you set up a couple of projectors around the object, like spot lights which project an image (hence the need for light and shadows). The artists would then paint the images of those projectors and the final model textures would be baked by your tool. To some degree this is already supported in the major modelling packages but you could enhance it by allowing projectors to mix and combine colors so that reusable dirt or rust decals can be added on top. You could also allow the projectors to not only affect the color textures, but also the textures for the other material parameters. The downside is that such a tool can be rather GUI heavy. The upside is that you can easily "scale" the project according to your progress. Eg start with a purely non-gui application that reads the projector positions and mesh from a blender export, displays a preview, and performs the bake. Then add features like GUI, different projector types, etc until the two months are over.
  6. Thanks for sharing your code.   However, I do share the sentiment of the others that this is can be a bad idea. While it is true that 99% of all players, artists, and programmers won't be able to tell that the normalmaps are broken, they will be able to tell that it looks bad, or at least not "right". Trust me, I've been there, done that. Also on a commercial project.   The real problem comes later though. Once a significant amount of all materials have broken normalmaps, the spec/gloss maps are adapted to it to somehow counteract the effect. Then the lighting. All of a sudden you can no longer change individual assets to "good" normalmaps because it would break the entire setup. And before you know it, "bad looking" becomes your new art style that every new asset has to adhere to, because otherwise the game would not look coherent.   If you don't have the time or ressources to make actual normalmaps, then not using normalmaps or using funky normalmaps might be the right choice. There are very good looking games out there that aren't photo realistic. But it should be a conscious choice.
  7. For large amounts of data, there are also SIMD intrinsics that can do this: half -> float: _mm_cvtph_ps and _mm256_cvtph_ps float -> half: _mm_cvtps_ph and _mm256_cvtps_ph see https://software.intel.com/sites/landingpage/IntrinsicsGuide/ Oh, I just noticed you aren't doing this on a PC. But some ARM processors support similar conversion functions. See for example: https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
  8. I'm guessing here that by "decimal" you mean "as text"? If so, this is your problem: The ostream::operator<< operator always outputs text, even if the stream is set to binary. This is a bit braindead, I know... Try: std::uint8_t F = 10111001; // btw, this is missing a 0b prefix, it should be 0b10111001 std::ofstream K("C:/Users/WDR/Desktop/kml.enc", std::ios::binary); for(int i = 0; i < 256; i++) { K.write(&F, sizeof(F)); }You should get a 256 byte long file where every byte is 0b10111001.
  9. Peak performance (in FLoating point OPerations per Second = FLOPS) is the theoretical upper limit on how many computations a device can sustain per second. If a Titan X were doing nothing else than computing 1 + 2 * 3 then it could do that 3 072 000 000 000 times per second and since there are two operations in there (an addition and a multiplication) this amounts to 6 144 000 000 000 FLOPS or about 6.144 TFLOPS. But you only get that speed if you never read any data or write back any results or do anything else other than a multiply followed by an addition.   A "thread" (and Krohm rightfully warned of its use as a marketing buzzword) is generally understood to be an execution context. If a device executes a program, this refers to the current state, such as the current position in the program, the current values of the local variables, etc.   Threads and peak performance are two entirely different things!   Some compute devices (some Intel CPUs, some AMD CPUs, SUN niagara CPUs and most GPUs) can store more than one execution context aka "thread" on the chip so that they can interleave the execution of both/all of them. This sometimes falls under the term of "hardware-threads", at least for CPUs. And this is done for performance reasons. But it does not affect the theoretical peak performance of the device, only how much of that you can actually use. And the direct relationship between the maximum number of hardware threads, the used number of hardware threads, and the achieved performance ... is very complicated. It depends on lots of different factors like memory throughput, memory latency, access patterns, the actual algorithm, and so on. So if this is what you are asking about, then you might have to look into how GPUs work and how certain algorithms make use of that.
  10. I'm not very fluent with java, but I would be suprised if there wasn't a less "probabilistic" approach to playing videos ;-)   Anyhow, about the alpha channel: Have you considered using a second greyscale video stream for the alpha channel? You might have to "blur" the transparency borders of the color stream, similarly to how it's done with eg. foliage textures. Bitrate would be slightly inferior to a specialized codec that exploits the coherence between alpha and color channels, but you can use pretty much any codec pair with whatever bitrate you choose. You can even user a different bitrate (or resolution) for the alpha channel.
  11. This is actually quite specific: http://insight-labs.org/?p=1682 Though there is something I don't get: As frob pointed out, one result of this is usually that the offending ip ranges are blocked and remain blocked for some time. Which, presumably, is exactly what they want: No access to those two git-hub projects from within china. But if that was their goal, why would they design the attack in a way that all the requests originate from outside of china?
  12. Is it possible that you instantiate "TextField" before you initialize SDL? SDL might reset the unicode behavior uppon initialization.
  13. That's awesome! My first thought was "Nice render!!" Hmmm, needs a different lighting model though, the phong shading looks way too much like plastic.
  14. Just to clarify, this is what can happen when the heuristic is (extremely) pessimistic: [attachment=26575:nonAdmissible.png]   Usually a small search space, but sometimes non optimal solution.