I'm not sure why you need that setting with physics, but let me explain what it does:
When you allocate memory it is always on some virtual address (which is located on some physical memory page ~ when you're working with it; it can be swap-in and swap-out between physical memory and hard drive) ... these virtual pages tend to have size of 4KiB (because 4KiB is base physical page size on x86), or basically a multiply of 4KiB - that is just fyi).
Now, on each modern CPU you have so called FPU & SIMD processor. It is Floating Point Unit & Single-Instruction-Multiple-Data ~ you have multiple values in single register and perform single operation over all of them (4x float in SIMD SSE is well know ~ 4x float ~ 4D vector).
Let's continue (I will describe just details for SIMD as I don't remember FPU specifications by heart), when you're reading data from memory to register, there is one instruction that allows you to quickly load data from memory into this SIMD register; and one to save - they are 0F 28 and 0F 29 in assembly written as MOVAPS. These instructions perform fast load of 16 bytes from memory address (or another register) or fast save, there is just one condition - the memory address must be aligned on 16-bytes boundary (physically!).
This can become a bit problematic when one use virtual memory, although there is one nice property we have - each virtual address of 0x0 definitely begins at 16-byte boundary (because it has to be assigned with physical page ~ which always begins at 16-byte boundary); and it always takes whole such page. E.g. when virtual address X modulo 16 is 0, the physical address during the computation will definitely also be modulo 0 equal to 0 (the modulo operation here means that given address is 16-byte aligned).
Now, what you, as a programmer need to know - all the allocations & delocations (both stack, and heap based) must be performed on 16-byte boundaries ~ e.g. each such address must be equal to 0 after 'mod 16' operation. There are OS-specific functions to handle the heap allocation correctly (_aligned_malloc under Windows OS, posix_memalign under POSIX-based OS, etc.); stack based allocations must be hand-specified to the compiler (using __declspec(align(16)) under MSVC or __attribute__((aligned(16))) under GCC).
Now the previous also applies for doubles & long long (although they are on 8-byte boundary, not 16-byte ... fyi, there are also 32-byte registers and on some CPU architectures even larger); The mentioned compiler directive -malign-double forces all doubles and long long to be aligned at 8-byte boundary (as they will actually use aligned alternatives of load/store instructions resulting in better performace).
Nothing is free though - in case you have structure where there is 8-byte double (aligned on 8-byte boundary) and 1 byte - you have to add unused 7 bytes as pad (e.g. in general your memory usage can be increased).
My apologize if I wen't a bit too much into hardware ~ but I wanted to share info about concepts why it is like this.
EDIT: So in general it isn't that bad (it can actually be good and yield better performance), yet there might be some troubles (and crashes) when using memory alignment concept without further knowledge behind it.