# new[] is flawed?

This topic is 1562 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hey guys, I have a strange problem and I don't know how best to word this.

At my work we have a custom memory manager. It works just fine, but the man who wrote it says that new[] cannot be used because of underlying issues regarding alignment, he thinks. It's been awhile since he wrote this code and he doesn't quite remember.

But when I was writing code the other day, using new[], I found an issue where our memory manager would return a pointer to the overloaded new[] but the pointer I received from the actual new[] was four bytes off. Let me give you some pseudocode to better describe this.

void* operator new( int size )
{
void* memory = m_MemoryManager->AllocateSize( size ); // Let's say the pointer was 4
return memory; // Pointer is still 4
}

...

x* stuff = new x[ 4 ]; // x's pointer will now be 8.


The weird thing is that the pointer offsetting is happening when the stack is unwinding, some space between the actual function and the place where "stuff" is being allocated.

I imagine this mystery code is where the compiler actually invokes the constructor for each element of the array, but why would it alter the address?

Can someone explain like I'm five what is actually going on here?

(This is not in a multi-threaded environment, so nothing should be altering the pointer from underneath me.)

##### Share on other sites

Hmm I think I remember what you're talking about, somethinga bout knowing how many objects to call the destructor for. The strange thing was that it wouldn't do it on some pointers, but would on others. I just wanted to make sure there wasn't a way to solve this issue, because I really like having arrays of objects. It's so nice for cache coherency!

##### Share on other sites

That would make sense. I'd have loved for some consistency so I could have accounted for it though...

##### Share on other sites

If you are allocating 4 ints, and the pointer moves from, say, 0x04 to 0x08, doesn't this mean that your 'size' variable is also getting changed, to add 4 extra bytes?
So there's nothing you really need to take account for, since it is automatically handled.

Here's a quick test to be sure:

#include <iostream>
#include <cstdint>
using namespace std;

void* operator new( size_t size )
{
std::cout << "The size 'new' is actually asking for: " << size << " bytes." << std::endl;
void* memory = malloc( size ); // Let's say the pointer was 4

std::cout << "Original address: " << memory << std::endl;

return memory; // Pointer is still 4
}

class Object
{
public:
Object() = default;
virtual ~Object() = default; //Virtual, to give it a vtable, to ensure it's non-POD.

int meow = 357;
};

struct POD
{
uint64_t A;
uint64_t B;
};

int main()
{
const int NumObjectsToAllocate = 4;
std::cout << "Sizeof 'Object': " << sizeof(Object) << std::endl;
std::cout << "Allocating " << NumObjectsToAllocate
<< " Objects should take " << (NumObjectsToAllocate * sizeof(Object))
<< " bytes." << std::endl;

Object *stuff = new Object[ NumObjectsToAllocate ];

std::cout << "Resulting address: " << stuff << std::endl;

delete[] stuff;

std::cout << "---------------------------------------" << std::endl;

const int NumOfFloatsToAllocate = 4;
std::cout << "Sizeof 'float': " << sizeof(float) << std::endl;
std::cout << "Allocating " << NumOfFloatsToAllocate
<< " floats should take " << (NumOfFloatsToAllocate * sizeof(float))
<< " bytes." << std::endl;

float *floats = new float[ NumOfFloatsToAllocate ];

std::cout << "Resulting address: " << floats << std::endl;

delete[] floats;

std::cout << "---------------------------------------" << std::endl;

const int NumOfPodsToAllocate = 4;
std::cout << "Sizeof 'Pod': " << sizeof(POD) << std::endl;
std::cout << "Allocating " << NumOfPodsToAllocate
<< " pods should take " << (NumOfPodsToAllocate * sizeof(POD))
<< " bytes." << std::endl;

POD *pods = new POD[ NumOfPodsToAllocate ];

std::cout << "Resulting address: " << pods << std::endl;

delete[] pods;

return 0;
}


Results (with this compiler):
Sizeof 'Object': 8
Allocating 4 Objects should take 32 bytes.
The size 'new' is actually asking for: 36 bytes.
---------------------------------------
Sizeof 'float': 4
Allocating 4 floats should take 16 bytes.
The size 'new' is actually asking for: 16 bytes.
---------------------------------------
Sizeof 'Pod': 16
Allocating 4 pods should take 64 bytes.
The size 'new' is actually asking for: 64 bytes.

So it's not just changing the pointer address, but it's also adding 4 bytes to the total size allocated. There's nothing you need to take into account.

I'm purely speculating here... but since (with this specific compiler) POD structs also don't require any extra bytes, I wonder if the 4 extra bytes for non-POD types is actually a pointer to the originally allocated type's destructor?

Imagine:

Base *objects = new Derived[10]; //All these are guaranteed to be the same type: Derived.
delete[] objects; //So they should all use the same destructor: Derived's destructor (if Base's destructor was virtual).

The compiler would know that 'objects' can be treated as type 'Base', but for destruction purposes might not realize that they are actually 'Derived', so maybe the first 4 bytes (sizeof a pointer-to-func on a 32 bit machine) are pointing at the correct destructor to call.

[/end-of-amature-speculation-from-someone-who-doesn't-know-assembly-or-the-inner-workings-of-compilers]

But regardless of what the compiler is using it for, it's taken care of for you, so there is nothing you need to manually take into account. Yes, it's allocating some extra bytes, but at the same time it's offsetting the pointer, and upon destruction it's destroying the correct number of bytes. There's no problem, unless you needed something to be guaranteed to be at a specific address in memory because of some esoteric hardware architecture - and in that really obscure, really unusual circumstance, then that's when you use malloc() directly.

##### Share on other sites

A reasonable speculation, SotL, but you forget one critical detail: if the four bytes were indeed a destructor pointer, they would (A) need to be 8 bytes on 64-bit platforms and (B) obviate the need for virtual destructors when using arrays. which is definitely not the case.

##### Share on other sites

Why don't they store the housekeeping data preceding the allocation then, making sure the returned pointer is aligned? Operator delete[] could find the housekeeping information based on the pointer passed to it, via the magic of subtraction.

Seems a no-brainer to me... (maybe the implementation was implemented before alignment became a major issue though, and it is retained for backwards compatibility).

##### Share on other sites

Why don't they store the housekeeping data preceding the allocation then, making sure the returned pointer is aligned? Operator delete[] could find the housekeeping information based on the pointer passed to it, via the magic of subtraction.

Because you don't know that the address there is writable.

##### Share on other sites

The OS does though, unless you use a placement form, is that the reason then?

##### Share on other sites

I think you're making some very unsound assumptions about generalities of memory allocation schemes. Why would the C++ implementation be able to assume that the memory directly preceding that returned from an allocator be writable? If the underlying allocator grabs whole pages at a time from the OS that preceding memory could very well be a page with the write permission disabled.

##### Share on other sites

Maybe it's ASLR? But I don't know if it's effects would be detectable in a debugger like in your case.

See this compiler option: http://msdn.microsoft.com/en-us/library/bb384887.aspx

And I think in Windows 7 ASLR is always used, even for programs compiled without that option?

##### Share on other sites

The new[] implementation should ask for more memory than required (it already does, it asks for the housekeeping data amount extra), but it should ask for enough to make sure the address minus the housekeeping data size is aligned, then return the address which is correctly aligned, beyond the housekeeping data.

EDIT: I realise this method could require an entire (aligned size - housekeeping size) amount of memory to be wasted. What is probably needed is a general purpose allocator that can handle requests such as e.g. "give me x bytes at address of your choosing p, but I want the address (p+4) to be aligned on a 16 byte boundary").

EDIT2: Corrected the EDIT.

##### Share on other sites

Where does it say that existing C++ implementations aren't already doing that?

##### Share on other sites

Evidence (maybe anecdotal!) from this thread... getting back unaligned requests for arrays from new[].

Aligned malloc usually only takes a size and an alignment restriction as well. It probably needs another constraint as an argument (i.e. the offset from the base address returned which needs to be aligned).

##### Share on other sites

Maybe it's ASLR? But I don't know if it's effects would be detectable in a debugger like in your case.

See this compiler option: http://msdn.microsoft.com/en-us/library/bb384887.aspx

And I think in Windows 7 ASLR is always used, even for programs compiled without that option?

ASLR has nothing to do with this. It's purely a function of memory obtained from the OS which is going to be a level lower than memory obtained from the language runtime (or a custom allocator).

##### Share on other sites

Evidence (maybe anecdotal!) from this thread... getting back unaligned requests for arrays from new[].

Aligned malloc usually only takes a size and an alignment restriction as well. It probably needs another constraint as an argument (i.e. the offset from the base address returned which needs to be aligned).

Nothing presented so far in this thread has shown an allocation not suitable for the data it was allocated for.

##### Share on other sites

Yeah, that's why I added "maybe anecdotal" ;)

##### Share on other sites

Evidence from this thread... getting back unaligned requests for arrays from new[].

The only evidence in this thread is SotL's post and all the memory addresses are properly aligned for the types involved. That's not even anecdotal evidence since what you claim to be seeing hasn't actually shown up.

##### Share on other sites

At my work we have a custom memory manager. It works just fine, but the man who wrote it says that new[] cannot be used because of underlying issues regarding alignment, he thinks. It's been awhile since he wrote this code and he doesn't quite remember.

There ya go... anecdotal evidence right there in the OP! His colleague thinks he had an anecdote but maybe he didn't quite remember!

I'm not trying to start an argument... I just said how I think it should work (in an ideal world), you jumped on it, then I said what it should do implementation wise to ensure it can cope, and you said how do I know it doesn't do it anyway! Time for more beers I think!!!

EDIT: Lulz I've annoyed someone ;) As I said, not trying to start an argument...

##### Share on other sites

Maybe it's ASLR? But I don't know if it's effects would be detectable in a debugger like in your case.

See this compiler option: http://msdn.microsoft.com/en-us/library/bb384887.aspx

And I think in Windows 7 ASLR is always used, even for programs compiled without that option?

ASLR has nothing to do with this. It's purely a function of memory obtained from the OS which is going to be a level lower than memory obtained from the language runtime (or a custom allocator).

Also, not completely sure, but some sources say that ASLR does randomization at the virtual address space level of a process, more specifically - it randomizes image, stack and heap addresses. I'm not sure what you mean by "from the OS", but last I checked, the CRT allocation operators in C++ use heap functions for memory allocation (AFAIK, the operator new() allocates memory with HeapAlloc). I also don't understand why you think that "the language runtime" has it's own memory-manager implementation.

But I was wrong about not being able to disable ASLR on Windows 7 - Process Explorer has a column that shows the state of ASLR for each process.

##### Share on other sites
The use of HeapAlloc() is entirely implementation-dependent and not mandatory. In fact, it is perfectly legitimate - even common - for C or C++ runtimes to have their own allocators running between the OS and the application.

You also missed the other important factor, which is that we're talking about a custom allocator here, which by definition is a layer of indirection between whatever the runtime offers (whether that goes straight to the OS or not) and the application itself.

##### Share on other sites

The use of HeapAlloc() is entirely implementation-dependent and not mandatory. In fact, it is perfectly legitimate - even common - for C or C++ runtimes to have their own allocators running between the OS and the application.

You also missed the other important factor, which is that we're talking about a custom allocator here, which by definition is a layer of indirection between whatever the runtime offers (whether that goes straight to the OS or not) and the application itself.

Unless his custom allocator (or the runtime allocator) uses virtual memory directly, or if it uses non-virtual memory, the memory allocated by his own allocator will come from a process heap.

Anyway, I have no idea how ASLR works, but Goolge'ing it, I did find mentions that the randomized addresses it produces can be seen in WinDbg...

So I think it should also be visible in Visual Studio.

Edited by tonemgub

##### Share on other sites

Yes, you can recognize the presence of ASLR in a debugger. That should be pretty obvious.

What ASLR does [b]not[/b] do is modify addresses that the runtime/application layer already have access to. That would just be insanity. You will get random addresses [b]from the OS[/b] but any memory management implemented on top of that is immune.

##### Share on other sites
I thought that's exactly what it does. :)

Edit:
Nevermind, and sorry if I caused any confusion... I was confused a bit myself. I read more about it, and it seems that ASLR does not modify virtual addresses at run-time, only at image load-time (or for the heap, at heap-creation time).

So the wikipedia page seems to be a bit wrong in this case. :)

I mean this: "Entropy is increased by either raising the amount of virtual memory area space over which the randomization occurs or reducing the period over which the randomization occurs. The period is typically implemented as small as possible, so most systems must increase VMA space randomization."