Archived

This topic is now archived and is closed to further replies.

CPU registers, structs and function pointers

This topic is 4943 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

First of all.. let's post some source:
/* #pragma pseudocode */
typedef void (*ARRAY_PUSH)(void*);

typedef struct _Node
{
  struct _Node  *next;
  struct _Node  *prev;
  void          *data;
}NODE, *PNODE;

typedef struct _Array
{
  ARRAY_PUSH  push_back;
  ARRAY_PUSH  push_front;
  PNODE       head;
  PNODE       tail;
}ARRAY, *PARRAY;

void array_pushb(void *data)
{
  PARRAY array;
  PNODE  newnode;
  
  /* And here's some inline assembler     */
  _asm("movl  %edi, %array");
  
  array->head->next = malloc(sizeof(NODE);
  /* Yadda, yadda, yadda */
}

PARRAY array_init()
{
  PARRAY array = malloc(sizeof(ARRAY));

  array->push_back = array_pushb;
  /* More init code here... */

  return array;
}
It's easy to write a linked list (this is a indexed double linked list with a twist ) but I want to be able to call it with "C++" like syntax (And NO! I want to do this in C!)
PARRAY mesh_array = array_init();

mesh_array->push_back(ONE_MILLION_POLLIES_MESH_WITH_BIG_BOOBS);
PMESH mesh = (PMESH)mesh_array->get(index);
  
In Lcc Win32, it's extremely easy as it saves the address of the struct calling the function via the function pointer in the edi register, so it's a simple thing to extract it and make a fake 'this' pointer.. But I've been trying to figure out how to do the same trick with GCC (and more lately the Visual Toolkit) as I guess that those compilers might be able to outrun Lcc's optimiz.. optimizations... opti..WAAHHH! Don't now how to spell that, you gotta figure thatone out yourself. I've read the GCC online docs, but I might be stupid as hell, cause I can't figure out how to do this. Anyone who can point me in the right direction?? "For every complex problem there is an answer that is clear, simple, and wrong." H L Mencken [edited by - Rickmeister on May 31, 2004 7:51:52 PM]

Share this post


Link to post
Share on other sites
Hmm I am not certain of your motivations.

"optimizations"
Well if this bothers you, then you''d better not call malloc per node created. Use a dedicated pool of nodes. 10000 times faster.

"pseudo this"
I don''t think you have any way of simulating exactly the C++ syntax for virtual members in a portable and stable way. To do it cleanly either replicate the object pointer in the function. I am quasi certain you can''t use this edi trick in Visual C++. Else you could use a macro :

array_push_back(mesh array,
ONE_MILLION_POLLIES_MESH_WITH_BIG_BOOBS);

Share this post


Link to post
Share on other sites
Why do you want to do this anyway? It will most likely confuse anyone reading your program. I would guess GCC would also put the structure pointer in the edi register so this may work (well actually it may not and after optimizations, which is what you want, I doubt it'll use the same register every time), but it's an ugly hack, do you really need it?

[edited by - Monder on June 1, 2004 12:22:41 PM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:
Original post by Charles B
...not call malloc...
Really good point. If there''s anything you should bear in mind when coding (besides good data structures and algorithms) it''s to avoid frequent allocation/deallocation of memory.

I''ve been assisting a lot of customers with performance problems in various projects (mostly larger server-side applications, quite large installations). I''d say that a third of the performance issues is because of naive memory allocations.

In case anyone wonders, the other major problems have been poor understanding of multi-threading, taking locks over too large sections of code, hotspots around locks causing serialization of all threads (meaning one burning hot and seven idle CPUs). The last performance issue is related to databases, either bad db design, bad db implementation or stupid db usage causing way too many unnecessary requests to the db.


I have yet to see a bottleneck caused by a compiler not optimizeing something to a register.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Oh, by the way... Sometimes, or even quite often, the people having these performance problems also have very "clever" optimizations, such as writing single line statements doing a thousand things, writing their own clever lists rather than using STL, mentioning comments about they need to assembly optimize certain routines etc. etc...

Share this post


Link to post
Share on other sites
quote:

Oh, by the way... Sometimes, or even quite often, the people having these performance problems also have very "clever" optimizations, such as writing single line statements doing a thousand things, writing their own clever lists rather than using STL, mentioning comments about they need to assembly optimize certain routines etc. etc...



Last time I checked C doesn''t support templates, but a stupid question get stupid answers My bad... Don''t have the ability to express myself using english as I''m able to in my native tongue... Come on.. gimme a break, huh??

I also use a memory pooling system with reference counting for reuse/overwriting of unreferenced nodes, but that doesn''t show in the code posted above, as I tried to keep it small. Malloc only gets called when the array/list grows out of bounds and in the init function.

Really, the question wasn''t about optimization as I''m well aware of the fact that no matter what compiler I use, it''s still not going to optimize away bad code (like 99% of my source files ) The asm hack used to get the address of the struct, wich contains the functionpointer that calls the function (phew..) IS a ugly thing, but it works with Lcc. What happends with the struct pointer in GCC?? Is there some safe way to unwind the call stack to see wich instance of ARRAY that the function got called from, or am I out fishing in way to deep water now?? Is it possible??

Share this post


Link to post
Share on other sites
This will probably need some re-syntaxing for gcc, but the concept: be the compiler.

PARRAY mesh_array = array_init();

_asm("pushl mesh_array"); // Push 32-bit address of mesh_array.


mesh_array->push_back(ONE_MILLION_POLLIES_MESH_WITH_BIG_BOOBS);
PMESH mesh = (PMESH)mesh_array->get(index);

_asm("add esp, 4"); // Pop mesh_array pointer off stack.


//in your pushb function:

void array_pushb(void *data){
PARRAY array;
PNODE newnode; /* And here's some inline assembler */

_asm("mov eax, [esp+4]"); //Is that the right syntax?

// Copy data held at address in esp + 4, b/c call to

// push_back pushes current eip onto stack.

// Copy pointer data into array.

_asm("mov %array, eax");

array->head->next = malloc(sizeof(NODE); /* Yadda, yadda, yadda */}

Well you didn't lose your C++ syntax, lol

edit: fixed code.

[edited by - temp_ie_cant_thinkof_name on June 1, 2004 11:43:00 PM]

Share this post


Link to post
Share on other sites
Edi is not a good register to use that way with LCC-Win32 - unless you push it first and pop it later - so that it''s not clobbered.

LCC-Win32 also comes with a container lib. For what you want to do check out vector.h in lcc\include.


// _asm("mov eax, [esp+4]"); //Is that the right syntax?

No, the src and dest should be reverses - instr src, dest - like so:

_asm("mov 4(%esp), %eax;");

here are the others

_asm("add $4, %esp;");
_asm("mov %array, %eax");

Share this post


Link to post
Share on other sites
Though if you do it temp_ie_cant_thinkof_name''s way then you''re basically doing parameter passing anyway so you may as well do it properly without all the asm stuff.

Share this post


Link to post
Share on other sites
There are a lot of opcode prefix, immediate, displacement, segment override bytes that go into making a call in assembly, and I don''t know how they all go toget yet so I don''t know if this will work. Crack open IA-32 Volume 2A and read up on how an opcode with its accompanying bytes are formed, and you could probably even dissassemble your program to see how gcc does displacement indexing (if it does) form struct addresses. Well if it does do something like this (in nasmw):

call [mesh_array_struc+_Array.push_back]
//next instruciton pointer is pushed onto stack by cpu
//eip is set to destination in call [address] opcode

then maybe there''s a way to back track and find the bytes that make up "mes_array_struc." The above code just calls the address in memory found at the first 4 bytes in your structure. _Array.push_back is just he assembler''s way of making "structs." _Array.push_back is a displacement index that when added to mesh_array will give the address of push_back, which should be at messH_array + 0, and this displacement is encoded in an opcode so it''s handle by the processor (ie, there''s no "add" instruction involved). So if the opcode is formed in a way (with displacement indexing and everything) that mesh_array_struct is in a call opcode, then you can use the EIP pushed onto the stack when ->push_back was called, and copy the memory found at call [mesh_array... which may hold mesh_array''s address.

Or if gcc handles structs all by itself (ie doesn''t use the cpu''s displacement indexing capability) then it should be a lot easier, b/c an opcode encoding this

call [push_back]

would be a lot easier to decode. By the way, the brackets are equivalent to C dereferencing, since pus_back is a lable (an offset, address, location), the bracket tells the assembler to make the opcode that gets the value "at that address" rather than taking the immediate value, the address push_back.

So in pseudocode, if you know how the call opcode was encoded and let''s pretend we do:

void push_back(void) {
PARRAY m_array;
_asm {
mov eax, [esp] //esp is the stack pointer
//the stack has return eip on top
sub eax, 4 //manipulate the value of the next
//instruction''s address, to get
//teh address of where call opcode
//had the address of mesh_array.
mov m_array, eax //variables in inline asm
//are equivalent to [m_array]
//in nasmw.
}
//You now have the mesh_array in mesh_array->push_back.
//Do the linked list add node
}

Of course there''s a very low chance the above code works, knowing that there''s a lot more that probably went into the call opcode than an address.

Share this post


Link to post
Share on other sites
quote:
Original post by Monder
Though if you do it temp_ie_cant_thinkof_name's way then you're basically doing parameter passing anyway so you may as well do it properly without all the asm stuff.


Came to my mind also.. The goal was not to clobber my code like temp_ie_cant_thinkof_name suggested, but to have a small chunk of inline assembly at the beginning of the accessor functions to enable the instance.funcptr(arg) syntax instead of using instance.funcptr(instance,arg), but my assembly skills ain't that great, as is my knowledge of how the cpu really does work. But, one good thing has come out of all this crap - now I know much better how the compiler uses the registers and the stack

This idea was a complete shot in the dark, as I only happend to notice that Lcc used the EDI register to store the pointer to the instance of the struct when accessing it's variables. Just had to see if there was some other technique that I could use as it was a bit hackish and probably totaly unsafe.

God Speed

[edit] I wasn't refering to the above post by temp_ie_cant_thinkof_name.

"For every complex problem there is an answer that is clear, simple, and wrong." H L Mencken

[edited by - Rickmeister on June 2, 2004 3:50:18 PM]

Share this post


Link to post
Share on other sites