Sign in to follow this  
Ectara

[C] Temporary Register Allocation

Recommended Posts

In the compiler I'm writing, it uses temporary registers for values such as array subscript offsets, dereferenced values, and results of calculations. My target architecture doesn't have a set amount of registers, and the compiler can't know how many registers there are at compile time; there are at least two general purpose, one flag register, and stack, base, and instruction pointer registers.

Does anyone have an idea how I can use allocate temporary registers if they are there, and otherwise use the stack or another solution if I don't have enough/run out?

[Edited by - Ectara on December 7, 2010 6:35:05 PM]

Share this post


Link to post
Share on other sites
Quote:
My target architecture doesn't have a set amount of registers,
All architectures have fixed known number of registers. Otherwise the instruction set would become prohibitively verbose.

Quote:
Does anyone have an idea how I can use allocate temporary registers
Registers are not allocated, they are a piece of hardware, actual gates in the chip.


Maybe you're asking about a different problem related to register allocation - given local variables and temporaries in source code, how to make most efficient use of CPU registers.

Share this post


Link to post
Share on other sites
Quote:
All architectures have fixed known number of registers. Otherwise the instruction set would become prohibitively verbose.

This architecture is a VM, and has a compile-time constant of how many registers are available.

Quote:
Registers are not allocated, they are a piece of hardware, actual gates in the chip.

Quote:
Maybe you're asking about a different problem related to register allocation - given local variables and temporaries in source code, how to make most efficient use of CPU registers.

I'm referring to the use of registers for temporary values in an algorithm or compound instructions; with an indefinite amount of registers, incrementing which register is occupied with a temporary risks running out, and using only the basic two registers suffers from not using as many registers as it could.

Share this post


Link to post
Share on other sites
Quote:
Original post by Ectara
This architecture is a VM, and has a compile-time constant of how many registers are available.

What does the VM spec say how this should be handled?

Share this post


Link to post
Share on other sites
Nothing yet; I haven't gotten around to writing the specification, because I'm not sure how it should be handled, myself. :D

But it must have from 6 to _64_ registers.

EDIT: Clarification

[Edited by - Ectara on December 7, 2010 8:01:17 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Ectara
Nothing yet; I haven't gotten around to writing it. :D
Ah, so this is about finding a solution to a problem that solves the problem that doesn't exist yet.

In words of Keanu: Whoa.

Quote:
But it must have from 6 to 256 registers.


And how will these registers be addressed? What will the not-yet-existent opcodes for a not-yet-existent VM look like?

It's easy, you need to sum up 256 numbers. How would the assembly look like if it were written manually?

Typically, for load-store architecture, it's something like this:
mov ra, mem[0];
mov rb, mem[1]
mov rc, mem[2]
...
mov rz, mem[26]
add ra, rb
add ra, rc
...
add ra, rz
// ra now holds sum of first 26 numbers, repeat until done


But the above are mnemonics - what would opcodes be like: [instr][register_idx][params]?

In that case, what happens if I write [add][137][211] when machine only has 200 registers. Or only 100?


However, given that there are always at least two registers, writing a VM(2) on top of VM(1) would work. VM(2) would then be responsible for generating VM(1) bytecode based on number of registers available.

It might also be fun to implement JVM as VM(3) on top of that and run Ruby as VM(4) on it. Perhaps running some DSL as VM(5) inside it.

But I heard you like VMs, so ...

Share this post


Link to post
Share on other sites
Quote:
Quote:
Original post by Ectara
Nothing yet; I haven't gotten around to writing it. :D

Ah, so this is about finding a solution to a problem that solves the problem that doesn't exist yet.

In words of Keanu: Whoa.


What I meant to say was, the VM is written, and there's an ASM and a C compiler that I've written for it already; just a solution to this particular problem is missing. Currently the register count is 16 and I haven't changed that yet, but changing the define changes the number of available registers.

Also, did I say 256? I meant to say 64. Changes in the design later on reduced the upper limit to 64.

Quote:

It's easy, you need to sum up 256 numbers. How would the assembly look like if it were written manually?

Typically, for load-store architecture, it's something like this:
-snip-


The operation could be explained as such:


typedef struct operand{
int flags: 2;
int value: 6;
};

typedef struct operation{
unsigned char opcode;
operand operands[3];
};




If no bits are set in the flags field of an operand, the value bits specify which register.
If the most significant bit is set in the flags field, it is a constant, and the value bits specify the type of constant, which follows the operation after the operands in memory.
If the least significant bit is set of the flags, it is allocated memory, or a variable, and the value bits specify the type, and a 32bit constant containing a hash key for a special variable.
If both bits of the flags are set, it is dereferencing, and the value bits specify which register; to dereference, one must first move the pointer to a register.

It's a bit RISC-like.

But, adding many numbers is no big deal. One could write:

move reg1, $256
move reg0, $0
move reg2, mem ; A pointer, perhaps?
loop:
add reg0, reg0, [reg2]
add reg2, reg2, $4 ; Assuming 32bit integers in contiguous memory.
dec reg1
jnz loop, reg1


Quote:
In that case, what happens if I write [add][137][211] when machine only has 200 registers. Or only 100?

If those are register indices, my VM would currently overrun the boundaries, and probably corrupt the stack.

Quote:
But I heard you like VMs, so ...

Thanks, Xzibit. :P

Share this post


Link to post
Share on other sites
How do you plan on writing a compiler if the target architecture isn't fully known? There will be so many things unpredictable that emitting efficient code will be all but impossible.

The best thing I could suggest would be to have the compiler support a variable number of registers in the register scheduler, and just recompile your binaries if the VM's register count changes.

Share this post


Link to post
Share on other sites
Eh... The architecture was designed around the same binary running on all incarnations of the VM. I suppose having to recompile would break that. So, basically, I'd have to have a fixed register count, no matter what? One other idea I had, was to query if there were registers left, and if so, use a register, and if not, have to push something on the stack, and use registers, then pop things back, or dereference the stack pointer, whatever method I'd end up using to free up registers for temporary use.

Share this post


Link to post
Share on other sites
You could, of course, write the compiler so that it takes "number of registers" as a command-line argument, implements the register allocation assuming that number of registers, and emits the corresponding bytecode. You would, of course, have to recompile each program in order to run on a differently-compiled VM; and this assumes that the number of registers somehow doesn't have a major effect on the instruction set (e.g. the fields in opcodes that specify register ID are always 6 bits wide, just that accessing nonexistent registers causes an exception or "hardware" interrupt or something). Otherwise, I hope you're prepared to buckle down and write 59 separate compilers.

Share this post


Link to post
Share on other sites
The compiler has the same compile-time constant, and can be changed just as easily; if necessary, the constant could be changed to a simple variable. So, the only solution would be to have a static register count?

Share this post


Link to post
Share on other sites
Quote:
Original post by Ectara
So, the only solution would be to have a static register count?
That is one approach.

The other is proposed above. Use a VM(2) with fixed register count which emits VM(1)-specific instruction set.

This is how JVM and most others work. Bytecode is fixed (in this case it would be 256 registers) but it's translated into actual hardware on the fly, or is just interpreted.

At which point you might just as well define that VM(1) has fixed 256 registers, and it's up to VM(1) implementation to do whatever it needs to do to actually run such code. This is, coincidentally, how today's x86 CPUs work. The "VM" is microcode which emulates various bits of functionality which have long become obsolete or which are not ideal for raw implementation.

Even more, IIRC, most of today's x86 CPUs are implemented in microcode, which allows certain patches to be applied later to avoid issues such as FDIV bug.

The whole point of virtualization is to pretend that hardware has certain features that it really does not, and VM emulates or masks them.

Share this post


Link to post
Share on other sites
In systems architecture, the entire point of a register is that you have a known number of fast-access slots to store data in; if you need a variable number of slots, you use a stack or just access memory directly off the bus. Registers exist for two reasons: speed of access, and known quantities. They're there so that you know exactly how many items of data you can access quickly and how many you need to relegate to the memory system.

What you're doing is really totally counter to that. Why have a variable number of registers? As soon as your compiler can't rely on the register count, it has to do things like ask for available registers, invoke different code paths dependent on the number of registers available, and so on. Essentially you've removed both advantages of having registers in the first place: you don't know how many there are, so you can't schedule your most-often-accessed data into them; and you don't have any speed, because you have to ask the processing system (VM/etc.) what's there.

Worse, you have to store multiple code paths in your compiled binaries so that you can handle various situations; suppose you have a block of code that compiles very optimally to use 10 registers, but you don't know if you might be running on a VM instance with only 8. Now you have to have your compiler store a 10-register version and a less-than-10-register version so that you can use the optimal edition when the VM is available, versus the less optimal edition when the registers aren't there.


So in a nutshell: you're going against the entire purpose of registers here. I really don't see any possible benefit to having a variable register count in your architecture. Just set a static number and be done with it. If your compiler's register scheduling algorithm is robust, and your bytecode design is solid, and your VM is equally robust, you can simply tweak the register count to see what gets the best results, and then lock it down once you have a number to work with.

But this whole "uncertain architecture" thing just doesn't make sense. It's like trying to design a transmission when you don't know if you are driving one wheel (motorcycle) or four (heavy-duty truck).

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this