How was the first assembler made?
Does anyone else remember typing out the Hex listings from 'Your Sinclair' into a basic hex editor? One mistake and boom - two hours down the drain.
Quote:Original post by Anonymous Poster
AFAIK you cannot program in binary. Binary isn't a "language." The 1's and 0's represent the yes and no decisions made by the processor itself. When you hear about binary, you're hearing what decisions the processor is making. Correct me if I'm wrong, but it is impossible to write "111010010101010101001010100101010101001...." and have it do something.
As many others have said, there no difference between coding in hex and binary. I have seen some hex editors which allow you to see the binary representation too. In that case, you could actually type 1s and 0s to create a working program.
Dwiel
I've programmed in hex with GetRealDebugger, but all my program did was output a single character to the screen by storing a character in a register I can't remember, and calling interrupt 24h. That was a waste of time, so I'll stick with asm =)
Quote:Original post by CodemongerActually, one hex digit represents 4 bits of information, which has a range of 16 values or states. That's what hexadecimal means, after all: base 16.
Like I said before Hex values represent 8 bits of information...
Quote:Original post by Tazzel3DQuote:Original post by Anonymous Poster
AFAIK you cannot program in binary. Binary isn't a "language." The 1's and 0's represent the yes and no decisions made by the processor itself. When you hear about binary, you're hearing what decisions the processor is making. Correct me if I'm wrong, but it is impossible to write "111010010101010101001010100101010101001...." and have it do something.
As many others have said, there no difference between coding in hex and binary. I have seen some hex editors which allow you to see the binary representation too. In that case, you could actually type 1s and 0s to create a working program.
Dwiel
Well actually yes, there is a large difference between hex opcodes and binary.
Quote:Original post by Anonymous PosterQuote:Original post by Tazzel3DQuote:Original post by Anonymous Poster
AFAIK you cannot program in binary. Binary isn't a "language." The 1's and 0's represent the yes and no decisions made by the processor itself. When you hear about binary, you're hearing what decisions the processor is making. Correct me if I'm wrong, but it is impossible to write "111010010101010101001010100101010101001...." and have it do something.
As many others have said, there no difference between coding in hex and binary. I have seen some hex editors which allow you to see the binary representation too. In that case, you could actually type 1s and 0s to create a working program.
Dwiel
Well actually yes, there is a large difference between hex opcodes and binary.
Which is?
(10101010)2 == "opcode" (AA)16, no?
So then the large difference is in the numeric base which really means nothing...
Quote:Original post by Codemonger
You can't directly access bits on a CPU, you can use bytes to represent a series of bits and thats it. The CPU itself can access the bits of a byte. Like I said before Hex values represent 8 bits of information and this is how bits are manipulated. But I think an ASM freak could probably clear this up and give a good underlying description of whats happening. From ASM to bytecode to CPU etc.. Like I said I'm no expert.
The computer doesn't really access single bits of the opcodes either, it just checks the whole byte to see what opcode it is.
There is no reason to use binary when programming, the best representation, IMO, is hex, 2 characters for each opcode (or more for multi-byte instructions, meh). Almost like assembly.
Quote:
Well actually yes, there is a large difference between hex opcodes and binary.
No, they're perfectly equivalent. It's just a number, the base is irrelevant.
In fact, each opcode is represented by a certain bitpattern, often 8 byte, but sometimes more. Additional bitpattern indicate the opcodes' operands, such as registers, values or memory reference offsets. Since it's impossible for a human to remember all those numeric codes, they were mapped to easier names, the so called mnemonics or instructions. "MOV", for example, is such an instruction.
But contrary to what is often implied, a single mnemonic doesn't have a 1 to 1 equivalence to a certain bitpattern on modern CPUs. Several different versions of "MOV" exist in binary, depending on the exact form of the instruction: register to register, register to memory, operand sizes, hardcoded mutipliers, and so on. The assembler selects the correct bitpattern by analyzing the context of the instruction.
In the old days, the instructions were directly mapped to their binary representation. The assembler was merely a translator, from the mnemonics to their bitpattern. Today, it does a little more than that.
Here is an example to clarify what happens:
Look at how these instructions convert to their binary patterns:
0x51 push ecx
0x52 push edx
As can be seen, the push instruction has more than one bitpattern, depending on the operand. This is to save memory and bandwidth. Several more patterns
Binary representations of instructions can have a variable number of binary operands and arguments, depending on the precise instruction form, which indicate the usage context:
0x8B 0x00 mov eax,[eax]
0x8B 0x80 0x6C 0x11 0x00 0x00 mov eax,[eax + 0x116C]
The 0x8B is the code for "MOV" in one of its basic forms. It's extended by the instruction form suffix: 0x00 means "get a register from the address pointed by another register (without an offset)". As in the push example above, again no separate listing of the operands to save memory: the eax is encoded in both the 0x8B and the 0x00. The second example uses an explicit offset. This is reflected by the 0x00 changed to 0x80. It extends the basic MOV to mean "get a register from the address in a register plus an offset". The offset follows the 0x80 in 32bit little endian format.
Here's another interesting example:
0x8B 0x85 0x68 0xFF 0xFF 0xFF mov eax,[ebp - 0x98]
0x8D 0x85 0x68 0xFF 0xFF 0xFF lea eax,[ebp - 0x98]
The mov example is similar to the one above with offset, only with a different register. The basic MOV pattern is the same (0x8B), but the instruction form is different (0x85 instead of 0x80) to reflect the register change, wich is now ebp instead of eax. Instruction form suffixes are often equal amongst otherwise different instructions: the 0x85 is equivalent for both the MOV and the LEA instruction. This is because the form suffix only specifies the usage form of the basic command, which is given by the first number (0x8B for MOV, and 0x8D for LEA).
In addition to these simple examples you have a lot of other suffix types, and also prefixes that come in front of the basic instruction pattern (most often to indicate operand sizes). To make things even more complex, the bitpatterns can vary depending on the current CPU operation mode, for example real vs. protected mode. That's especially true for prefixes. Actually, the bitpattern for ASM instructions are a huge mess, and very confusing, apparently lacking any form of human sanity :) The reasons are historical, because the instructions needed to be backwards compatible down to the 8086. A lot of additional instructions and instruction forms were added over time, leaving a lot of chaos.
Hope this clears things up a little.
I have a Computer Architecture module as part of my university course and for the first few weeks we enter the assembly instructions directly into the memory of special educational computers as hex. Its interesting but way slow (inputting, not execution).
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement