# Reading binary file in C++ skips 0s?

This topic is 417 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

First things first - I actually never read file as binary in C++ (although, of course, I used C++ o/i/fstream a lot of times) and I'm also new to emulation stuff.

Recently I got into this tutorial and coded my very first emulator (which is Chip 8 emulator):

http://www.multigesture.net/articles/how-to-write-an-emulator-chip-8-interpreter/

But when I ran Pong on my emulator all I saw were a couple of "O"s at the top of the console window (right now I'm just printing Os and white spaces in the console, as I just want to check out if my emulator works). Clearly there was something very wrong with my code, so I downloaded and opened source code included in the above tutorial to check for any errors I might've done, but found none. I compared each function from sources with my own code and couldn't find anything wrong with my code.

Therefore I thought that there must be some kind of error ongoing when the program loads a file as binary. Although I did included some simple error checking in the function responsible for loading a file and none of this error checkings ever showed me an error. First I wrote a simple printf in emulateCycle function:

printf("%X\n", opcode);

And then in main function I replaced infinite for loop with 8-step for loop, so my app would just show me first 8 opcodes. And then I opened pong2.ch8 file (which is the file I'm trying to load) in hex editor, so I could compare these opcodes with "original" ones. You can see what I got in readingerror.jpg. Only the first opcode matches its respective opcode in pong2.ch8 hex code. (File size is actually ok here, it should be 294 as it is).

I checked loadprogram function (which loads a binary file) for any errors, but there were none. So I decided to check if actually everything from pong2.ch8 is read and written to buffer and my emulator memory correctly. After writing contents of pong2.ch8 into buffer (which is of type char *) I did a printf to view contents of buffer variable. File readingerror2.jpg shows the results - first three nibbles are ok, but 4th one misses a zero before C! And this error continues to appear later.

I'm not sure if this is actually the issue and if this is the cause of Pong not working correctly on my emulator, but it seems to me there's something really wrong with loading a file as binary. I'm attaching loadprogram function here and you can see my whole code in main.txt. I searched Internet, but I cannot find what I'm doing wrong in this loadprogram function. I also compiled my project on VS2010 and Devcpp, but results are the same.

bool loadprogram(const char * filename) {
fstream pfile;
pfile.open(filename, ios::in | ios::binary);

if (pfile == NULL) {
fputs("File error", stderr);
return false;
}

pfile.seekg(0, ios::end);
long bufferSize = pfile.tellg();
pfile.seekg(0, ios::beg);
printf("File size: %d\n", (int)bufferSize);

char * buffer = (char*)malloc(sizeof(char) * bufferSize);
if (buffer == NULL) {
fputs("Memory error", stderr);
return false;
}

if ((4096-512) > bufferSize) {
for (int i = 0; i<bufferSize; i++) {
memory[i+512] = buffer[i];
printf("%X\n", buffer[i]);
}
}
else
printf("Error: ROM too big for memory.\n");

pfile.close();
free(buffer);

return true;
}

Guy in the tutorial actually coded this function in C, but I rewrote it in C++ to see if it would help me resolve this issue.

##### Share on other sites
Is that really the code you used to generate the screenshot? Because it shouldn't be outputting 4 digits per line.

##### Share on other sites

The second screenshot shows contents of buffer from loadprogram function. The first one shows you opcodes from emulateCycle function and I think that both screenshots show quite right what they're supposed to. In emulateCycle the first line is actually:

opcode = memory[pc] << 8 | memory[pc+1];

This combines into one two adjacent pieces of opcode and that's where 4 digits you're probably refering to got from. Each opcode in Chip8 is 2 bytes long, so there should be 4 digits in hex.

Edited by mentor

##### Share on other sites
The screenshot on the right looks like it matches the code and what the hexeditor displays. The screenshot on the left is... what? Printing uninitialized memory or something?

##### Share on other sites

OK, as I said I'm new to emulation stuff and cannot really explain more than what's in tutorial I gave link to in the first post. But important thing is that I compared my code to the tutorial's source code and they were the same (besides that I'm not using classes here). I later rewrote loadprogram function from C to C++, but it wasn't working when it was C exactly the same as in the source code either.

Chip 8 opcode's are all 2 bytes long. So to get one opcode you need to actually combine 1 byte from memory to the next byte in memory and this is what this line does:

opcode = memory[pc] << 8 | memory[pc+1];

After that I'm printing above opcode and that's what screenshot on the left shows:

printf("%X\n", opcode);

I hope I explained that ok, English is not my native language.

The screenshot on the right looks like it matches the code and what the hexeditor displays.

I'm not sure, I'm new to this stuff. As you can see, code in hex editor goes like this:

22 FC 6B 0C

While what screenshot on the right shows seems to go like this:

22 FC 6B C

I'm not sure if this is correct and if there shouldn't be any zero before C.

But to be honest, I'm not really sure if this is the root of my problem with my emulator and if this is why my emulator doesn't work properly. But if this isn't the case, I don't really know what is. Best if anyone interested would look into main.txt. Although there are about 460 lines of code, I can assure that emulateCycle function is fully correct (just the same as in tutorial's source code) and this function takes a lot of place. The rest is rather short.

##### Share on other sites

The problem is that %X by itself doesn't print leading 0s. You probably want %02X.

##### Share on other sites

Damn, that's so stupid of me, but I must agree with you. But then again it doesn't solve this issue with my emulator not working properly.

So, I know now that contents of buffer and memory variables are correct and that they match contents of pong2.ch8 hex. But what screenshot on the left shows still doesn't feel right. You can see that first opcode is 22FC which is correct, but second one is 6B20 while it should be 6B0C. So does this line is not doing its job right?

opcode = memory[pc] << 8 | memory[pc+1];

The point is it's exactly the same as in the tutorial's source code and that's what is misleading me. I mean, there are binaries included in tutorial, not only source code, and tutorial's emulator from binaries works perfect, although it is the same code as mine.

EDIT:

And perhaps there should be ">>" instead of "<<"? Because "<<" "adds" 8 zeros to the left, right? While it should add 8 zeros to the right where memory[pc+1] goes?

Perhaps one way to solve my issue would be not to combine memory[pc] with memory[pc+1], but rather check just memory[pc] in the first switch instruction like this:

switch (memory[pc] & 0xF0) {
case 0x00:
//...
break;

case 0x10:
//...
break;
}

And when this is not enough, because there are multiple instructions starting with 0x8000 for example, I could add another switch inside appropiate case like this:

switch (memory[pc+1] & 0x0F) {
case 0x01:
//...and so on...
}

But also the problem is that Chip 8 has some opcodes which actually would require me to combine memory[pc] with memory[pc+1] in order to read, let's say, proper memory address (for example, some instruction like "0x6nnn" might jump to address nnn). And I'm still interested why the line with opcode variable combining two memory pieces doesn't really work. Or perhaps there is something I just don't get?

Edited by mentor

##### Share on other sites
To elaborate on what SiCrane said:

%X just means "print this number in capitalized hexadecimal". It's just like how %d is "print this number in decimal".

printf doesn't put leading zeroes on numbers if you don't explicitly tell it to.

To get leading zeroes with %X you put %0<number of digits you want to pad with zeroes>X.

I saw this in the original post but I assumed you were talking about the 00 in 6C00 not being in the right screenshot. Edited by Nypyren

##### Share on other sites

I checked my emulator with two other .ch8 files - Tetris and Tron. Tetris repeats the issue - first opcode is correct, the rest is completely different. But Tron actually gave me all first eight opcodes right. But Tron doesn't work properly either, so it seems that some mistake in opcodes must appear later.

So, to be clear - my emulator reads a binary file correctly and saves Chip 8 program into memory array correctly. That's what I'm kinda sure of. But emulator does not read opcodes in emulateCycle correctly and at lleast it seems so. Not all opcodes at least. And that's rather strange. I must've done some mistake that I just cannot find now at all.

##### Share on other sites
I would recommend brushing up on your bitwise arithmetic. Your understanding of shift operators seems incorrect, and your description of masking bytes is also incorrect.

##### Share on other sites

Yep, done that already. I don't know why you're talking about masking bytes actually, but I was wrong about shift operator. "<<" is correct here. Still, I have no idea why my emulator doesn't work.

##### Share on other sites
memory[pc] << 8 | memory[pc+1];

That part looks correct to me...

- Operator precedence for << is higher than |: CHECK
- << is signed/unsigned agnostic: CHECK
- Big endian byte order: CHECK

You *might* be experiencing automatic promotion of signed values to larger integer types, but that shouldn't turn your 0C into 20.

So where is the 0x20 coming from?

Since you're using C++, I suggest putting a data breakpoint on that element of the array to see what's setting it to 0x20. Edited by Nypyren

##### Share on other sites

That's totally strange, but the value of this variable is always: 12 '\f'. I mean, always after the value is written into from buffer. I never used breakpoints and debugger before, but I think I did properly.

##### Share on other sites
12 in decimal is C in hex, so that's OK. Does the "6B20" problem still happen?

##### Share on other sites

Unfortunately, yes. Also I might be missing something. Variable memory is declared like this:

unsigned char memory[4096];

Chip 8 memory is 4K and Chip 8 program (.ch8 file) starts at 0x200. So the variable I'm looking here for is memory[0x203], right?:

memory[0x200] = 22
memory[0x201] = FC
memory[0x202] = 6B
memory[0x203] = 0C (should be, but console window shows 20)

Please, take a look at the screenshot attached. I think that values shown by debugger (on the left in Dev-cpp window) are correct, but opcodes in console window aren't really that correct. I added watches on two following opcodes (from 0x204 to 0x207) which aren't shown in console window, but debugger shows them correctly also. For the record, this very same problem happened when I was using VS2010, so I don' think that's an issue anyhow related to Dev-cpp.

It kinda seems like the problem lies with this line:

opcode = memory[pc] << 8 | memory[pc+1];

Because values in memory are correct, so I cannot think of anything else, but still this line is exactly the same as in tutorial's source code. So I really don't know what could be causing this problem.

##### Share on other sites
You need to debug everything between the point where memory[...] appears to be correct and where it's being printed incorrectly, not just the loading code.

for example, the following program with known-good values prints the expected result:

int main()
{
unsigned char c1 = 0x6C;
unsigned char c2 = 0x0C;
unsigned int opcode = c1 << 8 | c2; // int, unsigned short, and short all work here as well.  What is your opcode type, anyway?

printf("%04X", opcode); // prints 6C0C

return 0;
}

so my suspicion is that something is wrong between your loading code and the line that assigns opcode = ...

It's most likely one of the following:
- Your memory malloc isn't large enough and other allocations occupy its space later on.
- Buffer overrun.
- Stack corruption.
- "if (memory[x] == 0x6C && memory[x+1] = 0x20)" (single-equals-in-condition-expression bug)
- memory[x] with a typo in how x is calculated

Since it always happens on the same location, if something is actually modifying the memory array, you can find this extremely quickly by using a data breakpoint.

In visual studio:

- Set a normal breakpoint on the line after 'memory' is allocated.
- Calculate the effective address of memory[0x203]. (put "&memory[0x203]" in the watch window and use the address it shows)
- Make a 1-byte data breakpoint at that address. https://msdn.microsoft.com/en-us/library/350dyxd0(v=vs.100).aspx
- Resume the program.

You should see the program stop at the exact statement which writes to that memory address (including your loader and any bugs that might exist). From there you can figure out what's happening and fix it. Edited by Nypyren

##### Share on other sites

I finally figured out what was wrong. First opcode is 22FC which calls subroutine at address 2FC. So after my emulator executed this instruction my program counter (pc) was set to 2FC (and not 201), so the next opcode would be 6B20, not 6B0C (because emulator "jumped" to a different place in memory) and that's ok. The problem was with implementation of another opcode which had a typo I totally missed and that caused my emulator's strange behaviour.

I'm just surprised I didn't notice this typo before, but now Pong works correctly. I'll later check emulator with other programs, but everything should be fine now. Thank you all for all your help!

##### Share on other sites

This topic is 417 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628730
• Total Posts
2984427

• 25
• 11
• 10
• 16
• 14