• Advertisement
Sign in to follow this  

[.net] Reading Two Bytes as One

This topic is 3322 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am at my wits end here and about to break something so I thought I'd tap gamedev once more. I've googled for hours. Searched the gamedev site. Visited lame sites that claim to have the answers but want you to register first. The works. Nothing. Maybe it's because I haven't slept in two days or maybe I'm just worse at programming than I thought but this is stupid hard for me. Here's the situation: I have a file. Its contents look like 00 00 FF 00 repeated over and over again. When read in to memory, 00 is treated as two individual bytes (a char each) I want 00 to be treated as a single byte so I can write the damn thing to address and get the ascii representation of the hex value. I found out that the images I was writing out as .tga was writing the hex values of what were supposed to be my hex values. I wasn't pleased. I was simply moving the source file's content to an output with an added header and a new extension. But for some reason, when looking at this in a hex editor, the values were nowhere near to the original's values. No code because I know it doesn't work and I know that the way I'm doing it assumes that I'm trying to convert a single char into a piece of an address. Here's an example though of what I'm doing:
unsigned int stride32;
unsigned char *buffer = new unsigned char[length];
file.read(buffer, length); //go on until end of file

int abyte = 0;
//we have the original file in a bigass buffer now. Sweet. We have the "hex" values saved and ready for use. 

unsigned char *whereisit = (unsigned char*)&stride32;
*(whereisit + abyte) = buffer[first two chars];
abyte += 1; //next byte in int
*(whereisit + abyte) = buffer[next two chars]; //yes, I skip whitespace
abyte += 1; //next byte in int
//Repeat that until all 4 bytes of int have been filled then....

cout << stride32;

Not the exact code but it sums up what I'm trying to do I guess. I would really appreciate ANY help I can get with this. Thanks again gamedev.net!

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by caldiar
I have a file. Its contents look like 00 00 FF 00 repeated over and over again.

When read in to memory, 00 is treated as two individual bytes (a char each)

I want 00 to be treated as a single byte so I can write the damn thing to address and get the ascii representation of the hex value.


Maybe this is way off, as I'm not sure if I understand what you're trying to do, but I think you should do something like this:


for each pair of characters (i.e., 00/FF)
int leftDigit = convert left char to a number
int rightDigit = convert right char to a number
char theByte = (char)(leftDigit * 16 + rightDigit);


Again, sorry if this is way off.

EDIT: Made some changes to the pseudo-code.

EDIT 2: Changed it again, thanks Nypyren for pointing out a stupid mistake.

[Edited by - Gage64 on January 14, 2009 2:33:26 AM]

Share this post


Link to post
Share on other sites
buffer[first two chars]? I assume you mean buffer[first char].

If so, you should read 4 chars, not just 2.

Share this post


Link to post
Share on other sites
00 is two chars (0 and 0)

I want 00 to be one char so I can use it to modify my the 4 bytes that make up my int.

Basically I want 00 00 FF 00 to fit inside the 4 bytes of an int but when 00 00 FF 00 is read into the buffer each individual digit is treated as a byte when it should be pairs of digits.

EDITED: I sounded like a jerk. I need some sleep


Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
Quote:
Original post by caldiar
I have a file. Its contents look like 00 00 FF 00 repeated over and over again.

When read in to memory, 00 is treated as two individual bytes (a char each)

I want 00 to be treated as a single byte so I can write the damn thing to address and get the ascii representation of the hex value.


Maybe this is way off, as I'm not sure if I understand what you're trying to do, but I think you should do something like this:


for each pair of characters (i.e., 00/FF)
int leftDigit = leftChar - '0';
int rightDigit = rightChar - '0';
char theByte = (char)(leftDigit * 16 + rightDigit);


Again, sorry if this is way off.

EDIT: Made some changes to the pseudo-code.



Make sure that you do something different for the A-F characters, since they aren't next to the 0..9 characters in the ASCII table (after '9' there's junk like ':' and other punctuation marks)

such as:

int leftDigit = (leftchar <= '9') ? (leftChar - '0') : (toupper(leftChar) - 'A' + 0xA);

Share this post


Link to post
Share on other sites
Quote:
Original post by caldiar
00 is two chars (0 and 0)

I want 00 to be one char so I can use it to modify my the 4 bytes that make up my int.


Sorry, but I still don't understand why you can't use my method to convert every such "pair" into a byte, and then use bit manipulation to combine 4 bytes into an int.

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
Quote:
Original post by caldiar
00 is two chars (0 and 0)

I want 00 to be one char so I can use it to modify my the 4 bytes that make up my int.


Sorry, but I still don't understand why you can't use my method to convert every such "pair" into a byte, and then use bit manipulation to combine 4 bytes into an int.

You should use that method - along with the changes suggested by Nypyren - to read in 2 chars and convert them to one byte.

Do that 4 times and build your integer from those 4 pairs. In your code you read each char as a byte, so your 4 pairs take up 8 bytes, not just 4.

Share this post


Link to post
Share on other sites
Thank you SO MUCH guys! I love you all <3

The reason I wanted to do this conversion is so I could turn a text file into a 32 bit RGBA tga file.

the text 00 00 FF 00 comes out as gibberish in ascii but the tga's hex value matches when opened with a hex editor.

I am curious though, why am I seeing green and blue in the green and blue channels even though I set the pixel to be only pure red?

Shouldn't green and blue be empty?

Share this post


Link to post
Share on other sites
Quote:
Original post by caldiar
The reason I wanted to do this conversion is so I could turn a text file into a 32 bit RGBA tga file.


std::ifstream input("input.txt");
std::ofstream output("output.txt", std::ios::out | std::ios::binary);

input >> std::hex;

std::copy(std::istream_iterator<unsigned>(input),
std::istream_iterator<unsigned>(),
std::ostream_iterator<unsigned char>(output));


Quote:
I am curious though, why am I seeing green and blue in the green and blue channels even though I set the pixel to be only pure red?
I don't think your eyes are able to see the actual bit patterns as green and blue colors, so I suspect you're using some kind of software to display the channels of your image. Could it be that your software displays a background greenish color to show that you're seeing the green channel?

Share this post


Link to post
Share on other sites
turns out the alpha channel was skewing the byte order creating a pattern of red/green/blue instead of just red.

Fixed by outputting as a .raw and loading it as an RGBA file in GIMP.

Wow that's some elegant code.

Here's mine (a ton of stuff is hard coded but it's for personal use so I didn't bother to make it do anything but function)


#include <stdlib.h>
#include <stdio.h>
#include <iostream>
#include <fstream>
#include <sstream>

#include <string>
#include <vector>
#include <windows.h>

//#define NULL 0;

typedef unsigned char byte;
typedef unsigned int size_t;
int length;

using namespace std;

ifstream file;
ofstream o_file("output.tga", ios::binary);
int pos = 0;


char header[18];


char *buffer;
//unsigned char condensed[8]; //00 00 00 00 = 8 bytes / 2 (digit pairs)

void *safe_malloc( size_t size )
{
void *p;

p = malloc(size);
if(!p)
printf("SAFE_MALLOC HAS AN ERROR!");

return p;
}

void FlipBytes(){
//unsigned char* ARGH = (unsigned char*)&stride32;
unsigned int printme;
unsigned char *ptr = (unsigned char*)&printme;
int i = 0;
int left;
int right;
char *header;
header = (char*)safe_malloc(18);
memset(header, 0, 18);
//file.seekg(0);
header[2] = 2;
header[12] = 512 & 255;
header[13] = 512 >> 8;
header[14] = 512 & 255;
header[15] = 512 >> 8;
header[16] = 24;

o_file.write(header, 18);

//file.seekg(0, ios::beg);

while(pos < length){
//00 00 00 00
if(buffer[pos] <= '9'){
left = buffer[pos] - '0';
}else{
if(buffer[pos] == 'A'){left = toupper(buffer[pos]) - 'A' + 0xA; }
if(buffer[pos] == 'B'){left = toupper(buffer[pos]) - 'B' + 0xB; }
if(buffer[pos] == 'C'){left = toupper(buffer[pos]) - 'C' + 0xC; }
if(buffer[pos] == 'D'){left = toupper(buffer[pos]) - 'D' + 0xD; }
if(buffer[pos] == 'E'){left = toupper(buffer[pos]) - 'E' + 0xE; }
if(buffer[pos] == 'F'){left = toupper(buffer[pos]) - 'F' + 0xF; }
}
if(buffer[pos + 1] <= '9'){
right = buffer[pos + 1] - '0';
}else{
if(buffer[pos + 1] == 'A'){right = toupper(buffer[pos + 1]) - 'A' + 0xA; }
if(buffer[pos + 1] == 'B'){right = toupper(buffer[pos + 1]) - 'B' + 0xB; }
if(buffer[pos + 1] == 'C'){right = toupper(buffer[pos + 1]) - 'C' + 0xC; }
if(buffer[pos + 1] == 'D'){right = toupper(buffer[pos + 1]) - 'D' + 0xD; }
if(buffer[pos + 1] == 'E'){right = toupper(buffer[pos + 1]) - 'E' + 0xE; }
if(buffer[pos + 1] == 'F'){right = toupper(buffer[pos + 1]) - 'F' + 0xF; }
}

char newbyte = (char)(left * 16 + right);
o_file << newbyte;

//char newbyte = (char)(left + right);

*(ptr + i) = newbyte;
i++;
if(i >= sizeof(unsigned int)){
//o_file << *((unsigned int*)ptr);
i = 0;
}


pos += 3;



/*l1[0] = (char)&buffer[pos];
l1[1] = (char)&buffer[pos + 1];
//l1[2] = buffer[pos + 2];

l2[0] = buffer[pos + 3];
l2[1] = buffer[pos + 4];
//l2[2] = buffer[pos + 5];

l3[0] = buffer[pos + 6];
l3[1] = buffer[pos + 7];
//l3[2] = buffer[pos + 8];

l4[0] = buffer[pos + 9];
l4[1] = buffer[pos + 10];
//l4[2] = buffer[pos + 11];*/




//o_file.write(l4, 3);


//o_file.flush();


//cout << bytearray << endl;
//bytearray = NULL;
}
//if(file.eof()){
// file.close();
//}
}

int main(int argc, char **argv){





char filename[512];
cout << "Enter the name of the file to convert:" << endl;
cin.getline(filename, 256);
cout << "Ok! Sit tight while this is processed..." << endl;
file.open(filename, ios::binary);
file.seekg(0, ios::end);
length = file.tellg();
file.seekg(0, ios::beg);

buffer = new char [length];
//pos = file.tellg();


file.read(buffer, length);
file.close();
if(file != NULL){
FlipBytes();
}
buffer = NULL;
delete[] buffer;
return 0;
}






Share this post


Link to post
Share on other sites
Just a note. All your lines of:
if(buffer[pos + 1] == 'A'){right = toupper(buffer[pos + 1]) - 'A' + 0xA; }
if(buffer[pos + 1] == 'B'){right = toupper(buffer[pos + 1]) - 'B' + 0xB; }
if(buffer[pos + 1] == 'C'){right = toupper(buffer[pos + 1]) - 'C' + 0xC; }
if(buffer[pos + 1] == 'D'){right = toupper(buffer[pos + 1]) - 'D' + 0xD; }
if(buffer[pos + 1] == 'E'){right = toupper(buffer[pos + 1]) - 'E' + 0xE; }
if(buffer[pos + 1] == 'F'){right = toupper(buffer[pos + 1]) - 'F' + 0xF;
Result in a canceling out of your first two terms as you just end up adding the hex value at the end. This is because the value in your buffer has already been determined to be the value you are subtracting (IE 'C' - 'C' = 0). Also, you are making specific checks to see if the value is an uppercase character so your toupper call isn't going to do much.

You can cut all of these checks down to a simple
right = buffer[pos + 1] - 'A' + 0x0A;
This will subtract the ascii value in your buffer from the ascii value of 'A' (IE 'C' - 'A' = 2) and add that value to the hex value 0x0A ('C' - 'A' + 0x0A = 0x0C).

Earlier, it was suggested to use toupper so that if you had any lower case characters in your file they would be converted to upper case. This would make it so that you could subtract 'A' in all cases instead of having to subtract lower case 'a' (a different ASCII value) if some characters were lower case.

Doesn't matter in the long run, but it'll make your code a little cleaner.

Share this post


Link to post
Share on other sites
I really don't understand what you're trying to do. It looks to me like you are torturing yourself with this haphazard method. Some files are stored as bytes. Read bytes. Some files are stored as integers. Read integers. Reading in 2 bytes as an integer is as simple as fread(buffer, 4, 1, file).

Share this post


Link to post
Share on other sites
Quote:
Original post by kittycat768
I really don't understand what you're trying to do. It looks to me like you are torturing yourself with this haphazard method. Some files are stored as bytes. Read bytes. Some files are stored as integers. Read integers. Reading in 2 bytes as an integer is as simple as fread(buffer, 4, 1, file).


fread is a legacy C function. The proper C++ solution is given by ToohrVyk above.

Share this post


Link to post
Share on other sites
Quote:
Original post by caldiar
turns out the alpha channel was skewing the byte order creating a pattern of red/green/blue instead of just red.


In the future, feel free to post (small) pictures to illustrate this kind of problem.

Quote:

Wow that's some elegant (ToohrVyk's) code.


Welcome to C++. :)

Quote:


void *safe_malloc( size_t size )
{
void *p;

p = malloc(size);
if(!p)
printf("SAFE_MALLOC HAS AN ERROR!");

return p;
}



OK. Many problems here.

1) This is C++. Use new. It's idiomatic.
2) This is C++. Use new. It's type-safe. malloc() is not.
3) This is C++. Use new. It doesn't require you to think about the size of the type of thing you want to allocate. malloc() does not.
4) This is C++. Use new and new[]. This keeps scalar and array allocations distinct (which may allow the compiler to make some optimizations in the underlying allocation system, and is useful for documentation purposes). malloc() cannot make such a distinction.
5) You already use new[] elsewhere in the code, so be consistent.
6) Actually, forget all of that, and use std::vector instead. Don't mess around with calling delete yourself. Using a standard library container helps immensely with exception safety.
7) Printing an error message locally doesn't make anything "safe". By writing the code this way, you (a) make the writer of the calling code (you) think that no check for a null return value is necessary (when it is); (b) deny the writer of the calling code (you) the option of *not* printing the message, or of formatting it differently, etc.

Only attempt to handle errors at the point where they can be handled. This is largely the point of exceptions in C++: to provide an easy way to get back to the point in the code where the problem can/should be handled.

Often, failed allocations shouldn't be handled at all, but should just let the program crash - in a controlled manner! - without letting other stuff execute. With new (and implicitly with standard library containers), this is easy to arrange: if there is a failure, a std::bad_alloc exception will be thrown, and all you do is - nothing. When no exception handler is found, std::terminate will be called, and you're done. new and new[] will never (in the normal form) return a NULL pointer (although you can use a special form that will, if you really need it).

As a result, we get:


// Oh, and don't use comments to disable code. Get a proper version control
// system; then you can be aggressive about deleting bits, since you can
// always get them back from the repository. Alternatively, use something like
// #if 0
// ...
// #endif
// for temporary disabling code.
void FlipBytes(){
unsigned int printme;
// Please use C++ style casts in C++.
unsigned char *ptr = reinterpret_cast<unsigned char*>(&printme);
int i = 0;
// Please, do anything you can to make sure your editor is not allowed
// to mix tabs and spaces freely for indentation.
int left; // this was previously indented 4 spaces instead of a tab.
int right;
// In C++, we use std::fill to zero-initialize ranges (arrays,
// containers or portions thereof); but std::vector guarantees that the
// initial elements - if you request any - will be default-constructed
// (which makes them zero for primitive types).
std::vector<char> header(18);
header[2] = 2;
header[12] = 512 & 255;
header[13] = 512 >> 8;
header[14] = 512 & 255;
header[15] = 512 >> 8;
header[16] = 24;

o_file.write(&header.front(), 18);

while (pos < length) {
//00 00 00 00
if (buffer[pos] <= '9'){
left = buffer[pos] - '0';
} else {
if(buffer[pos] == 'A'){left = toupper(buffer[pos]) - 'A' + 0xA; }
if(buffer[pos] == 'B'){left = toupper(buffer[pos]) - 'B' + 0xB; }
if(buffer[pos] == 'C'){left = toupper(buffer[pos]) - 'C' + 0xC; }
if(buffer[pos] == 'D'){left = toupper(buffer[pos]) - 'D' + 0xD; }
if(buffer[pos] == 'E'){left = toupper(buffer[pos]) - 'E' + 0xE; }
if(buffer[pos] == 'F'){left = toupper(buffer[pos]) - 'F' + 0xF; }
}
if(buffer[pos + 1] <= '9'){
right = buffer[pos + 1] - '0';
}else{
if(buffer[pos + 1] == 'A'){right = toupper(buffer[pos + 1]) - 'A' + 0xA; }
if(buffer[pos + 1] == 'B'){right = toupper(buffer[pos + 1]) - 'B' + 0xB; }
if(buffer[pos + 1] == 'C'){right = toupper(buffer[pos + 1]) - 'C' + 0xC; }
if(buffer[pos + 1] == 'D'){right = toupper(buffer[pos + 1]) - 'D' + 0xD; }
if(buffer[pos + 1] == 'E'){right = toupper(buffer[pos + 1]) - 'E' + 0xE; }
if(buffer[pos + 1] == 'F'){right = toupper(buffer[pos + 1]) - 'F' + 0xF; }
}

char newbyte = (char)(left * 16 + right);
o_file << newbyte;

ptr[i++] = newbyte; // similarly here with spaces vs. tabs.
// Also notice that array indexing and pointer arithmetic
// are equivalent.
if(i >= sizeof(unsigned int)){
i = 0;
}
pos += 3;
}
}



Of course, if we notice that

(a) Using globals for something like this is really horrible (this is exactly what function parameters are for: to pass information into the function);
(b) There's no point in constructing the "printme" value any more;
(c) Each hex "nibble" can be converted in the same way, and making a function to convert each one would probably be cleaner;
(e) We could probably use a similar approach to write numbers into the header;
(d) 'A' through 'F' are consecutive in ASCII;
(e) Variables should be declared near first use and scoped as tightly as possible;

we get something more like:


template <int size>
void writeBigEndian(char* where, int what) {
// We make a local copy of 'what', which is modified...
for (char* c = where + size; c > where; --c) {
// 'c' points one past where we want to write.
c[-1] = what & 255;
what >>= 8;
}
}

int as_hex(char c) {
if (c >= '0' && c <= '9') { return c - '0'; }
if (c >= 'A' && c <= 'F') { return c - 'A' + 0xA; }
if (c >= 'a' && c <= 'a') { return c - 'a' + 0xA; }
throw std::domain_error();
}

void FlipBytes(std::ofstream& o_file, std::vector<char> buffer, std::vector<char>::iterator pos){
std::vector<char> header(18);
header[2] = 2;
writeBigEndian<2>(header + 12, 512);
writeBigEndian<2>(header + 14, 512);
header[16] = 24;

o_file.write(&header.front(), 18);

while (pos != buffer.end()) {
int left = as_hex(*pos);
int right = as_hex(*(pos + 1));
// Don't need a temporary for this.
o_file << char(left * 16 + right);
pos += 3;
}
}



You know, assuming you still want to do all the work, when it can be done like ToohrVyk showed... :)

Share this post


Link to post
Share on other sites
Quote:
Original post by kittycat768
I really don't understand what you're trying to do. It looks to me like you are torturing yourself with this haphazard method. Some files are stored as bytes. Read bytes. Some files are stored as integers. Read integers. Reading in 2 bytes as an integer is as simple as fread(buffer, 4, 1, file).


1) As pointed out, fread() belongs to C.
2) All files are stored as bytes. It is up to programs to interpret them as integers. A more accurate statement would be that some files are formatted such that they're intended to be interpreted as a series of integers. Of course, integer sizes and endian-ness are platform-specific.
3) You've missed the point, anyway: the file contains text which looks like what a hex editor would display for a binary file, and he wishes to convert this to the corresponding actual binary data. For example, if the file contains the two bytes '2' '0' in sequence, he would want to write a space character (ASCII 30, 0x20 in hex) to the file.

ToohrVyk's code does this, is blindingly short, and shows off the C++ standard library just as the language designers presumably intended (it would be hard to imagine coming up with classes like istream_iterator otherwise). I would probably use an ostreambuf_iterator for the output, though (ostream_iterator uses operator<<, which will happen to work fine for outputting chars, but doesn't document the intent very well). This kind of thing is more important to think about on the input side (if you use istream_iterator<char>, it will skip whitespace; but here, skipping whitespace is the desired behaviour).

Share this post


Link to post
Share on other sites
I know my code is a mess and I typically don't program like that.

That's angry coding that you're seeing :D

the safe_malloc function is a function I borrowed from the source of another program that made use of it in a 24 bit tga writing function it had.

Well, the program works as intended though so it's all good now. Maybe... MAYBE if I get a wild hair up my butt I'll go back over the program and clean it.

Here's the resulting image on a successful run:


I was writing this program so I can have a greater control over the lightmaps used in Call of Duty 4. (For instance, say we have a texture that is a light with a grate over it. The light doesn't cast the shadows of the grate because they're both the same texture. Being able to pull out and recompile the lightmap data with custom painted lightmaps lets me paint the shadows of the grate that should appear)


Thanks again guys for all the help. I'm not used to writing this kind of stuff.

Also, just wondering, why was this moved to the .NET forum?

Share this post


Link to post
Share on other sites
Quote:
Original post by kittycat768
Legacy is a really nasty term used for things that are old. WinMain() has got to be like 15 years old by now at least. We sure need that legacy code to make our programs work. Stupid legacy code. Why can't they make an STL template to replace WinMain()... fread() belongs the stand C library. Everything in C is a part of C++. I really don't understand why people try to act like C is a dog and C++ is a cat. In reality C is a puppy and C++ is a dog. fread() is in no risk of becoming extinct anytime soon. Many things in C++ were designed to make things that you'd normally do in C easier. I don't see file reading getting any easier than fread(). If it isn't broken, don't fix it. Also, your wording "proper solution" is simply an opinion. This doesn't mean that I'm not open to learning how to open and read files with STL. I'm actually enjoying learning STL quite a bit. My main point is that fread() deserves our respect.


You are wrong. fread is not a part of C, it is a part of the C Standard Library. If you are using C++ you should not be using the C Standard Library, you should be using the C++ Standard Library and its subset the Standard Template Library. The proper way to read a file with C++ is to use streams, or more accurately file streams.

You are confusing the nature of C/C++ with the nature the C/C++ Standard Libraries. C++ was meant to be backwards compatible with C. This is purely in relation to the language itself. The standard libraries that accompany each language are in no way meant to be compatible.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement