Sign in to follow this  
ajm113

Store Binary File Into a String? (C++)

Recommended Posts

ajm113    355
Hello, I was wondering how can I take a binary file and grab it's contents and place it into a string? And yes I looked at the fread example on cplusplus.com. I have done some Google for a few hours. Every method I tried to use doesn't work or it only gets 4 characters of the binary file... I seem to have more luck with text files using the fopen reading the file as a binary which is kinda ironic if you ask me. But anyways why I want to do this is because I have a Mysql server I'm using to store files. I need to upload the contents of the binary file to Mysql for it to actually store the file. So then I can do my will with that file in the database. Any suggestions?

Share this post


Link to post
Share on other sites
GregMichael    135
"Every method I tried to use doesn't work or it only gets 4 characters of the binary file... I seem to have more luck with text files using the fopen reading the file as a binary which is kinda ironic if you ask me."

Can you post some of the methods you've tried ?

One thing that seems "familiar" in that I've seen it done in the past is something like the following...pseudo code kind of...

char * buffer = allocate( x number of bytes );

read sizeof(buffer) from file...which will read sizeof(char *) = 4 bytes

Oh what an optimization to find (embarrassingly)...correcting it sped up load times no end...LOL !

Share this post


Link to post
Share on other sites
Zao    971
If the file contains binary data, you should not put it in a string. Strings are for text.

FILE* f = fopen(filename, "rb");
fseek(f, SEEK_END, 0);
long size = ftell(f);
fseek(f, SEEK_SET, 0);

std::vector<char> buf(size);
fread(&buf[0], 1, size, f);
fclose(f);



And of course, this is the C-style FILE*-based API. For sanity, you could/should use std::ifstream.

In any way, you should use a std::vector<T> for whenever you need a contiguous buffer of suitable size to pass somewhere, like say MySQL. &v[0] gives you a pointer to the first element, and v.size() gives you the number of elements.

Share this post


Link to post
Share on other sites
ajm113    355
Hey Zao!

I tried out your method it works so far, right now I'm just debugging this little problem with the accumulate function I'm using to convert the vector into the string.

So far the output is correct, what I find really odd is when I return that string through the function it's out put is this: 'Ä'. Is there any reason why when I return that string it turns out like that?


//====================================
//getFileContents(const char *szfilePath)
//====================================

//...


std::vector<char> vBuf(lSize);

fread(&vBuf[0], 1, lSize, pFile);

// terminate
fclose (pFile);

std::string strBuf = accumulate(vBuf.begin(), vBuf.end(), std::string(""));

buffer = (char*)strBuf.c_str();

return buffer;

}

//======================================
//SomeOther.cpp
//======================================

buf = fileSearch.getFileContents(szFilename);

//After that is called buf equals to "Å"?




Is this some kind of garbage collection issue?

Share this post


Link to post
Share on other sites
the_edd    2109

std::ifstream file("filename.txt", std::ios::binary);
std::istreambuf_iterator<char> b(file), e;
std::string content(b, e);


EDIT:


std::string strBuf = accumulate(vBuf.begin(), vBuf.end(), std::string(""));

buffer = (char*)strBuf.c_str();

return buffer;

}



You're effectively returning a pointer to a local variable. Once the function has returned, strBuf and all of its resources are destroyed. buffer will point to memory with unpredictable content.

[Edited by - the_edd on April 25, 2010 4:03:13 PM]

Share this post


Link to post
Share on other sites
Zahlman    1682
1) You can't just return the .c_str() of a local variable, for the same reason that you can't return a reference to a local variable or a pointer into a local array. So just return the string. But actually, don't just return the string; just return the vector - keep reading.

2) Using std::accumulate like that is going out of your way to do huge amounts of extra work and slow down the program while churning through memory.

You can create a string from a std::vector<char> directly, like so:

std::string strBuf(vBuf.begin(), vBuf.end());


3) But why would you want to do that? As you were already told, std::string instances are for text, and you don't have text, or you would be opening it as a text file rather than as a binary file. Just return the vector.

Share this post


Link to post
Share on other sites
ajm113    355
Alright well now I'm back to the first problem started of only getting 4 characters of the files. But I need it in a form of a string so I can run a mysql command through C++ to be able to upload the file. Is there some alternative then using a string to store the binary contents so I can transfer the file via Winsock and Mysql?

Share this post


Link to post
Share on other sites
Atrix256    539
Yes there are alternatives!

#1 - convert your binary data into a string of hexidecimal digits. This will double the size of your data, but it will be safe to store in a database text field.

or

#2 - convert your binary data into a string of "base 64" encoding. This is more commonly used on the web, and makes your data 25% larger instead of 100%. Check it out here: http://en.wikipedia.org/wiki/Base64

But why are you trying to store files in a database? Files don't belong in databases, it makes the database preform very poorly.

Store it on disk instead, and if you need to store data about the file too, store that in the database, but keep the file seperate!

Share this post


Link to post
Share on other sites
Zipster    2359
Quote:
Original post by ajm113
Alright well now I'm back to the first problem started of only getting 4 characters of the files. But I need it in a form of a string so I can run a mysql command through C++ to be able to upload the file. Is there some alternative then using a string to store the binary contents so I can transfer the file via Winsock and Mysql?

Did you try the code the_edd posted? How do you know you only read 4 bytes?

Share this post


Link to post
Share on other sites
ajm113    355
@Zipster and edd, sorry I forgot to mention! I tried that I get the same effect pretty much. I'm just looking at what VS2008 gives me in debug mode. I'm guesting what VS2008 isn't as always accurate in debug with strings that has binary data?

@ Atrix256

I really like the hex idea! I already got a hex converter going and sending hex to the server right now for the file. Plus I think it would save me from having to create a anti mysql injection attack system.

The reason why I'm using a database instead of a regular old FTP route is for security reasons. Just so no one can get into my computer using what I created or try to access files the user uploaded. Plus it may be for a website too on the side.

It will make my life easier just using PHP and C++ for mysql then having to write a security system myself using C++ and PHP to keep people hacking files or doing any damage to the files easily.

I have one more question if you can answer me this if anyone doesn't mind me asking.

For some odd reason even though the values in debug mode show correctly on the variables in VS2008.

It seems that sprintf is messing the values up and sometimes crashing itself. Here is a example of what I have:



bool Mysql_class::UploadFile(int UserId, MYSQL_FILE_PACKET packet)
{

if(strlen(packet.fileData.c_str()) < 1 || strlen(packet.fileName.c_str()) < 4)
{
return false;
}
int StatementId = FindAvailableStatement();

if(StatementId < 0)
{
return false;
}

try
{

char query[2025];

char* fileName = (char*)packet.fileName.c_str();
char* gameName = (char*)packet.GameName.c_str();
char* fileData = (char*)packet.fileData.c_str();
__int64 CreationTime = packet.fileCreationDate;

sprintf(query, "UPDATE files SET fileCreationDate=%i, fileData=\'%s\' WHERE owner=%i;", CreationTime, fileData, UserId);

// pstmt = con->prepareStatement();
MySQLstatment[StatementId].stmt = con->createStatement();
MySQLstatment[StatementId].stmt->execute(query);

















When really the value for query is supposed to be this
"UPDATE files SET fileCreationDate=2147483647, fileData='0x01CF7EC0 48656C6C6F20576F726C6421DA486F77417265596F753F' WHERE owner=1;".

The variable's value after is really this:
"UPDATE files SET fileCreationDate=2147483647, fileData='(null)' WHERE owner=30375616;" Which is really odd because fileData isn't empty and idk where the heck 30375616 is coming from...

Sorry if this question kinda seems low level for me to ask, but I'm kinda light headed for my room being really stuffy.

[Edited by - ajm113 on April 26, 2010 9:53:30 PM]

Share this post


Link to post
Share on other sites
Atrix256    539
take it from someone who has made systems like this - storing your files in the database really is a bad idea.

Also is your application running on a client and connecting to a mysql server directly? If so that is also a really really bad idea because then everyone will have the login info you used and will be able to connect to your db with whatever user rights your program has and do whatever they want.

Here's a better idea...

from your client program, use libcurl to talk http to a php script, something like:

http://www.mydomain.com/index.php?function=uploadfile

send your file data as a POST argument, and base 64 encode it (cause its way smaller than HEX!)

have your php handle the file... write the file to disk with a unique id such as "1.dat" and store information about that file in your database where "1" is the primary key of your file information table.

if you want to download a file you can hit a url like this...

http://www.mydomain.com/index.php?function=download&fileid=1

if you want to, you can add username / password parameters to your php file to make sure the user is valid.

there is more security stuff you can do but this is a whole world better than what you are planning.

no offense, just want to inform!

Share this post


Link to post
Share on other sites
ajm113    355
Well actually, mysql is connected to my server application, for the user to do ANYTHING they have to always supply a username and password for things like file upload, downloads, user account checking, etc and by the server the user is automatically logged off the server after a command is done. So it doesn't stay connected all the time. Plus if I used mysql it saves me a lot of time making back ups for clients or removing a user with the database on my PC.

Plus the thing is if I did use my web service, yes I would get more traffic, but I would still have a disadvantage of speed and a risk of loosing my account because most hosting company's frown files that aren't being used for your website.. It's kinda stupid I know, but I'm not going to write a blog about it...

Anyways, the client application stores the login and when ever it gives the password it will be always encrypted.


And don't worry I think it's great your taking your time to inform me the risks and I appreciate it and not intending to kill my hopes and dreams, hehe.

On the other end...

Also I figured out why I was getting only a short part of my binary string!

When the buffer passes through anything using strlen() it kinda tells my code to only copy that small part to send to the server.

I'm sure because of garabage collection isn't included with most C++ compilers by default they added a check for a "0" function in the strlen(). Then say the string ends there! Instead I used ".length" with a std::string for my buffer.

It works pretty well for large binary files it appears! Thank you guys!

Share this post


Link to post
Share on other sites
Atrix256    539
ok well i wanted to give you warnings but if you are set in going that way (especially if you have technical reasons why you need to!) then so be it (:

good luck man, and im glad you got your string thing working.

Share this post


Link to post
Share on other sites
the_edd    2109
EDIT: sorry, I posted the stuff below in haste and didn't see your most recent post. However, this has nothing to do with garbage collection. strlen determines the length of a string by looking for a 0 byte. That's just what it does. It might be prudent to get hold of a good tutorial text and brush up on your C++ fundamentals.

Quote:
Original post by ajm113
@Zipster and edd, sorry I forgot to mention! I tried that I get the same effect pretty much. I'm just looking at what VS2008 gives me in debug mode. I'm guesting what VS2008 isn't as always accurate in debug with strings that has binary data.


Do you mean you're looking at it in a debugger, or you're compiling your code with debug settings? Which ever method you're using to look at the data, make sure it really is looking at all the data and not stopping at the first '\0' byte.

For example, if you look at a char* in your common or garden variety debugger, it will only show you the data up until the first '\0'. Similarly, using strlen to determine the amount of binary data a char* points at is incorrect.

In binary files '\0' bytes (having numeric value 0) are extremely common. Since you're only seeing 4 characters, I'm guessing the 5th byte of your file is actually equal to 0. Is this the case? Get a hex/binary editor and check.

So, my code again with a couple of amendments (reading in to a vector now, to move away from the binary/text confusion):


std::ifstream file("filename.txt", std::ios::binary);
std::istreambuf_iterator<char> b(file), e;
std::vector<char> content(b, e);

// The correct way to tell how much data was read:
std::cout << "read " << content.size() << " bytes from file\n"; // try this!

Share this post


Link to post
Share on other sites
Zahlman    1682
Have you actually examined the contents of the file?

Are you opening the file in binary mode?

And again, why are you putting it in a string? When you call the file a binary file, you are basically saying that you don't want to treat its contents as "text". When you put it in a string, you say that you do want to treat it that way. Which is it?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this