[EDIT: nevermind, just read the entirety of the thread... doh]
-me
[edited by - Palidine on May 18, 2004 8:09:27 PM]
Large (i mean absolutely huge) File Parsing (Solved)
Don't solve the problem side step it. Use memory mapped files. Here is my code to open a file in memory mapped mode. Once open you adress it like it's a part of memory, ie no fseek or anything.
edit: breaking tables bad
[edited by - SiCrane on May 18, 2004 8:25:41 PM]
void CFile::OpenReadFile(void){ HANDLE FileHandle = (void*)CreateFile(m_Name.c_str(), GENERIC_READ, FILE_SHARE_READ,NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (FileHandle == INVALID_HANDLE_VALUE) throw FileUtilities::FileNotFound(); m_Size = (unsigned int)GetFileSize(FileHandle, NULL); if (m_Size == 0) return; m_Handle = CreateFileMapping(FileHandle, NULL, PAGE_READONLY, 0, m_Size, NULL); if (m_Handle == NULL) { std::runtime_error("Couldn't create file mapping"); return; } CloseHandle(FileHandle); if (m_Handle == NULL) { std::runtime_error("Couldn't close file handle"); return; } m_Data = (char*)MapViewOfFile(m_Handle, FILE_MAP_READ, 0, 0, 0); if (m_Data == NULL) { std::runtime_error("Couldn't map view of file"); return; }}
edit: breaking tables bad
[edited by - SiCrane on May 18, 2004 8:25:41 PM]
also, though this doesn''t solve the problem at hand, there are several freely available programs that you can plug into apache to get it to spit out seperate log files for each day/month/year/whatever. having a seperate log file for each month is a good idea anyway b/c then you can delete old log files and not have to worry about the bloat. just google around and i''m sure you''ll find something.
-me
-me
Is line 25331 longer than 1022 characters? If you didn't want to worry about keeping the right buffer size you could read character by character into a
If you get the exact same problem with different C++ runtimes I doubt it has anything to do with being able to read the entire file. If you have
[edited by - igni ferroque on May 18, 2004 8:24:08 PM]
string
until you hit '\n'.If you get the exact same problem with different C++ runtimes I doubt it has anything to do with being able to read the entire file. If you have
grep
or any other program that can display a specific line from a file, take a look at that line. [edited by - igni ferroque on May 18, 2004 8:24:08 PM]
Ok, for some reason I have it working. Apparently it was a "SEARCH" rather than a "GET" or "POST" post that had over 300000 characters in it. I increased the line size to a million, an extreme amount and it seemed to work. Thanks for all the help anyways everyone.
quote:Original post by doodah2001
Ok, heres the basic outline of the code. This is the fstream version, however I have done bascially the same thing with FILE* and using windows specific with CreateFile().
int main()
{
ifstream in(filename);
if(!in.is_open())
return -1;
char line[1024];
while(!in.eof())
{
in.getline(line, 1024);
//call all the line parsing stuff here that just
//deals with the line, thats it
}
}
Its nothing complicated, it''s very simple. I have run it without doing any of my analysis functions but still get caught up on the same line. I have tried increasing the line size but that didn''t work because I still get caught up.
mat
I''m fairly sure that your problem occures when the line is longer than 1023 characters
Note that if the line is longer than 1023 characters, it sets the failbit of the stream. So the stream isn''t valid after that. !in.eof() will not fail, as that just tests for eof
You could either use std::getline, which is much safer because it operates on strings. Or you could use some code like this
if ( cin.rdstate() & ios::failbit ) { // the line was too long // turn off the stream''s ios::failbit cin.clear( cin.rdstate() & ~ios::failbit ); // read and discard any unread characters, up to // and including the delimiter character while ( cin.good() && (cin.get() != delim) );}else{ //Process the line}
[/souce]
quote:Original post by doodah2001
Ok, for some reason I have it working. Apparently it was a "SEARCH" rather than a "GET" or "POST" post that had over 300000 characters in it. I increased the line size to a million, an extreme amount and it seemed to work. Thanks for all the help anyways everyone.
Hehe. That's the problem with fixed-size buffers. They're never big enough =) Glad you fixed the problem!
[edited by - igni ferroque on May 18, 2004 8:26:54 PM]
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement