Jump to content
  • Advertisement
Sign in to follow this  
spaceJockey123

C++ and Text files

This topic is 4337 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all, I'm in a bit of a tricky situation. I need to write a program (in C/C++) which grabs delimited data from thousands of text files. After that I'm going to be performing data mining algorithms on all the data. Here's a sample of one of the text files: 1:[]1488844,3,2005-09-06[]822109,5,2005-05-13[]885013,4,2005-10-19[] This bit '[]' is actually a special symbol that pushes everything to the next line when I paste it into here. Thing is I'm not too sure about File IO. Are there special libraries I should be using? Also the text files weigh about 2GB... I chose to do this task in C because it's efficient with memory but if I'm going to be performing algorithms on this much data, should I be using certain data structures? I put up another thread about parallel computing because eventually I will probably be using several computers to do the task together. Thank you, Space J

Share this post


Link to post
Share on other sites
Advertisement
Well, if you're willing to do a bit more coding work than standard I/O programming, you could use memory mapped files. They let you view segments of a file as though it was loaded into memory (and was thus just a byte array). This lets you map only small parts of a file at a time (a few kilobytes, for example), so that you know that you don't have to open the whole file at once, and have very explicit control over what is opened. It also lets you access the data much more easily (assuming you prefer raw buffer manipulation to having to call read(), write(), or use stream operators). You'll have to use platform-specific code, however, to do this. On Win32, you can check out MSDN: File Mapping. For Linux and similar stuff, I believe you can use mmap() and its associated functions.

Share this post


Link to post
Share on other sites
goto http://msdn.microsoft.com and search for fstream.h or fstream. Its definitly easy and should only take you about 3 seconds to learn... but prolly not the best for 2gig files with yer type of input...

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!