Text File I/O - Random Characters Getting Deleted!

Started by
4 comments, last by Zahlman 14 years, 5 months ago
Hey all, I'm writing a program that find all the phrases of a certain length in a file. So it runs through the file from start to end finding every phrase and stores them in a vector of strings. It works almost perfectly, except it sometimes deletes one character in the same place every time a new phrase is found.

Esperanza feels lost, alone,
feels lost, alone, and
lost, alone, and confused.
alone, and confused. She
and onfused. She believes
onfused. he believes that
he elieves that the
elieves hat the only
hat he only ones
he nly ones who
nly nes who can
As you can see above, starting on the fifth line, the second word in the phrase loses its first letter. The relevant class code for generating the above is below:

void Doc::findPhrases() {
   string curPhrase;
   string buffer;
   fstream file("small.txt");
   int curLoc;

    while(true) {

      curPhrase = "";
      if (file.eof()) return;
      file >> curPhrase;
      curLoc = file.tellg();

      for (int i=0; i<phraseLength-1; i++) {
         if (file.eof()) return;
         file >> buffer;
         curPhrase = curPhrase + " " + buffer;
      }
      phrases.push_back(curPhrase);
      file.seekg(curLoc);
   }

   file.close();
}

void Doc::printPhrases() {
   for (int i=0; i<50; i++) {
      cout << phrases << endl;
   }
}
I think it has to something to do with how I read in the first buffer word (the second letter in the phrase)... but I'm not sure what exactly is wrong. Any ideas? Thanks for the help!
Advertisement
Could you specify what the lines you gave us are.

Esperanza feels lost, alone,
feels lost, alone, and
lost, alone, and confused.
alone, and confused. She
and onfused. She believes
onfused. he believes that
he elieves that the
elieves hat the only
hat he only ones
he nly ones who
nly nes who can

Is that the source file? Or is that what the program is returning? Is it both? It looks like it only shows a sliding 4 words at a time. Is that correct?
j-locke: That is a sample of what printPhrases() returns. The program is currently set to show phrases of length 4.

The full text of "small.txt" (the source file) is:
Esperanza feels lost, alone, and confused. She believes that the only ones who can understand her feelings are trees. The trees are skinny and pointy. The city had planted them in concrete, which naturally is not the best nor the healthiest place for trees to grow. Esperanza sees the trees as a reflection of herself [skinny and angular]. In spite of their location, the trees seem to survive in a way. This idea of persevering parallels back to Esperanza. She struggles to be successful even though she is in a harsh environment. Her description of the setting leads me to think that she is poor. Therefore, she has limitations due to class. Is this why she feels that she can only confide in trees? For Esperanza, the trees seem to be like a support network for her. She would talk to them when she felt lonely. It was almost as if the trees were her family members. The trees were not only a support network and a reflection of Esperanza. They inspired Esperanza. When Esperanza was ready to give up, she looked at the trees, and they told her to keep, keep, keep, keep. The trees taught Esperanza perseverance, which would definitely help her achieve success.BibliographyFour Skinny TreesWords: 207
Strange; with that source text file and your code I get the correct, expected results:

Esperanza feels lost, alone,
feels lost, alone, and
lost, alone, and confused.
alone, and confused. She
and confused. She believes
confused. She believes that
She believes that the
believes that the only
that the only ones
the only ones who
(etc.)


What compiler are you using, and on what platform? Have you tried a full recompile of the program? Can you reproduce the issue in any simpler programs? I assume that "phrases" is a std::vector<std::string>, is that correct? Are you sure the input file is plain text and not hiding a rogue NULL byte or something similar?

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Hmm, it looks like it should be solid to me as well. Have you tried stepping through a debugger? To see if the data is being lost where it's reading it in or printing it out?
Ugh. File seeking/rewind operations are, in my experience, usually a sign of doing things wrongly, or at least messily. So is using .eof() to control a loop. In most cases the cleanest way is to use the reading operation itself to control the loop.

To fix these issues, we read a word each time through a single loop, and store a buffer of single words. Each time through the loop that the buffer is full, we construct a "phrase" string from the buffer contents, and discard the first word.

// Precondition: phraseLength >= 1.void Doc::findPhrases(const string& filename) {  typedef deque<string> wordlist;  wordlist words;  string word;  ifstream file(filename);  while (file >> word) {    words.push_back(word);    if (words.size() != phraseLength) { continue; }    wordlist::iterator it = words.begin(), end = words.end();    string phrase = *it++;    while (it != end) { phrase += " " + *it++; }    phrases.push_back(phrase);    words.pop_front();  }}

This topic is closed to new replies.

Advertisement