Jump to content

  • Log In with Google      Sign In   
  • Create Account

Searching txt files


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
13 replies to this topic

#1 lb2024   Members   -  Reputation: 118

Like
0Likes
Like

Posted 22 June 2014 - 06:35 PM

I'm currently learning fstream and I am trying to search a txt file I created for certain strings.

 

Once I find a string I want to start copying the strings directly under them until I find another string to begin the copying directly under them. Except for my first string I want to copy after the "=" sign then stop after a new line.

 

For example the txt will look something like this:

 

R = 1-1

cbl

12

14

16

18

 

R = 2-1

cbl

11

13

15

17

 

where R equals the remote number. Cbl means the cables and under "cbl" is the cable numbers (none are going to be integers all are strings).

 

But I want the out put to be in the format

 

1-1 12

1-1 14

1-1 16

1-1 18

 

2-1 11

2-1 13

2-1 15

2-1 17

 

The reason I am creating this searching and outputting program because once it is in my final format, I plan on inputting them into excel to be placed into a database.

 

I can open the file but I have no idea how to search word for word and then tell the program to copy next to key words then continue until another keyword.

 

Help?



Sponsor:

#2 Samith   Members   -  Reputation: 2274

Like
1Likes
Like

Posted 22 June 2014 - 06:47 PM

The ifstream object provides an interface for reading input from a file. It doesn't provide any searching or otherwise "semantic" uses of the file, all of that is up to the programmer to do on his/her own.

 

What you'll want to do is use the getline function of your ifstream object to read one line at a time. Then you can determine if that line matches the "R = x-x" type of line, and if so you can use getline to get the subsequent lines, which you will then copy into your output.

 

How will you detect that a line matches the "R = x-x" format? That's up to you, there are a lot of ways you could do it. To me, it looks like the simplest solution is to merely check that the first character is 'R' since it appears no other lines have R as the first character. Therefore that is an identifying property of the remote lines. This approach isn't very safe, and doesn't check for malformed lines, but the amount of safety you think you require is up to you. You can certainly do a more rigorous examination of each line, if that's something you need.


Edited by Samith, 22 June 2014 - 06:49 PM.


#3 lb2024   Members   -  Reputation: 118

Like
0Likes
Like

Posted 22 June 2014 - 07:00 PM

Thanks for the help, I will research the getline() function. That will most definitly help gathering information for each line.

 

But is there a way to tell the program to begin after "=" or after finding the desired keyword? Like how to make the program to begin inputting strings after a desired keyword or the next line?



#4 Buckeye   Crossbones+   -  Reputation: 6251

Like
2Likes
Like

Posted 22 June 2014 - 07:15 PM

Take a look at some of the functions available for std::string, such as:

std::string myString = "";
while( std::string::npos == myString.find("=",0)  ) { ... [load myString with the next input text line, checking for end-of-file]... }
// myString contains "="
// or, for some keyword..
while( std::string::npos == myString.find("keyword",0) { .. [load myString, looping until the keyword is found] ...}
//.. continue as desired

The find() function for std::string returns the position where the quoted string occurs. If the quoted string is not found, the function returns std::string::npos (no position). There are also variations such as find_first_of("=", 0), and find_last_of("=",0) if there are multiple occurrences of some string or token. The 0 (zero) argument is just the start position within the string to start the search.

 

Be sure to check for end-of-file when you're searching through a file.

 

EDIT: you can also input directly to a std::string. E.g.,

file >> myString; // NOTE: a complete line may or may not be input, depending on whitespace.
// The input will end when whitespace is encountered, including std::string::endl (endline)
// BUT.. the string will NOT contain any whitespace - spaces, tabs, endlines..
// for instance, for the line: R<space>=<space>1-1
file >> myString; // "R" (without the quotation marks)
file >> myString; // "="
file >> myString; // "1-1"

// for the line: R=<space>1-1
file >> myString; // "R="
file >> myString; // "1-1"

Edited by Buckeye, 22 June 2014 - 07:36 PM.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.


#5 Samith   Members   -  Reputation: 2274

Like
2Likes
Like

Posted 22 June 2014 - 07:15 PM

That's just your typical string manipulation stuff. What you get back from getline is a string. A c-string to be exact. There are number of functions for dealing with c-strings (strstr, strchr, strcmp, etc, you can look them up). If you would like you could construct a std::string from the c-string that you get back from getline. You can look up std::string, as well. There are quite a few member functions of std::strings that will be useful to you.

 

But regardless: what you get back from getline is a string. You have to parse the string yourself, however you choose. You check if the string matches a certain format, if it does, then you grab some substrings of that string and copy them into the output. How you determine what substrings are important is up to you. If you want everything after the first '=' sign then you could use strchr to find the first '=' and go from there, or you could use the 'find' member function of std::string to find the '=' sign, and then only look at the substring that immediately follows the '=' sign. If you know the '=' sign is always the third character of the string, then you could simply use the substring starting at the 4th character. Whatever you want.

 

EDIT: ninja'd by Buckeye laugh.png


Edited by Samith, 22 June 2014 - 07:17 PM.


#6 lb2024   Members   -  Reputation: 118

Like
2Likes
Like

Posted 22 June 2014 - 07:24 PM

Oh thanks a bunch guys. I will research and try my best to create.



#7 LennyLen   Crossbones+   -  Reputation: 4020

Like
0Likes
Like

Posted 22 June 2014 - 10:39 PM

If you're going to do some research, try looking up string parsing and tokenizing.



#8 jHaskell   Members   -  Reputation: 1087

Like
0Likes
Like

Posted 23 June 2014 - 06:05 AM


and under "cbl" is the cable numbers (none are going to be integers all are strings).

 

Clarify this statement.  You say none are going to be integers, yet your example text contains only integers.  When it comes to string parsing, the details are essential.



#9 Vortez   Crossbones+   -  Reputation: 2704

Like
-1Likes
Like

Posted 23 June 2014 - 07:18 AM

You could use fscanf i beleive, something like this:

 

(Untested)

char c;
int x, y, z;

char szString[256];
ZeroMemory(szString, 256);

FILE *f = fopen("YourFileName.txt", "rt");
if(!f)
    return;

// add error checking later...
fscanf(f, "%c = %d-%d\n", c, x, y);
fnscanf(f, 256, "%s\n", szString);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "\n");

fclose(f);

http://www.cplusplus.com/reference/cstdio/fscanf/


Edited by Vortez, 23 June 2014 - 07:37 AM.


#10 Paradigm Shifter   Crossbones+   -  Reputation: 5433

Like
0Likes
Like

Posted 23 June 2014 - 07:38 AM

You'd want to read each line into a string and use sscanf (convert to char* with c_str()) instead, since fscanf advances the file read pointer as it goes along, whereas sscanf does not.

 

I prefer the scanf family of functions for this kind of parsing too.


"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

#11 lb2024   Members   -  Reputation: 118

Like
0Likes
Like

Posted 23 June 2014 - 01:32 PM

Just for clairfication I'm using c++ so C looks a bit different to me but I'm sure I can figure it out.

 

 

 


and under "cbl" is the cable numbers (none are going to be integers all are strings).

 

Clarify this statement.  You say none are going to be integers, yet your example text contains only integers.  When it comes to string parsing, the details are essential.

 

Everything in my txt right now are going to be used as strings. Even though it looks like a number, I am only worried about passing it as strings of txt for now. I'll soon be searching txt for numerical values and sorting them for other purposes but right now I'm going step by step trying to get string positions and minuplation.

 

 

Where I'm at now is a bit confusing. For example. I'm unsure how to get to the line under the "cbl" and to stop once it gets to another remote number.

if (mystring.find ("r=") != string::npos){
		rString_pos = mystring.npos + 2;
		rname = mystring.substr(rString_pos+1, 4);
		cout <<rname<<endl;		
		}

The above finds the string "r=" and moves the npos to the character afterwards then I put the remaining line into rname.

 

But I can't do the same for cbl. Even the above seems wrong or around-the-way (but it works). I'm thinking of using a while loop but i'm unsure how to write the correct loop with the right paramerters to look for. I tried getline() but that doesn't work because I don't know how to get the string from the line into my string and to continue until it reaches another keyword.



#12 Buckeye   Crossbones+   -  Reputation: 6251

Like
0Likes
Like

Posted 24 June 2014 - 06:17 AM

string::npos shouldn't be used as a position. It's a constant normally used as an indication (as mentioned above) that the search failed.

 

Try something like:

char buf[256];
file.getline(buf, 256, '\n');
mystring = buf;

int pos;
if( string::npos != (pos = mystring.find( "R=", 0 ) ) )
{
   rString_pos = pos + 2;
   rname = mystring.substring( pos+2, mystring.length() - rString_pos );
   cout << rname << endl;
}

Also, your original post has "R<space>=" with a capital "R" and a <space>. Be sure to test for all cases if you're not sure of the format of the text.


Edited by Buckeye, 24 June 2014 - 06:28 AM.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.


#13 jHaskell   Members   -  Reputation: 1087

Like
0Likes
Like

Posted 24 June 2014 - 09:16 AM

psuedo-code:

string inLine
string prefix
while not eof
	getline inLine
	if inLine[0] = 'R'
		prefix = substring(inLine ... )
		continue
	endif
	if inLine[0] = 'c'
		 write out empty line
		continue
	endif
	if size of inLine > 0
		write out prefix " " inLine
	endif
wend

Basically, you have 3 scenarios:

  • whenever you come across an R line, parse out the value you want to retain and assign it to prefix (to late be used for output)
  • whenever you come across a cbl line, write out an empty line (your example output had space between 'sections').
  • whenever you have a non-empty line that isn't an R or cbl, output prefix + line

The continues skip any further processing.



#14 rAm_y_   Members   -  Reputation: 481

Like
0Likes
Like

Posted 24 June 2014 - 09:36 AM

Parsing/tokenizing all depends on consistency of the data your parsing. You look for a common delimiter and work from there. You could then use a Vector array to store you data and print it out.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS