Sign in to follow this  

Searching txt files

This topic is 1270 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm currently learning fstream and I am trying to search a txt file I created for certain strings.

 

Once I find a string I want to start copying the strings directly under them until I find another string to begin the copying directly under them. Except for my first string I want to copy after the "=" sign then stop after a new line.

 

For example the txt will look something like this:

 

R = 1-1

cbl

12

14

16

18

 

R = 2-1

cbl

11

13

15

17

 

where R equals the remote number. Cbl means the cables and under "cbl" is the cable numbers (none are going to be integers all are strings).

 

But I want the out put to be in the format

 

1-1 12

1-1 14

1-1 16

1-1 18

 

2-1 11

2-1 13

2-1 15

2-1 17

 

The reason I am creating this searching and outputting program because once it is in my final format, I plan on inputting them into excel to be placed into a database.

 

I can open the file but I have no idea how to search word for word and then tell the program to copy next to key words then continue until another keyword.

 

Help?

Share this post


Link to post
Share on other sites

The ifstream object provides an interface for reading input from a file. It doesn't provide any searching or otherwise "semantic" uses of the file, all of that is up to the programmer to do on his/her own.

 

What you'll want to do is use the getline function of your ifstream object to read one line at a time. Then you can determine if that line matches the "R = x-x" type of line, and if so you can use getline to get the subsequent lines, which you will then copy into your output.

 

How will you detect that a line matches the "R = x-x" format? That's up to you, there are a lot of ways you could do it. To me, it looks like the simplest solution is to merely check that the first character is 'R' since it appears no other lines have R as the first character. Therefore that is an identifying property of the remote lines. This approach isn't very safe, and doesn't check for malformed lines, but the amount of safety you think you require is up to you. You can certainly do a more rigorous examination of each line, if that's something you need.

Edited by Samith

Share this post


Link to post
Share on other sites

Thanks for the help, I will research the getline() function. That will most definitly help gathering information for each line.

 

But is there a way to tell the program to begin after "=" or after finding the desired keyword? Like how to make the program to begin inputting strings after a desired keyword or the next line?

Share this post


Link to post
Share on other sites

Take a look at some of the functions available for std::string, such as:

std::string myString = "";
while( std::string::npos == myString.find("=",0)  ) { ... [load myString with the next input text line, checking for end-of-file]... }
// myString contains "="
// or, for some keyword..
while( std::string::npos == myString.find("keyword",0) { .. [load myString, looping until the keyword is found] ...}
//.. continue as desired

The find() function for std::string returns the position where the quoted string occurs. If the quoted string is not found, the function returns std::string::npos (no position). There are also variations such as find_first_of("=", 0), and find_last_of("=",0) if there are multiple occurrences of some string or token. The 0 (zero) argument is just the start position within the string to start the search.

 

Be sure to check for end-of-file when you're searching through a file.

 

EDIT: you can also input directly to a std::string. E.g.,

file >> myString; // NOTE: a complete line may or may not be input, depending on whitespace.
// The input will end when whitespace is encountered, including std::string::endl (endline)
// BUT.. the string will NOT contain any whitespace - spaces, tabs, endlines..
// for instance, for the line: R<space>=<space>1-1
file >> myString; // "R" (without the quotation marks)
file >> myString; // "="
file >> myString; // "1-1"

// for the line: R=<space>1-1
file >> myString; // "R="
file >> myString; // "1-1"
Edited by Buckeye

Share this post


Link to post
Share on other sites

That's just your typical string manipulation stuff. What you get back from getline is a string. A c-string to be exact. There are number of functions for dealing with c-strings (strstr, strchr, strcmp, etc, you can look them up). If you would like you could construct a std::string from the c-string that you get back from getline. You can look up std::string, as well. There are quite a few member functions of std::strings that will be useful to you.

 

But regardless: what you get back from getline is a string. You have to parse the string yourself, however you choose. You check if the string matches a certain format, if it does, then you grab some substrings of that string and copy them into the output. How you determine what substrings are important is up to you. If you want everything after the first '=' sign then you could use strchr to find the first '=' and go from there, or you could use the 'find' member function of std::string to find the '=' sign, and then only look at the substring that immediately follows the '=' sign. If you know the '=' sign is always the third character of the string, then you could simply use the substring starting at the 4th character. Whatever you want.

 

EDIT: ninja'd by Buckeye laugh.png

Edited by Samith

Share this post


Link to post
Share on other sites


and under "cbl" is the cable numbers (none are going to be integers all are strings).

 

Clarify this statement.  You say none are going to be integers, yet your example text contains only integers.  When it comes to string parsing, the details are essential.

Share this post


Link to post
Share on other sites

You could use fscanf i beleive, something like this:

 

(Untested)

char c;
int x, y, z;

char szString[256];
ZeroMemory(szString, 256);

FILE *f = fopen("YourFileName.txt", "rt");
if(!f)
    return;

// add error checking later...
fscanf(f, "%c = %d-%d\n", c, x, y);
fnscanf(f, 256, "%s\n", szString);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "%d-%d %d\n", x, y, z);
fscanf(f, "\n");

fclose(f);

http://www.cplusplus.com/reference/cstdio/fscanf/

Edited by Vortez

Share this post


Link to post
Share on other sites

You'd want to read each line into a string and use sscanf (convert to char* with c_str()) instead, since fscanf advances the file read pointer as it goes along, whereas sscanf does not.

 

I prefer the scanf family of functions for this kind of parsing too.

Share this post


Link to post
Share on other sites

Just for clairfication I'm using c++ so C looks a bit different to me but I'm sure I can figure it out.

 

 

 


and under "cbl" is the cable numbers (none are going to be integers all are strings).

 

Clarify this statement.  You say none are going to be integers, yet your example text contains only integers.  When it comes to string parsing, the details are essential.

 

Everything in my txt right now are going to be used as strings. Even though it looks like a number, I am only worried about passing it as strings of txt for now. I'll soon be searching txt for numerical values and sorting them for other purposes but right now I'm going step by step trying to get string positions and minuplation.

 

 

Where I'm at now is a bit confusing. For example. I'm unsure how to get to the line under the "cbl" and to stop once it gets to another remote number.

if (mystring.find ("r=") != string::npos){
		rString_pos = mystring.npos + 2;
		rname = mystring.substr(rString_pos+1, 4);
		cout <<rname<<endl;		
		}

The above finds the string "r=" and moves the npos to the character afterwards then I put the remaining line into rname.

 

But I can't do the same for cbl. Even the above seems wrong or around-the-way (but it works). I'm thinking of using a while loop but i'm unsure how to write the correct loop with the right paramerters to look for. I tried getline() but that doesn't work because I don't know how to get the string from the line into my string and to continue until it reaches another keyword.

Share this post


Link to post
Share on other sites

string::npos shouldn't be used as a position. It's a constant normally used as an indication (as mentioned above) that the search failed.

 

Try something like:

char buf[256];
file.getline(buf, 256, '\n');
mystring = buf;

int pos;
if( string::npos != (pos = mystring.find( "R=", 0 ) ) )
{
   rString_pos = pos + 2;
   rname = mystring.substring( pos+2, mystring.length() - rString_pos );
   cout << rname << endl;
}

Also, your original post has "R<space>=" with a capital "R" and a <space>. Be sure to test for all cases if you're not sure of the format of the text.

Edited by Buckeye

Share this post


Link to post
Share on other sites

psuedo-code:

string inLine
string prefix
while not eof
	getline inLine
	if inLine[0] = 'R'
		prefix = substring(inLine ... )
		continue
	endif
	if inLine[0] = 'c'
		 write out empty line
		continue
	endif
	if size of inLine > 0
		write out prefix " " inLine
	endif
wend

Basically, you have 3 scenarios:

  • whenever you come across an R line, parse out the value you want to retain and assign it to prefix (to late be used for output)
  • whenever you come across a cbl line, write out an empty line (your example output had space between 'sections').
  • whenever you have a non-empty line that isn't an R or cbl, output prefix + line

The continues skip any further processing.

Share this post


Link to post
Share on other sites

Parsing/tokenizing all depends on consistency of the data your parsing. You look for a common delimiter and work from there. You could then use a Vector array to store you data and print it out.

Share this post


Link to post
Share on other sites

This topic is 1270 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this