Sign in to follow this  

Regex and STL search questions

This topic is 4708 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I have 2 questions.

The first is I have a string, which looks like so "<a>Hello<a>More Stuff <a>More..", and have made a function to extract everything inbetween the tags..., but its very messy, and doesnt work if the tags are changed. I couldn't find anything on google... and is it possible to pass in the start and end tags as search patterns as well? <a href="[A-Za-z-_]*"> and </a> for instance? (string InnerText(string , starttag , endtag);)

My second question is that when I tried to do this in perl as well, I could, find patterns that included the tags, but to extract the text between the tags, I ran into difficulty (i had to loop through all the extracted vals in array), it was messy, and sometimes empty whitespace values where put into the array. Again like last time, how could I do the same in Perl and also have search patterns in start and end tag?

Thanks very much.

Share this post


Link to post
Share on other sites
First of all, GDNet doesn't require HTML for basic text formatting. Simple line breaks function well, and there is some markup available - see the Site FAQ for details. For more complex formatting, HTML is appropriate. This is important because people frequently quote each other in order to provide targeted responses.

Now, on to your questions.

Quote:
Original post by Genjix
[I]s it possible to pass in the start and end tags as search patterns as well?
C++ does not natively (or through its Standard Library) support regular expressions, but you can locate third-party libraries to implement it. Alternatively, if you're a text processing guru, you can implement it yourself.

Quote:
[H]ow could I do the same in Perl and also have search patterns in start and end tag?
Depending on whether your data is nested or sequential, you will need to construct a two-pass regex algorithm, which extracts the delimiter(s) on the first pass and the delimited data on the second. I've been striving to forget Perl for quite a few years now, so I can't give you exact code.

Anything, however, is possible.

Share this post


Link to post
Share on other sites

so using stl, and if i gave you some random string such as "kujgsfdfdsg^&$&<a>Hello</a>%£HGR}@{JHHJDGJHDgjgbjdjdgvj^&78<a>More Stuff</a>JHKJD5_+_yu795<a>...More</a>dfjkfd", how could i parse it and return a vector<string> of all the innertext between the tags back (without the tags)?


All the functions Ive tried do really strange things (logic errors).


Thanks.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
you can use regex library which is a part of boost distribution at http://boost.org

good luck

Share this post


Link to post
Share on other sites

This topic is 4708 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this