• Advertisement
Sign in to follow this  

Python-Get String Between Two Characters

This topic is 620 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I need to one give me the string between ~ and ^.
I have a string like this:

~~~~ ABC ^ DEF ^ HGK > LMN ^ 

I need to get the string between them with python.
I've tried this:

import re
target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
matchObj = re.findall(r'~(.*?)\^', target)
print matchObj But the result is:['~~~ ABC ']

What I expect is:

[ABC , DEF , HGK , LMN ]

or

[^ABC , ^DEF , ^HGK , LMN ]

I want to do all this because i am trying to extract text from an html page. like this example

<td class="cell-1">
<div><span class="value-frame">&nbsp;~~~~ ABC ^ DEF ^ HGK > LMN ^ </span></div>
</td>

 

Share this post


Link to post
Share on other sites
Advertisement

I would do it something like this personally:

 

I would start off by replacing the '~' characters in the string since they do not seem to add anything. You could replace these with an empty string. Then use re.split with either the '^' or '>' characters. You have to escape '^' because it has special meaning to a regular expression. Using the pipe character creates an or condition in a regular expression.

 

Then that would give me results but they would have a lot of whitespace, so I would use a list comprehension to get rid of the extra whitespace and to omit any blank entries from the list.

 

Below is the code:

#!/usr/bin/env python

import re

def main():
    target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
    target = target.replace('~', '')
    target_list = re.split('\^|>', target)
    target_list = [entry.strip() for entry in target_list if len(entry.strip()) > 0]
    print(target_list)

if __name__ == '__main__':
    main()

That gives me:

['ABC', 'DEF', 'HGK', 'LMN']
Edited by shadowisadog

Share this post


Link to post
Share on other sites

Guys, it's python, don't forget it :D

 

[source='Python']target=' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '

result = (target.replace('^','').replace('>','').replace('~','')).split()[/source]

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement