Jump to content
  • Advertisement
Sign in to follow this  
koka282

Python-Get String Between Two Characters

This topic is 802 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I need to one give me the string between ~ and ^.
I have a string like this:

~~~~ ABC ^ DEF ^ HGK > LMN ^ 

I need to get the string between them with python.
I've tried this:

import re
target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
matchObj = re.findall(r'~(.*?)\^', target)
print matchObj But the result is:['~~~ ABC ']

What I expect is:

[ABC , DEF , HGK , LMN ]

or

[^ABC , ^DEF , ^HGK , LMN ]

I want to do all this because i am trying to extract text from an html page. like this example

<td class="cell-1">
<div><span class="value-frame">&nbsp;~~~~ ABC ^ DEF ^ HGK > LMN ^ </span></div>
</td>

 

Share this post


Link to post
Share on other sites
Advertisement

I would do it something like this personally:

 

I would start off by replacing the '~' characters in the string since they do not seem to add anything. You could replace these with an empty string. Then use re.split with either the '^' or '>' characters. You have to escape '^' because it has special meaning to a regular expression. Using the pipe character creates an or condition in a regular expression.

 

Then that would give me results but they would have a lot of whitespace, so I would use a list comprehension to get rid of the extra whitespace and to omit any blank entries from the list.

 

Below is the code:

#!/usr/bin/env python

import re

def main():
    target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
    target = target.replace('~', '')
    target_list = re.split('\^|>', target)
    target_list = [entry.strip() for entry in target_list if len(entry.strip()) > 0]
    print(target_list)

if __name__ == '__main__':
    main()

That gives me:

['ABC', 'DEF', 'HGK', 'LMN']
Edited by shadowisadog

Share this post


Link to post
Share on other sites

Guys, it's python, don't forget it :D

 

[source='Python']target=' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '

result = (target.replace('^','').replace('>','').replace('~','')).split()[/source]

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!