Public Group

# Python-Get String Between Two Characters

This topic is 710 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I need to one give me the string between ~ and ^.
I have a string like this:

~~~~ ABC ^ DEF ^ HGK > LMN ^

I need to get the string between them with python.
I've tried this:

import re
target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
matchObj = re.findall(r'~(.*?)\^', target)
print matchObj But the result is:['~~~ ABC ']

What I expect is:

[ABC , DEF , HGK , LMN ]

or

[^ABC , ^DEF , ^HGK , LMN ]

I want to do all this because i am trying to extract text from an html page. like this example

<td class="cell-1">
<div><span class="value-frame">&nbsp;~~~~ ABC ^ DEF ^ HGK > LMN ^ </span></div>
</td>

##### Share on other sites

I would do it something like this personally:

I would start off by replacing the '~' characters in the string since they do not seem to add anything. You could replace these with an empty string. Then use re.split with either the '^' or '>' characters. You have to escape '^' because it has special meaning to a regular expression. Using the pipe character creates an or condition in a regular expression.

Then that would give me results but they would have a lot of whitespace, so I would use a list comprehension to get rid of the extra whitespace and to omit any blank entries from the list.

Below is the code:

#!/usr/bin/env python

import re

def main():
target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
target = target.replace('~', '')
target_list = re.split('\^|>', target)
target_list = [entry.strip() for entry in target_list if len(entry.strip()) > 0]
print(target_list)

if __name__ == '__main__':
main()


That gives me:

['ABC', 'DEF', 'HGK', 'LMN']


##### Share on other sites

If you need to use a regexp, I believe the regexp you are looking for is:

"^[ ~\\^>]*([A-Z]*)[ ~\\^>]*([A-Z]*)[ ~\\^>]*([A-Z]*)[ ~\\^>]*([A-Z]*).*\$"

When you need to test regexps I would recommend:

https://www.debuggex.com/

##### Share on other sites

Guys, it's python, don't forget it :D

[source='Python']target=' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '

result = (target.replace('^','').replace('>','').replace('~','')).split()[/source]

1. 1
2. 2
3. 3
Rutin
16
4. 4
5. 5

• 10
• 14
• 30
• 13
• 11
• ### Forum Statistics

• Total Topics
631788
• Total Posts
3002357
×