Substring[whole word] check using a string variable

Substring[whole word] check using a string variable - python

In Python2.7, I am trying the following:
>>> import re
>>> text='0.0.0.0/0 172.36.128.214'
>>> far_end_ip="172.36.128.214"
>>>
>>>
>>> chk=re.search(r"\b172.36.128.214\b",text)
>>> chk
<_sre.SRE_Match object at 0x0000000002349578>
>>> chk=re.search(r"\b172.36.128.21\b",text)
>>> chk
>>> chk=re.search(r"\b"+far_end_ip+"\b",text)
>>>
>>> chk
>>>
Q:how can i make the search work when using the variable far_end_ip

Two issues:
You need to write the last bit of the string as a regex literal or escape the backslash: ... + r"\b"
You should escape the dots in the text to find: ... + re.escape(far_end_ip)
So:
re.search(r"\b" + re.escape(far_end_ip) + r"\b",text)
See also "How to use a variable inside a regular expression?".

Related

Regex how to check last 4 numbers from long number

I would like to check only last 4 digit number with python
for example, if I have following numbers and I want to check last four number whether it start from 10
or 02
201600001057 ( I want to get 1057)
201600000216 ( I want to get 0216)
Thanks in advance

Why would you use regex for this?
last4 = str(number)[-4:]
if last4.startswith(('10', '02')):
print("yes, actually")

You can do it without regexp
>>> s="201600001057"
>>> s[-4:]
"1057"
>>> s[-4:].isdigit()
True
>>> s="201600001057a"
>>> s[-4:].isdigit()
False

(?=(?:10|02))\d{4}$
This should do it.See demo.
http://regex101.com/r/kP4pZ2/18
print re.findall(r"(?=(?:10|02))\d{4}$",x,re.M)
x is your string.

You could use re.search or re.match. It would match the strings only if the last four numbers starts with 10 or 02
>>> s = "201600001057"
>>> s1 = "201600000216"
>>> re.search(r'(?:10|02)\d{2}$', s)
<_sre.SRE_Match object at 0x7fdbb2b6d3d8>
>>> re.search(r'(?:10|02)\d{2}$', s).group()
'1057'
>>> re.search(r'(?:10|02)\d{2}$', s1).group()
'0216'
>>> if re.search(r'(?:10|02)\d{2}$', s1):
... print 'Matches'
...
Matches
>>> if re.search(r'(?:10|02)\d{2}$', s):
... print 'Matches'
...
Matches

the findall function in re module can be used
>>> import re
>>> x="201600001057"
>>> re.findall('\d{4}$', x)
['1057']

Moving parts of string around python

I have a string, well, several actually. The strings are simply:
string.a.is.this
or
string.a.im
in that fashion.
and what I want to do is make those stings become:
this.is.a.string
and
im.a.string
What I've tried:
new_string = string.split('.')
new_string = (new_string[3] + '.' + new_string[2] + '.' + new_string[1] + '.' + new_string[0])
Which works fine for making:
string.a.is.this
into
this.is.a.string
but gives me a error of 'out of range' if I try it on:
string.a.im
yet if I do:
new_string = (new_string[2] + '.' + new_string[1] + '.' + new_string[0])
that works fine to make:
string.a.im
into
im.a.string
but obviously does not work for:
string.a.is.this
since it is not setup for 4 indices. I was trying to figure out how to make the extra index optional, or any other work around, or, better method. Thanks.

You can use str.join, str.split, and [::-1]:
>>> mystr = 'string.a.is.this'
>>> '.'.join(mystr.split('.')[::-1])
'this.is.a.string'
>>> mystr = 'string.a.im'
>>> '.'.join(mystr.split('.')[::-1])
'im.a.string'
>>>
To explain better, here is a step-by-step demonstration with the first string:
>>> mystr = 'string.a.is.this'
>>>
>>> # Split the string on .
>>> mystr.split('.')
['string', 'a', 'is', 'this']
>>>
>>> # Reverse the list returned above
>>> mystr.split('.')[::-1]
['this', 'is', 'a', 'string']
>>>
>>> # Join the strings in the reversed list, separating them by .
>>> '.'.join(mystr.split('.')[::-1])
'this.is.a.string'
>>>

You could do it through python's re module,
import re
mystr = 'string.a.is.this'
regex = re.findall(r'([^.]+)', mystr)
'.'.join(regex[::-1])
'this.is.a.string'

Split string into tuple (Upper,lower) 'ABCDefgh' . Python 2.7.6

my_string = 'ABCDefgh'
desired = ('ABCD','efgh')
the only way I can think of doing this is creating a for loop and then scanning through and checking each element in the string individually and adding to string and then creating the tuple . . . is there a more efficient way to do this?
it will always be in the format UPPERlower

print re.split("([A-Z]+)",my_string)[1:]

Simple way (two passes):
>>> import itertools
>>> my_string = 'ABCDefgh'
>>> desired = (''.join(itertools.takewhile(lambda c:c.isupper(), my_string)), ''.join(itertools.dropwhile(lambda c:c.isupper(), my_string)))
>>> desired
('ABCD', 'efgh')
Efficient way (one pass):
>>> my_string = 'ABCDefgh'
>>> uppers = []
>>> done = False
>>> i = 0
>>> while not done:
... c = my_string[i]
... if c.isupper():
... uppers.append(c)
... i += 1
... else:
... done = True
...
>>> lowers = my_string[i:]
>>> desired = (''.join(uppers), lowers)
>>> desired
('ABCD', 'efgh')

Because I throw itertools.groupby at everything:
>>> my_string = 'ABCDefgh'
>>> from itertools import groupby
>>> [''.join(g) for k,g in groupby(my_string, str.isupper)]
['ABCD', 'efgh']
(A little overpowered here, but scales up to more complicated problems nicely.)

my_string='ABCDefg'
import re
desired = (re.search('[A-Z]+',my_string).group(0),re.search('[a-z]+',my_string).group(0))
print desired

A more robust approach without using re
import string
>>> txt = "ABCeUiioualfjNLkdD"
>>> tup = (''.join([char for char in txt if char in string.ascii_uppercase]),
''.join([char for char in txt if char not in string.ascii_uppercase]))
>>> tup
('ABCUNLD', 'eiioualfjkd')
the char not in string.ascii_uppercase instead of char in string.ascii_lowercase means that you'll never lose any data in case your string has non-letters in it, which could be useful if you suddenly start having errors when this input starts being rejected 20 function calls later.

How do I split the following string?

I have the following string where I need to extract only the first digits from it.
string = '50.2000\xc2\xb0 E'
How do I extract 50.2000 from string?

If the number can be followed by any kind of character, try using a regex:
>>> import re
>>> r = re.compile(r'(\d+\.\d+)')
>>> r.match('50.2000\xc2\xb0 E').group(1)
'50.2000'

mystring = '50.2000\xc2\xb0 E'
print mystring.split("\xc2", 1)[0]
Output
50.2000

If you just wanted to split the first digits, just slice the string:
start = 10 #start at the 10th digit
print mystring[start:]
Demo:
>>> my_string = 'abcasdkljf23u109842398470ujw{}{\\][\\['
>>> start = 10
>>> print(my_string[start:])
23u109842398470ujw{}{\][\[
You can, split the string at the first \:
>>> s = r'50.2000\xc2\xb0 E'
>>> s.split('\\', 1)
['50.2000', 'xc2\\xb0 E']

You could solve this using a regular expression:
In [1]: import re
In [2]: string = '50.2000\xc2\xb0 E'
In [3]: m = re.match('^([0-9]+\.?[0-9]*)', string)
In [4]: m.group(0)
Out[4]: '50.2000'

Python Regular expression repeat

I have a string like this
--x123-09827--x456-9908872--x789-267504
I am trying to get all value like
123:09827
456:9908872
789:267504
I've tried (--x([0-9]+)-([0-9])+)+
but it only gives me last pair result, I am testing it through python
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> p = "(--x([0-9]+)-([0-9]+))+"
>>> re.match(p,x)
>>> re.match(p,x).groups()
('--x789-267504', '789', '267504')
How should I write with nested repeat pattern?
Thanks a lot!
David

Code it like this:
x = "--x123-09827--x456-9908872--x789-267504"
p = "--x(?:[0-9]+)-(?:[0-9]+)"
print re.findall(p,x)

Just use the .findall method instead, it makes the expression simpler.
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> r = re.compile(r"--x(\d+)-(\d+)")
>>> r.findall(x)
[('123', '09827'), ('456', '9908872'), ('789', '267504')]
You can also use .finditer which might be helpful for longer strings.
>>> [m.groups() for m in r.finditer(x)]
[('123', '09827'), ('456', '9908872'), ('789', '267504')]

Use re.finditer or re.findall. Then you don't need the extra pair of parentheses that wrap the entire expression. For example,
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> p = "--x([0-9]+)-([0-9]+)"
>>> for m in re.finditer(p,x):
>>> print '{0} {1}'.format(m.group(1),m.group(2))

try this
p='--x([0-9]+)-([0-9]+)'
re.findall(p,x)

No need to use regex :
>>> "--x123-09827--x456-9908872--x789-267504".replace('--x',' ').replace('-',':').strip()
'123:09827 456:9908872 789:267504'

You don't need regular expressions for this. Here is a simple one-liner, non-regex solution:
>>> input = "--x123-09827--x456-9908872--x789-267504"
>>> [ x.replace("-", ":") for x in input.split("--x")[1:] ]
['123:09827', '456:9908872', '789:267504']
If this is an exercise on regex, here is a solution that uses the repetition (technically), though the findall(...) solution may be preferred:
>>> import re
>>> input = "--x123-09827--x456-9908872--x789-267504"
>>> regex = '--x(.+)'
>>> [ x.replace("-", ":") for x in re.match(regex*3, input).groups() ]
['123:09827', '456:9908872', '789:267504']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Substring[whole word] check using a string variable - python

Related

Regex how to check last 4 numbers from long number

Moving parts of string around python

Split string into tuple (Upper,lower) 'ABCDefgh' . Python 2.7.6

How do I split the following string?

Python Regular expression repeat

Categories

Resources