% symbol and regular expressions [duplicate] - python

This question already has answers here:
What does % do to strings in Python?
(4 answers)
Closed 3 years ago.
How does this line of code work? Google searches on individual characters don't work well.
re.sub(r'(.*>.*/.*)%s(_R[12].*)' % sample.group(1), r'\1%s\2' % sample_name[1], line)
What I don't understand:
"% sample.group(1)" .... what is % doing?
'\1%s\2' %
%s
What I understand:
re.sub(x,y,z) will substitute x for y in string z
r is for raw (don't mess with /)
arrays & indexes
_R[12].* matches "_R" and a 1 or 2 followed by random characters.
line (it's a string)
Thanks!

The % string operator is used for string interpolation/formatting. Think sprintf or String.format:
r'(.*>.*/.*)%s(_R[12].*)' % sample.group(1)
Equals
r'(.*>.*/.*)' + sample.group(1) + r'(_R[12].*)'
Specifically, the s operator (i.e., %s) is defined as:
String (converts any Python object using str()).
.format is the modern way to go, though.

Related

Compilation of complex regex expressions [duplicate]

This question already has answers here:
String formatting in Python [duplicate]
(14 answers)
Closed 2 years ago.
I'm trying to understand the following code related to complex regex.
I do not understand how the full_regex line operates? What is the use of the '%s' as well as the other % before the (regex1, regex2...)
Can someone please help with this?
regex1 = '(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})'
regex2 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\S]*[+\s]\d{1,2}[,]{0,1}[+\s]\d{4})'
regex3 = '(\d{1,2}[+\s](?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\S]*[+\s]\d{4})'
regex4 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\S]*[+\s]\d{4})'
regex5 = '(\d{1,2}[/-][1|2]\d{3})'
regex6 = '([1|2]\d{3})'
full_regex = '(%s|%s|%s|%s|%s|%s)' %(regex1, regex2, regex3, regex4, regex5, regex6)
The expression
full_regex = '(%s|%s|%s|%s|%s|%s)' % (regex1, regex2, regex3, regex4, regex5, regex6)
just merges all of the other regexps into one big one that alternates between all of them; that's not regex syntax, it's just Python string interpolation.

Python regex and escape characters [duplicate]

This question already has answers here:
Regular expression with backslash in Python3
(2 answers)
Closed 3 years ago.
I have the following regex code:
tmp = 'c:\\\\temp'
m = re.search(tmp, tmp)
if(m==None):
print('Unable to find a ticker in ' + filename)
else:
print("REGEX RESULT - " + m.group(0))
which returns None. No matter how many or few backslashes I use for variable tmp, I still get None as result. How can I perform regex to search for a backslashed file path?
you can use r'' to ignore escape chars
tmp = r'c:\\temp'
r is for raw string

Python regex with variable {}-multiplier [duplicate]

This question already has answers here:
How do I escape curly-brace ({}) characters in a string while using .format (or an f-string)?
(23 answers)
Closed 6 years ago.
Say you wanted to create a pattern that matches sequences of var consecutive digits. You could do it this way:
p = re.compile(r"\d{"+str(var)+"}")
or this way:
p = re.compile(r"\d{%d}" % var)
But how would you do it using format()?
I tried both:
p = re.compile(r"\d{0}".format(var))
and:
p = re.compile(r"\d{{0}}".format(var))
but none of these worked.
You need to actually have triple { and } - two for the escaped literal braces and one for the placeholder:
In [1]: var = 6
In [2]: r"\d{{{0}}}".format(var)
Out[2]: '\\d{6}'

Python CLI: print 'some text' vs 'some text' [duplicate]

This question already has answers here:
Understanding repr( ) function in Python
(5 answers)
Why do backslashes appear twice?
(2 answers)
Closed 7 years ago.
Why does >>> 'c\\\h' produces 'c\\\\h' via the python CLI
But >>> print 'c\\\h' produces c\\h
Python interpreter running in REPL mode prints representation (repr builtin) of result of last statement (it it exists and not a None):
>>> 5 + 6
11
For str objects representation is a string literal in a same form it is written in your code (except for the quotes that may differ), so it includes escape sequences:
>>> '\n\t1'
'\n\t1'
>>> print repr('\n\t1')
'\n\t1'
print statement (or function) on the other hand prints pretty string-conversion (str builtin) of an element, which makes all escape sequences being converted to actual characters:
>>> print '\n\t1'
<---- newline
1 <---- tab + 1

Output variable value in a string [duplicate]

This question already has answers here:
Is there a Python equivalent to Ruby's string interpolation?
(9 answers)
Closed 8 years ago.
In Ruby I can do this:
"This is a string with the value of #{variable} shown."
How do I do that same thing in Python?
You have a lot of options.
"This is a string with the value of " + str(variable) + " shown."
"This is a string with the value of %s shown." % (str(variable))
"This is a string with the value of {0} shown.".format(variable)
The modern/preferred way is to use str.format:
"This is a string with the value of {} shown.".format(variable)
Below is a demonstration:
>>> 'abc{}'.format(123)
'abc123'
>>>
Note that in Python versions before 2.7, you need to explicitly number the format fields:
"This is a string with the value of {0} shown.".format(variable)
this is one of the way we can also do
from string import Template
s = Template('$who likes $what')
s.substitute(who='tim', what='kung pao')

Categories

Resources