Python regex: Bad character range - python

I have the next regular expression to find emojis on a text:
re.compile(u'([\U00002600-\U000027BF])|([\U0001F300-\U0001F64F])|([\U0001F680-\U0001F6FF])')
It is working well in Python 3 but in Python 2.7 I get this:
sre_constants.error: bad character range
How can I fix it to support both, Python 2.7 and Python 3?

Use r'(... instead of u'(... like this:
re.compile(r'([\U00002600-\U000027BF\U0001F300-\U0001F64F\U0001F680-\U0001F6FF])')
Also note that you can specify multiple ranges inside [...]
https://regex101.com/r/WuQ3Zr/1

Related

How to substitute variable in string directly? [duplicate]

I'm trying out Python 3.6. Going through new code, I stumbled upon this new syntax:
f"My formatting string!"
It seems we can do things like this:
>>> name = "George"
>>> print(f"My cool string is called {name}.")
My cool string is called George.
Can anyone shed some light on the inner workings of this? In particular what is the scope of the variables that an f-prefixed string can take?
See PEP 498 Literal String Interpolation:
The expressions that are extracted from the string are evaluated in the context where the f-string appeared. This means the expression has full access to local and global variables. Any valid Python expression can be used, including function and method calls.
So the expressions are evaluated as if they appear in the same scope; locals, closures, and globals all work the same as in other code in the same context.
You'll find more details in the reference documentation:
Expressions in formatted string literals are treated like regular Python expressions surrounded by parentheses, with a few exceptions. An empty expression is not allowed, and a lambda expression must be surrounded by explicit parentheses. Replacement expressions can contain line breaks (e.g. in triple-quoted strings), but they cannot contain comments. Each expression is evaluated in the context where the formatted string literal appears, in order from left to right.
Since you are trying out a 3.6 alpha build, please do read the What's New In Python 3.6 documentation. It summarises all changes, including links to the relevant documentation and PEPs.
And just to be clear: 3.6 isn't released yet; the first alpha is not expected to be released until May 2016. See the 3.6 release schedule.
f-strings also support any Python expressions inside the curly braces.
print(f"My cool string is called {name.upper()}.")
It might also be worth noting that this PEP498 has a backport to Python <3.6
pip install fstring
from fstring import fstring
x = 1
y = 2.0
plus_result = "3.0"
print fstring("{x}+{y}={plus_result}")
# Prints: 1+2.0=3.0
letter f for "format" as in f"hello {somevar}. This little f before the "(double-quote) and the {} characters tell python 3, "hey, this string needs to be formatted. So put these variable in there and format it.".
hope this is clear.

Sublime Text syntax: Python 3.6 f-strings

I am trying to modify the default Python.sublime_syntax file to handle Python’s f-string literals properly. My goal is to have expressions in interpolated strings recognised as such:
f"hello {person.name if person else 'there'}"
-----------source.python----------
------string.quoted.double.block.python------
Within f-strings, ranges of text between a single { and another } (but terminating before format specifiers such as !r}, :<5}, etc—see PEP 498) should be recognised as expressions. As far as I know, that might look a little like this:
...
string:
- match: "(?<=[^\{]\{)[^\{].*)(?=(!(s|r|a))?(:.*)?\})" # I'll need a better regex
push: expressions
However, upon inspecting the build-in Python.sublime_syntax file, the string contexts especially are to unwieldy to even approach (~480 lines?) and I have no idea how to begin. Thanks heaps for any info.
There was an update to syntax highlighting in BUILD 3127 (Which includes: Significant improvements to Python syntax highlighting).
However, a couple users have stated that in BUILD 3176 syntax highlighting still is not set to correctly highlight Python expressions that are located within f strings. According to #Jollywatt, it is set to source.python f"string.quoted.double.block {constant.other.placeholder}" rather than f"string.quoted.double.block {source.python}"
It looks like Sublime uses this tool, PackageDev, "to ease the creation of snippets, syntax definitions, etc. for Sublime Text."

C++ Transpose a Python regular expression into PCRE

I need to transpose a regular expression I wrote in Python into C++ using PCRE cpp wrapper.
My original python code does the following:
self.reg = re.compile('(?<![/,\-\s])\s+(?![/,\-\s])')
myfields = self.reg.split(line_of_text)
...
I tried to create a pcrecpp reg exp as follow:
pcrecpp::RE reg("(?<![/,\\-\\s])\\s+(?![/,\\-\\s])");
But it doesn't work. I mean PartialMatch() and FullMatch() do not work.
Moreover, I didn't find yet a method doing something similar to python re.split().
I'm not very experienced with PCRE. Is there a specific syntax ?
Any feedback ?
Thanks.
z.
The pcrecpp::RE class uses / as a delimiter ( I believe.. ). The syntax is pretty similar to Perl's.
So you most likely need to escape the forward slash to fix your problem.
pcrecpp::RE re("(?<![\\/\\s,-])\\s+(?![\\/\\s,-])").PartialMatch("foo bar")
escape the forward slashes.
Like this:
(?<![\/,\-\s])\s+(?![\/,\-\s])

Python print statement shows unwanted characters

I am new to python and am having difficulties with an assignment for a class.
Here's my code:
print ('Plants for each semicircle garden: ',round(semiPlants,0))
Here's what gets printed:
('Plants for each semicircle garden:', 50.0)
As you see I am getting the parenthesis and apostrophes, which I do not want shown.
You're clearly using python2.x when you think you're using python3.x. In python 2.x, the stuff in the parenthesis is being interpreted as a tuple.
One fix is to use string formatting to do this:
print ( 'Plants for each semicircle garden: {0}'.format(round(semiPlants,0)))
which will work with python2.6 and onward (parenthesis around a single argument aren't interpreted as a tuple. To get a 1-tuple, you need to do (some_object,))
You've tagged this question Python-3.x, but it looks like you are actually running your code with Python 2.
To see what version you're using, run, "python -V".
Remove the parentheses as print is a statement, not a function in Python 2.x:
print 'Plants for each semicircle garden: ',round(semiPlants,0)
Plants for each simicircle garden: 50.0

Split shell-like syntax in Haskell?

How can I split a string in shell-style syntax in Haskell? The equivalent in Python is shlex.split.
>>> shlex.split('''/nosuchconf "/this doesn't exist either" "yep"''')
['/nosuchconf', "/this doesn't exist either", 'yep']
I'm not sure what exactly you mean: are you wanting to get get all quoted sub-strings from a String? Note that unlike Python, etc. Haskell only has one set of quotes that indicate something is a String, namely "...".
Possibilities to consider:
The words and lines functions
The split package
Write a custom parser using polyparse, uu-parsinglib, parsec, etc.
It may be useful if you specified why you wanted such functionality: are you trying to parse existing shell scripts? Then language-sh might be of use. But you shouldn't be using such Strings internally in Haskell, and instead using [String] or something.

Categories

Resources