Using a walrus operator in if statement does not work - python

I have a simple function that should output a prefix based on a pattern or None if it does not match. Trying to do a walrus it does not seem to work. Any idea?
import re
def get_prefix(name):
if m := re.match(f'^.+(\d\d)-(\d\d)-(\d\d\d\d)$', name) is not None:
return m.group(3) + m.group(2) + m.group(1)
get_prefix('abc 10-12-2020')
Traceback
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in get_prefix
AttributeError: 'bool' object has no attribute 'group'

You're setting m to re.match(f'^.+(\d\d)-(\d\d)-(\d\d\d\d)$', name) is not None, which is a boolean.
You probably mean
if (m := re.match(f'^.+(\d\d)-(\d\d)-(\d\d\d\d)$', name)) is not None:
But you don't need is not None here anyway. Matches are truthy and None is falsey. So you just need:
if m := re.match(f'^.+(\d\d)-(\d\d)-(\d\d\d\d)$', name):
(Arguably it's better practice to use () whenever you're using an assignment expression, to make clear what's being assigned.)
See PEP572#Relative precedence of :=

Related

Is the function "next" a good practice to find first occurrence in a iterable?

I've learned about iterators and such and discovered this quite interesting way of getting the first element in a list that a condition is applied (and also with default value in case we don't find it):
first_occurence = next((x for x in range(1,10) if x > 5), None)
For me, it seems a very useful, clear way of obtaining the result.
But since I've never seen that in production code, and since next is a little more "low-level" in the python structure I was wondering if that could be bad practice for some reason. Is that the case? and why?
It's fine. It's efficient, it's fairly readable, etc.
If you're expecting a result, or None is a possible result (so using None as a placeholder makes it hard to figure out if you got a result or got the default) it may be better to use the EAFP form rather than providing a default, catching the StopIteration it raises if no item is found, or just letting it bubble up if the problem is from the caller's input not meeting specs (so it's up to them to handle it). It looks even cleaner at point of use that way:
first_occurence = next(x for x in range(1,10) if x > 5)
Alternatively, when None is a valid result, you can use an explicit sentinel object that's guaranteed unique like so:
sentinel = object() # An anonymous object you construct can't possibly appear in the input
first_occurence = next((x for x in range(1,10) if x > 5), sentinel)
if first_occurence is not sentinel: # Compare with is for performance and to avoid broken __eq__ comparing equal to sentinel
A common use case for this one of these constructs to replace a call to any when you not only need to know if any item passed the test, but which item (any can only return True or False, so it's unsuited to finding which item passed).
We can wrap it up in a function to provide an even nicer interface:
_raise = object()
# can pass either an iterable or an iterator
def first(iterable, condition, *, default=_raise, exctype=None):
"""Get the first value from `iterable` which meets `condition`.
Will consume elements from the iterable.
default -> if no element meets the condition, return this instead.
exctype -> if no element meets the condition and there is no default,
raise this kind of exception rather than `StopIteration`.
(It will be chained from the original `StopIteration`.)
"""
try:
# `iter` is idempotent; this makes sure we have an iterator
return next(filter(condition, iter(iterable)))
except StopIteration as e:
if default is not _raise:
return default
if exctype:
raise exctype() from e
raise
Let's test it:
>>> first(range(10), lambda x: x > 5)
6
>>> first(range(10), lambda x: x > 11)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in first
StopIteration
>>> first(range(10), lambda x: x > 11, exctype=ValueError)
Traceback (most recent call last):
File "<stdin>", line 4, in first
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in first
ValueError
>>> first(range(10), lambda x: x > 11, default=None)
>>>

Python all short-circuiting with None element

I read everywhere that Python all and any functions support short-circuiting.
However:
a = None
all((a is not None, a + 1 > 2))
Throws the following error:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/IPython/core/interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-6e28870e65c8>", line 1, in <module>
all((a is not None, a + 1 > 2))
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
I would have expected the code not to evaluate a + 1 > 2 since a is None.
Why is this happening? Is this because each term is evaluated before the call? Am I forced to use the and operator as in a is not None and a + 1 > 2?
The tuple (a is not None, a + 1 > 2) needs to be created before all() can be called. It is during the creation of the tuple that the TypeError is raised. all() doesn't even get a chance to run.
If you want to see all's short circuiting in action, pass it a generator expression. For example:
>>> all('foobaR'[i].islower() for i in range(7))
False
>>> all('foobar'[i].islower() for i in range(7))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <genexpr>
IndexError: string index out of range
In the first run all stops when it hits the 'R'.islower() case since that's False. In the second run it keeps going until i == 6, which triggers an index error.
Am I forced to use the and operator as in a is not None and a + 1 > 2?
I wouldn't say "forced"—I'm sure there other convoluted options—but yes, that's the obvious and idiomatic way to write it.

TypeError: 'int' object is not callable for a recursive function

a = 3
def f(x):
x = (x**3-4*x)/(3(x**2)-4)
return x
while True:
print(a)
a = f(a)
I'm getting a type error here, and I'm not sure why. I'm trying to run this recursive function, is there any way to fix this?
You need a * operator after your parentheses. Multiplication is only implied in mathematical notation in this context, in Python it looks like you're trying to call a function.
3(x**2)
So it would be
3*(x**2)
For example
>>> 3(5*2)
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
3(5*2)
TypeError: 'int' object is not callable
>>> 3*(5*2)
30

What is the defined behavior in Python for no return statement being reached?

Given the following Python code:
def avg(a):
if len(a):
return sum(a) / len(a)
What is the language defined behavior of avg when the length of a is zero or is its behavior unspecified by the language and thus should not be counted upon in Python code?
The default return value is None.
From the documentation on Calls:
A call always returns some value, possibly None, unless it raises an exception. How this value is computed depends on the type of the callable object.
If len(a) is 0, that will be treated as a False-like value, and your return statement won't be reached. When the flow of control drops out of the bottom of a function with no explicit return statement being reached, Python functions implicitly return None:
>>> print(avg([]))
None
If len(a) is not defined - in other words, if the object has no __len__() method - you'll get a TypeError:
>>> print(avg(False))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in avg
TypeError: object of type 'bool' has no len()

Why sum start value is not zero value of iterable?

Why is sum not able to take correct zero value automatically?
>>> sum((['1'], ['2']))
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
sum((['1'], ['2']))
TypeError: unsupported operand type(s) for +: 'int' and 'list'
>>> sum((['1'], ['2']), [])
['1', '2']
It is simple to implement like this:
>>> def sum(s, start=None):
it = iter(s)
n = next(it)
if start is None:
start = type(n)()
return n + __builtins__.sum(it, start)
>>> sum((['1'], ['2']))
['1', '2']
>>>
But sum does not anyway join strings, so maybe it is just to encourage to use proper methods for different 'summings'.
On the other hand if it is meant to be used only for numbers, why not sum_numbers not sum as name to make it clear.
EDIT: to handle empty sequence we must add little code:
>> sum([])
Traceback (most recent call last):
File "<pyshell#36>", line 1, in <module>
sum([])
File "<pyshell#28>", line 3, in sum
n = next(it)
StopIteration
>>> def sum(s, start=None):
it = iter(s)
try:
n= next(it)
except:
return 0
if start is None:
start = type(n)()
return n + __builtins__.sum(it, start)
>>> sum([])
0
>>>
Inferring the zero value is impossible in the general case. What if the iterable produces instances of a user-defined class that has no zero-argument constructor? And as you've shown, it's easy to provide it yourself.

Categories

Resources