Like the title says, is there any way to name variables (i.e., lists) used within a nested list comprehension in Python?
I could come up with a fitting example, but I think the question is clear enough.
Here is an example of pseudo code:
[... [r for r in some_list if r.some_attribute == something_from_within_this_list comprehension] ... [r for r in some_list if r.some_attribute == something_from_within_this_list comprehension] ...]
Is there any way to avoid the repetition here and simply add a variable for this temporary list only for use within the list comprehension?
CLARIFICATION:
The list comprehension is already working fine, so it's not a question of 'can it be done with a list comprehension'. And it is quicker than it's original form of a for statement too, so it's not one of those 'for statements vs list comprehensions' questions either. It is simply a question of making the list comprehension more readable by making variable names for variables internal to the list comprehension alone. Just googling around I haven't really found any answer. I found this and this, but that's not really what I am after.
Based on my understanding of what you want to do, No you cannot do it.
You cannot carry out assignments in list comprehensions because a list comprehension is essentially of the form
[expression(x, y) for x in expression_that_creates_a_container
for y in some_other_expression_that_creates_a_container(x)
if predicate(y, x)]
Granted there are a few other cases but they're all about like that. Note that nowhere does there exist room for a statement which is what a name assignment is. So you cannot assign to a name in the context of a list comprehension except by using the for my_variable in syntax.
If you have the list comprehension working, you could post it and see if it can be simplified. Solutions based on itertools are often a good alternative to burly list comprehensions.
I think I understand exactly what you meant, and I came up with a "partial solution" to this problem. The solution works fine, but is not efficent.
Let me explain with an example:
I was just trying to solve a Pythagorean triplet which sum was 1000. The python code to solve it is just:
def pythagoreanTriplet(sum):
for a in xrange(1, sum/2):
for b in xrange(1, sum/3):
c = sum - a - b
if c > 0 and c**2 == a**2 + b**2:
return a, b, c
But I wanted to code it in a functional programming-like style:
def pythagoreanTriplet2(sum):
return next((a, b, sum-a-b) for a in xrange(1, sum/2) for b in xrange(1, sum/3) if (sum-a-b) > 0 and (sum-a-b)**2 == a**2 + b**2)
As can be seen in the code, I calc 3 times (sum-a-b), and I wanted to store the result in an internal varible to avoid redundant calculation. The only way I found to do that was by adding another loop with a single value to declare an internal variable:
def pythagoreanTriplet3(sum):
return next((a, b, c) for a in xrange(1, sum/2) for b in xrange(1, sum/3) for c in [sum-a-b] if c > 0 and c**2 == a**2 + b**2)
It works fine... but as I said at the begin of the post, is not an efficent method. Comparing the 3 methods with cProfile, the time required for each method is the next one:
First method: 0.077 seconds
Secnd method: 0.087 seconds
Third method: 0.109 seconds
Some people could classify the following as a "hack", but it is definitely useful in some cases.
f = lambda i,j: int(i==j) #A dummy function (here Kronecker's delta)
a = tuple(tuple(i + (2+f_ij)*j + (i + (1+f_ij)*j)**2
for j in range(4)
for f_ij in (f(i,j),) ) #"Assign" value f(i,j) to f_ij.
for i in range(4) )
print(a)
#Output: ((0, 3, 8, 15), (2, 13, 14, 23), (6, 13, 44, 33), (12, 21, 32, 93))
This approach is particularly convenient if the function f is costly to evaluate. Because it is somewhat unusual, it may be a good idea to document the "assignment" line, as I did above.
I'm just gonna go out on a limb here, because I have no idea what you really are trying to do. I'm just going to guess that you are trying to shoehorn more than you should be into a single expression. Don't do that, just assign subexpressions to variables:
sublist = [r for r in some_list if r.some_attribute == something_from_within_this_list comprehension]
composedlist = [... sublist ... sublist ...]
This feature was added in Python 3.8 (see PEP 572), it's called "assignment expressions" and the operator is := .
Examples from the documentation:
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
stuff = [[y := f(x), x/y] for x in range(5)]
Related
Firstly, I understand that this is not a best practice. However, this is for a coding challenge where the aim is to make the shortest code possible. Here is the challenge.
The challenge itself is quite easy to solve, as illustrated by the below code:
def solution(a, b, n):
op = []
for i in range(1, n+1):
op += [a * b]
a += 1
b += 1
return sum(op)
However, I want to make this code as short as possible, and to do so, I wanted to use a list comprehension and lambda, like so:
sol = lambda a,b,n: [(op+=[a*b, a+=1, b+=1) for i in range(1, n+1)]
As you can see, in this comprehension. I need to append an element to the end of the list, and increment a and b. In another Stack Overflow post, I saw that someone had used a tuple to achieve this, however, when I try this, I get the error
sol = lambda a,b,n: [(a+=1,b+=1) for i in range(1,n+1)]
^
SyntaxError: invalid syntax
How can I fix this? Any other suggestions are also welcome
There's no need for the list. And you don't need to increment the variables with +=, use the range() function to get a sequence of incrementing numbers. You can then use zip() to pair up the two sequences of numbers.
def solution(a, b, n):
return sum(a * b for a, b in zip(range(a, a+n), range(b, b+n)))
I'm using Python 3.5.
As part of a problem, I'm trying to design a function that takes a list as input and reverts it. So if x = [a, b, c] the function would make x = [c, b, a].
The problem is, I'm not allowed to use any built-in functions, and it has got me stuck. My initial thought was the following loop inside a function:
for revert in range(1, len(x) + 1):
y.append(x[-revert])
And it works. But the problem is I'm using len(x), which I believe is a built-in function, correct?
So I searched around and have made the following very simple code:
y = x[::-1]
Which does exactly what I wanted, but it just seems almost too simple/easy and I'm not sure whether "::" counts as a function.
So I was wondering if anyone had any hints/ideas how to manually design said function? It just seems really hard when you can't use any built-in functions and it has me stuck for quite some time now.
range and len are both built-in functions. Since list methods are accepted, you could do this with insert. It is reeaallyy slow* but it does the job for small lists without using any built-ins:
def rev(l):
r = []
for i in l:
r.insert(0, i)
return r
By continuously inserting at the zero-th position you end up with a reversed version of the input list:
>>> print(rev([1, 2, 3, 4]))
[4, 3, 2, 1]
Doing:
def rev(l):
return l[::-1]
could also be considered a solution. ::-1 (:: has a different result) isn't a function (it's a slice) and [] is, again, a list method. Also, contrasting insert, it is faster and way more readable; just make sure you're able to understand and explain it. A nice explanation of how it works can be found in this S.O answer.
*Reeaaalllyyyy slow, see juanpa.arrivillaga's answer for cool plot and append with pop and take a look at in-place reverse on lists as done in Yoav Glazner's answer.
:: is not a function, it's a python literal. as well as []
How to check if ::, [] are functions or not. Simple,
import dis
a = [1,2]
dis.dis(compile('a[::-1]', '', 'eval'))
1 0 LOAD_NAME 0 (a)
3 LOAD_CONST 0 (None)
6 LOAD_CONST 0 (None)
9 LOAD_CONST 2 (-1)
12 BUILD_SLICE 3
15 BINARY_SUBSCR
16 RETURN_VALUE
If ::,[] were functions, you should find a label CALL_FUNCTION among python instructions executed by a[::-1] statement. So, they aren't.
Look how python instructions looks like when you call a function, lets say list() function
>>> dis.dis(compile('list()', '', 'eval'))
1 0 LOAD_NAME 0 (list)
3 CALL_FUNCTION 0
6 RETURN_VALUE
So, basically
def rev(f):
return f[::-1]
works fine. But, I think you should do something like Jim suggested in his answer if your question is a homework or sent by you teacher. But, you can add this quickest way as a side note.
If you teacher complains about [::-1] notation, show him the example I gave you.
Another way ( just for completeness :) )
def another_reverse(lst):
new_lst = lst.copy() # make a copy if you don't want to ruin lst...
new_lst.reverse() # notice! this will reverse it in place
return new_lst
Here's a solution that doesn't use built-in functions but relies on list methods. It reverse in-place, as implied by your specification:
>>> x = [1,2,3,4]
>>> def reverse(seq):
... temp = []
... while seq:
... temp.append(seq.pop())
... seq[:] = temp
...
>>> reverse(x)
>>> x
[4, 3, 2, 1]
>>>
ETA
Jim, your answer using insert at position 0 was driving me nuts! That solution is quadratic time! You can use append and pop with a temporary list to achieve linear time using simple list methods. See (reverse is in blue, rev is green):
If it feels a little bit like "cheating" using seq[:] = temp, we could always loop over temp and append every item into seq and the time complexity would still be linear but probably slower since it isn't using the C-based internals.
Your example that works:
y = x[::-1]
uses Python slices notation which is not a function in the sense that I assume you're requesting. Essentially :: acts as a separator. A more verbose version of your code would be:
y = x[len(x):None:-1]
or
y = x[start:end:step]
I probably wouldn't be complaining that python makes your life really, really easily.
Edit to be super pedantic. Someone could argue that calling [] at all is using an inbuilt python function because it's really syntactical sugar for the method __getitem__().
x.__getitem__(0) == x[0]
And using :: does make use of the slice() object.
x.__getitem__(slice(len(x), None, -1) == x[::-1]
But... if you were to argue this, anything you write in python would be using inbuilt python functions.
Another way for completeness, range() takes an optional step parameter that will allow you to step backwards through the list:
def reverse_list(l):
return [l[i] for i in range(len(l)-1, -1, -1)]
The most pythonic and efficient way to achieve this is by list slicing. And, since you mentioned you do not need any inbuilt function, it completely suffice your requirement. For example:
>>> def reverse_list(list_obj):
... return list_obj[::-1]
...
>>> reverse_list([1, 3, 5 , 3, 7])
[7, 3, 5, 3, 1]
Just iterate the list from right to left to get the items..
a = [1,2,3,4]
def reverse_the_list(a):
reversed_list = []
for i in range(0, len(a)):
reversed_list.append(a[len(a) - i - 1])
return reversed_list
new_list = reverse_the_list(a)
print new_list
I love python. However, one thing that bugs me a bit is that I don't know how to format functional activities in a fluid manner like a can in javascript.
example (randomly created on the spot): Can you help me convert this to python in a fluent looking manner?
var even_set = [1,2,3,4,5]
.filter(function(x){return x%2 === 0;})
.map(function(x){
console.log(x); // prints it for fun
return x;
})
.reduce(function(num_set, val) {
num_set[val] = true;
}, {});
I'd like to know if there are fluid options? Maybe a library.
In general, I've been using list comprehensions for most things but it's a real problem if I want to print
e.g., How can I print every even number between 1 - 5 in python 2.x using list comprehension (Python 3 print() as a function but Python 2 it doesn't). It's also a bit annoying that a list is constructed and returned. I'd rather just for loop.
Update Here's yet another library/option : one that I adapted from a gist and is available on pipy as infixpy:
from infixpy import *
a = (Seq(range(1,51))
.map(lambda x: x * 4)
.filter(lambda x: x <= 170)
.filter(lambda x: len(str(x)) == 2)
.filter( lambda x: x % 20 ==0)
.enumerate() Ï
.map(lambda x: 'Result[%d]=%s' %(x[0],x[1]))
.mkstring(' .. '))
print(a)
pip3 install infixpy
Older
I am looking now at an answer that strikes closer to the heart of the question:
fluentpy https://pypi.org/project/fluentpy/ :
Here is the kind of method chaining for collections that a streams programmer (in scala, java, others) will appreciate:
import fluentpy as _
(
_(range(1,50+1))
.map(_.each * 4)
.filter(_.each <= 170)
.filter(lambda each: len(str(each))==2)
.filter(lambda each: each % 20 == 0)
.enumerate()
.map(lambda each: 'Result[%d]=%s' %(each[0],each[1]))
.join(',')
.print()
)
And it works fine:
Result[0]=20,Result[1]=40,Result[2]=60,Result[3]=80
I am just now trying this out. It will be a very good day today if this were working as it is shown above.
Update: Look at this: maybe python can start to be more reasonable as one-line shell scripts:
python3 -m fluentpy "lib.sys.stdin.readlines().map(str.lower).map(print)"
Here is it in action on command line:
$echo -e "Hello World line1\nLine 2\Line 3\nGoodbye"
| python3 -m fluentpy "lib.sys.stdin.readlines().map(str.lower).map(print)"
hello world line1
line 2
line 3
goodbye
There is an extra newline that should be cleaned up - but the gist of it is useful (to me anyways).
Generators, iterators, and itertools give added powers to chaining and filtering actions. But rather than remember (or look up) rarely used things, I gravitate toward helper functions and comprehensions.
For example in this case, take care of the logging with a helper function:
def echo(x):
print(x)
return x
Selecting even values is easy with the if clause of a comprehension. And since the final output is a dictionary, use that kind of comprehension:
In [118]: d={echo(x):True for x in s if x%2==0}
2
4
In [119]: d
Out[119]: {2: True, 4: True}
or to add these values to an existing dictionary, use update.
new_set.update({echo(x):True for x in s if x%2==0})
another way to write this is with an intermediate generator:
{y:True for y in (echo(x) for x in s if x%2==0)}
Or combine the echo and filter in one generator
def even(s):
for x in s:
if x%2==0:
print(x)
yield(x)
followed by a dict comp using it:
{y:True for y in even(s)}
Comprehensions are the fluent python way of handling filter/map operations.
Your code would be something like:
def evenize(input_list):
return [x for x in input_list if x % 2 == 0]
Comprehensions don't work well with side effects like console logging, so do that in a separate loop. Chaining function calls isn't really that common an idiom in python. Don't expect that to be your bread and butter here. Python libraries tend to follow the "alter state or return a value, but not both" pattern. Some exceptions exist.
Edit: On the plus side, python provides several flavors of comprehensions, which are awesome:
List comprehension: [x for x in range(3)] == [0, 1, 2]
Set comprehension: {x for x in range(3)} == {0, 1, 2}
Dict comprehension: ` {x: x**2 for x in range(3)} == {0: 0, 1: 1, 2: 4}
Generator comprehension (or generator expression): (x for x in range(3)) == <generator object <genexpr> at 0x10fc7dfa0>
With the generator comprehension, nothing has been evaluated yet, so it is a great way to prevent blowing up memory usage when pipelining operations on large collections.
For instance, if you try to do the following, even with python3 semantics for range:
for number in [x**2 for x in range(10000000000000000)]:
print(number)
you will get a memory error trying to build the initial list. On the other hand, change the list comprehension into a generator comprehension:
for number in (x**2 for x in range(1e20)):
print(number)
and there is no memory issue (it just takes forever to run). What happens is the range object gets built (which only stores the start, stop and step values (0, 1e20, and 1)) the object gets built, and then the for-loop begins iterating over the genexp object. Effectively, the for-loop calls
GENEXP_ITERATOR = `iter(genexp)`
number = next(GENEXP_ITERATOR)
# run the loop one time
number = next(GENEXP_ITERATOR)
# run the loop one time
# etc.
(Note the GENEXP_ITERATOR object is not visible at the code level)
next(GENEXP_ITERATOR) tries to pull the first value out of genexp, which then starts iterating on the range object, pulls out one value, squares it, and yields out the value as the first number. The next time the for-loop calls next(GENEXP_ITERATOR), the generator expression pulls out the second value from the range object, squares it and yields it out for the second pass on the for-loop. The first set of numbers are no longer held in memory.
This means that no matter how many items in the generator comprehension, the memory usage remains constant. You can pass the generator expression to other generator expressions, and create long pipelines that never consume large amounts of memory.
def pipeline(filenames):
basepath = path.path('/usr/share/stories')
fullpaths = (basepath / fn for fn in filenames)
realfiles = (fn for fn in fullpaths if os.path.exists(fn))
openfiles = (open(fn) for fn in realfiles)
def read_and_close(file):
output = file.read(100)
file.close()
return output
prefixes = (read_and_close(file) for file in openfiles)
noncliches = (prefix for prefix in prefixes if not prefix.startswith('It was a dark and stormy night')
return {prefix[:32]: prefix for prefix in prefixes}
At any time, if you need a data structure for something, you can pass the generator comprehension to another comprehension type (as in the last line of this example), at which point, it will force the generators to evaluate all the data they have left, but unless you do that, the memory consumption will be limited to what happens in a single pass over the generators.
The biggest dealbreaker to the code you wrote is that Python doesn't support multiline anonymous functions. The return value of filter or map is a list, so you can continue to chain them if you so desire. However, you'll either have to define the functions ahead of time, or use a lambda.
Arguments against doing this notwithstanding, here is a translation into Python of your JS code.
from __future__ import print_function
from functools import reduce
def print_and_return(x):
print(x)
return x
def isodd(x):
return x % 2 == 0
def add_to_dict(d, x):
d[x] = True
return d
even_set = list(reduce(add_to_dict,
map(print_and_return,
filter(isodd, [1, 2, 3, 4, 5])), {}))
It should work on both Python 2 and Python 3.
There's a library that already does exactly what you are looking for, i.e. the fluid syntaxt, lazy evaluation and the order of operations is the same with how it's written, as well as many more other good stuff like multiprocess or multithreading Map/Reduce.
It's named pyxtension and it's prod ready and maintained on PyPi.
Your code would be rewritten in this form:
from pyxtension.strams import stream
def console_log(x):
print(x)
return x
even_set = stream([1,2,3,4,5])\
.filter(lambda x:x%2 === 0)\
.map(console_log)\
.reduce(lambda num_set, val: num_set.__setitem__(val,True))
Replace map with mpmap for multiprocessed map, or fastmap for multithreaded map.
We can use Pyterator for this (disclaimer: I am the author).
We define the function that prints and returns (which I believe you can omit completely however).
def print_and_return(x):
print(x)
return x
then
from pyterator import iterate
even_dict = (
iterate([1,2,3,4,5])
.filter(lambda x: x%2==0)
.map(print_and_return)
.map(lambda x: (x, True))
.to_dict()
)
# {2: True, 4: True}
where I have converted your reduce into a sequence of tuples that can be converted into a dictionary.
Is there an equivalent of starts with for lists in python ?
I would like to know if a list a starts with a list b. like
len(a) >= len(b) and a[:len(b)] == b ?
You can just write a[:len(b)] == b
if len(b) > len(a), no error will be raised.
For large lists, this will be more efficient:
from itertools import izip
...
result = all(x==y for (x, y) in izip(a, b))
For small lists, your code is fine. The length check can be omitted, as DavidK said, but it would not make a big difference.
PS: No, there's no build-in to check if a list starts with another list, but as you already know, it's trivial to write such a function yourself.
it does not get much simpler than what you have (and the check on the lengths is not even needed)...
for an overview of more extended/elegant options for finding sublists in lists, you can check out the main answer to this
post : elegant find sub-list in list
I have a loop of the following type:
a = range(10)
b = [something]
for i in range(len(a)-1):
b.append(someFunction(b[-1], a[i], a[i+1]))
However the for-loop is killing a lot of performance. I have try to write a windows generator to give me 2 elements everything time but it still require explicit for-loop in the end. Is there a way to make this shorter and more efficient in a pythonic way?
Thanks
edit: I forgot the element in b.. sorry guys. However the solution to my previous problem is very helpful in other problem I have too. Thanks.
Consider this
def make_b( a, seed ):
yield seed
for a,b in zip( a[:-1], a[1:] ):
seed= someFunction( seed, a, b )
yield seed
Which lets you do this
a = xrange(10)
b= list(make_b(a,something))
Note that you can often use this:
b = make_b(a)
Instead of actually creating b as a list. b as a generator function saves you considerable storage (and some time) because you may not really need a list object in the first place. Often, you only need something iterable.
Similarly for a. It does not have to be a list, merely something iterable -- like a generator function with a yield statement.
For your initially stated problem of mapping a function over pairs of an input sequence the following will work, and is about as efficient as it gets while staying in Python land.
from itertools import tee
a = range(10)
a1, a2 = tee(a)
a2.next()
b = map(someFunction, a1, a2)
As for the expanded problem where you need to access the result of the previous iteration - this kind of inner state is present in the functional concept unfold. But Python doesn't include an unfold construct, and for a good reason for loops are more readable in this case and most likely faster too. As for making it more Pythonic, I suggest lifting the pairwise iteration out to a function and create an explicit loop variable.
def pairwise(seq):
a, b = tee(seq)
b.next()
return izip(a, b)
def unfold_over_pairwise(unfolder, seq, initial):
state = initial
for cur_item, next_item in pairwise(seq):
state = unfolder(state, cur_item, next_item)
yield state
b = [something]
b.extend(unfold_over_pairwise(someFunction, a, initial=b[-1]))
If the looping overhead really is a problem, then someFunction must be something really simple. In that case it probably is best to write the whole loop in a faster language, such as C.
Some loop or other will always be around, but one possibility that might reduce overhead is:
import itertools
def generate(a, item):
a1, a2 = itertools.tee(a)
next(a2)
for x1, x2 in itertools.izip(a1, a2):
item = someFunction(item, x1, x2)
yield item
to be used as:
b.extend(generate(a, b[-1]))
Try something like this:
a = range(10)
b = [something]
s = len(b)
b+= [0] * (len(a) - 1)
[ b.__setitem__(i, someFunction(b[i-1], a[i-s], a[i-s+1])) for i in range(s, len(b))]
Also:
using functions from itertools should
be useful also (earlier posts)
maybe you can rewrite someFunction and use map instead of list
comprehension