Add an arbitrary element to an xrange()? - python

In Python, it's more memory-efficient to use xrange() instead of range when iterating.
The trouble I'm having is that I want to iterate over a large list -- such that I need to use xrange() and after that I want to check an arbitrary element.
With range(), it's easy: x = range(...) + [arbitrary element].
But with xrange(), there doesn't seem to be a cleaner solution than this:
for i in xrange(...):
if foo(i):
...
if foo(arbitrary element):
...
Any suggestions for cleaner solutions? Is there a way to "append" an arbitrary element to a generator?

itertools.chain lets you make a combined iterator from multiple iterables without concatenating them (so no expensive temporaries):
from itertools import chain
# Must wrap arbitrary element in one-element tuple (or list)
for i in chain(xrange(...), (arbitrary_element,)):
if foo(i):
...

I would recommend keeping the arbitrary_element check out of the loop, but if you want to make it part of the loop, you can use itertools.chain:
for i in itertools.chain(xrange(...), [arbitrary_element]):
...

Related

Convert for loop into list comprehension with assignment?

I am trying to convert a for loop with an assignment into a list comprehension.
More precisely I am trying to only replace one element from a list with three indexes.
Can it be done?
for i in range(len(data)):
data[i][0] = data[i][0].replace('+00:00','Z').replace(' ','T')
Best
If you really, really want to convert it to a list comprehension, you could try something like this, assuming the sub-lists have three elements, as you stated in the questions:
new_data = [[a.replace('+00:00','Z').replace(' ','T'), b, c] for (a, b, c) in data]
Note that this does not modify the existing list, but creates a new list, though. However, in this case I'd just stick with a regular for loop, which much better conveys what you are actually doing. Instead of iterating the indices, you could iterate the elements directly, though:
for x in data:
x[0] = x[0].replace('+00:00','Z').replace(' ','T')
I believe it could be done, but that's not the best way to do that.
First you would create a big Jones Complexity for a foreign reader of your code.
Second you would exceed preferred amount of chars on a line, which is 80. Which again will bring complexity problems for a reader.
Third is that list comprehension made to return things from comprehensing of lists, here you change your original list. Not the best practice as well.
List comprehension is useful when making lists. So, it is not recommended here. But still, you can try this simple solution -
print([ele[0].replace('+00:00','Z').replace(' ','T') for ele in data])
Although I don't recommend you use list-comprehension in this case, but if you really want to use it, here is a example.
It can handle different length of data, if you need it.
code:
data = [["1 +00:00",""],["2 +00:00","",""],["3 +00:00"]]
print([[i[0].replace('+00:00','Z').replace(' ','T'),*i[1:]] for i in data])
result:
[['1TZ', ''], ['2TZ', '', ''], ['3TZ']]

Itertools Python

Is there any way to use itertools product function where the function returns each combination of lists step by step ?
For example:
itertools.product(*mylist)
-> the solution should return the first combination of the lists , after that the second one etc.
As #ggorlen has explained, itertools.product(...) returns an iterator. For example, if you have
import itertools
mylist = [('Hello','Hi'),('Andy','Betty')]
iterator = itertools.product(*mylist)
next(iterator) or iterator.__next__() will evaluate to 'Hello Andy' the first time you call them, for example. When you next call next(iterator), it will return 'Hello Betty', then 'Hi Andy', and finally 'Hi Betty', before raising StopIteration errors.
You can also convert an iterator into a list with list(iterator), if you are more comfortable with a list, but if you just need the first few values and mylist is big, this would be really inefficient, and it might be worth the time familiarising yourself with iterators.
Do consider whether you are really just iterating through iterator though. If so, just use
for combination in iterator:
pass # Body of loop
Even if you just need the first n elements and n is large you can use
for combination in itertools.islice(iterator, n):
pass # Body of loop

How to replace using for loop in range(len) by using enumerate

Is there a simple way to use enumerate instead of for loop with range(len)? For example, here I loop to replace all values of each element in subarrays by the index of its subarray.
list = []
for i in range(len(nparray)):
j = [i]*(len(nparray[i]))
list.append(j)
My nparray is np.array with 6 subarrays, and each subarray has different size.
enumerate won't replace the use of for, just make it arguably nicer. You can use list comprehension however:
[[i]*len(x) for i,x in enumerate(nparray)]
And avoid using list as variable name since it's alrteady used as an builtin.
to use enumerator, first you need to declare two target vars because enumerator return a tuple.
Using your example in a comprehension list, it could be like this:
listR = [[idx]*(len(val)) for idx,val in enumerate(multiarray)]
If you want to deep https://docs.python.org/2/library/functions.html#enumerate
I hope this help you.
Regards

Are these two generator expressions doing the same thing?

Assuming that there is a list to work on, I am not sure whether these two lines of code have the same return values:
sum(lst[i] for i in lst[:-1] if lst[i]<0)
sum(lst[i] for i in range(len(lst)-1) if lst[i]<0)
Furthermore, could I have replaced sum(lst[i]... with sum(i... and still get the exact same result?
In the first you are looping over the elements of lst; do not use those values as an index. Instead, just use the values directly:
sum(elem for elem in lst[:-1] if elem < 0)
I renamed i to elem to make this clearer; now it the equivalent of your second version, where you use indices generated by range().
When you already have a sequence, and need to iterate over the values, there rarely is a need to use range() to produce indices instead.

Why does len() not support iterators?

Many of Python's built-in functions (any(), all(), sum() to name some) take iterables but why does len() not?
One could always use sum(1 for i in iterable) as an equivalent, but why is it len() does not take iterables in the first place?
Many iterables are defined by generator expressions which don't have a well defined len. Take the following which iterates forever:
def sequence(i=0):
while True:
i+=1
yield i
Basically, to have a well defined length, you need to know the entire object up front. Contrast that to a function like sum. You don't need to know the entire object at once to sum it -- Just take one element at a time and add it to what you've already summed.
Be careful with idioms like sum(1 for i in iterable), often it will just exhaust iterable so you can't use it anymore. Or, it could be slow to get the i'th element if there is a lot of computation involved. It might be worth asking yourself why you need to know the length a-priori. This might give you some insight into what type of data-structure to use (frequently list and tuple work just fine) -- or you may be able to perform your operation without needing calling len.
This is an iterable:
def forever():
while True:
yield 1
Yet, it has no length. If you want to find the length of a finite iterable, the only way to do so, by definition of what an iterable is (something you can repeatedly call to get the next element until you reach the end) is to expand the iterable out fully, e.g.:
len(list(the_iterable))
As mgilson pointed out, you might want to ask yourself - why do you want to know the length of a particular iterable? Feel free to comment and I'll add a specific example.
If you want to keep track of how many elements you have processed, instead of doing:
num_elements = len(the_iterable)
for element in the_iterable:
...
do:
num_elements = 0
for element in the_iterable:
num_elements += 1
...
If you want a memory-efficient way of seeing how many elements end up being in a comprehension, for example:
num_relevant = len(x for x in xrange(100000) if x%14==0)
It wouldn't be efficient to do this (you don't need the whole list):
num_relevant = len([x for x in xrange(100000) if x%14==0])
sum would probably be the most handy way, but it looks quite weird and it isn't immediately clear what you're doing:
num_relevant = sum(1 for _ in (x for x in xrange(100000) if x%14==0))
So, you should probably write your own function:
def exhaustive_len(iterable):
length = 0
for _ in iterable: length += 1
return length
exhaustive_len(x for x in xrange(100000) if x%14==0)
The long name is to help remind you that it does consume the iterable, for example, this won't work as you might think:
def yield_numbers():
yield 1; yield 2; yield 3; yield 5; yield 7
the_nums = yield_numbers()
total_nums = exhaustive_len(the_nums)
for num in the_nums:
print num
because exhaustive_len has already consumed all the elements.
EDIT: Ah in that case you would use exhaustive_len(open("file.txt")), as you have to process all lines in the file one-by-one to see how many there are, and it would be wasteful to store the entire file in memory by calling list.

Categories

Resources