Suppose I want the first element, the 3rd through 200th elements, and the 201st element through the last element by step-size 3, from a list in Python.
One way to do it is with distinct indexing and concatenation:
new_list = old_list[0:1] + old_list[3:201] + old_list[201::3]
Is there a way to do this with just one index on old_list? I would like something like the following (I know this doesn't syntactically work since list indices cannot be lists and since Python unfortunately doesn't have slice literals; I'm just looking for something close):
new_list = old_list[[0, 3:201, 201::3]]
I can achieve some of this by switching to NumPy arrays, but I'm more interested in how to do it for native Python lists. I could also create a slice maker or something like that, and possibly strong arm that into giving me an equivalent slice object to represent the composition of all my desired slices.
But I'm looking for something that doesn't involve creating a new class to manage the slices. I want to just sort of concatenate the slice syntax and feed that to my list and have the list understand that it means to separately get the slices and concatenate their respective results in the end.
A slice maker object (e.g. SliceMaker from your other question, or np.s_) can accept multiple comma-separated slices; they are received as a tuple of slices or other objects:
from numpy import s_
s_[0, 3:5, 6::3]
Out[1]: (0, slice(3, 5, None), slice(6, None, 3))
NumPy uses this for multidimensional arrays, but you can use it for slice concatenation:
def xslice(arr, slices):
if isinstance(slices, tuple):
return sum((arr[s] if isinstance(s, slice) else [arr[s]] for s in slices), [])
elif isinstance(slices, slice):
return arr[slices]
else:
return [arr[slices]]
xslice(list(range(10)), s_[0, 3:5, 6::3])
Out[1]: [0, 3, 4, 6, 9]
xslice(list(range(10)), s_[1])
Out[2]: [1]
xslice(list(range(10)), s_[:])
Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
import numpy as np
a = list(range(15, 50, 3))
# %%timeit -n 10000 -> 41.1 µs ± 1.71 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
[a[index] for index in np.r_[1:3, 5:7, 9:11]]
---
[18, 21, 30, 33, 42, 45]
import numpy as np
a = np.arange(15, 50, 3).astype(np.int32)
# %%timeit -n 10000 -> 31.9 µs ± 5.68 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
a[np.r_[1:3, 5:7, 9:11]]
---
array([18, 21, 30, 33, 42, 45], dtype=int32)
import numpy as np
a = np.arange(15, 50, 3).astype(np.int32)
# %%timeit -n 10000 -> 7.17 µs ± 1.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
slices = np.s_[1:3, 5:7, 9:11]
np.concatenate([a[_slice] for _slice in slices])
---
array([18, 21, 30, 33, 42, 45], dtype=int32)
Seems using numpy is a faster way.
Adding numpy part to the answer from ecatmur.
import numpy as np
def xslice(x, slices):
"""Extract slices from array-like
Args:
x: array-like
slices: slice or tuple of slice objects
"""
if isinstance(slices, tuple):
if isinstance(x, np.ndarray):
return np.concatenate([x[_slice] for _slice in slices])
else:
return sum((x[s] if isinstance(s, slice) else [x[s]] for s in slices), [])
elif isinstance(slices, slice):
return x[slices]
else:
return [x[slices]]
You're probably better off writing your own sequence type.
>>> L = range(20)
>>> L
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> operator.itemgetter(*(range(1, 5) + range(10, 18, 3)))(L)
(1, 2, 3, 4, 10, 13, 16)
And to get you started on that:
>>> operator.itemgetter(*(range(*slice(1, 5).indices(len(L))) + range(*slice(10, 18, 3).indices(len(L)))))(L)
(1, 2, 3, 4, 10, 13, 16)
Not sure if this is "better", but it works so why not...
[y for x in [old_list[slice(*a)] for a in ((0,1),(3,201),(201,None,3))] for y in x]
It's probably slow (especially compared to chain) but it's basic python (3.5.2 used for testing)
Why don;t you create a custom slice for your purpose
>>> from itertools import chain, islice
>>> it = range(50)
>>> def cslice(iterable, *selectors):
return chain(*(islice(iterable,*s) for s in selectors))
>>> list(cslice(it,(1,5),(10,15),(25,None,3)))
[1, 2, 3, 4, 10, 11, 12, 13, 14, 25, 28, 31, 34, 37, 40, 43, 46, 49]
You could extend list to allow multiple slices and indices:
class MultindexList(list):
def __getitem__(self, key):
if type(key) is tuple or type(key) is list:
r = []
for index in key:
item = super().__getitem__(index)
if type(index) is slice:
r += item
else:
r.append(item)
return r
else:
return super().__getitem__(key)
a = MultindexList(range(10))
print(a[1:3]) # [1, 2]
print(a[[1, 2]]) # [1, 2]
print(a[1, 1:3, 4:6]) # [1, 1, 2, 4, 5]
Related
There are some questions that come close, but I haven't found a specific answer to this. I'm trying to do some in-place sorting of a numpy 3D array along a given axis. I don't want simple sorting though, I want to resort the array according to my own index. For example
a = np.random.rand((3,3,3))
and let's say I want to resort the last dimension according to the following indices of the old array:
new_order = [1,2,0]
I would expect to be able to say:
a[:,:,new_order] = a
but this does not behave as expected. Suggestions?
np.ndarray.sort is the only sort that claims to be inplace, and it does not give you much control.
Placing the order index on the right works - but can give unpredictable results. Evidently it is doing some sort of sequential assignment, and an earlier assignment on the left can affect values on the right.
In [719]: a=np.arange(12).reshape(3,4)
In [720]: a[:,[0,1,3,2]]=a
In [721]: a
Out[721]:
array([[ 0, 1, 2, 2],
[ 4, 5, 6, 6],
[ 8, 9, 10, 10]])
To do this sort of assignment predictably requires some sort of buffering.
In [728]: a[:,[0,1,3,2]]=a.copy()
In [729]: a
Out[729]:
array([[ 0, 1, 3, 2],
[ 4, 5, 7, 6],
[ 8, 9, 11, 10]])
Indexing of the right gets around this, but this is not in-place. The variable a points to a new object.
In [731]: a=a[:,[0,1,3,2]]
In [732]: a
Out[732]:
array([[ 0, 1, 3, 2],
[ 4, 5, 7, 6],
[ 8, 9, 11, 10]])
However assignment with [:] may solve this:
In [738]: a=np.arange(12).reshape(3,4)
In [739]: a.__array_interface__
Out[739]:
{'data': (181868592, False), # 181... is the id of the data buffer
'descr': [('', '<i4')],
'shape': (3, 4),
'strides': None,
'typestr': '<i4',
'version': 3}
In [740]: a[:]=a[:,[0,1,3,2]]
In [741]: a.__array_interface__
Out[741]:
{'data': (181868592, False), # same data buffer
'descr': [('', '<i4')],
'shape': (3, 4),
'strides': None,
'typestr': '<i4',
'version': 3}
In [742]: a
Out[742]:
array([[ 0, 1, 3, 2],
[ 4, 5, 7, 6],
[ 8, 9, 11, 10]])
The fact that the a.data id is the same indicates that this is an inplace action. But it would be good to test this with other indexing to make sure it does what you want.
But, is 'inplace' sorting necessary? If the array is very large it might be needed to avoid memory errors. But we'd have to test the alternatives to see if they work.
inplace matters also if there is some other variable that uses the same data. For example
b = a.T # a transpose
With a[:]= the rows of b will be reordered. a and b continue to share the same data. With a=, b is unchanged. a and b are now decoupled.
Unfortunately, numpy does not have a builtin solution for this. The only way is to either use some clever assignments or to write your own custom method.
Using cycle detection, an additional set for remembering indices and an auxiliary array for caching the axis, I wrote a custom method for this that should be usefull for reordering large ndarrays:
import numpy as np
def put_at(index, axis=-1, slc=(slice(None),)):
"""Gets the numpy indexer for the given index based on the axis."""
return (axis < 0)*(Ellipsis,) + axis*slc + (index,) + (-1-axis)*slc
def reorder_inplace(array, new_order, axis=0):
"""
Reindex (reorder) the array along an axis.
:param array: The array to reindex.
:param new_order: A list with the new index order. Must be a valid permutation.
:param axis: The axis to reindex.
"""
if np.size(array, axis=axis) != len(new_order):
raise ValueError(
'The new order did not match indexed array along dimension %{0}; '
'dimension is %{1} but corresponding boolean dimension is %{2}'.format(
axis, np.size(array, axis=axis), len(new_order)
)
)
visited = set()
for index, source in enumerate(new_order):
if index not in visited and index != source:
initial_values = np.take(array, index, axis=axis).copy()
destination = index
visited.add(destination)
while source != index:
if source in visited:
raise IndexError(
'The new order is not unique; '
'duplicate found at position %{0} with value %{1}'.format(
destination, source
)
)
array[put_at(destination, axis=axis)] = array.take(source, axis=axis)
destination = source
source = new_order[destination]
visited.add(destination)
array[put_at(destination, axis=axis)] = initial_values
Example:
In[4]: a = np.arange(15).reshape(3, 5)
In[5]: a
Out[5]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
Reorder on axis 0:
In[6]: reorder_inplace(a, [2, 0, 1], axis=0)
In[7]: a
Out[7]:
array([[10, 11, 12, 13, 14],
[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]])
Reorder on axis 1:
In[10]: reorder_inplace(a, [3, 2, 0, 4, 1], axis=1)
In[11]: a
Out[11]:
array([[ 3, 2, 0, 4, 1],
[ 8, 7, 5, 9, 6],
[13, 12, 10, 14, 11]]
Timing and memory for small array of 1000 x 1000
In[5]: a = np.arange(1000 * 1000).reshape(1000, 1000)
In[6]: %timeit reorder_inplace(a, np.random.permutation(1000))
8.19 ms ± 18.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In[7]: %memit reorder_inplace(a, np.random.permutation(1000))
peak memory: 81.75 MiB, increment: 0.49 MiB
In[8]: %timeit a[:] = a[np.random.permutation(1000), :]
3.27 ms ± 9.49 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In[9]: %memit a[:] = a[np.random.permutation(1000), :]
peak memory: 89.56 MiB, increment: 0.01 MiB
For small array, the memory consumption is not very different, but the numpy version is much faster.
Timing and memory for 20000 x 20000
In[5]: a = np.arange(20000 * 20000).reshape(20000, 20000)
In[6]: %timeit reorder_inplace(a, np.random.permutation(20000))
1.16 s ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In[7]: %memit reorder_inplace(a, np.random.permutation(20000))
peak memory: 3130.77 MiB, increment: 0.19 MiB
In[8]: %timeit a[:] = a[np.random.permutation(20000), :]
1.84 s ± 2.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In[9]: %memit a[:] = a[np.random.permutation(20000), :]
peak memory: 6182.80 MiB, increment: 3051.76 MiB
When the size of the array increases by a notch, the numpy version becomes much slower. The memory consumption for the numpy version is also very high. The custom inplace reordering uses a negligible amount.
Here you are,
a = a[:, :, new_order]
Also, here are a couple 'numpy for Matlab users' pages that I found useful when I was getting started:
Link
http://mathesaurus.sourceforge.net/matlab-numpy.html
Does range function allows concatenation ? Like i want to make a range(30) & concatenate it with range(2000, 5002). So my concatenated range will be 0, 1, 2, ... 29, 2000, 2001, ... 5001
Code like this does not work on my latest python (ver: 3.3.0)
range(30) + range(2000, 5002)
You can use itertools.chain for this:
from itertools import chain
concatenated = chain(range(30), range(2000, 5002))
for i in concatenated:
...
It works for arbitrary iterables. Note that there's a difference in behavior of range() between Python 2 and 3 that you should know about: in Python 2 range returns a list, and in Python3 an iterator, which is memory-efficient, but not always desirable.
Lists can be concatenated with +, iterators cannot.
I like the most simple solutions that are possible (including efficiency). It is not always clear whether the solution is such. Anyway, the range() in Python 3 is a generator. You can wrap it to any construct that does iteration. The list() is capable of construction of a list value from any iterable. The + operator for lists does concatenation. I am using smaller values in the example:
>>> list(range(5))
[0, 1, 2, 3, 4]
>>> list(range(10, 20))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> list(range(5)) + list(range(10,20))
[0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
This is what range(5) + range(10, 20) exactly did in Python 2.5 -- because range() returned a list.
In Python 3, it is only useful if you really want to construct the list. Otherwise, I recommend the Lev Levitsky's solution with itertools.chain. The documentation also shows the very straightforward implementation:
def chain(*iterables):
# chain('ABC', 'DEF') --> A B C D E F
for it in iterables:
for element in it:
yield element
The solution by Inbar Rose is fine and functionally equivalent. Anyway, my +1 goes to Lev Levitsky and to his argument about using the standard libraries. From The Zen of Python...
In the face of ambiguity, refuse the temptation to guess.
#!python3
import timeit
number = 10000
t = timeit.timeit('''\
for i in itertools.chain(range(30), range(2000, 5002)):
pass
''',
'import itertools', number=number)
print('itertools:', t/number * 1000000, 'microsec/one execution')
t = timeit.timeit('''\
for x in (i for j in (range(30), range(2000, 5002)) for i in j):
pass
''', number=number)
print('generator expression:', t/number * 1000000, 'microsec/one execution')
In my opinion, the itertools.chain is more readable. But what really is important...
itertools: 264.4522138986938 microsec/one execution
generator expression: 785.3081048010291 microsec/one execution
... it is about 3 times faster.
python >= 3.5
You can use iterable unpacking in lists (see PEP 448: Additional Unpacking Generalizations).
If you need a list,
[*range(2, 5), *range(3, 7)]
# [2, 3, 4, 3, 4, 5, 6]
This preserves order and does not remove duplicates. Or, you might want a tuple,
(*range(2, 5), *range(3, 7))
# (2, 3, 4, 3, 4, 5, 6)
... or a set,
# note that this drops duplicates
{*range(2, 5), *range(3, 7)}
# {2, 3, 4, 5, 6}
It also happens to be faster than calling itertools.chain.
from itertools import chain
%timeit list(chain(range(10000), range(5000, 20000)))
%timeit [*range(10000), *range(5000, 20000)]
738 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
665 µs ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The benefit of chain, however, is that you can pass an arbitrary list of ranges.
ranges = [range(2, 5), range(3, 7), ...]
flat = list(chain.from_iterable(ranges))
OTOH, unpacking generalisations haven't been "generalised" to arbitrary sequences, so you will still need to unpack the individual ranges yourself.
Can be done using list-comprehension.
>>> [i for j in (range(10), range(15, 20)) for i in j]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 16, 17, 18, 19]
Works for your request, but it is a long answer so I will not post it here.
note: can be made into a generator for increased performance:
for x in (i for j in (range(30), range(2000, 5002)) for i in j):
# code
or even into a generator variable.
gen = (i for j in (range(30), range(2000, 5002)) for i in j)
for x in gen:
# code
With the help of the extend method, we can concatenate two lists.
>>> a = list(range(1,10))
>>> a.extend(range(100,105))
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 100, 101, 102, 103, 104]
range() in Python 2.x returns a list:
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
xrange() in Python 2.x returns an iterator:
>>> xrange(10)
xrange(10)
And in Python 3 range() also returns an iterator:
>>> r = range(10)
>>> iterator = r.__iter__()
>>> iterator.__next__()
0
>>> iterator.__next__()
1
>>> iterator.__next__()
2
So it is clear that you can not concatenate iterators other by using chain() as the other guy pointed out.
You can use list function around range function to make a list
LIKE THIS
list(range(3,7))+list(range(2,9))
I came to this question because I was trying to concatenate an unknown number of ranges, that might overlap, and didn't want repeated values in the final iterator. My solution was to use set and the union operator like so:
range1 = range(1,4)
range2 = range(2,6)
concatenated = set.union(set(range1), set(range2)
for i in concatenated:
print(i)
So, i simply want to make this faster:
for x in range(matrix.shape[0]):
for y in range(matrix.shape[1]):
if matrix[x][y] == 2 or matrix[x][y] == 3 or matrix[x][y] == 4 or matrix[x][y] == 5 or matrix[x][y] == 6:
if x not in heights:
heights.append(x)
Simply iterate over a 2x2 matrix (usually round 18x18 or 22x22) and check it's x. But its kinda slow, i wonder which is the fastest way to do this.
Thank you very much!
For a numpy based approach, you can do:
np.flatnonzero(((a>=2) & (a<=6)).any(1))
# array([1, 2, 6], dtype=int64)
Where:
a = np.random.randint(0,30,(7,7))
print(a)
array([[25, 27, 28, 21, 18, 7, 26],
[ 2, 18, 21, 13, 27, 26, 2],
[23, 27, 18, 7, 4, 6, 13],
[25, 20, 19, 15, 8, 22, 0],
[27, 23, 18, 22, 25, 17, 15],
[19, 12, 12, 9, 29, 23, 21],
[16, 27, 22, 23, 8, 3, 11]])
Timings on a larger array:
a = np.random.randint(0,30, (1000,1000))
%%timeit
heights=[]
for x in range(a.shape[0]):
for y in range(a.shape[1]):
if a[x][y] == 2 or a[x][y] == 3 or a[x][y] == 4 or a[x][y] == 5 or a[x][y] == 6:
if x not in heights:
heights.append(x)
# 3.17 s ± 59.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
yatu = np.flatnonzero(((a>=2) & (a<=6)).any(1))
# 965 µs ± 11.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
np.allclose(yatu, heights)
# true
Vectorizing with numpy yields to roughly a 3200x speedup
It looks like you want to find if 2, 3, 4, 5 or 6 appear in the matrix.
You can use np.isin() to create a matrix of true/false values, then use that as an indexer:
>>> arr = np.array([1,2,3,4,4,0]).reshape(2,3)
>>> arr[np.isin(arr, [2,3,4,5,6])]
array([2, 3, 4, 4])
Optionally, turn that into a plain Python set() for faster in lookups and no duplicates.
To get the positions in the array where those numbers appear, use argwhere:
>>> np.argwhere(np.isin(arr, [2,3,4,5,6]))
array([[0, 1],
[0, 2],
[1, 0],
[1, 1]])
I would like to know the fastest way to compute the intersection of two list within a numba function. Just for clarification: an example of the intersection of two lists:
Input :
lst1 = [15, 9, 10, 56, 23, 78, 5, 4, 9]
lst2 = [9, 4, 5, 36, 47, 26, 10, 45, 87]
Output :
[9, 10, 4, 5]
The problem is, that this needs to be computed within the numba function and therefore e.g. sets can not be used. Do you have an idea?
My current code is very basic. I assume that there is room for improvement.
#nb.njit
def intersection:
result = []
for element1 in lst1:
for element2 in lst2:
if element1 == element2:
result.append(element1)
....
Since numba compiles and runs your code in machine code, your probably at the best for such a simple operation.
I ran some benchmarks below
#nb.njit
def loop_intersection(lst1, lst2):
result = []
for element1 in lst1:
for element2 in lst2:
if element1 == element2:
result.append(element1)
return result
#nb.njit
def set_intersect(lst1, lst2):
return set(lst1).intersection(set(lst2))
Resuls
loop_intersection
40.4 µs ± 1.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
set_intersect
42 µs ± 6.74 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I played with this a bit to try and learn something, realizing that the answer has already been given. When I run the accepted answer I get a return value of [9, 10, 5, 4, 9]. I wasn’t clear if the repeated 9 was acceptable or not.
Assuming it’s OK, I ran a trial using list comprehension to see it made any difference. My results:
from numba import jit
def createLists():
l1 = [15, 9, 10, 56, 23, 78, 5, 4, 9]
l2 = [9, 4, 5, 36, 47, 26, 10, 45, 87]
#jit
def listComp():
l1, l2 = createLists()
return [i for i in l1 for j in l2 if i == j]
%timeit listComp()
5.84 microseconds +/- 10.5 nanoseconds
Or if you can can use Numpy this code is even faster and removes the duplicate "9" and is much faster with the Numba signature.
import numpy as np
from numba import jit, int64
#jit(int64[:](int64[:], int64[:]))
def JitListComp(l1, l2):
l3 = np.array([i for i in l1 for j in l2 if i == j])
return np.unique(l3) # and i not in crossSec]
#jit
def CreateList():
l1 = np.array([15, 9, 10, 56, 23, 78, 5, 4, 9])
l2 = np.array([9, 4, 5, 36, 47, 26, 10, 45, 87])
return JitListComp(l1, l2)
CreateList()
Out[39]: array([ 4, 5, 9, 10])
%timeit CreateList()
1.71 µs ± 10.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
You can use set operation for this:
def intersection(lst1, lst2):
return list(set(lst1) & set(lst2))
then simply call the function intersection(lst1,lst2). This will be the easiest way.
Does range function allows concatenation ? Like i want to make a range(30) & concatenate it with range(2000, 5002). So my concatenated range will be 0, 1, 2, ... 29, 2000, 2001, ... 5001
Code like this does not work on my latest python (ver: 3.3.0)
range(30) + range(2000, 5002)
You can use itertools.chain for this:
from itertools import chain
concatenated = chain(range(30), range(2000, 5002))
for i in concatenated:
...
It works for arbitrary iterables. Note that there's a difference in behavior of range() between Python 2 and 3 that you should know about: in Python 2 range returns a list, and in Python3 an iterator, which is memory-efficient, but not always desirable.
Lists can be concatenated with +, iterators cannot.
I like the most simple solutions that are possible (including efficiency). It is not always clear whether the solution is such. Anyway, the range() in Python 3 is a generator. You can wrap it to any construct that does iteration. The list() is capable of construction of a list value from any iterable. The + operator for lists does concatenation. I am using smaller values in the example:
>>> list(range(5))
[0, 1, 2, 3, 4]
>>> list(range(10, 20))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> list(range(5)) + list(range(10,20))
[0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
This is what range(5) + range(10, 20) exactly did in Python 2.5 -- because range() returned a list.
In Python 3, it is only useful if you really want to construct the list. Otherwise, I recommend the Lev Levitsky's solution with itertools.chain. The documentation also shows the very straightforward implementation:
def chain(*iterables):
# chain('ABC', 'DEF') --> A B C D E F
for it in iterables:
for element in it:
yield element
The solution by Inbar Rose is fine and functionally equivalent. Anyway, my +1 goes to Lev Levitsky and to his argument about using the standard libraries. From The Zen of Python...
In the face of ambiguity, refuse the temptation to guess.
#!python3
import timeit
number = 10000
t = timeit.timeit('''\
for i in itertools.chain(range(30), range(2000, 5002)):
pass
''',
'import itertools', number=number)
print('itertools:', t/number * 1000000, 'microsec/one execution')
t = timeit.timeit('''\
for x in (i for j in (range(30), range(2000, 5002)) for i in j):
pass
''', number=number)
print('generator expression:', t/number * 1000000, 'microsec/one execution')
In my opinion, the itertools.chain is more readable. But what really is important...
itertools: 264.4522138986938 microsec/one execution
generator expression: 785.3081048010291 microsec/one execution
... it is about 3 times faster.
python >= 3.5
You can use iterable unpacking in lists (see PEP 448: Additional Unpacking Generalizations).
If you need a list,
[*range(2, 5), *range(3, 7)]
# [2, 3, 4, 3, 4, 5, 6]
This preserves order and does not remove duplicates. Or, you might want a tuple,
(*range(2, 5), *range(3, 7))
# (2, 3, 4, 3, 4, 5, 6)
... or a set,
# note that this drops duplicates
{*range(2, 5), *range(3, 7)}
# {2, 3, 4, 5, 6}
It also happens to be faster than calling itertools.chain.
from itertools import chain
%timeit list(chain(range(10000), range(5000, 20000)))
%timeit [*range(10000), *range(5000, 20000)]
738 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
665 µs ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The benefit of chain, however, is that you can pass an arbitrary list of ranges.
ranges = [range(2, 5), range(3, 7), ...]
flat = list(chain.from_iterable(ranges))
OTOH, unpacking generalisations haven't been "generalised" to arbitrary sequences, so you will still need to unpack the individual ranges yourself.
Can be done using list-comprehension.
>>> [i for j in (range(10), range(15, 20)) for i in j]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 16, 17, 18, 19]
Works for your request, but it is a long answer so I will not post it here.
note: can be made into a generator for increased performance:
for x in (i for j in (range(30), range(2000, 5002)) for i in j):
# code
or even into a generator variable.
gen = (i for j in (range(30), range(2000, 5002)) for i in j)
for x in gen:
# code
With the help of the extend method, we can concatenate two lists.
>>> a = list(range(1,10))
>>> a.extend(range(100,105))
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 100, 101, 102, 103, 104]
range() in Python 2.x returns a list:
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
xrange() in Python 2.x returns an iterator:
>>> xrange(10)
xrange(10)
And in Python 3 range() also returns an iterator:
>>> r = range(10)
>>> iterator = r.__iter__()
>>> iterator.__next__()
0
>>> iterator.__next__()
1
>>> iterator.__next__()
2
So it is clear that you can not concatenate iterators other by using chain() as the other guy pointed out.
You can use list function around range function to make a list
LIKE THIS
list(range(3,7))+list(range(2,9))
I came to this question because I was trying to concatenate an unknown number of ranges, that might overlap, and didn't want repeated values in the final iterator. My solution was to use set and the union operator like so:
range1 = range(1,4)
range2 = range(2,6)
concatenated = set.union(set(range1), set(range2)
for i in concatenated:
print(i)