Splitting an array into sequential groups - python

I have an array with elements [2, 3, 4, 5, 6, 7, 8, 9, 10, ...]. I wish to split this up as follows: [[2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], ...]. I am not sure how to do this because the elements in each split must be repeated, and I am unsure of how to create these repeated elements. Any help will be much appreciated.

You can use zip() to create 3-tuples, and then use list() to transform the resulting tuples into lists.
data = [2, 3, 4, 5, 6, 7, 8, 9, 10]
# Prints [[2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9], [8, 9, 10]]
print(list(list(item) for item in zip(data, data[1:], data[2:])))

Use a comprehension:
N = 3
l = [2, 3, 4, 5, 6, 7, 8, 9, 10]
ll = [l[i:i+N] for i in range(len(l)-N+1)]
Output:
>>> ll
[[2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9], [8, 9, 10]]

There's a module "more_itertools" that has method that creates triples from a list:
import more_itertools
out = list(more_itertools.triplewise(lst))
Output:
[(2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10)]
It's not a built-in module, so you'll have to install it (pip or conda or whatever you use) beforehand however.

The zip idea generalised for varying chunk size:
lst = [2, 3, 4, 5, 6, 7, 8, 9, 10]
result = list(zip(*(lst[i:] for i in range(3))))

zip() and comprehension, suggested above, are probably the ways to go in "real life", but in order to better understand the problem, consider this very simple approach:
data = [2, 3, 4, 5, 6, 7, 8, 9, 10]
result = []
for i in range(0,len(data)-2):
result.append([data[i],data[i+1],data[i+2]])
print(result)

Related

Split list into sublists of length x (with or without overlap)

There are many similar questions on here, but I can't find exactly what I'm looking for.
I want to split a list into sublists, each of which is exactly length x. This can include overlap, and the area of overlap doesn't matter so much. For example:
list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
max_len = 3
desired_result = [[1, 2, 3], [3, 4, 5], [6, 7, 8], [8, 9, 10]]
# or
max_length = 4
desired_result = [[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]
# or
max_len = 5
desired_result = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
It doesn't matter how many final sublists there are, though I don't want any more than necessary.
It also doesn't matter where the overlap happens, I just need to capture all the individual items in the original list and have each sublist result in the same number of items.
Thanks!
You can adjust the accepted answer by NedBatchelder in this thread to work for the described scenario.
This is a generator function which I think is a pretty neat solution.
def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
# n must not be 0
for i in range(0, len(lst), n):
if i + n >= len(lst):
yield lst[-n:]
else:
yield lst[i:i + n]
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for i in range(1, 11):
print(list(chunks(l, i)))
Expected output:
[[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]
[[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [8, 9, 10]]
[[1, 2, 3, 4], [5, 6, 7, 8], [7, 8, 9, 10]]
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6], [5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7], [4, 5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7, 8], [3, 4, 5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7, 8, 9], [2, 3, 4, 5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]
The trick as I see it is to iterate an index in steps of x but "clip" the last one to be no less than x from the end:
>>> a = list(range(1, 11))
>>> x = 3
>>> [a[i:i+x] for i in (min(i, len(a) - x) for i in range(0, len(a), x))]
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [8, 9, 10]]
Taken straight from itertools' recipes:
from itertools import zip_longest
def grouper(iterable, n, *, incomplete='fill', fillvalue=None):
"Collect data into non-overlapping fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, fillvalue='x') --> ABC DEF Gxx
# grouper('ABCDEFG', 3, incomplete='strict') --> ABC DEF ValueError
# grouper('ABCDEFG', 3, incomplete='ignore') --> ABC DEF
args = [iter(iterable)] * n
if incomplete == 'fill':
return zip_longest(*args, fillvalue=fillvalue)
if incomplete == 'strict':
return zip(*args, strict=True)
if incomplete == 'ignore':
return zip(*args)
else:
raise ValueError('Expected fill, strict, or ignore')
You may then use the "fill" option, and add the last item continuously if you wish to guarantee the size of the sublist:
>>> list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> max_len = 3
>>> list(grouper(list_to_split, max_len, fillvalue=list_to_split[-1]))
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 10, 10)]
Iterate over the list, slicing max_len elements at a time. We start the slice at min(idx, len(list_to_split) - max_len)) in case we're too close to the end of the list:
list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
max_len = 3
result = []
for idx in range(0, len(list_to_split), max_len):
start = min(idx, len(list_to_split) - max_len)
result.append(list_to_split[start:start + max_len])
print(result)
You can turn this into a list comprehension, but it's admittedly not very readable:
list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
max_len = 3
result = [
list_to_split[
min(idx, len(list_to_split) - max_len):
min(idx, len(list_to_split) - max_len) + max_len]
for idx in range(0, len(list_to_split), max_len)
]
print(result)
Both of these output:
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [8, 9, 10]]
Following code should do without any libraries, though there are many libraries too that you can use
def main():
'''The Main'''
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
x = 3
print([l[i:i+x] for i in range(0, len(l), x)])
if __name__ == '__main__':
main()
Output
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

Is there a better method to create such a numpy array?

I want a numpy array like this:
b = np.array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
Is there a faster way to create a NumPy array like this instead of typing them manually?
You can do something like this:
>>> np.repeat(np.arange(1, 10).reshape(-1,1), 6, axis=1)
array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
Explanation:
np.arange(1, 10).reshape(-1,1) creates an array
array([[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
np.repeat(_, 6, axis=1) repeats this 6 times on the first (or second in human words) axis.
Yes. There are plenty of methods. This is one:
np.repeat(np.arange(1,10),6,axis=0).reshape(9,6)
Another method is to use broadcasting:
>>> np.arange(1,10)[:,None] * np.ones(6, dtype=int)
array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
For any w*l size, convert a list of lists into an np.array like so:
w = 6
l = 9
[np.array([[1+i]*w for i in range(d)])
array([[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9]])
np.transpose(np.array(([np.arange(1,10)] * 6)))
np.arange(1,10) creates an numpy array from 1 to 9.
[] puts the array into a list.
*6 augments the array 6 times.
np.array() converts the resulting structure (list of arrays) to a numpy array
np.transpose() rotates the orientation of the numpy array to get vertical one.

Loop through a list of lists to find max value

The following shows a sample of a list of lists that I have:
[[['point_5', [5, 6, 7, 8], 11.0],
['point_5', [6, 7, 8, 9],12.57]],
[['point_18', [3, 4, 5, 6],6.25],
['point_18', [3, 5, 6, 7],7.2],
['point_18', [4, 5, 6, 7],7.55],
['point_18', [6, 7, 8, 9],14.0],
['point_19', [3, 5, 6, 7],8.166],
['point_19', [5, 6, 7, 8],9.285],
['point_19', [6, 7, 8, 9],11.0]]]
I need to define a loop which searches through each element of this list of lists and returns the maximum value of the last element of each list. What I mean is that for example for [6,7,8,9] we have:
['point_5', [6, 7, 8, 9], 12.57]
['point_18', [6, 7, 8, 9],14.0]
['point_19', [6, 7, 8, 9],11.0]
Since max(12.57,14.0,11.0) = 14.0 then what I am looking for is a list that one of its element is [point_18,[6,7,8,9],14.0].
Another example is that since the only element that has [3, 4, 5, 6] is ['point_18', [3, 4, 5, 6],6.25] then another element of the new list should be [point_18,[3,4,5,6],6.25].
In fact, the new list of lists that I am trying to create should be like the following list:
New_list = [['point_5',[5,6,7,8],11.0],['point_18',[6,7,8,9],14.0],['point_18', [3, 4, 5, 6],6.25],['point_19', [3, 5, 6, 7],8.166],['point_18', [4, 5, 6, 7],7.55]].
I am not sure if it is a good idea or not but what I have done is that first I tried to extract each unique [x,y,i,j] in a list through the following code:
A = []
for i in bob:
for j in i:
A.append(j[1])
import itertools
A.sort()
B = list(A for A,_ in itertools.groupby(A))
Now B is:
[[3, 4, 5, 6],
[3, 5, 6, 7],
[4, 5, 6, 7],
[5, 6, 7, 8],
[6, 7, 8, 9]]
Then I want to search for each element of this list in the main lists of list and find the max value.
Any help would be appreciated.
I think you can try breaking the problems into parts.
I don't know if you had a list that already done the first step, so I am assuming no
This is the list that I am working with:
your_list = [['point_5', [5, 6, 7, 8], 11.0],
['point_5', [6, 7, 8, 9],12.57],
['point_18', [3, 4, 5, 6],6.25],
['point_18', [3, 5, 6, 7],7.2],
['point_18', [4, 5, 6, 7],7.55],
['point_18', [6, 7, 8, 9],14.0],
['point_19', [3, 5, 6, 7],8.166],
['point_19', [5, 6, 7, 8],9.285],
['point_19', [6, 7, 8, 9],11.0]]
First, try sorting the list in a dict.
sorted_dict = {}
for i in your_list:
if tuple(i[1]) in sorted_dict:
sorted_dict[tuple(i[1])].append(i)
else:
sorted_dict[tuple(i[1])] = [i]
Then, select the max and put it in a list.
return_list = []
for key, values in sorted_dict.items():
print(values)
return_list.append(sorted(values, key=lambda x: float(x[2]))[-1]) # sort the list according to the third value
Return list now should have the value you're looking for
I am not sure if this is what you're looking for so comment if there's any problem

padding while creating sublists

Is there an elegant way how to pad the last sublist with zeroes while creating sublists from a list of integers?
So far I have this oneliner and need to fill the last sublist with 2 zeroes
[lst[x:x+3] for x in range(0, len(lst), 3)]
for example
lst =[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result should be:
[1,2,3][4,5,6][7,8,9][10,0,0]
With itertools.zip_longest, consuming the same iterator created off of the list, and fill in the missing values as 0 :
[[*i] for i in itertools.zip_longest(*[iter(lst)] * 3, fillvalue=0)]
Example:
In [1219]: lst =[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
In [1220]: [[*i] for i in itertools.zip_longest(*[iter(lst)] * 3, fillvalue=0)]
Out[1220]: [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 0, 0]]
Without itertools:
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print([lst[x:x+3]+[0]*(x-len(lst)+3) for x in range(0, len(lst), 3)])
Prints:
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 0, 0]]

Better way to get sublist in python

I am working on the following problem:
This function returns a list of all possible sublists in L of length n without skipping elements in L. The sublists in the returned list should be ordered in the way they appear in L, with those sublists starting from a smaller index being at the front of the list.
Example 1, if L = [10, 4, 6, 8, 3, 4, 5, 7, 7, 2] and n = 4 then your function should return the list [[10, 4, 6, 8], [4, 6, 8, 3], [6, 8, 3, 4], [8, 3, 4, 5], [3, 4, 5, 7], [4, 5, 7, 7], [5, 7, 7, 2]]
My solution works but how can I make it shorter? What is a better way to do this?
def getSublists(L, n):
newN = n
myList = []
for i in range(len(L)):
orginalLen = L[i:n]
if(len(orginalLen) == n):
myList.append(L[i:n])
n = n + 1
else:
myList.append(L[i:n])
n = n + 1
if(newN == 1):
print(myList)
else:
print(myList[:len(myList)-(n-1)])
getSublists([10, 4, 6, 8, 3, 4, 5, 7, 7, 2],4)
getSublists([1], 1)
getSublists([0, 0, 0, 0, 0], 2)
OUTPUT
[[10, 4, 6, 8], [4, 6, 8, 3], [6, 8, 3, 4], [8, 3, 4, 5], [3, 4, 5, 7], [4, 5, 7, 7], [5, 7, 7, 2]]
[[1]]
[[0, 0], [0, 0], [0, 0], [0, 0]]
l = [1,2,3,4,5,6,87,9]
n = ..
print [l[i:i+n] for i in range(len(l)-n+1)]
maybe you need.
In one line:
get_sublists = lambda ls, n: [ls[x:x+n] for x in range(len(ls)-n+1)]
get_sublists([10, 4, 6, 8, 3, 4, 5, 7, 7, 2], 4)
[[10, 4, 6, 8], [4, 6, 8, 3], [6, 8, 3, 4], [8, 3, 4, 5], [3, 4, 5, 7], [4, 5, 7, 7], [5, 7, 7, 2]]
def get_sublists(L, n):
return [ L[i:i+n] for i in range(len(L)-n) ]
I completed the program a little better understanding of the reader.
def getSublists(L, n):
new_list = []
for i in range(len(L)-n+1):
a = L[i:i+n]
new_list.append(a)
return new_list
answer:
[[10, 4, 6, 8],
[4, 6, 8, 3],
[6, 8, 3, 4],
[8, 3, 4, 5],
[3, 4, 5, 7],
[4, 5, 7, 7],
[5, 7, 7, 2]]
This is pretty readable I think, to understand the concept. The idea here is to iterate through the numbers from 0 to the length of L, minus 4. And just take the sublist of L from your current index i, to i+4. Iterating to length-4 ensures you don't try to access an index out of bounds!
>>> for i in range(len(L)-4+1):
print L[i:i+4]
[10, 4, 6, 8]
[4, 6, 8, 3]
[6, 8, 3, 4]
[8, 3, 4, 5]
[3, 4, 5, 7]
[4, 5, 7, 7]
[5, 7, 7, 2]

Categories

Resources