Replace climbing sequence with its average - python

I have a random list like this
X = [0, 1, 5, 6, 7, 10, 15]
and need to find and replace every climbing sequence with its average.
In the end it should look like this:
X = [0, 6, 10, 15] #the 0 and 1 to 0; and the 5,6,7 to 6
I tried to find the sequence by subtracting the second value from the first like this:
y = 0
z = []
while X[y +1] -X[y] == 1:
z.append(X[y])
y = y +1
And now I dont know how to delete for example 5,6 and 7 and replace it with the average 6.

You can use itertools.groupby on the list with a key function that returns each item's difference with an incremental counter:
from itertools import groupby, count
from statistics import mean
X = [0, 1, 5, 6, 7, 10, 15]
c = count()
X = [int(mean(g)) for _, g in groupby(X, key=lambda i: i - next(c))]
X becomes:
[0, 6, 10, 15]

You can iterate and group in the same list each climbing sequence for then taking the mean.
>>> res = [[x[0]]]
>>> for i in range(1, len(x)):
... if x[i] == x[i-1] + 1:
... res[-1].append(x[i])
... else:
... res.append([x[i]]
>>> res
[[0, 1], [5, 6, 7], [10], [15]]
>>> [int(sum(l)/len(l)) for l in res]
[0, 6, 10, 15]

Here's a starting technique: make a new list that's the difference of adjacent elements in the list:
diff = [X[i] - X[i-1] for i in range(1, len(X)) ]
There are more "Pythonic" ways to do this, but I want to make sure this is accessible to newer programmers.
You now have diff as
[1, 4, 1, 1, 3, 5]
Where you have a 1 in diff, you have a climbing pair in X. Iterate through diff to find a sequence of 1 values. Where you find this, take the slice of X that corresponds to the 1 values. The middle element of that slice is your mean.
If the value is not 1, then you simply take the corresponding element of X, as you've been doing.
append the identified values to z, and there's your desired result.
Can you take it from there?

Not really to answer the question, which is a fairly basic CS 101 question that people should try to figure out themselves, but what I noticed about the nice answer of #blhsing was that it appeared fairly slow. I found that mean() is incredibly slow!
from itertools import groupby, count
from statistics import mean
from timeit import timeit
def generate_1step_seq1(xs):
result = []
n = 0
while n < len(xs):
# sequences with step of 1 only
if not result or xs[n] == result[-1] + 1:
result += [xs[n]]
else:
# int result, rounding down
yield sum(result) // len(result)
result = [xs[n]]
n += 1
if result:
yield sum(result) // len(result)
def generate_1step_seq2(xs):
c = count()
return [int(sum(xs) // len(xs)) for xs in [list(g) for _, g in groupby(xs, key=lambda i: i - next(c))]]
def generate_1step_seq3(xs):
c = count()
return [int(mean(g)) for _, g in groupby(xs, key=lambda i: i - next(c))]
values = [0, 1, 5, 6, 7, 10, 15]
print(list(generate_1step_seq1(values)))
print(generate_1step_seq2(values))
print(generate_1step_seq3(values))
print(timeit(lambda: list(generate_1step_seq1(values)), number=10000))
print(timeit(lambda: list(generate_1step_seq2(values)), number=10000))
print(timeit(lambda: list(generate_1step_seq3(values)), number=10000))
Initially I figured that was probably due to the tiny list size, but even for large lists, mean() is horribly slow. Anyone happen to know why? It appears due to the very safe nature of statistics _sum, trying to avoid float rounding errors?

Related

How to increase efficiency of comparing elements of a list?

Whenever I code on online platforms and somehow I have to compare the elements of a list to one another, I use the following code which according to me is the most efficient possible. This is the last code which I was practicing. It was to find the maximum index between 2 same elements.
max=0
for i in range(len(mylist)):
if max==(len(mylist)-1):
break
for j in range(i + 1, len(mylist)):
if mylist[i] == mylist[j]:
if max>(abs(i-j)):
max=abs(i-j)
It runs most of the test cases, but sometimes it shows "time limit exceeded." I know it is related to the constraints and time complexity but I still can't find a better way. If anyone could help me, that would be great.
It's easier to use C based functions in Python. Also don't name variables python types like list.
x = [item for i, item in enumerate(l) if item in l[i+1:]]
# do something with list of values
You could group by equal elements and then find the difference in-group, and keep the maximum:
lst = [1, 3, 5, 3, 7, 8, 9, 1]
groups = {}
for i, v in enumerate(lst):
groups.setdefault(v, []).append(i)
result = max(max(group) - min(group) for group in groups.values())
print(result)
Output
7
The complexity of this approach is O(n).
def get_longest_distance_between_same_elements_in_list(mylist):
positions = dict()
longest_distance = 0
if len(mylist) < 1:
return longest_distance
for index in range(0, len(mylist)):
if mylist[index] in positions:
positions[mylist[index]].append(index)
else:
positions[mylist[index]] = [index]
for key, value in positions.items():
if len(value) > 1 and longest_distance < value[len(value)-1] - value[0]:
longest_distance = value[len(value)-1] - value[0]
return longest_distance
l1 = [1, 3, 5, 3, 7, 8, 9, 1]
l2 = [9]
l3 = []
l4 = [4, 4, 4, 4, 4]
l5 = [10, 10, 3, 4, 5, 4, 10, 56, 4]
print(get_longest_distance_between_same_elements_in_list(l1))
print(get_longest_distance_between_same_elements_in_list(l2))
print(get_longest_distance_between_same_elements_in_list(l3))
print(get_longest_distance_between_same_elements_in_list(l4))
print(get_longest_distance_between_same_elements_in_list(l5))
Output -
7
0
0
4
6
Time Complexity : O(n)

Sum of all numbers in the first or second place in an array

I have a 2d list, for example:
list1 = [[1,2],[3,4],[5,6],[7,8]]
and I want to find the sum of all the numbers at the n'th place of every element.
For example if I want the answer for 0, I would calculate:
my_sum = list1[0][0] + list1[1][0] + list1[2][0]
or
my_sum = 0
place = 0
for i in range(len(list1)):
my_sum += list1[i][place]
return my_sum
Output: 16
Is there a more elegant way to do this? Or one that uses only one line of code?
I mean as fictional code for example:
fictional_function(list1,place) = 16
Since you are looking for a functional solution, consider operator.itemgetter:
from operator import itemgetter
L = [[1,2],[3,4],[5,6],[7,8]]
res = sum(map(itemgetter(0), L)) # 16
For performance and simpler syntax, you can use a 3rd party library such as NumPy:
import numpy as np
A = np.array([[1,2],[3,4],[5,6],[7,8]])
res = A[:, 0].sum() # 16
As a generalization if you want multiple indices (e.g. 0 and 1) you could use reduce combined with and element-wise sum something like this:
from functools import reduce
def fictional_function(lst, *places):
s_places = set(places)
def s(xs, ys):
return [x + y for x, y in zip(xs, ys)]
return [x for i, x in enumerate(reduce(s, lst)) if i in s_places]
list1 = [[1, 2], [3, 4], [5, 6], [7, 8]]
print(fictional_function(list1, 0))
print(fictional_function(list1, 0, 1))
print(fictional_function(list1, *[1, 0]))
Output
[16]
[16, 20]
[16, 20]
The idea is that the function s sums two list element-wise, for example:
s([1, 2], [3, 4]) # [4, 6]
and with reduce apply s to a list of lists, finally filter the result for the intended indices (places) only.
list1 = [[1,2],[3,4],[5,6],[7,8]]
ind = 0
sum_ind = sum(list(zip(*list1))[ind])
The above can be even written as function taking list and the index as input and returns the sum of the common index.
What we do in the above is first we get all the same indexes to individual lists using zip and then chooses which index one has to be summed and passes the same to sum function.

Find the permutations that sums to the three smallest numbers

I asked the same thing yesterday but was finding a hard time finding the right sentence to describe my problem, so I deleted it. But here it is again.
Let us say that we have 3 lists:
list1 = [1, 2]
list2 = [2, 3]
list3 = [1]
Let us say I want to find the 3 permutations of these list, which when added together, it results in the smallest number possible. So here, the permutations that we want would be:
1,2,1
2,2,1
1,3,1
Because the sum of the numbers on each permutation creates the smallest numbers possible.
2,3,1
Will not be a part of the solution since the sum is larger than the other three, thus, not a part of the three smallest.
Of course, using itertools and list all the permutations, and add the numbers on each permutation would be the most obvious solution, but I was wondering if there is a more efficient algorithm for this? Considering It should be able to take 1000 lists.
NOTE: If the number of list is N, then i would need to find N permutations. Thus, if there are 3 lists, I find the 3 smallest permutations.
PRECONDITIONS:
-A part of the precondition is that all of these lists are sorted.
-The number of elements on all list is 2N-1, to deal with the case where only one list have more than 1 element.
-All of the lists are sorted from smallest.
Since the lists are sorted, the smallest element in each list is the first one, the sum of which gives us the "minimal sum permutation". Picking any element except from the first one is going to increase the sum value.
We start off by calculating the difference between element i and the first one for each list. For example, for the lists [1, 3, 4, 8] and [3, 9, 12, 15], these differences would be [2, 3, 7] and [6, 9, 12] respectively. We keep them separate in cost_lists, because they will be needed later on. But in cost_global, we pool them all together and by sorting them in ascending order, we find a solution where for all lists but one we choose the minimal value. To keep track which element from which list will give us the next minimum sum, we group the difference values with both the index of the list it comes from and which element in that list it is.
However, this is not a complete approach. It is possible, for example, that taking the next value from two lists incurs a smaller cost than taking the next value from one list. So, we have to search for the product of the combinations for k = 2, 3, ..., N. Doing that normally would result to N**N complexity, but we can take some really good shortcuts.
From the partial solution above, we have a list of the minimal costs in order. Since we want only the first N minimal sums, we check what the cost value of the Nth permutation is (threshold). So, when we search for a group of two next values, we can safely ignore their sum if it exceeds our current threshold. And since the difference values within lists are in ascending order, once we cross the threshold, we can instantly exit the loop. Similarly, if we haven't found any new combinations within the threshold for k = 2, it is pointless to look for k > 2. Considering that most likely the smallest sum costs will be the result of a single nonminimal value, or a few small ones (unless most lists have massive differences between sequential values), we are bound to exit these loops rather quickly. The code I came up to achieve this is fairly ugly, but it effectively does the same as
for k in xrange(2, len(lists)):
for comb in itertools.combinations(cost_lists, k):
for group in itertools.product(*comb):
if sum(g[0] for g in group) <= threshold:
cost_global.append(group)
except that we exit the loops as soon as we guarantee not to find any results, lest we pointlessly shift through an innumerable number of combinations/products which are over the threshold.
def filter_cost(cost_lists, threshold):
cost = [[i for i in ilist if i[0] <= threshold] for ilist in cost_lists]
# the algorithm requires that we remove any lists that have become empty
return [ilist for ilist in cost if ilist]
def _combi(cost_lists, k, start, depth, subtotal, threshold):
if depth == k:
for i in xrange(start, len(cost_lists)):
for value in cost_lists[i]:
if value[0] + subtotal > threshold:
break
yield (value,)
else:
for i in xrange(start, len(cost_lists)):
for value in cost_lists[i]:
if value[0] + subtotal > threshold:
break
for c in _combi(cost_lists, k, i+1, depth+1,
value[0]+subtotal, threshold):
yield (value,) + c
def combinations_product(cost_lists, k, threshold):
for i in xrange(len(cost_lists)-k+1):
for value in cost_lists[i]:
if value[0] > threshold:
break
for comb in _combi(cost_lists, k, i+1, 2, value[0], threshold):
temp = (value,) + comb
cost, ilists, ith_items = zip(*temp)
yield sum(cost), ilists, ith_items
def find_smallest_sum_permutations(lists):
minima = [min(x) for x in lists]
cost_local = []
cost_global = []
for i, ilist in enumerate(lists):
if len(ilist) > 1:
first = ilist[0]
diff = [(num-first, i, j) for j, num in enumerate(ilist[1:], 1)]
cost_local.append(diff)
cost_global.extend(diff)
cost_global.sort()
threshold_index = len(lists) - 2
cost_threshold = cost_global[threshold_index][0]
cost_local = filter_cost(cost_local, cost_threshold)
for k in xrange(2, len(lists)):
group_combinations = tuple(combinations_product(cost_local, k,
cost_threshold))
if group_combinations:
cost_global.extend(group_combinations)
cost_global.sort()
cost_threshold = cost_global[threshold_index][0]
cost_local = filter_cost(cost_local, cost_threshold)
else:
break
permutations = [minima]
for k in xrange(N-1):
_, ilist, ith_item = cost_global[k]
if type(ilist) == int:
permutation = [minima[i]
if i != ilist else lists[ilist][ith_item]
for i in xrange(N)]
else:
# multiple nonminimal values combination
mapping = dict(zip(ilist, ith_item))
permutation = [minima[i]
if i not in mapping else lists[i][mapping[i]]
for i in xrange(N)]
permutations.append(permutation)
return permutations
Examples
Example in the question.
>>> lists = [
[1, 2],
[2, 3],
[1],
]
>>> for p in find_smallest_sum_permutations(lists):
... print p, sum(p)
[1, 2, 1] 4
[2, 2, 1] 5
[1, 3, 1] 5
Example I had generated with random lists.
>>> import random
>>> N = 5
>>> random.seed(1024)
>>> lists = [sorted(random.sample(range(10*N), 2*N-1)) for _ in xrange(N)]
>>> for p in find_smallest_sum_permutations(lists):
... print p, sum(p)
[4, 4, 1, 6, 0] 15
[4, 6, 1, 6, 0] 17
[4, 4, 3, 6, 0] 17
[4, 4, 1, 6, 4] 19
[4, 6, 3, 6, 0] 19
Example by user2357112 which had caught a glaring error in my previous iteration.
>>> lists = [
[1, 2, 30, 40],
[1, 2, 30, 40],
[10, 20, 30, 40],
[10, 20, 30, 40],
]
>>> for p in find_smallest_sum_permutations(lists):
... print p, sum(p)
[1, 1, 10, 10] 22
[2, 1, 10, 10] 23
[1, 2, 10, 10] 23
[2, 2, 10, 10] 24
The trick is to only generate the combinations that might possibly be needed, and store them in a heap. Each one that you pull out is the smallest one you have not yet seen. And the fact that THAT combination has been pulled out tells you that there are new ones which might also be small.
See https://docs.python.org/2/library/heapq.html for how to use a heap. We also need code for generating combinations. And with that, here is working code for getting the first n combinations for any list of lists:
import heapq
# Helper class for storing combinations.
class ListSelector:
def __init__(self, lists, indexes):
self.lists = lists
self.indexes = indexes
def value(self):
answer = 0
for i in range(0, len(self.lists)):
answer = answer + self.lists[i][self.indexes[i]]
return answer
def values(self):
return [self.lists[i][self.indexes[i]] for i in range(0, len(self.lists))]
# These are the next combinations. We are willing to increment any
# leading 0, or the first non-zero value. This will provide one and
# only one path to each possible combination.
def next_selectors(self):
lists = self.lists
indexes = self.indexes
selectors = []
for i in range(0, len(lists)):
if len(lists[i]) <= indexes[i] + 1:
if 0 == indexes[i]:
continue
else:
break
new_indexes = [
indexes[j] + (0 if j != i else 1)
for j in range(0, len(lists))]
selectors.append(ListSelector(lists, new_indexes))
if 0 < indexes[i]:
break
return selectors
# This will just return an iterator over all combinations, from smallest
# to largest. It does NOT generate them until needed.
def combinations(lists):
sel = ListSelector(lists, [0 for _ in range(len(lists))])
upcoming = [(sel.value(), sel)]
while len(upcoming):
value, sel = heapq.heappop(upcoming)
yield sel
for next_sel in sel.next_selectors():
heapq.heappush(upcoming, (next_sel.value(), next_sel))
# This just gets the first n of them. (It will return less if less.)
def smallest_n_combinations(n, lists):
i = 0
for sel in combinations(lists):
yield sel
i = i + 1
if i == n:
break
# Example usage
lists = [
[1, 2, 5],
[2, 3, 4],
[1]]
for sel in smallest_n_combinations(3, lists):
print(sel.value(), sel.values(), sel.indexes)
(This could be made more efficient for a long list of lists with tricks like caching the value inside of ListSelector and calculating it incrementally for new ones.)

Extract elements of list at odd positions

So I want to create a list which is a sublist of some existing list.
For example,
L = [1, 2, 3, 4, 5, 6, 7], I want to create a sublist li such that li contains all the elements in L at odd positions.
While I can do it by
L = [1, 2, 3, 4, 5, 6, 7]
li = []
count = 0
for i in L:
if count % 2 == 1:
li.append(i)
count += 1
But I want to know if there is another way to do the same efficiently and in fewer number of steps.
Solution
Yes, you can:
l = L[1::2]
And this is all. The result will contain the elements placed on the following positions (0-based, so first element is at position 0, second at 1 etc.):
1, 3, 5
so the result (actual numbers) will be:
2, 4, 6
Explanation
The [1::2] at the end is just a notation for list slicing. Usually it is in the following form:
some_list[start:stop:step]
If we omitted start, the default (0) would be used. So the first element (at position 0, because the indexes are 0-based) would be selected. In this case the second element will be selected.
Because the second element is omitted, the default is being used (the end of the list). So the list is being iterated from the second element to the end.
We also provided third argument (step) which is 2. Which means that one element will be selected, the next will be skipped, and so on...
So, to sum up, in this case [1::2] means:
take the second element (which, by the way, is an odd element, if you judge from the index),
skip one element (because we have step=2, so we are skipping one, as a contrary to step=1 which is default),
take the next element,
Repeat steps 2.-3. until the end of the list is reached,
EDIT: #PreetKukreti gave a link for another explanation on Python's list slicing notation. See here: Explain Python's slice notation
Extras - replacing counter with enumerate()
In your code, you explicitly create and increase the counter. In Python this is not necessary, as you can enumerate through some iterable using enumerate():
for count, i in enumerate(L):
if count % 2 == 1:
l.append(i)
The above serves exactly the same purpose as the code you were using:
count = 0
for i in L:
if count % 2 == 1:
l.append(i)
count += 1
More on emulating for loops with counter in Python: Accessing the index in Python 'for' loops
For the odd positions, you probably want:
>>>> list_ = list(range(10))
>>>> print list_[1::2]
[1, 3, 5, 7, 9]
>>>>
I like List comprehensions because of their Math (Set) syntax. So how about this:
L = [1, 2, 3, 4, 5, 6, 7]
odd_numbers = [y for x,y in enumerate(L) if x%2 != 0]
even_numbers = [y for x,y in enumerate(L) if x%2 == 0]
Basically, if you enumerate over a list, you'll get the index x and the value y. What I'm doing here is putting the value y into the output list (even or odd) and using the index x to find out if that point is odd (x%2 != 0).
You can also use itertools.islice if you don't need to create a list but just want to iterate over the odd/even elements
import itertools
L = [1, 2, 3, 4, 5, 6, 7]
li = itertools.islice(l, 1, len(L), 2)
You can make use of bitwise AND operator &:
>>> x = [1, 2, 3, 4, 5, 6, 7]
>>> y = [i for i in x if i&1]
[1, 3, 5, 7]
This will give you the odd elements in the list. Now to extract the elements at odd indices you just need to change the above a bit:
>>> x = [10, 20, 30, 40, 50, 60, 70]
>>> y = [j for i, j in enumerate(x) if i&1]
[20, 40, 60]
Explanation
Bitwise AND operator is used with 1, and the reason it works is because, odd number when written in binary must have its first digit as 1. Let's check:
23 = 1 * (2**4) + 0 * (2**3) + 1 * (2**2) + 1 * (2**1) + 1 * (2**0) = 10111
14 = 1 * (2**3) + 1 * (2**2) + 1 * (2**1) + 0 * (2**0) = 1110
AND operation with 1 will only return 1 (1 in binary will also have last digit 1), iff the value is odd.
Check the Python Bitwise Operator page for more.
P.S: You can tactically use this method if you want to select odd and even columns in a dataframe. Let's say x and y coordinates of facial key-points are given as columns x1, y1, x2, etc... To normalize the x and y coordinates with width and height values of each image you can simply perform:
for i in range(df.shape[1]):
if i&1:
df.iloc[:, i] /= heights
else:
df.iloc[:, i] /= widths
This is not exactly related to the question but for data scientists and computer vision engineers this method could be useful.

Split list into smaller lists (split in half)

I am looking for a way to easily split a python list in half.
So that if I have an array:
A = [0,1,2,3,4,5]
I would be able to get:
B = [0,1,2]
C = [3,4,5]
A = [1,2,3,4,5,6]
B = A[:len(A)//2]
C = A[len(A)//2:]
If you want a function:
def split_list(a_list):
half = len(a_list)//2
return a_list[:half], a_list[half:]
A = [1,2,3,4,5,6]
B, C = split_list(A)
A little more generic solution (you can specify the number of parts you want, not just split 'in half'):
def split_list(alist, wanted_parts=1):
length = len(alist)
return [ alist[i*length // wanted_parts: (i+1)*length // wanted_parts]
for i in range(wanted_parts) ]
A = [0,1,2,3,4,5,6,7,8,9]
print split_list(A, wanted_parts=1)
print split_list(A, wanted_parts=2)
print split_list(A, wanted_parts=8)
f = lambda A, n=3: [A[i:i+n] for i in range(0, len(A), n)]
f(A)
n - the predefined length of result arrays
def split(arr, size):
arrs = []
while len(arr) > size:
pice = arr[:size]
arrs.append(pice)
arr = arr[size:]
arrs.append(arr)
return arrs
Test:
x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
print(split(x, 5))
result:
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13]]
If you don't care about the order...
def split(list):
return list[::2], list[1::2]
list[::2] gets every second element in the list starting from the 0th element.
list[1::2] gets every second element in the list starting from the 1st element.
Using list slicing. The syntax is basically my_list[start_index:end_index]
>>> i = [0,1,2,3,4,5]
>>> i[:3] # same as i[0:3] - grabs from first to third index (0->2)
[0, 1, 2]
>>> i[3:] # same as i[3:len(i)] - grabs from fourth index to end
[3, 4, 5]
To get the first half of the list, you slice from the first index to len(i)//2 (where // is the integer division - so 3//2 will give the floored result of1, instead of the invalid list index of1.5`):
>>> i[:len(i)//2]
[0, 1, 2]
..and the swap the values around to get the second half:
>>> i[len(i)//2:]
[3, 4, 5]
B,C=A[:len(A)/2],A[len(A)/2:]
Here is a common solution, split arr into count part
def split(arr, count):
return [arr[i::count] for i in range(count)]
def splitter(A):
B = A[0:len(A)//2]
C = A[len(A)//2:]
return (B,C)
I tested, and the double slash is required to force int division in python 3. My original post was correct, although wysiwyg broke in Opera, for some reason.
If you have a big list, It's better to use itertools and write a function to yield each part as needed:
from itertools import islice
def make_chunks(data, SIZE):
it = iter(data)
# use `xragne` if you are in python 2.7:
for i in range(0, len(data), SIZE):
yield [k for k in islice(it, SIZE)]
You can use this like:
A = [0, 1, 2, 3, 4, 5, 6]
size = len(A) // 2
for sample in make_chunks(A, size):
print(sample)
The output is:
[0, 1, 2]
[3, 4, 5]
[6]
Thanks to #thefourtheye and #Bede Constantinides
This is similar to other solutions, but a little faster.
# Usage: split_half([1,2,3,4,5]) Result: ([1, 2], [3, 4, 5])
def split_half(a):
half = len(a) >> 1
return a[:half], a[half:]
There is an official Python receipe for the more generalized case of splitting an array into smaller arrays of size n.
from itertools import izip_longest
def grouper(n, iterable, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
This code snippet is from the python itertools doc page.
10 years later.. I thought - why not add another:
arr = 'Some random string' * 10; n = 4
print([arr[e:e+n] for e in range(0,len(arr),n)])
While the answers above are more or less correct, you may run into trouble if the size of your array isn't divisible by 2, as the result of a / 2, a being odd, is a float in python 3.0, and in earlier version if you specify from __future__ import division at the beginning of your script. You are in any case better off going for integer division, i.e. a // 2, in order to get "forward" compatibility of your code.
#for python 3
A = [0,1,2,3,4,5]
l = len(A)/2
B = A[:int(l)]
C = A[int(l):]
General solution split list into n parts with parameter verification:
def sp(l,n):
# split list l into n parts
if l:
p = len(l) if n < 1 else len(l) // n # no split
p = p if p > 0 else 1 # split down to elements
for i in range(0, len(l), p):
yield l[i:i+p]
else:
yield [] # empty list split returns empty list
Since there was no restriction put on which package we can use.. Numpy has a function called split with which you can easily split an array any way you like.
Example
import numpy as np
A = np.array(list('abcdefg'))
np.split(A, 2)
With hints from #ChristopheD
def line_split(N, K=1):
length = len(N)
return [N[i*length/K:(i+1)*length/K] for i in range(K)]
A = [0,1,2,3,4,5,6,7,8,9]
print line_split(A,1)
print line_split(A,2)
Another take on this problem in 2020 ... Here's a generalization of the problem. I interpret the 'divide a list in half' to be .. (i.e. two lists only and there shall be no spillover to a third array in case of an odd one out etc). For instance, if the array length is 19 and a division by two using // operator gives 9, and we will end up having two arrays of length 9 and one array (third) of length 1 (so in total three arrays). If we'd want a general solution to give two arrays all the time, I will assume that we are happy with resulting duo arrays that are not equal in length (one will be longer than the other). And that its assumed to be ok to have the order mixed (alternating in this case).
"""
arrayinput --> is an array of length N that you wish to split 2 times
"""
ctr = 1 # lets initialize a counter
holder_1 = []
holder_2 = []
for i in range(len(arrayinput)):
if ctr == 1 :
holder_1.append(arrayinput[i])
elif ctr == 2:
holder_2.append(arrayinput[i])
ctr += 1
if ctr > 2 : # if it exceeds 2 then we reset
ctr = 1
This concept works for any amount of list partition as you'd like (you'd have to tweak the code depending on how many list parts you want). And is rather straightforward to interpret. To speed things up , you can even write this loop in cython / C / C++ to speed things up. Then again, I've tried this code on relatively small lists ~ 10,000 rows and it finishes in a fraction of second.
Just my two cents.
Thanks!
from itertools import islice
Input = [2, 5, 3, 4, 8, 9, 1]
small_list_length = [1, 2, 3, 1]
Input1 = iter(Input)
Result = [list(islice(Input1, elem)) for elem in small_list_length]
print("Input list :", Input)
print("Split length list: ", small_list_length)
print("List after splitting", Result)
You can try something like this with numpy
import numpy as np
np.array_split([1,2,3,4,6,7,8], 2)
result:
[array([1, 2, 3, 4]), array([6, 7, 8])]

Categories

Resources