Split a Python list logarithmically

Split a Python list logarithmically - python

I am trying to do the following..
I have a list of n elements. I want to split this list into 32 separate lists which contain more and more elements as we go towards the end of the original list. For example from:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
I want to get something like this:
b = [[1],[2,3],[4,5,6,7],[8,9,10,11,12]]
I've done the following for a list containing 1024 elements:
for i in range (0, 32):
c = a[i**2:(i+1)**2]
b.append(c)
But I am stupidly struggling to find a reliable way to do it for other numbers like 256, 512, 2048 or for another number of lists instead of 32.

Use an iterator, a for loop with enumerate and itertools.islice:
import itertools
def logsplit(lst):
iterator = iter(lst)
for n, e in enumerate(iterator):
yield itertools.chain([e], itertools.islice(iterator, n))
Works with any number of elements. Example:
for r in logsplit(range(50)):
print(list(r))
Output:
[0]
[1, 2]
[3, 4, 5]
[6, 7, 8, 9]
... some more ...
[36, 37, 38, 39, 40, 41, 42, 43, 44]
[45, 46, 47, 48, 49]
In fact, this is very similar to this problem, except it's using enumerate to get variable chunk sizes.

This is incredibly messy, but gets the job done. Note that you're going to get some empty bins at the beginning if you're logarithmically slicing the list. Your examples give arithmetic index sequences.
from math import log, exp
def split_list(_list, divs):
n = float(len(_list))
log_n = log(n)
indices = [0] + [int(exp(log_n*i/divs)) for i in range(divs)]
unfiltered = [_list[indices[i]:indices[i+1]] for i in range(divs)] + [_list[indices[i+1]:]]
filtered = [sublist for sublist in unfiltered if sublist]
return [[] for _ in range(divs- len(filtered))] + filtered
print split_list(range(1024), 32)
Edit: After looking at the comments, here's an example that may fit what you want:
def split_list(_list):
copy, output = _list[:], []
length = 1
while copy:
output.append([])
for _ in range(length):
if len(copy) > 0:
output[-1].append(copy.pop(0))
length *= 2
return output
print split_list(range(15))
# [[0], [1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13, 14]]
Note that this code is not efficient, but it can be used as a template for writing a better algorithm.

Something like this should solve the problem.
for i in range (0, int(np.sqrt(2*len(a)))):
c = a[i**2:min( (i+1)**2, len(a) )]
b.append(c)
Not very pythonic but does what you want.
def splitList(a, n, inc):
"""
a list to split
n number of sublist
inc ideal difference between the number of elements in two successive sublists
"""
zr = len(a) # remaining number of elements to split into sublists
st = 0 # starting index in the full list of the next sublist
nr = n # remaining number of sublist to construct
nc = 1 # number of elements in the next sublist
#
b=[]
while (zr/nr >= nc and nr>1):
b.append( a[st:st+nc] )
st, zr, nr, nc = st+nc, zr-nc, nr-1, nc+inc
#
nc = int(zr/nr)
for i in range(nr-1):
b.append( a[st:st+nc] )
st = st+nc
#
b.append( a[st:max(st+nc,len(a))] )
return b
# Example of call
# b = splitList(a, 32, 2)
# to split a into 32 sublist, where each list ideally has 2 more element
# than the previous

There's always this.
>>> def log_list(l):
if len(l) == 0:
return [] #If the list is empty, return an empty list
new_l = [] #Initialise new list
new_l.append([l[0]]) #Add first iteration to new list inside of an array
for i in l[1:]: #For each other iteration,
if len(new_l) == len(new_l[-1]):
new_l.append([i]) #Create new array if previous is full
else:
new_l[-1].append(i) #If previous not full, add to it
return new_l
>>> log_list([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]

Related

How find all pairs equal to N in a list

I have a problem with this algorithm- I have to find pairs in list:
[4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
which are equal to 12. The thing is that after making a pair those numbers (elements) can not be used again.
For now, I have code which you can find below. I have tried to delete numbers from the list after matching, but I feel that there is an issue with indexing after this.
It looks very easy but still not working. ;/
class Pairs():
def __init__(self, sum, n, arr ):
self.sum = sum
self.n = n
self.arr = arr
def find_pairs(self):
self.n = len(self.arr)
for i in range(0, self.n):
for j in range(i+1, self.n):
if (self.arr[i] + self.arr[j] == self.sum):
print("[", self.arr[i], ",", " ", self.arr[j], "]", sep = "")
self.arr.pop(i)
self.arr.pop(j-1)
self.n = len(self.arr)
i+=1
def Main():
sum = 12
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
n = len(arr)
obj_Pairs = Pairs(sum, n, arr)
obj_Pairs.find_pairs()
if __name__ == "__main__":
Main()
update:
Thank you guys for the fast answers!
I've tried your solutions, and unfortunately, it is still not exactly what I'm looking for. I know that the expected output should look like this: [4, 8], [0, 12], [1, 11], [4, 8], [12, 0]. So in your first solution, there is still an issue with duplicated elements, and in the second one [4, 8] and [12, 0] are missing. Sorry for not giving output at the beginning.

With this problem you need to keep track of what numbers have already been tried. Python has a Counter class that will hold the count of each of the elements present in a given list.
The algorithm I would use is:
create counter of elements in list
iterate list
for each element, check if (target - element) exists in counter and count of that item > 0
decrement count of element and (target - element)
from collections import Counter
class Pairs():
def __init__(self, target, arr):
self.target = target
self.arr = arr
def find_pairs(self):
count_dict = Counter(self.arr)
result = []
for num in self.arr:
if count_dict[num] > 0:
difference = self.target - num
if difference in count_dict and count_dict[difference] > 0:
result.append([num, difference])
count_dict[num] -= 1
count_dict[difference] -= 1
return result
if __name__ == "__main__":
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
obj_Pairs = Pairs(12, arr)
result = obj_Pairs.find_pairs()
print(result)
Output:
[[4, 8], [8, 4], [0, 12], [12, 0], [1, 11]]
Demo

Brief
If you have learned about hashmaps and linked lists/deques, you can consider using auxiliary space to map values to their indices.
Pro:
It does make the time complexity linear.
Doesn't modify the input
Cons:
Uses extra space
Uses a different strategy from the original. If this is for a class and you haven't learned about the data structures applied then don't use this.
Code
from collections import deque # two-ended linked list
class Pairs():
def __init__(self, sum, n, arr ):
self.sum = sum
self.n = n
self.arr = arr
def find_pairs(self):
mp = {} # take advantage of a map of values to their indices
res = [] # resultant pair list
for idx, elm in enumerate(self.arr):
if mp.get(elm, None) is None:
mp[elm] = deque() # index list is actually a two-ended linked list
mp[elm].append(idx) # insert this element
comp_elm = self.sum - elm # value that matches
if mp.get(comp_elm, None) is not None and mp[comp_elm]: # there is no match
# match left->right
res.append((comp_elm, elm))
mp[comp_elm].popleft()
mp[elm].pop()
for pair in res: # Display
print("[", pair[0], ",", " ", pair[1], "]", sep = "")
# in case you want to do further processing
return res
def Main():
sum = 12
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
n = len(arr)
obj_Pairs = Pairs(sum, n, arr)
obj_Pairs.find_pairs()
if __name__ == "__main__":
Main()
Output
$ python source.py
[4, 8]
[0, 12]
[4, 8]
[1, 11]
[12, 0]

To fix your code - few remarks:
If you iterate over array in for loop you shouldn't be changing it - use while loop if you want to modify the underlying list (you can rewrite this solution to use while loop)
Because you're iterating only once the elements in the outer loop - you only need to ensure you "popped" elements in the inner loop.
So the code:
class Pairs():
def __init__(self, sum, arr ):
self.sum = sum
self.arr = arr
self.n = len(arr)
def find_pairs(self):
j_pop = []
for i in range(0, self.n):
for j in range(i+1, self.n):
if (self.arr[i] + self.arr[j] == self.sum) and (j not in j_pop):
print("[", self.arr[i], ",", " ", self.arr[j], "]", sep = "")
j_pop.append(j)
def Main():
sum = 12
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
obj_Pairs = Pairs(sum, arr)
obj_Pairs.find_pairs()
if __name__ == "__main__":
Main()

How to get the perform a difference in each sublist of a list?

I want to create a list that would be the difference between the last and the first element of each sublist of a list. Sublists are sublists of N elements, so I will have len(list)/K+1 elements in my final list.
Here is a little example:
I have a list : [0, 10, 20, 5, 10, 30, 20, 35]. I chose to have a maximum of 3 elements per sublist. I will have the following sublists [0, 10, 20], [5, 10, 30], [20, 35].
Now I apply the difference in each sublist and I get the values 20, 25, 15 (because 20-0, 30-5 and 35-20).
I want the result to be in a list, so to have [20, 25, 15] as a final result.

As requested in the original version of the question, and according to the edits made by the OP, we create a list of increasing sequences, then calculate the differences between the max and min of each of them:
def spans(data):
sequences = [[data[0]]]
for val in data[1:]:
if val >= sequences[-1][-1]:
sequences[-1].append(val)
else:
sequences.append([val])
return [s[-1] -s[0] for s in sequences]
Sample run with the OP's data:
data = [0, 10, 20, 5, 10, 30, 20, 35]
print(spans(data))
# [20, 25, 15]
Another one:
print(spans([2, 4, 6, 8, 9, 4, 5, -2, -1, 5, 4]))
# [7, 1, 7, 0]

Here is a script that should help you.
Care if you have only one element in the last sublist, I don't know how you want to deal with this case. I considered that the element is the result of this sublist, but maybe you want to ignore the last sublist or to have 0 as a result.
Explanations are the comments in the script:
# Initial list
l = [0, 10, 20, 5, 10, 30, 20, 35]
# Number of elements to consider in each sublist.
STEP = 3
# Get the size of the list
length = len(l)
# The result list, empty at the beginning, that will be populated by the differences
result_list = []
# Iterate through sublists of exactly STEP elements
i = 0
while (i+STEP-1)<length:
result_list.append(l[i+STEP-1]-l[i])
i += STEP
# Special case for the possible little last sublist
# No difference done and element is kept if there is only one element
if i==length-1:
result_list.append(l[-1])
# Else, do the difference in the last sublist
elif i<length-1:
result_list.append(l[-1]-l[i])
Here is the script that takes the max-min of each sublist, as OP asked primarily:
l = [1,2,3,4,5,6,7,8]
n = 3
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
# Create the sublists
grouped_l = list(chunks(l,n))
# Do the max-min on each sublists
res = []
for i in grouped_l:
res.append(max(i)-min(i))

Python: split list into indices based on consecutive identical values

If you could advice me how to write the script to split list by number of values I mean:
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
And there are 11-4,12-2,15-6,20-3 items.
So in next list for exsample range(0:100)
I have to split on 4,2,6,3 parts
So I counted same values and function for split list, but it doen't work with list:
div=Counter(my_list).values() ##counts same values in the list
def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(islice(it, size)), ())
What do I need:
Out: ([0,1,2,3],[4,5],[6,7,8,9,10,11], etc...]

You can use enumerate, itertools.groupby, and operator.itemgetter:
In [45]: import itertools
In [46]: import operator
In [47]: [[e[0] for e in d[1]] for d in itertools.groupby(enumerate(my_list), key=operator.itemgetter(1))]
Out[47]: [[0, 1, 2, 3], [4, 5], [6, 7, 8, 9, 10, 11], [12, 13, 14]]
What this does is as follows:
First it enumerates the items.
It groups them, using the second item in each enumeration tuple (the original value).
In the resulting list per group, it uses the first item in each tuple (the enumeration)

Solution in Python 3 , If you are only using counter :
from collections import Counter
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
count = Counter(my_list)
div= list(count.keys()) # take only keys
div.sort()
l = []
num = 0
for i in div:
t = []
for j in range(count[i]): # loop number of times it occurs in the list
t.append(num)
num+=1
l.append(t)
print(l)
Output:
[[0, 1, 2, 3], [4, 5], [6, 7, 8, 9, 10, 11], [12, 13, 14]]
Alternate Solution using set:
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
val = set(my_list) # filter only unique elements
ans = []
num = 0
for i in val:
temp = []
for j in range(my_list.count(i)): # loop till number of occurrence of each unique element
temp.append(num)
num+=1
ans.append(temp)
print(ans)
EDIT:
As per required changes made to get desired output as mention in comments by #Protoss Reed
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
val = list(set(my_list)) # filter only unique elements
val.sort() # because set is not sorted by default
ans = []
index = 0
l2 = [54,21,12,45,78,41,235,7,10,4,1,1,897,5,79]
for i in val:
temp = []
for j in range(my_list.count(i)): # loop till number of occurrence of each unique element
temp.append(l2[index])
index+=1
ans.append(temp)
print(ans)
Output:
[[54, 21, 12, 45], [78, 41], [235, 7, 10, 4, 1, 1], [897, 5, 79]]
Here I have to convert set into list because set is not sorted and I think remaining is self explanatory.
Another Solution if input is not always Sorted (using OrderedDict):
from collections import OrderedDict
v = OrderedDict({})
my_list=[12,12,11,11,11,11,20,20,20,15,15,15,15,15,15]
l2 = [54,21,12,45,78,41,235,7,10,4,1,1,897,5,79]
for i in my_list: # maintain count in dict
if i in v:
v[i]+=1
else:
v[i]=1
ans =[]
index = 0
for key,values in v.items():
temp = []
for j in range(values):
temp.append(l2[index])
index+=1
ans.append(temp)
print(ans)
Output:
[[54, 21], [12, 45, 78, 41], [235, 7, 10], [4, 1, 1, 897, 5, 79]]
Here I use OrderedDict to maintain order of input sequence which is random(unpredictable) in case of set.
Although I prefer #Ami Tavory's solution which is more pythonic.
[Extra work: If anybody can convert this solution into list comprehension it will be awesome because i tried but can not convert it to list comprehension and if you succeed please post it in comments it will help me to understand]

Remove items from a list in Python based on previous items in the same list

Say I have a simple list of numbers, e.g.
simple_list = range(100)
I would like to shorten this list such that the gaps between the values are greater than or equal to 5 for example, so it should look like
[0, 5, 10...]
FYI the actual list does not have regular increments but it is ordered
I'm trying to use list comprehension to do it but the below obviously returns an empty list:
simple_list2 = [x for x in simple_list if x-simple_list[max(0,x-1)] >= 5]
I could do it in a loop by appending to a list if the condition is met but I'm wondering specifically if there is a way to do it using list comprehension?

This is not a use case for a comprehension, you have to use a loop as there could be any amount of elements together that have less than five between them, you cannot just check the next or any n amount of numbers unless you knew the data had some very specific format:
simple_list = range(100)
def f(l):
it = iter(l)
i = next(it)
for ele in it:
if abs(ele - i) >= 5:
yield i
i = ele
yield i
simple_list[:] = f(simple_list)
print(simple_list)
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]
A better example to use would be:
l = [1, 2, 2, 2, 3, 3, 3, 10, 12, 13, 13, 18, 24]
l[:] = f(l)
print(l)
Which would return:
[1, 10, 18, 24]
If your data is always in ascending order you can remove the abs and just if ele - i >= 5.

If I understand your question correctly, which I'm not sure I do (please clarify), you can do this easily. Assume that a is the list you want to process.
[v for i,v in enumerate(a) if abs(a[i] - a[i - 1]) >= 5]
This gives all elements with which the difference to the previous one (should it be next?) are greater or equal than 5. There are some variations of this, according to what you need. Should the first element not be compared and excluded? The previous implementation compares it with index -1 and includes it if the criteria is met, this one excludes it from the result:
[v for i,v in enumerate(a) if i != 0 and abs(a[i] - a[i - 1]) >= 5]
On the other hand, should it always be included? Then use this:
[v for i,v in enumerate(a) if (i != 0 and abs(a[i] - a[i - 1]) >= 5) or (i == 0)]

Python - summing and grouping through a list

I have a big list of numbers like so:
a = [133000, 126000, 123000, 108000, 96700, 96500, 93800,
93200, 92100, 90000, 88600, 87000, 84300, 82400, 80700,
79900, 79000, 78800, 76100, 75000, 15300, 15200, 15100,
8660, 8640, 8620, 8530, 2590, 2590, 2580, 2550, 2540, 2540,
2510, 2510, 1290, 1280, 1280, 1280, 1280, 951, 948, 948,
947, 946, 945, 609, 602, 600, 599, 592, 592, 592, 591, 583]
What I want to do is cycle through this list one by one checking if a value is above a certain threshold (for example 40000). If it is above this threshold we put that value in a new list and forget about it. Otherwise we wait until the sum of the values is above the threshold and when it is we put the values in a list and then continue cycling. At the end, if the final values don't sum to the threshold we just add them to the last list.
If I'm not being clear consider the simple example, with the threshold being 15
[20, 10, 9, 8, 8, 7, 6, 2, 1]
The final list should look like this:
[[20], [10, 9], [8, 8], [7, 6, 2, 1]]
I'm really bad at maths and python and I'm at my wits end. I have some basic code I came up with but it doesn't really work:
def sortthislist(list):
list = a
newlist = []
for i in range(len(list)):
while sum(list[i]) >= 40000:
newlist.append(list[i])
return newlist
Any help at all would be greatly appreciated. Sorry for the long post.

The function below will accept your input list and some limit to check and then output the sorted list:
a = [20, 10, 9, 8, 8, 7, 6, 2, 1]
def func(a, lim):
out = []
temp = []
for i in a:
if i > lim:
out.append([i])
else:
temp.append(i)
if sum(temp) > lim:
out.append(temp)
temp = []
return out
print(func(a, 15))
# [[20], [10, 9], [8, 8], [7, 6, 2, 1]]
With Python you can iterate over the list itself, rather than iterating over it's indices, as such you can see that I use for i in a rather than for i in range(len(a)).
Within the function out is the list that you want to return at the end; temp is a temporary list that is populated with numbers until the sum of temp exceeds your lim value, at which point this temp is then appended to out and replaced with an empty list.

def group(L, threshold):
answer = []
start = 0
sofar = L[0]
for i,num in enumerate(L[1:],1):
if sofar >= threshold:
answer.append(L[start:i])
sofar = L[i]
start = i
else:
sofar += L[i]
if i<len(L) and sofar>=threshold:
answer.append(L[i:])
return answer
Output:
In [4]: group([20, 10, 9, 8, 8, 7, 6, 2, 1], 15)
Out[4]: [[20], [10, 9], [8, 8], [7, 6, 2]]

Hope this will help :)
vlist = [20, 10,3,9, 7,6,5,4]
thresold = 15
result = []
tmp = []
for v in vlist:
if v > thresold:
tmp.append(v)
result.append(tmp)
tmp = []
elif sum(tmp) + v > thresold:
tmp.append(v)
result.append(tmp)
tmp = []
else:
tmp.append(v)
if tmp != []:
result.append(tmp)
Here what's the result :
[[20], [10, 3, 9], [7, 6, 5], [4]]

Here's yet another way:
def group_by_sum(a, lim):
out = []
group = None
for i in a:
if group is None:
group = []
out.append(group)
group.append(i)
if sum(group) > lim:
group = None
return out
print(group_by_sum(a, 15))

We already have plenty of working answers, but here are two other approaches.
We can use itertools.groupby to collect such groups, given a stateful accumulator that understands the contents of the group. We end up with a set of (key,group) pairs, so some additional filtering gets us only the groups. Additionally since itertools provides iterators, we convert them to lists for printing.
from itertools import groupby
class Thresholder:
def __init__(self, threshold):
self.threshold=threshold
self.sum=0
self.group=0
def __call__(self, value):
if self.sum>self.threshold:
self.sum=value
self.group+=1
else:
self.sum+=value
return self.group
print [list(g) for k,g in groupby([20, 10, 9, 8, 8, 7, 6, 2, 1], Thresholder(15))]
The operation can also be done as a single reduce call:
def accumulator(result, value):
last=result[-1]
if sum(last)>threshold:
result.append([value])
else:
last.append(value)
return result
threshold=15
print reduce(accumulator, [20, 10, 9, 8, 8, 7, 6, 2, 1], [[]])
This version scales poorly to many values due to the repeated call to sum(), and the global variable for the threshold is rather clumsy. Also, calling it for an empty list will still leave one empty group.
Edit: The question logic demands that values above the threshold get put in their own groups (not sharing with collected smaller values). I did not think of that while writing these versions, but the accepted answer by Ffisegydd handles it. There is no effective difference if the input data is sorted in descending order, as all the sample data appears to be.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Split a Python list logarithmically - python

Related

How find all pairs equal to N in a list

How to get the perform a difference in each sublist of a list?

Python: split list into indices based on consecutive identical values

Remove items from a list in Python based on previous items in the same list

Python - summing and grouping through a list

Categories

Resources