Python heapq: Split and merge into a ordered heapq - python

I am wanting to split two heapqs (used as a priority queues), and then add them together and have the resulting heapq ordered in relation to both of the previous heapqs.
Is this possible in python?
My current code:
population = []
for i in range(0, 6):
heappush(population, i)
new_population = []
for i in range(4, 9):
heappush(new_population, i)
split_index = len(population) // 2
temp_population = population[:split_index]
population = new_population[:split_index] + temp_population
print(population)
print(heappop(population))
Output:
[4, 5, 6, 0, 1, 2]
4
Wanted output:
[0, 1, 2, 4, 5, 6]
0

Use nlargest instead of slicing, then reheapify the combined lists.
from heapq import nlargest, heapify
n = len(population) // 2
population = heapify(nlargest(population, n) +
nlargest(new_population, n))
print(heappop(population))
You may want to benchmark, though, if sorting the two original lists, then merging the results, is faster. Python's sort routine is fast for nearly sorted lists, and this may impose a lot less overhead than the heapq functions. The last heapify step may not be necessary if you don't actually need a priority queue (since you are sorting them anyway).
from itertools import islice
from heapq import merge, heapify
n = len(population) # == len(new_population), presumably
population = heapify(islice(merge(sorted(population), sorted(new_population)), n))

Related

Split list into N sublists with approximately equal sums

I have a list of integers, and I need to split it into a given number of sublists (with no restrictions on order or the number of elements in each), in a way that minimizes the average difference in sums of each sublist.
For example:
>>> x = [4, 9, 1, 5]
>>> sublist_creator(x, 2)
[[9], [4, 1, 5]]
because list(map(sum, sublist_creator(x, 2))) yields [9, 10], minimizing the average distance. Alternatively, [[9, 1], [4, 5]] would have been equally correct, and my use case has no preference between two possibilities.
The only way I can think of to do this is by checking, iteratively, all possible combinations, but I'm working with a list of ~5000 elements and need to split it into ~30 sublists, so that approach is prohibitively expensive.
Here's the outline:
create N empty lists
sort() your input array in ascending order
pop() the last element from the sorted array
append() the popped element to the list with the lowest sum() of the elements
repeat 3 and 4 until input array is empty
profit!!!
With M=5000 elements and N=30 lists this approach might take about O(N*M) if you carefully store the intermediate sums of the sublists instead of calculating them from the scratch every time.
#lenik's solution has the right idea, but can use a heap queue that keeps track of the total of each sub-list and its index in sorted order to improve the cost of finding the sub-list of the minimum size to O(log n), resulting in an overall O(m x log n) time complexity:
import heapq
def sublist_creator(lst, n):
lists = [[] for _ in range(n)]
totals = [(0, i) for i in range(n)]
heapq.heapify(totals)
for value in lst:
total, index = heapq.heappop(totals)
lists[index].append(value)
heapq.heappush(totals, (total + value, index))
return lists
so that:
sublist_creator(x, 2)
returns:
[[4, 1, 5], [9]]
Implementation of #lennik's idea using python's underrated priority queue module heapq. This follows his idea pretty much exactly, except that each list is given a first element that contains its sum. Since lists are sorted lexicography and heapq is a min-heap implementation, all we have to do is pop off the first elements after we finish.
Using heapreplace will help avoid unnecessary resizing operations during the updates.
from heapq import heapreplace
def sublist_creator(x, n, sort=True):
bins = [[0] for _ in range(n)]
if sort:
x = sorted(x)
for i in x:
least = bins[0]
least[0] += i
least.append(i)
heapreplace(bins, least)
return [x[1:] for x in bins]
Given M = len(x) and N = n, the sort is O(M log M) and the loop does M insertions, which are O(log N) worst case. So for M >= N, we can say that asymptotically the algorithm is O(M log M). If the array is pre-sorted, it's O(M log N).

Writing a faster implementation for insert() in python

Essentially, I need to write a faster implementation as a replacement for insert() to insert an element in a particular position in a list.
The inputs are given in a list as [(index, value), (index, value), (index, value)]
For example: Doing this to insert 10,000 elements in a 1,000,000 element list takes about 2.7 seconds
def do_insertions_simple(l, insertions):
"""Performs the insertions specified into l.
#param l: list in which to do the insertions. Is is not modified.
#param insertions: list of pairs (i, x), indicating that x should
be inserted at position i.
"""
r = list(l)
for i, x in insertions:
r.insert(i, x)
return r
My assignment asks me to speed up the time taken to complete the insertions by 8x or more
My current implementation:
def do_insertions_fast(l, insertions):
"""Implement here a faster version of do_insertions_simple """
#insert insertions[x][i] at l[i]
result=list(l)
for x,y in insertions:
result = result[:x]+list(y)+result[x:]
return result
Sample input:
import string
l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
insertions = [(0, 'a'), (2, 'b'), (2, 'b'), (7, 'c')]
r1 = do_insertions_simple(l, insertions)
r2 = do_insertions_fast(l, insertions)
print("r1:", r1)
print("r2:", r2)
assert_equal(r1, r2)
is_correct = False
for _ in range(20):
l, insertions = generate_testing_case(list_len=100, num_insertions=20)
r1 = do_insertions_simple(l, insertions)
r2 = do_insertions_fast(l, insertions)
assert_equal(r1, r2)
is_correct = True
The error I'm getting while running the above code:
r1: ['a', 0, 'b', 'b', 1, 2, 3, 'c', 4, 5, 6, 7, 8, 9]
r2: ['a', 0, 'b', 'b', 1, 2, 3, 'c', 4, 5, 6, 7, 8, 9]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-54e0c44a8801> in <module>()
12 l, insertions = generate_testing_case(list_len=100, num_insertions=20)
13 r1 = do_insertions_simple(l, insertions)
---> 14 r2 = do_insertions_fast(l, insertions)
15 assert_equal(r1, r2)
16 is_correct = True
<ipython-input-7-b421ee7cc58f> in do_insertions_fast(l, insertions)
4 result=list(l)
5 for x,y in insertions:
----> 6 result = result[:x]+list(y)+result[x:]
7 return result
8 #raise NotImplementedError()
TypeError: 'float' object is not iterable
The file is using the nose framework to check my answers, etc, so if there's any functions that you don't recognize, its probably from that framework.
I know that it is inserting the lists right, however it keeps raising the error "float object is not iterable"
I've also tried a different method which did work (sliced the lists, added the element, and added the rest of the list, and then updating the list) but that was 10 times slower than insert()
I'm not sure how to continue
edit: I've been looking at the entire question wrong, for now I'll try to do it myself but if I'm stuck again I'll ask a different question and link that here
From your question, emphasis mine:
I need to write a faster implementation as a replacement for insert() to insert an element in a particular position in a list
You won't be able to. If there was a faster way, then the existing insert() function would already use it. Anything you do will not even get close to the speed.
What you can do is write a faster way to do multiple insertions.
Let's look at an example with two insertions:
>>> a = list(range(15))
>>> a.insert(5, 'X')
>>> a.insert(10, 'Y')
>>> a
[0, 1, 2, 3, 4, 'X', 5, 6, 7, 8, 'Y', 9, 10, 11, 12, 13, 14]
Since every insert shifts all values to the right of it, this in general is an O(m*(n+m)) time algorithm, where n is the original size of the list and m is the number of insertions.
Another way to do it is to build the result piece by piece, taking the insertion points into account:
>>> a = list(range(15))
>>> b = []
>>> b.extend(a[:5])
>>> b.append('X')
>>> b.extend(a[5:9])
>>> b.append('Y')
>>> b.extend(a[9:])
>>> b
[0, 1, 2, 3, 4, 'X', 5, 6, 7, 8, 'Y', 9, 10, 11, 12, 13, 14]
This is O(n+m) time, as all values are just copied once and there's no shifting. It's just somewhat tricky to determine the correct piece lengths, as earlier insertions affect later ones. Especially if the insertion indexes aren't sorted (and in that case it would also take O(m log m) additional time to sort them). That's why I had to use [5:9] and a[9:] instead of [5:10] and a[10:]
(Yes, I know, extend/append internally copy some more if the capacity is exhausted, but if you understand things enough to point that out, then you also understand that it doesn't matter :-)
One option is to use a different data structure, which supports faster insertions.
The obvious suggestion would be a binary tree of some sort. You can insert nodes into a balanced binary tree in O(log n) time, so long as you're able to find the right insertion point in O(log n) time. A solution to that is for each node to store and maintain its own subtree's cardinality; then you can find a node by index without iterating through the whole tree. Another possibility is a skip list, which supports insertion in O(log n) average time.
However, the problem is that you are writing in Python, so you have a major disadvantage trying to write something faster than the built-in list.insert method, because that's implemented in C, and Python code is a lot slower than C code. It's not unusual to write an O(log n) algorithm in Python that only beats the built-in O(n) implementation for very large n, and even n = 1,000,000 may not be large enough to win by a factor of 8 or more. This could mean a lot of wasted effort if you try implementing your own data structure and it turns out not to be fast enough.
I think the expected solution for this assignment will be something like Heap Overflow's answer. That said, there is another way to approach this question which is worth considering because it avoids the complications of working out the correct indices to insert at if you do the insertions out of order. My idea is to take advantage of the efficiency of list.insert but to call it on shorter lists.
If the data is still stored in Python lists, then the list.insert method can still be used to get the efficiency of a C implementation, but if the lists are shorter then the insert method will be faster. Since you only need to win by a constant factor, you can divide the input list into, say, 256 sublists of roughly equal size. Then for each insertion, insert it at the correct index in the correct sublist; and finally join the sublists back together again. The time complexity is O(nm), which is the same as the "naive" solution, but it has a lower constant factor.
To compute the correct insertion index we need to subtract the lengths of the sublists to the left of the one we're inserting in; we can store the cumulative sublist lengths in a prefix sum array, and update this array efficiently using numpy. Here's my implementation:
from itertools import islice, chain, accumulate
import numpy as np
def do_insertions_split(lst, insertions, num_sublists=256):
n = len(lst)
sublist_len = n // num_sublists
lst_iter = iter(lst)
sublists = [list(islice(lst_iter, sublist_len)) for i in range(num_sublists-1)]
sublists.append(list(lst_iter))
lens = [0]
lens.extend(accumulate(len(s) for s in sublists))
lens = np.array(lens)
for idx, val in insertions:
# could use binary search, but num_sublists is small
j = np.argmax(lens >= idx)
sublists[j-1].insert(idx - lens[j-1], val)
lens[j:] += 1
return list(chain.from_iterable(sublists))
It is not as fast as #iz_'s implementation (linked from the comments), but it beats the simple algorithm by a factor of almost 20, which is sufficient according to the problem statement. The times below were measured using timeit on a list of length 1,000,000 with 10,000 insertions.
simple -> 2.1252768037122087 seconds
iz -> 0.041302349785668824 seconds
split -> 0.10893724981304054 seconds
Note that my solution still loses to #iz_'s by a factor of about 2.5. However, #iz_'s solution requires the insertion points to be sorted, whereas mine works even when they are unsorted:
lst = list(range(1_000_000))
insertions = [(randint(0, len(lst)), "x") for _ in range(10_000)]
# uncomment if the insertion points should be sorted
# insertions.sort()
r1 = do_insertions_simple(lst, insertions)
r2 = do_insertions_iz(lst, insertions)
r3 = do_insertions_split(lst, insertions)
if r1 != r2: print('iz failed') # prints
if r1 != r3: print('split failed') # doesn't print
Here is my timing code, in case anyone else wants to compare. I tried a few different values for num_sublists; anything between 200 and 1,000 seemed to be about equally good.
from timeit import timeit
algorithms = {
'simple': do_insertions_simple,
'iz': do_insertions_iz,
'split': do_insertions_split,
}
reps = 10
for name, func in algorithms.items():
t = timeit(lambda: func(lst, insertions), number=reps) / reps
print(name, '->', t, 'seconds')
list(y) attempts to iterate over y and create a list of its elements. If y is an integer, it will not be iterable, and return the error you mentioned. You instead probably want to create a list literal containing y like so: [y]

Best way to directly generate a sublist from a python iterator

It is easy to convert an entire iterator sequence into a list using list(iterator), but what is the best/fastest way to directly create a sublist from an iterator without first creating the entire list, i.e. how to best create list(iterator)[m:n] without first creating the entire list?
It seems obvious that it should not* (at least not always) be possible to do so directly for m > 0, but it should be for n less than the length of the sequence. [p for i,p in zip(range(n), iterator)] comes to mind, but is that the best way?
The context is simple: Creating the entire list would cause a RAM overflow, so it needs to be broken down. So how do you do this efficiently and/or python-ic-ly?
*The list comprehension I mentioned could obviously be used for m > 0 by calling next(iterator) m times prior to execution, but I don't enjoy the lack of python-ness here.
itertools.islice:
from itertools import islice
itr = (i for i in range(10))
m, n = 3, 8
result = list(islice(itr, m, n))
print(result)
# [3, 4, 5, 6, 7]
In addition, you can add an argument as the step if you wanted:
itr = (i for i in range(10))
m, n, step = 3, 8, 2
result = list(islice(itr, m, n, step))
print(result)
# [3, 5, 7]

compute suffix maximums using itertools.accumulate

What's the recommended way to compute the suffix maximums of a sequence of integers?
Following is the brute-force approach (O(n**2)time), based on the problem definition:
>>> A
[9, 9, 4, 3, 6]
>>> [max(A[i:]) for i in range(len(A))]
[9, 9, 6, 6, 6]
One O(n) approach using itertools.accumulate() is the following, which uses two list constructors:
>>> A
[9, 9, 4, 3, 6]
>>> list(reversed(list(itertools.accumulate(reversed(A), max))))
[9, 9, 6, 6, 6]
Is there a more pythonic way to do this?
Slice-reversal makes things more concise and less nested:
list(itertools.accumulate(A[::-1], max))[::-1]
It's still something you'd want to bundle up into a function, though:
from itertools import accumulate
def suffix_maximums(l):
return list(accumulate(l[::-1], max))[::-1]
If you're using NumPy, you'd want numpy.maximum.accumulate:
import numpy
def numpy_suffix_maximums(array):
return numpy.maximum.accumulate(array[::-1])[::-1]
Personally when I think "Pythonic" I think "simple and easy-to-read", so here's my Pythonic version:
def suffix_max(a_list):
last_max = a[-1]
maxes = []
for n in reversed(a):
last_max = max(n, last_max)
maxes.append(last_max)
return list(reversed(maxes))
For what it's worth, this looks to be about 50% slower than the itertools.accumulate approach, but we're talking 25ms vs 17ms for a list of 100,000 ints, so it may not much matter.
If speed is the utmost concern and the range of numbers you expect to see is significantly smaller than the length of list you're working with, it might be worth using RLE:
def suffix_max_rle(a_list):
last_max = a_list[-1]
count = 1
max_counts = []
for n in a_list[-2::-1]:
if n <= last_max:
count += 1
else:
max_counts.append([last_max, count])
last_max = n
count = 1
if n <= last_max:
max_counts.append([last_max, count])
return list(reversed(max_counts))
This is about 4 times faster than the above, and about 2.5 times faster than the itertools approach, for a list of 100,000 ints in the range 0-10000. Provided, again, that your range of numbers is significantly smaller than the length of your lists, it will take less memory, too.

using python itertools to manage nested for loops

I am trying to use itertools.product to manage the bookkeeping of some nested for loops, where the number of nested loops is not known in advance. Below is a specific example where I have chosen two nested for loops; the choice of two is only for clarity, what I need is a solution that works for an arbitrary number of loops.
This question provides an extension/generalization of the question appearing here:
Efficient algorithm for evaluating a 1-d array of functions on a same-length 1d numpy array
Now I am extending the above technique using an itertools trick I learned here:
Iterating over an unknown number of nested loops in python
Preamble:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
At the end of the above itertools loop, I have a 12-element, 1-d array of functions, func_table, each element having been built from the trivial_functional.
Question:
Suppose I am given a pair of integers, (i_1, i_2), where these integers are to be interpreted as the indices of idx1 and idx2, respectively. How can I use itertools.product to determine the correct corresponding element of the func_table array?
I know how to hack the answer by writing my own function that mimics the itertools.product bookkeeping, but surely there is a built-in feature of itertools.product that is intended for exactly this purpose?
I don't know of a way of calculating the flat index other than doing it yourself. Fortunately this isn't that difficult:
def product_flat_index(factors, indices):
if len(factors) == 1: return indices[0]
else: return indices[0] * len(factors[0]) + product_flat_index(factors[1:], indices[1:])
>> product_flat_index(joint, (2, 1))
9
An alternative approach is to store the results in a nested array in the first place, making translation unnecessary, though this is more complex:
from functools import reduce
from operator import getitem, setitem, itemgetter
def get_items(container, indices):
return reduce(getitem, indices, container)
def set_items(container, indices, value):
c = reduce(getitem, indices[:-1], container)
setitem(c, indices[-1], value)
def initialize_table(lengths):
if len(lengths) == 1: return [0] * lengths[0]
subtable = initialize_table(lengths[1:])
return [subtable[:] for _ in range(lengths[0])]
func_table = initialize_table(list(map(len, joint)))
for items in product(*map(enumerate, joint)):
f = trivial_functional(*map(itemgetter(1), items))
set_items(func_table, list(map(itemgetter(0), items)), f)
>>> get_items(func_table, (2, 1)) # same as func_table[2][1]
<function>
So numerous answers were quite useful, thanks to everyone for the solutions.
It turns out that if I recast the problem slightly with Numpy, I can accomplish the same bookkeeping, and solve the problem I was trying to solve with vastly improved speed relative to pure python solutions. The trick is just to use Numpy's reshape method together with the normal multi-dimensional array indexing syntax.
Here's how this works. We just convert func_table into a Numpy array, and reshape it:
func_table = np.array(func_table)
component_dimensions = [len(idx1), len(idx2)]
func_table = np.array(func_table).reshape(component_dimensions)
Now func_table can be used to return the correct function not just for a single 2d point, but for a full array of 2d points:
dim1_pts = [3,1,2,1,3,3,1,3,0]
dim2_pts = [0,1,2,1,2,0,1,2,1]
func_array = func_table[dim1_pts, dim2_pts]
As usual, Numpy to the rescue!
This is a little messy, but here you go:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [enumerate(idx1), enumerate(idx2)]
func_map = {}
for indexes, items in map(lambda x: zip(*x), product(*joint)):
f = trivial_functional(*items)
func_map[indexes] = f
print(func_map[(2, 0)](5)) # 40 = (3+5)*5
I'd suggest using enumerate() in the right place:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
From what I understood from your comments and your code, func_table is simply indexed by the occurence of a certain input in the sequence. You can access it back again using:
for index, items in enumerate(product(*joint)):
# because of the append(), index is now the
# position of the function created from the
# respective tuple in join()
func_table[index](some_value)

Categories

Resources