Writing a faster implementation for insert() in python - python

Essentially, I need to write a faster implementation as a replacement for insert() to insert an element in a particular position in a list.
The inputs are given in a list as [(index, value), (index, value), (index, value)]
For example: Doing this to insert 10,000 elements in a 1,000,000 element list takes about 2.7 seconds
def do_insertions_simple(l, insertions):
"""Performs the insertions specified into l.
#param l: list in which to do the insertions. Is is not modified.
#param insertions: list of pairs (i, x), indicating that x should
be inserted at position i.
"""
r = list(l)
for i, x in insertions:
r.insert(i, x)
return r
My assignment asks me to speed up the time taken to complete the insertions by 8x or more
My current implementation:
def do_insertions_fast(l, insertions):
"""Implement here a faster version of do_insertions_simple """
#insert insertions[x][i] at l[i]
result=list(l)
for x,y in insertions:
result = result[:x]+list(y)+result[x:]
return result
Sample input:
import string
l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
insertions = [(0, 'a'), (2, 'b'), (2, 'b'), (7, 'c')]
r1 = do_insertions_simple(l, insertions)
r2 = do_insertions_fast(l, insertions)
print("r1:", r1)
print("r2:", r2)
assert_equal(r1, r2)
is_correct = False
for _ in range(20):
l, insertions = generate_testing_case(list_len=100, num_insertions=20)
r1 = do_insertions_simple(l, insertions)
r2 = do_insertions_fast(l, insertions)
assert_equal(r1, r2)
is_correct = True
The error I'm getting while running the above code:
r1: ['a', 0, 'b', 'b', 1, 2, 3, 'c', 4, 5, 6, 7, 8, 9]
r2: ['a', 0, 'b', 'b', 1, 2, 3, 'c', 4, 5, 6, 7, 8, 9]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-54e0c44a8801> in <module>()
12 l, insertions = generate_testing_case(list_len=100, num_insertions=20)
13 r1 = do_insertions_simple(l, insertions)
---> 14 r2 = do_insertions_fast(l, insertions)
15 assert_equal(r1, r2)
16 is_correct = True
<ipython-input-7-b421ee7cc58f> in do_insertions_fast(l, insertions)
4 result=list(l)
5 for x,y in insertions:
----> 6 result = result[:x]+list(y)+result[x:]
7 return result
8 #raise NotImplementedError()
TypeError: 'float' object is not iterable
The file is using the nose framework to check my answers, etc, so if there's any functions that you don't recognize, its probably from that framework.
I know that it is inserting the lists right, however it keeps raising the error "float object is not iterable"
I've also tried a different method which did work (sliced the lists, added the element, and added the rest of the list, and then updating the list) but that was 10 times slower than insert()
I'm not sure how to continue
edit: I've been looking at the entire question wrong, for now I'll try to do it myself but if I'm stuck again I'll ask a different question and link that here

From your question, emphasis mine:
I need to write a faster implementation as a replacement for insert() to insert an element in a particular position in a list
You won't be able to. If there was a faster way, then the existing insert() function would already use it. Anything you do will not even get close to the speed.
What you can do is write a faster way to do multiple insertions.
Let's look at an example with two insertions:
>>> a = list(range(15))
>>> a.insert(5, 'X')
>>> a.insert(10, 'Y')
>>> a
[0, 1, 2, 3, 4, 'X', 5, 6, 7, 8, 'Y', 9, 10, 11, 12, 13, 14]
Since every insert shifts all values to the right of it, this in general is an O(m*(n+m)) time algorithm, where n is the original size of the list and m is the number of insertions.
Another way to do it is to build the result piece by piece, taking the insertion points into account:
>>> a = list(range(15))
>>> b = []
>>> b.extend(a[:5])
>>> b.append('X')
>>> b.extend(a[5:9])
>>> b.append('Y')
>>> b.extend(a[9:])
>>> b
[0, 1, 2, 3, 4, 'X', 5, 6, 7, 8, 'Y', 9, 10, 11, 12, 13, 14]
This is O(n+m) time, as all values are just copied once and there's no shifting. It's just somewhat tricky to determine the correct piece lengths, as earlier insertions affect later ones. Especially if the insertion indexes aren't sorted (and in that case it would also take O(m log m) additional time to sort them). That's why I had to use [5:9] and a[9:] instead of [5:10] and a[10:]
(Yes, I know, extend/append internally copy some more if the capacity is exhausted, but if you understand things enough to point that out, then you also understand that it doesn't matter :-)

One option is to use a different data structure, which supports faster insertions.
The obvious suggestion would be a binary tree of some sort. You can insert nodes into a balanced binary tree in O(log n) time, so long as you're able to find the right insertion point in O(log n) time. A solution to that is for each node to store and maintain its own subtree's cardinality; then you can find a node by index without iterating through the whole tree. Another possibility is a skip list, which supports insertion in O(log n) average time.
However, the problem is that you are writing in Python, so you have a major disadvantage trying to write something faster than the built-in list.insert method, because that's implemented in C, and Python code is a lot slower than C code. It's not unusual to write an O(log n) algorithm in Python that only beats the built-in O(n) implementation for very large n, and even n = 1,000,000 may not be large enough to win by a factor of 8 or more. This could mean a lot of wasted effort if you try implementing your own data structure and it turns out not to be fast enough.
I think the expected solution for this assignment will be something like Heap Overflow's answer. That said, there is another way to approach this question which is worth considering because it avoids the complications of working out the correct indices to insert at if you do the insertions out of order. My idea is to take advantage of the efficiency of list.insert but to call it on shorter lists.
If the data is still stored in Python lists, then the list.insert method can still be used to get the efficiency of a C implementation, but if the lists are shorter then the insert method will be faster. Since you only need to win by a constant factor, you can divide the input list into, say, 256 sublists of roughly equal size. Then for each insertion, insert it at the correct index in the correct sublist; and finally join the sublists back together again. The time complexity is O(nm), which is the same as the "naive" solution, but it has a lower constant factor.
To compute the correct insertion index we need to subtract the lengths of the sublists to the left of the one we're inserting in; we can store the cumulative sublist lengths in a prefix sum array, and update this array efficiently using numpy. Here's my implementation:
from itertools import islice, chain, accumulate
import numpy as np
def do_insertions_split(lst, insertions, num_sublists=256):
n = len(lst)
sublist_len = n // num_sublists
lst_iter = iter(lst)
sublists = [list(islice(lst_iter, sublist_len)) for i in range(num_sublists-1)]
sublists.append(list(lst_iter))
lens = [0]
lens.extend(accumulate(len(s) for s in sublists))
lens = np.array(lens)
for idx, val in insertions:
# could use binary search, but num_sublists is small
j = np.argmax(lens >= idx)
sublists[j-1].insert(idx - lens[j-1], val)
lens[j:] += 1
return list(chain.from_iterable(sublists))
It is not as fast as #iz_'s implementation (linked from the comments), but it beats the simple algorithm by a factor of almost 20, which is sufficient according to the problem statement. The times below were measured using timeit on a list of length 1,000,000 with 10,000 insertions.
simple -> 2.1252768037122087 seconds
iz -> 0.041302349785668824 seconds
split -> 0.10893724981304054 seconds
Note that my solution still loses to #iz_'s by a factor of about 2.5. However, #iz_'s solution requires the insertion points to be sorted, whereas mine works even when they are unsorted:
lst = list(range(1_000_000))
insertions = [(randint(0, len(lst)), "x") for _ in range(10_000)]
# uncomment if the insertion points should be sorted
# insertions.sort()
r1 = do_insertions_simple(lst, insertions)
r2 = do_insertions_iz(lst, insertions)
r3 = do_insertions_split(lst, insertions)
if r1 != r2: print('iz failed') # prints
if r1 != r3: print('split failed') # doesn't print
Here is my timing code, in case anyone else wants to compare. I tried a few different values for num_sublists; anything between 200 and 1,000 seemed to be about equally good.
from timeit import timeit
algorithms = {
'simple': do_insertions_simple,
'iz': do_insertions_iz,
'split': do_insertions_split,
}
reps = 10
for name, func in algorithms.items():
t = timeit(lambda: func(lst, insertions), number=reps) / reps
print(name, '->', t, 'seconds')

list(y) attempts to iterate over y and create a list of its elements. If y is an integer, it will not be iterable, and return the error you mentioned. You instead probably want to create a list literal containing y like so: [y]

Related

Increment all entries in an array by 'n' without a for loop

I have an array:
arr = [5,5,5,5,5,5]
I want to increment a particular range in the arr by 'n'. So if n=2 and the range is [2,5].
The array should look like this:
arr = [5,5,7,7,7,5]
Needed to do this without a for loop, for a problem im trying to solve.
Tried:
arr[2:5] = [n]*3
but that obviously replaces the entries and becomes:
arr = [5,5,3,3,3,5]
Any suggestions would be highly appriciated.
n = 2
arr_range = slice(2, 5)
arr = [5,5,7,7,7,5]
arr[arr_range] = map(lambda x: x+n, arr[arr_range])
# arr
# [5, 5, 9, 9, 9, 5]
But I would recommend using numpy...
import numpy as np
n = 2
arr_range = slice(2, 5)
arr = np.array([5,5,7,7,7,5])
arr[arr_range] += n
You actually have a list, not an array. If you convert it to a Numpy array it is simple.
>>> n=3
>>> arr = np.array([5,5,5,5,5,5])
>>> arr[2:5] += n
>>> arr
array([5, 5, 8, 8, 8, 5])
You have basically two options (for code see below):
Use slice assignment via a list comprehension (a[:] = [x+1 for x in a]),
Use a for-loop (even though you exclude this in your question, I don't see a legitimate reason for doing so).
They come with pros and cons. Let's assume you are going to replace some fraction of the list items (as opposed to a fixed number of items). The for-loop runs in Python and hence might be slower but it has O(1) memory usage. The list comprehension and slice assignment both operate in C (assuming you are using CPython) but it has O(N) memory usage due to the temporary list.
Using a generator doesn't buy anything since it is converted to a list anyway before the assignment happens (this is necessary because if the generator had fewer or more items than the slice, the list would need to be resized accordingly; see the source code).
Using a map adds even more overhead since it needs to call the mapped function on every item.
The following is a performance comparison of the different methods. The for-loop is fastest for very small lists since it has minimal overhead (just the range object). For more than about a dozen items, the list comprehension clearly outperforms the other methods and especially for larger lists (len(a) > 3e5) the difference to the generator becomes noticeable (the generator cannot provide information about its size, so the generated list needs to be resized as more items are fetched). For very large lists the difference between for-loop and list comprehension seems to shrink again since the memory overhead tends to outweigh the loop cost, but reaching that point would require unusually large lists (where you'd be better off using something like Numpy anyway).
This is the code using the perfplot package:
import numpy
import perfplot
def use_generator(a):
i = slice(0, len(a)//2)
a[i] = (x+1 for x in a[i])
def use_map(a):
i = slice(0, len(a)//2)
a[i] = map(lambda x: x+1, a[i])
def use_list(a):
i = slice(0, len(a)//2)
a[i] = [x+1 for x in a[i]]
def use_loop(a):
for i in range(len(a)//2):
a[i] += 1
perfplot.show(
setup=lambda n: [0]*n,
kernels=[use_generator, use_map, use_list, use_loop],
n_range=[2**k for k in range(1, 26)],
xlabel="len(a)",
equality_check=None,
)

Best way to directly generate a sublist from a python iterator

It is easy to convert an entire iterator sequence into a list using list(iterator), but what is the best/fastest way to directly create a sublist from an iterator without first creating the entire list, i.e. how to best create list(iterator)[m:n] without first creating the entire list?
It seems obvious that it should not* (at least not always) be possible to do so directly for m > 0, but it should be for n less than the length of the sequence. [p for i,p in zip(range(n), iterator)] comes to mind, but is that the best way?
The context is simple: Creating the entire list would cause a RAM overflow, so it needs to be broken down. So how do you do this efficiently and/or python-ic-ly?
*The list comprehension I mentioned could obviously be used for m > 0 by calling next(iterator) m times prior to execution, but I don't enjoy the lack of python-ness here.
itertools.islice:
from itertools import islice
itr = (i for i in range(10))
m, n = 3, 8
result = list(islice(itr, m, n))
print(result)
# [3, 4, 5, 6, 7]
In addition, you can add an argument as the step if you wanted:
itr = (i for i in range(10))
m, n, step = 3, 8, 2
result = list(islice(itr, m, n, step))
print(result)
# [3, 5, 7]

How to sum variable-size parts of a collection?

I want to calculate the sum of a collection, for sections of different sizes:
d = (1, 2, 3, 4, 5, 6, 7, 8, 9)
sz = (2, 3, 4)
# here I expect 1+2=3, 3+4+5=12, 6+7+8+9=30
itd = iter(d)
result = tuple( sum(tuple(next(itd) for i in range(s))) for s in sz )
print("result = {}".format(result))
I wonder whether the solution I came up with is the most 'pythonic' (elegant, readable, concise) way to achieve what I want...
In particular, I wonder whether there is a way to get rid of the separate iterator 'itd', and whether it would be easier to work with slices?
I would use itertools.islice since you can directly use the values in sz as the step size at each point:
>>> from itertools import islice
>>> it=iter(d)
>>> [sum(islice(it,s)) for s in sz]
[3, 12, 30]
Then you can convert that to a tuple if needed.
The iter is certainly needed in order to step through the tuple at the point where the last slice left off. Otherwise each slice would be d[0:s]
There's no reason to get rid of your iterator – iterating over d is what you are doing, after all.
You do seem to have an overabundance of tuples in that code, though. The line that's doing all the work could be made more legible by getting rid of them:
it = iter(d)
result = [sum(next(it) for _ in range(s)) for s in sz]
# [3, 12, 30]
… which has the added advantage that now you're producing a list rather than a tuple. d and sz also make more sense as lists, by the way: they're variable-length sequences of homogeneous data, not fixed-length sequences of heterogeneous data.
Note also that it is the conventional name for an arbitrary iterator, and _ is the conventional name for any variable that must exist but is never actually used.
Going a little further, next(it) for _ in range(s) is doing the same work that islice() could do more legibly:
from itertools import islice
it = iter(d)
result = [sum(islice(it, s)) for s in sz]
# [3, 12, 30]
… at which point, I'd say the code's about as elegant, readable and concise as it's likely to get.

Return tuple of any two items in list, which if summed equal given int

For example, assume a given list of ints:
int_list = list(range(-10,10))
[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
What is the most efficient way to find if any given two values in int_list sum to equal a given int, say 2?
I was asked this in a technical phone interview this morning on how to efficiently handle this scenario with an int_list of say, 100 million items (I rambled and had no good answer :/).
My first idea was:
from itertools import combinations
int_list = list(range(-10,10))
combo_list = list(combinations(int_list, 2))
desired_int = 4
filtered_tuples = list(filter(lambda x: sum(x) == desired_int, combo_list))
filtered_tuples
[(-5, 9), (-4, 8), (-3, 7), (-2, 6), (-1, 5), (0, 4), (1, 3)]
Which doesn't even work with a range of only range(-10000, 10000)
Also, does anyone know of a good online Python performance testing tool?
For any integer A there is at most one integer B that will sum together to equal integer N. It seems easier to go through the list, do the arithmetic, and do a membership test to see if B is in the set.
int_list = set(range(-500000, 500000))
TARGET_NUM = 2
def filter_tuples(int_list, target):
for int_ in int_list:
other_num = target - int_
if other_num in int_list:
yield (int_, other_num)
filtered_tuples = filter_tuples(int_list, TARGET_NUM)
Note that this will duplicate results. E.g. (-2, 4) is a separate response from (4, -2). You can fix this by changing your function:
def filter_tuples(int_list, target):
for int_ in int_list:
other_num = target - int_
if other_num in int_list:
set.remove(int_)
set.remove(other_num)
yield (int_, other_num)
EDIT: See my other answer for an even better approach (with caveats).
What is the most efficient way to find if any given two values in int_list sum to equal a given int, say 2?
My first inclination was to do it with the itertools module's combinations and the short-cutting power of any, but it could be quite slower than Adam's approach:
>>> import itertools
>>> int_list = list(range(-10,10))
>>> any(i + j == 2 for i, j in itertools.combinations(int_list, 2))
True
Seems to be fairly responsive for larger ranges:
>>> any(i + j == 2 for i, j in itertools.combinations(xrange(-10000,10000), 2))
True
>>> any(i + j == 2 for i, j in itertools.combinations(xrange(-1000000,1000000), 2))
True
Takes about 10 seconds on my machine:
>>> any(i + j == 2 for i, j in itertools.combinations(xrange(-10000000,10000000), 2))
True
A more literal approach using math:
Assume a given list of ints:
int_list = list(range(-10,10)) ... [-10, -9, -8, -7, -6, -5, -4, -3, -2,
-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
What is the most efficient way to find if any given two values in
int_list sum to equal a given int, say 2? ... how to efficiently
handle this scenario with an int_list of say, 100 million items.
It's clear that we can deduce the requirements that we can apply a single parameter, n, for the range of integers, of the form range(-n, n), which means every integer from negative n up to but not including positive n. From there the requirements are clearly to whether some number, x, is a sum of any two integers in that range.
Any such range can be trivially shown to contain a pair that sum to any number in that range and n-1 beyond it, so it's a waste of computing power to search for it.
def x_is_sum_of_2_diff_numbers_in_range(x, n):
if isinstance(x, int) and isinstance(n, int):
return -(n*2) < x < (n - 1)*2
else:
raise ValueError('args x and n must be ints')
Computes nearly instantly:
>>> x_is_sum_of_2_diff_numbers_in_range(2, 1000000000000000000000000000)
True
Testing the edge-cases:
def main():
print x_is_sum_of_2_diff_numbers_in_range(x=5, n=4) # True
print x_is_sum_of_2_diff_numbers_in_range(x=6, n=4) # False
print x_is_sum_of_2_diff_numbers_in_range(x=-7, n=4) # True
print x_is_sum_of_2_diff_numbers_in_range(x=-8, n=4) # False
EDIT:
Since I can see that a more generalized version of this problem (where the list could contain any given numbers) is a common one, I can see why some people have a preconceived approach to this, but I still stand by my interpretation of this question's requirements, and I consider this answer the best approach for this more specific case.
I would have thought that any solution that depends on a doubly nested iteration over the list (albeit having the inner loop concealed by a nifty Python function) is O(n^2).
It is worth considering sorting the input. For any reasonable comparison-based sort, this will be O(n.lg(n)), which is already better than O(n^2). You might do better with a radix sort or pre-sort (making something like a bucket sort) depending on the range of the input list.
Having sorted the input, it is an O(n) operation to find a pair of numbers that sum to any given number, so your overall complexity is O(n.lg(n)).
In practice, it's an open question whether, for the stipulated “large number” of elements, a brute-force O(n^2) algorithm with nice cache behavior (zipping through arrays in order) would outperform the asymptotically better algorithm that moves a lot of data around, but eventually the one with the lower asymptotic complexity will win.

Python: Fast extraction of intersections among all possible 2-combinations in a large number of lists

I have a dataset of ca. 9K lists of variable length (1 to 100K elements). I need to calculate the length of the intersection of all possible 2-list combinations in this dataset. Note that elements in each list are unique so they can be stored as sets in python.
What is the most efficient way to perform this in python?
Edit I forgot to specify that I need to have the ability to match the intersection values to the corresponding pair of lists. Thanks everybody for the prompt response and apologies for the confusion!
If your sets are stored in s, for example:
s = [set([1, 2]), set([1, 3]), set([1, 2, 3]), set([2, 4])]
Then you can use itertools.combinations to take them two by two, and calculate the intersection (note that, as Alex pointed out, combinations is only available since version 2.6). Here with a list comrehension (just for the sake of the example):
from itertools import combinations
[ i[0] & i[1] for i in combinations(s,2) ]
Or, in a loop, which is probably what you need:
for i in combinations(s, 2):
inter = i[0] & i[1]
# processes the intersection set result "inter"
So, to have the length of each one of them, that "processing" would be:
l = len(inter)
This would be quite efficient, since it's using iterators to compute every combinations, and does not prepare all of them in advance.
Edit: Note that with this method, each set in the list "s" can actually be something else that returns a set, like a generator. The list itself could simply be a generator if you are short on memory. It could be much slower though, depending on how you generate these elements, but you wouldn't need to have the whole list of sets in memory at the same time (not that it should be a problem in your case).
For example, if each set is made from a function gen:
def gen(parameter):
while more_sets():
# ... some code to generate the next set 'x'
yield x
with open("results", "wt") as f_results:
for i in combinations(gen("data"), 2):
inter = i[0] & i[1]
f_results.write("%d\n" % len(inter))
Edit 2: How to collect indices (following redrat's comment).
Besides the quick solution I answered in comment, a more efficient way to collect the set indices would be to have a list of (index, set) instead of a list of set.
Example with new format:
s = [(0, set([1, 2])), (1, set([1, 3])), (2, set([1, 2, 3]))]
If you are building this list to calculate the combinations anyway, it should be simple to adapt to your new requirements. The main loop becomes:
with open("results", "wt") as f_results:
for i in combinations(s, 2):
inter = i[0][1] & i[1][1]
f_results.write("length of %d & %d: %d\n" % (i[0][0],i[1][0],len(inter))
In the loop, i[0] and i[1] would be a tuple (index, set), so i[0][1] is the first set, i[0][0] its index.
As you need to produce a (N by N/2) matrix of results, i.e., O(N squared) outputs, no approach can be less than O(N squared) -- in any language, of course. (N is "about 9K" in your question). So, I see nothing intrinsically faster than (a) making the N sets you need, and (b) iterating over them to produce the output -- i.e., the simplest approach. IOW:
def lotsofintersections(manylists):
manysets = [set(x) for x in manylists]
moresets = list(manysets)
for s in reversed(manysets):
moresets.pop()
for z in moresets:
yield s & z
This code's already trying to add some minor optimization (e.g. by avoiding slicing or popping off the front of lists, which might add other O(N squared) factors).
If you have many cores and/or nodes available and are looking for parallel algorithms, it's a different case of course -- if that's your case, can you mention the kind of cluster you have, its size, how nodes and cores can best communicate, and so forth?
Edit: as the OP has casually mentioned in a comment (!) that they actually need the numbers of the sets being intersected (really, why omit such crucial parts of the specs?! at least edit the question to clarify them...), this would only require changing this to:
L = len(manysets)
for i, s in enumerate(reversed(manysets)):
moresets.pop()
for j, z in enumerate(moresets):
yield L - i, j + 1, s & z
(if you need to "count from 1" for the progressive identifiers -- otherwise obvious change).
But if that's part of the specs you might as well use simpler code -- forget moresets, and:
L = len(manysets)
for i xrange(L):
s = manysets[i]
for j in range(i+1, L):
yield i, j, s & manysets[z]
this time assuming you want to "count from 0" instead, just for variety;-)
Try this:
_lists = [[1, 2, 3, 7], [1, 3], [1, 2, 3], [1, 3, 4, 7]]
_sets = map( set, _lists )
_intersection = reduce( set.intersection, _sets )
And to obtain the indexes:
_idxs = [ map(_i.index, _intersection ) for _i in _lists ]
Cheers,
José María García
PS: Sorry I misunderstood the question

Categories

Resources