speed up list iteration bottleneck

speed up list iteration bottleneck - python

I have a bottleneck in a piece of code that is ruining the performance of my code. I re-wrote the section, but, after timing it, things didn't improve.
The problem is as follows. Given a list of fixed-length-lists of integers
data = [[1,2,3], [3,2,1], [8,1,0], [1,3,4]]
I need to append the index of each sublist to a separate list as many times as its list value at a given column index. There is a separate list for each column in the data.
For instance, for the above data, there will be three resulting lists since the sub-lists have three columns.
There are 4 sublists, so we expect the numbers 0-3 to appear in each of the final lists.
We expect the following three lists to be generated from the above data
[[0, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3],
[0, 0, 1, 1, 2, 3, 3, 3],
[0, 0, 0, 1, 3, 3, 3, 3]]
I have two ways of doing this:
processed_data = list([] for _ in range(len(data[0])))
for n in range(len(data)):
sub_list = data[n]
for k, proc_list in enumerate(processed_data):
for _ in range(sub_list[k]):
proc_list.append(n)
processed_data = []
for i, col in enumerate(zip(*data)):
processed_data.append([j for j,count in enumerate(col) for _ in range(count)])
The average size of the data list is around 100,000.
Is there a way I can speed this up?

You can't improve the computational complexity of your algorithm unless you're able to tweak the output format (see below). In other words, you'll at best be able to improve the speed by a modest percentage (and the percentage will be independent of the size of the input).
I don't see any obvious implementation issues. The one idea I had was to get rid of the large number of append() calls and the overhead that is incurred by gradual list expansions by preallocating the output matrix, but #juanpa.arrivillaga suggests in their comment that append() is in fact very optimized on CPython. If you're on another interpreter, you could try it: you know that the length of the output list for column c will be equal to the sum of all the input numbers in column c. So you can just preallocate each output list by using [0] * sum_of_input_values_at_column_c, and then do proc_list[i] = n instead of proc_list.append(n) (and manually increment i). This does, however, require two passes over the input, so it might not actually be an improvement - your problem is quite memory-intensive as its core computation is extremely simple.
The reason that you can't improve the computational complexity is that it is already optimal: any algorithm needs to spend time on generating its output, so the size of the output is a lower bound for how fast the algorithm can possibly be. And in your case, the size of the output is equal to the sum of the values in your input matrix (and it's generally considered bad when you depend on the input values themselves rather than on the number of input values). And that's the number of iterations that your algorithm spends, so it is optimal. However, if the output of this function is going to reside in memory to be consumed by another function (rather than being written to a file), and you are able to make some adaptations in that function, you could instead output a matrix of generators, where each generator knows that it needs to generate sub_list[k] occurrences of n. Then, the complexity of your algorithm becomes proportional to the size of the input matrix (but consuming the output will still take the same amount of time that it would have taken to generate the full output).

Perhaps itertools can make this go faster for you by minimizing the amount of python code inside loops:
data = [[1,2,3], [3,2,1], [8,1,0], [1,3,4]]
from itertools import chain,repeat,starmap
result = [ list(chain.from_iterable(starmap(repeat,r)))
for r in map(enumerate,zip(*data)) ]
print(result)
[[0, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3],
[0, 0, 1, 1, 2, 3, 3, 3],
[0, 0, 0, 1, 3, 3, 3, 3]]
If you're processing the output in the same order as the result's rows come out, you can convert this to a generator and use it directly in your main process:
iResult = ( chain.from_iterable(starmap(repeat,r))
for r in map(enumerate,zip(*data)) )
for iRow in iResult: # iRow is also an iterator
for resultItem in iRow:
# Perform your item processing here
print(resultItem, end=" ")
print()
0 1 1 1 2 2 2 2 2 2 2 2 3
0 0 1 1 2 3 3 3
0 0 0 1 3 3 3 3
This will avoid creating and storing the lists of indexes altogether (i.e. bringing that bottleneck down to zero). But that's only if you process the result sequentially

Related

How to Partition an array into 2 arrays with equal sums

we have an array of integers that has to be partitioned into 2 arrays. My goal is not just to say it's possible or not, it has to return the 2 arrays as an output.
Input = [ 1, 2, 3, 4, 6]
output = [1, 3, 4] [2, 6]
Both the arrays need to have the same sum. In this case, it is 8 for both arrays. All the elements should be used and no integers should repeat again in the output.
This is how I am trying.
def partition(nums):
if sum(nums) % 2:
return "Not possible"
target = (sum(nums))/2
possible = set()
possible.add(0)
for i in range(len(nums)):
next = set()
for t in possible:
next.add(t + nums[i])
if t + nums[i] == target:
sub = [t, nums[i]]
print(sub)
next.add(t)
possible = next
nums = [1, 2, 3, 4, 6]
print(partition(nums))
This code repeats the same elements and makes an array like [4,4]. I don't understand what to do to stop that.
I am a newbie. So you can completely rewrite it and come up with your own technique. Is it even possible to do something like that?

One approach is to use knapsack algorithm. The knapsack has to hold a weight of Total Sum/2. Get the items whose weight is (Total Sum)/2.the remaining items will have same weight.
Other approach is backtracking. Just run through the list to get a combination of numbers summing to Total Sum/2, once found a list return. But this will be inefficient.

Python LIFO list/array - shifting data, to replace first input with newest value

My goal is to find the highest high in set of price data. However, im currently struggling to append data to a list in a LIFO order (last in first out) in my for loop looping through large set of data. So for example:
I have a list []
append to list item by item in for loop: list [1, 2, 3, 4, 5]
then I have reached desired list length (in this case 5), I want to shift everything down whilst deleting '1' for example it could go to [2, 3, 4, 5, 1] then replace '1' with '6' resulting in [2, 3, 4, 5, 6]
highs_list = np.array([])
list_length = 50
for current in range(1, len(df.index)):
previous = current - 1
if len(highs_list) < list_length:
np.append(highs_list, df.loc[current, 'high'])
else:
np.roll(highs_list, -1)
highs_list[49] = df.loc[current, 'high']

If you insert 1, 2, 3, 4, 5 and then want to remove 1 to insert 6 then this seems to be a FIFO movement, since the First IN (1) is the First OUT.
Anyhow, standard Python lists allow for this by using append() and pop(0), but the in-memory shift of the elements has time complexity O(n).
A much more efficient tool for this task is collections.deque, which also provides a rotate method to allow exactly the [1,2,3,4,5] => [2,3,4,5,1] transformation.

Function Failing at Large List Sizes

I have a question: Starting with a 1-indexed array of zeros and a list of operations, for each operation add a value to each the array element between two given indices, inclusive. Once all operations have been performed, return the maximum value in the array.
Example: n = 10, Queries = [[1,5,3],[4,8,7],[6,9,1]]
The following will be the resultant output after iterating through the array, Index 1-5 will have 3 added to it etc...:
[0,0,0, 0, 0,0,0,0,0, 0]
[3,3,3, 3, 3,0,0,0,0, 0]
[3,3,3,10,10,7,7,7,0, 0]
[3,3,3,10,10,8,8,8,1, 0]
Finally you output the max value in the final list:
[3,3,3,10,10,8,8,8,1, 0]
My current solution:
def Operations(size, Array):
ResultArray = [0]*size
Values = [[i.pop(2)] for i in Array]
for index, i in enumerate(Array):
#Current Values in = Sum between the current values in the Results Array AND the added operation of equal length
#Results Array
ResultArray[i[0]-1:i[1]] = list(map(sum, zip(ResultArray[i[0]-1:i[1]], Values[index]*len(ResultArray[i[0]-1:i[1]]))))
Result = max(ResultArray)
return Result
def main():
nm = input().split()
n = int(nm[0])
m = int(nm[1])
queries = []
for _ in range(m):
queries.append(list(map(int, input().rstrip().split())))
result = Operations(n, queries)
if __name__ == "__main__":
main()
Example input: The first line contains two space-separated integers n and m, the size of the array and the number of operations.
Each of the next m lines contains three space-separated integers a,b and k, the left index, right index and summand.
5 3
1 2 100
2 5 100
3 4 100
Compiler Error at Large Sizes:
Runtime Error
Currently this solution is working for smaller final lists of length 4000, however in order test cases where length = 10,000,000 it is failing. I do not know why this is the case and I cannot provide the example input since it is so massive. Is there anything clear as to why it would fail in larger cases?

I think the problem is that you make too many intermediary trow away list here:
ResultArray[i[0]-1:i[1]] = list(map(sum, zip(ResultArray[i[0]-1:i[1]], Values[index]*len(ResultArray[i[0]-1:i[1]]))))
this ResultArray[i[0]-1:i[1]] result in a list and you do it twice, and one is just to get the size, which is a complete waste of resources, then you make another list with Values[index]*len(...) and finally compile that into yet another list that will also be throw away once it is assigned into the original, so you make 4 throw away list, so for example lets said the the slice size is of 5.000.000, then you are making 4 of those or 20.000.000 extra space you are consuming, 15.000.000 of which you don't really need, and if your original list is of 10.000.000 elements, well just do the math...
You can get the same result for your list(map(...)) with list comprehension like
[v+Value[index][0] for v in ResultArray[i[0]-1:i[1]] ]
now we use two less lists, and we can reduce one list more by making it a generator expression, given that slice assignment does not need that you assign a list specifically, just something that is iterable
(v+Value[index][0] for v in ResultArray[i[0]-1:i[1]] )
I don't know if internally the slice assignment it make it a list first or not, but hopefully it doesn't, and with that we go back to just one extra list
here is an example
>>> a=[0]*10
>>> a
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> a[1:5] = (3+v for v in a[1:5])
>>> a
[0, 3, 3, 3, 3, 0, 0, 0, 0, 0]
>>>
we can reduce it to zero extra list (assuming that internally it doesn't make one) by using itertools.islice
>>> import itertools
>>> a[3:7] = (1+v for v in itertools.islice(a,3,7))
>>> a
[0, 3, 3, 4, 4, 1, 1, 0, 0, 0]
>>>

Nested array computations in Python using numpy

I am trying to use numpy in Python in solving my project.
I have a random binary array rndm = [1, 0, 1, 1] and a resource_arr = [[2, 3], 4, 2, [1, 2]]. What I am trying to do is to multiply the array element wise, then get their sum. As an expected output for the sample above,
output = 5 0 2 3. I find hard to solve such problem because of the nested array/list.
So far my code looks like this:
def fitness_score():
output = numpy.add(rndm * resource_arr)
return output
fitness_score()
I keep getting
ValueError: invalid number of arguments.
For which I think is because of the addition that I am trying to do. Any help would be appreciated. Thank you!

Numpy treats its arrays as matrices, and resource_arr is not a (valid) matrix. In your case a python list is more suitable:
def sum_nested(l):
tmp = []
for element in l:
if isinstance(element, list):
tmp.append(numpy.sum(element))
else:
tmp.append(element)
return tmp
In this function we check for each element inside l if it is a list. If so, we sum its elements. On the other hand, if the encountered element is just a number, we leave it untouched. Please note that this only works for one level of nesting.
Now, if we run sum_nested([[2, 3], 4, 2, [1, 2]]) we will get [5 4 2 3]. All that's left is multiplying this result by the elements of rndm, which can be achieved easily using numpy:
def fitness_score(a, b):
return numpy.multiply(a, sum_nested(b))

Numpy is all about the non-jagged arrays. You can do things with jagged arrays, but doing so efficiently and elegantly isnt trivial.
Almost always, trying to find a way to map your datastructure to a non-nested one, for instance, encoding the information as below, will be more flexible, and more performant.
resource_arr = (
[0, 0, 1, 2, 3, 3]
[2, 3, 4, 2, 1, 2]
)
That is, an integer denoting the 'row' each value belongs to, paired with an array of equal size of the values themselves.
This may 'feel' wasteful when coming from a C-style way of doing arrays (omg more memory consumption), but staying away from nested datastructures is almost certainly your best bet in terms of performance, and the amount of numpy/scipy ecosystem that will actually be compatible with your data representation. If it really uses more memory is actually rather questionable; every new python object uses a ton of bytes, so if you have only few elements per nesting, it is the more memory efficient solution too.
In this case, that would give you the following efficient solution to your problem:
output = np.bincount(*resource_arr) * rndm

I have not worked much with pandas/numpy so I'm not sure if this is most efficient way, but it works (atleast for the example you have shown):
import numpy as np
rndm = [1, 0, 1, 1]
resource_arr = [[2, 3], 4, 2, [1, 2]]
multiplied_output = np.multiply(rndm, resource_arr)
print(multiplied_output)
output = []
for elem in multiplied_output:
output.append(sum(elem)) if isinstance(elem, list) else output.append(elem)
final_output = np.array(output)
print(final_output)

Sorting Function. Explanantion

def my_sort(array):
length_of_array = range(1, len(array))
for i in length_of_array:
value = array[i]
last_value = array[i-1]
if value<last_value:
array[i]=last_value
array[i-1]=value
my_sort(array)
return array
I know what the function does in general. Its a sorting alogarithm.... But i dont know how what each individual part/section does.

Well, I have to say that the best way to understand this is to experiment with it, learn what it is using, and, basically, learn Python. :)
However, I'll go through the lines one-by-one to help:
Define a function named my_sort that accepts one argument named array. The rest of the lines are contained in this function.
Create a range of numbers using range that spans from 1 inclusive to the length of array non-inclusive. Then, assign this range to the variable length_of_array.
Start a for-loop that iterates through the range defined in the preceding line. Furthermore, assign each number returned to the variable i. This for-loop encloses lines 4 through 9.
Create a variable value that is equal to the item returned by indexing array at position i.
Create a variable last_value that is equal to the item returned by indexing array at position i-1.
Test if value is less than last_value. If so, run lines 7 through 9.
Make the i index of array equal last_value.
Make the i-1 index of array equal value.
Rerun my_sort recursively, passing in the argument array.
Return array for this iteration of the recursive function.
When array is finally sorted, the recursion will end and you will be left with array all nice and sorted.
I hope this shed some light on the subject!

I'll see what I can do for you. The code, for reference:
def my_sort(array):
length_of_array = range(1, len(array))
for i in length_of_array:
value = array[i]
last_value = array[i-1]
if value<last_value:
array[i]=last_value
array[i-1]=value
my_sort(array)
return array
def my_sort(array):
A function that takes an array as an argument.
length_of_array = range(1, len(array))
We set the variable length_of_array to a range of numbers that we can iterate over, based on the number of items in array. I assume you know what range does, but if you don't, in short you can iterate over it in the same way you'd iterate over a list. (You could also use xrange() here.)
for i in length_of_array:
value = array[i]
last_value = array[-1]
What we're doing is using the range to indirectly traverse the array because there's the same total of items in each. If we look closely, though, value uses the i as its index, which starts off at 1, so value is actually array[1], and last_value is array[1-1] or array[0].
if value<last_value:
array[i]=last_value
array[i-1]=value
So now we're comparing the values. Let's say we passed in [3, 1, 3, 2, 6, 4]. We're at the first iteration of the loop, so we're essentially saying, if array[1], which is 1, is less than array[0], which is 3, swap them. Of course 1 is less than 3, so swap them we do. But since the code can only compare each item to the previous item, there's no guarantee that array will be properly sorted from lowest to highest. Each iteration could unswap a properly swapped item if the item following it is larger (e.g. [2,5,6,4] will remain the same on the first two iterations -- they will be skipped over by the if test -- but when it hits the third, 6 will swap with 4, which is still wrong). In fact, if we were to finish this out without the call to my_sort(array) directly below it, our original array would evaluate to [1, 3, 2, 3, 4, 6]. Not quite right.
my_sort(array)
So we call my_sort() recursively. What we're basically saying is, if on the first iteration something is wrong, correct it, then pass the new array back to my_sort(). This sounds weird at first, but it works. If the if test was never satisfied at all, that would mean each item in our original list was smaller than the next, which is another way (the computer's way, really) of saying it was sorted in ascending order to begin with. That's the key. So if any list item is smaller than the preceding item, we jerk it one index left. But we don't really know if that's correct -- maybe it needs to go further still. So we have to go back to the beginning and (i.e., call my_sort() again on our newly-minted list), and recheck to see if we should pull it left again. If we can't, the if test fails (each item is smaller than the next) until it hits the next error. On each iteration, this teases the same smaller number leftward by one index until it's in its correct position. This sounds more confusing than it is, so let's just look at the output for each iteration:
[3, 1, 3, 2, 6, 4]
[1, 3, 3, 2, 6, 4]
[1, 3, 2, 3, 6, 4]
[1, 2, 3, 3, 6, 4]
[1, 2, 3, 3, 4, 6]
Are you seeing what's going on? How about if we only look at what's changing on each iteration:
[3, 1, ... # Wrong; swap. Further work ceases; recur (return to beginning with a fresh call to my_sort()).
[1, 3, 3, 2, ... # Wrong; swap. Further work ceases; recur
[1, 3, 2, ... # Wrong; swap. Further work ceases; recur
[1, 2, 3, 3, 6, 4 # Wrong; swap. Further work ceases; recur
[1, 2, 3, 3, 4, 6] # All numbers all smaller than following number; correct.
This allows the function to call itself as many times as it needs to pull a number from the back to the front. Again, each time it's called, it focuses on the first wrong instance, pulling it one left until it puts it in its proper position. Hope that helps! Let me know if you're still having trouble.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.