Indexing numpy array as slow as python append - python

I have always been told that python's native append is a slow function and should be avoided in for loops. However after a couple of small tests I have found it performs more poorly than a numpy array when iterating over it with a for loop:
First Test -array/list construction
python native list append
def pythonAppend(n):
x = []
for i in range(n):
x.append(i)
return x
%timeit pythonAppend(1000000)
Numpy allocate array then access
def numpyConstruct(n):
x = np.zeros(n)
for i in range(n):
x[i] = i
return x
%timeit numpyConstruct(1000000)
Results:
Python Time: 179 ms
Numpy Time: 189 ms
Second Test -Accesing Elements
n = 1000000
x = pythonAppend(n)
arr = numpyConstruct(n)
order = arr[:]; np.random.shuffle(order); order = list(order.astype(int))
def listAccess(x):
for i in range(len(x)):
x[i] = i/3.
return x
def listAccessOrder(x, order):
for i in order:
x[i] = i/3.
return x
%timeit listAccess(x)
%timeit listAccess(arr)
%timeit listAccessOrder(x, order)
%timeit listAccessOrder(arr, order)
%timeit arr / 3.
Results
python -sequential: 175 ms
numpy -sequential: 198 ms
python -shuffled access: 2.08 s
numpy -shuffled access: 2.15 s
numpy -vectorised access: 1.92ms
These results to me are quite surprising as I would have thought that at least due to python being linked lists accessing elements would be slower than numpy due to having to follow a chain of pointers. Or am I misunderstanding the python list implementation? Also why do the python lists perform slightly better than the numpy equivalents -I guess most of the inefficiency comes from using a python for loop but python's append still out-competes numpy accessing and allocating.

Python lists are not linked lists. The data buffer of the list object contains links/pointers to objects (else where in memory). So fetching the ith element is easy. The buffer also has growth headroom, so append is a simple matter of inserting the new element's link. The buffer has to be reallocated periodically, but Python manages that seamlessly.
A numpy array also has a 1d data buffer, but it contains numeric values (more generally the bytes as required by the dtype). Fetching is easy, but it has to create a new python object to 'contain' the value ('unboxing'). Assignment also requires conversion from the python object to bytes that will be stored.
Generally we've found that making a new array by appending to a list (with one np.array call at the end) is competitive with assignment to a preallocated array.
Iteration through a numpy array is normally slower than iterating through a list.
What we strongly discourage is using np.append (or some variant) to grow an array iteratively. That makes a new array each time, with full copy.
numpy arrays are fast when the iteration is done in the compiled code. Normally that's with the whole-array methods. The c code can iterate over the data buffer, operating on the values directly, rather than invoking python object methods at each step.

Related

Is it possible to speed-up conversion of a list into an array in python?

In my code, I have noticed that the conversion of a list into an array takes a significant amount of time.
I'm wondering if there any faster ways on how to convert a list to an array in python, here are my three attempts:
import numpy as np
from timeit import timeit
from array import array
added_data = range(100000)
def test1():
np.asarray(added_data, dtype=np.float16)
def test2():
np.array(added_data, dtype=np.float16)
def test3():
array('f', added_data)
print(timeit(test1,number=100))
print(timeit(test2,number=100))
print(timeit(test3,number=100))
In other words:
Input: < type 'list' >
Output: < type 'array.array' >
It is very unlikely that there's a faster way to convert a list of values into an array than the obvious and simple approaches you've already tried. If there was a better way, the numpy authors probably would have implemented it in np.asarray or the np.array constructor itself. I also want to note that array.array creates a much less sophisticated object than the numpy functions, so it's probably not what you want.
What you might be able to do to improve your program's overall performance is to avoid creating the list in the first place. Perhaps you can read external data from a file directly into an array with np.loadtxt or np.load (depending on how it is formatted). Or maybe you can generate the array from scratch with functions like np.arange, rather than using a normal Python function like range that (in Python 2) returns a list.
Adding item into numpy array will make performance issues. Never do that.
Alternatives:
1- Adding items into list and convert that list into numpy array.
2- use deque from collections. This is the best way.
import collections
a = collections.deque([1,2,3,4])
a.append(5)
It gonna be the same result:
from array import array
def test4() :
array = array('d')
for item in added_data: # comma, or other
array.append(item)
But you can try:
from array import array
def test5() :
dataset_array = array('d')
dataset_array.extend(added_data)

what is the quickest way to iterate through a numpy array

I noticed a meaningful difference between iterating through a numpy array "directly" versus iterating through via the tolist method. See timing below:
directly
[i for i in np.arange(10000000)]
via tolist
[i for i in np.arange(10000000).tolist()]
considering I've discovered one way to go faster. I wanted to ask what else might make it go faster?
what is fastest way to iterate through a numpy array?
This is actually not surprising. Let's examine the methods one a time starting with the slowest.
[i for i in np.arange(10000000)]
This method asks python to reach into the numpy array (stored in the C memory scope), one element at a time, allocate a Python object in memory, and create a pointer to that object in the list. Each time you pipe between the numpy array stored in the C backend and pull it into pure python, there is an overhead cost. This method adds in that cost 10,000,000 times.
Next:
[i for i in np.arange(10000000).tolist()]
In this case, using .tolist() makes a single call to the numpy C backend and allocates all of the elements in one shot to a list. You then are using python to iterate over that list.
Finally:
list(np.arange(10000000))
This basically does the same thing as above, but it creates a list of numpy's native type objects (e.g. np.int64). Using list(np.arange(10000000)) and np.arange(10000000).tolist() should be about the same time.
So, in terms of iteration, the primary advantage of using numpy is that you don't need to iterate. Operation are applied in an vectorized fashion over the array. Iteration just slows it down. If you find yourself iterating over array elements, you should look into finding a way to restructure the algorithm you are attempting, in such a way that is uses only numpy operations (it has soooo many built-in!) or if really necessary you can use np.apply_along_axis, np.apply_over_axis, or np.vectorize.
These are my timings on a slower machine
In [1034]: timeit [i for i in np.arange(10000000)]
1 loop, best of 3: 2.16 s per loop
If I generate the range directly (Py3 so this is a genertor) times are much better. Take this a baseline for a list comprehension of this size.
In [1035]: timeit [i for i in range(10000000)]
1 loop, best of 3: 1.26 s per loop
tolist converts the arange to a list first; takes a bit longer, but the iteration is still on a list
In [1036]: timeit [i for i in np.arange(10000000).tolist()]
1 loop, best of 3: 1.6 s per loop
Using list() - same time as direct iteration on the array; that suggests that the direct iteration first does this.
In [1037]: timeit [i for i in list(np.arange(10000000))]
1 loop, best of 3: 2.18 s per loop
In [1038]: timeit np.arange(10000000).tolist()
1 loop, best of 3: 927 ms per loop
same times a iterating on the .tolist
In [1039]: timeit list(np.arange(10000000))
1 loop, best of 3: 1.55 s per loop
In general if you must loop, working on a list is faster. Access to elements of a list is simpler.
Look at the elements returned by indexing.
a[0] is another numpy object; it is constructed from the values in a, but not simply a fetched value
list(a)[0] is the same type; the list is just [a[0], a[1], a[2]]]
In [1043]: a = np.arange(3)
In [1044]: type(a[0])
Out[1044]: numpy.int32
In [1045]: ll=list(a)
In [1046]: type(ll[0])
Out[1046]: numpy.int32
but tolist converts the array into a pure list, in this case, as list of ints. It does more work than list(), but does it in compiled code.
In [1047]: ll=a.tolist()
In [1048]: type(ll[0])
Out[1048]: int
In general don't use list(anarray). It rarely does anything useful, and is not as powerful as tolist().
What's the fastest way to iterate through array - None. At least not in Python; in c code there are fast ways.
a.tolist() is the fastest, vectorized way of creating a list integers from an array. It iterates, but does so in compiled code.
But what is your real goal?
The speedup via tolist only holds for 1D arrays. Once you add a second axis, the performance gain disappears:
1D
import numpy as np
import timeit
num_repeats = 10
x = np.arange(10000000)
via_tolist = timeit.timeit("[i for i in x.tolist()]", number=num_repeats, globals={"x": x})
direct = timeit.timeit("[i for i in x]",number=num_repeats, globals={"x": x})
print(f"tolist: {via_tolist / num_repeats}")
print(f"direct: {direct / num_repeats}")
tolist: 0.430838281600154
direct: 0.49088368080047073
2D
import numpy as np
import timeit
num_repeats = 10
x = np.arange(10000000*10).reshape(-1, 10)
via_tolist = timeit.timeit("[i for i in x.tolist()]", number=num_repeats, globals={"x": x})
direct = timeit.timeit("[i for i in x]", number=num_repeats, globals={"x": x})
print(f"tolist: {via_tolist / num_repeats}")
print(f"direct: {direct / num_repeats}")
tolist: 2.5606724178003786
direct: 1.2158976945000177
My test case has an numpy array
[[ 34 107]
[ 963 144]
[ 921 1187]
[ 0 1149]]
I'm going through this only once using range and enumerate
USING range
loopTimer1 = default_timer()
for l1 in range(0,4):
print(box[l1])
print("Time taken by range: ",default_timer()-loopTimer1)
Result
[ 34 107]
[963 144]
[ 921 1187]
[ 0 1149]
Time taken by range: 0.0005405639985838206
USING enumerate
loopTimer2 = default_timer()
for l2,v2 in enumerate(box):
print(box[l2])
print("Time taken by enumerate: ", default_timer() - loopTimer2)
Result
[ 34 107]
[963 144]
[ 921 1187]
[ 0 1149]
Time taken by enumerate: 0.00025605700102460105
This test case I picked enumerate will works faster

Python Typed Array of a Certain Size

This will create an empty array of type signed int:
import array
a = array.array('i')
What is an efficient (performance-wise) way to specify the array lengh (as well as the array's rank - number of dimensions)?
I understand that NumPy allows to specify array size at creation, but can it be done in standard Python?
Initialising an array of fixed size in python
This deals mostly with lists, as well as no consideration is given to performance. The main reason to use an array instead of a list is performance.
The array constructor accepts as a 2nd argument an iterable. So, the following works to efficiently create and initialize the array to 0..N-1:
x = array.array('i', range(N))
This does not create a separate N element vector or list.
(If using python 2, use xrange instead). Of course, if you need different initialization you may use generator object instead of range. For example, you can use generator expressions to fill the array with zeros:
a=array.array('i',(0 for i in range(N)))
Python has no 2D (or higher) array. You have to construct one from a list of 1D arrays.
The truth is, if you are looking for a high performance implementation, you should probably use Numpy.
It's simple and fast to just use:
array.array('i', [0]) * n
Timing of different ways to initialize an array in my machine:
n = 10 ** 7
array('i', [0]) * n # 21.9 ms
array('i', [0]*n) # 395.2 ms
array('i', range(n)) # 810.6 ms
array('i', (0 for _ in range(n))) # 1238.6 ms
You said
The main reason to use an array instead of a list is performance.
Surely arrays use less memory than lists.
But by my experiment, I found no evidence that an array is always faster than a normal list.

How to efficiently construct a numpy array from a large set of data?

If I have a huge list of lists in memory and I wish to convert it into an array, does the naive approach cause python to make a copy of all the data, taking twice the space in memory? Should I convert a list of lists, vector by vector instead by popping?
# for instance
list_of_lists = [[...], ..., [...]]
arr = np.array(list_of_lists)
Edit:
Is it better to create an empty array of a known size and then populate it incrementally thus avoiding the list_of_lists object entirely? Could this be accomplished by something as simply as some_array[i] = some_list_of_float_values?
I'm just puttign theis here as it's a bit long for a comment.
Have you read the numpy documentation for array?
numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)
"""
...
copy : bool, optional
If true (default), then the object is copied. Otherwise, a copy will
only be made if __array__ returns a copy, if obj is a nested sequence,
or if a copy is needed to satisfy any of the other requirements (dtype,
order, etc.).
...
"""
When you say you don't want to copy the data of the original array when creating the numpy array, what data structure are you hoping to end up with?
A lot of the speed up you get from using numpy is because the C arrays that are created are contiguous in memory. An array in python is just an array of pointers to objects, so you have to go and find the objects every time - which isn't the case in numpy, as it's not written in python.
If you want to just have the numpy array reference the python arrays in your 2D array, then you'll lose the performance gains.
if you do np.array(my_2D_python_array, copy=False) i don't know what it will actually produce, but you could easily test it yourself. Look at the shape of the array, and see what kind of objects it houses.
If you want the numpy array to be contiguous though, as some point you're going to have to allocate all of the memory it needs (which if it's as large as you're suggesting, it sounds like it might be difficult to find a contiguous section large enough).
Sorry that was pretty rambling, just a comment. How big are the actual arrays you're looking at?
Here's a plot of the cpu usage and memory usage of a small sample program:
from __future__ import division
#Make a large python 2D array
N, M = 10000, 18750
print "%i x %i = %i doubles = %f GB" % (N, M, N * M, N*M*8/10**9)
#grab pid to moniter memory and cpu usage
import os
pid = os.getpid()
os.system("python moniter.py -p " + str(pid) + " &")
print "building python matrix"
large_2d_array = [[n + m*M for n in range(N)] for m in range(M)]
import numpy
from datetime import datetime
print datetime.now(), "creating numpy array with copy"
np1 = numpy.array(large_2d_array, copy=True)
print datetime.now(), "deleting array"
del(np1)
print datetime.now(), "creating numpy array with copy"
np1 = numpy.array(large_2d_array, copy=False)
print datetime.now(), "deleting array"
del(np1)
1, 2, and 3 are the points where each of the matrices finish being created. Note that the native python array takes up much more memory than the numpy ones - python objects each have their own overhead, and the lists are lists of objects. For the numpy array this is not the case, so it is considerably smaller.
Also note that using the copy on the python object has no effect - new data is always created. You could get around this by creating a numpy array of python objects (using dtype=object), but i wouldn't advise it.

Python numpy array vs list

I need to perform some calculations a large list of numbers.
Do array.array or numpy.array offer significant performance boost over typical arrays?
I don't have to do complicated manipulations on the arrays, I just need to be able to access and modify values,
e.g.
import numpy
x = numpy.array([0] * 1000000)
for i in range(1,len(x)):
x[i] = x[i-1] + i
So I will not really be needing concatenation, slicing, etc.
Also, it looks like array throws an error if I try to assign values that don't fit in C long:
import numpy
a = numpy.array([0])
a[0] += 1232234234234324353453453
print(a)
On console I get:
a[0] += 1232234234234324353453453
OverflowError: Python int too large to convert to C long
Is there a variation of array that lets me put in unbounded Python integers?
Or would doing it that way take away the point of having arrays in the first place?
You first need to understand the difference between arrays and lists.
An array is a contiguous block of memory consisting of elements of some type (e.g. integers).
You cannot change the size of an array once it is created.
It therefore follows that each integer element in an array has a fixed size, e.g. 4 bytes.
On the other hand, a list is merely an "array" of addresses (which also have a fixed size).
But then each element holds the address of something else in memory, which is the actual integer that you want to work with. Of course, the size of this integer is irrelevant to the size of the array. Thus you can always create a new (bigger) integer and "replace" the old one without affecting the size of the array, which merely holds the address of an integer.
Of course, this convenience of a list comes at a cost: Performing arithmetic on the integers now requires a memory access to the array, plus a memory access to the integer itself, plus the time it takes to allocate more memory (if needed), plus the time required to delete the old integer (if needed). So yes, it can be slower, so you have to be careful what you're doing with each integer inside an array.
Your first example could be speed up. Python loop and access to individual items in a numpy array are slow. Use vectorized operations instead:
import numpy as np
x = np.arange(1000000).cumsum()
You can put unbounded Python integers to numpy array:
a = np.array([0], dtype=object)
a[0] += 1232234234234324353453453
Arithmetic operations compared to fixed-sized C integers would be slower in this case.
For most uses, lists are useful. Sometimes working with numpy arrays may be more convenient for example.
a=[1,2,3,4,5,6,7,8,9,10]
b=[5,8,9]
Consider a list 'a' and if you want access the elements in a list at discrete indices given in list 'b'
writing
a[b]
will not work.
but when you use them as arrays, you can simply write
a[b]
to get the output as array([6,9,10]).
Do array.array or numpy.array offer significant performance boost over
typical arrays?
I tried to test this a bit with the following code:
import timeit, math, array
from functools import partial
import numpy as np
# from the question
def calc1(x):
for i in range(1,len(x)):
x[i] = x[i-1] + 1
# a floating point operation
def calc2(x):
for i in range(0,len(x)):
x[i] = math.sin(i)
L = int(1e5)
# np
print('np 1: {:.5f} s'.format(timeit.timeit(partial(calc1, np.array([0] * L)), number=20)))
print('np 2: {:.5f} s'.format(timeit.timeit(partial(calc2, np.array([0] * L)), number=20)))
# np but with vectorized form
vfunc = np.vectorize(math.sin)
print('np 2 vectorized: {:.5f} s'.format(timeit.timeit(partial(vfunc, np.arange(0, L)), number=20)))
# with list
print('list 1: {:.5f} s'.format(timeit.timeit(partial(calc1, [0] * L), number=20)))
print('list 2: {:.5f} s'.format(timeit.timeit(partial(calc2, [0] * L), number=20)))
# with array
print('array 1: {:.5f} s'.format(timeit.timeit(partial(calc1, array.array("f", [0] * L)), number=20)))
print('array 2: {:.5f} s'.format(timeit.timeit(partial(calc2, array.array("f", [0] * L)), number=20)))
And the results were that list executes fastest here (Python 3.3, NumPy 1.8):
np 1: 2.14277 s
np 2: 0.77008 s
np 2 vectorized: 0.44117 s
list 1: 0.29795 s
list 2: 0.66529 s
array 1: 0.66134 s
array 2: 0.88299 s
Which seems to be counterintuitive. There doesn't seem to be any advantage in using numpy or array over list for these simple examples.
To OP: For your use case use lists.
My rules for when to use which, considering robustness and speed:
list: (most robust, fastest for mutable cases)
Ex. When your list is constantly mutating as in a physics simulation. When you are "creating" data from scratch that may be unpredictable in nature.
np.arrary: (less robust, fastest for linear algebra & data post processing)
Ex. When you are "post processing" a data set that you have already collected via sensors or a simulation; performing operations that can be vectorized.
Do array.array or numpy.array offer significant performance boost over typical arrays?
It can, depending on what you're doing.
Or would doing it that way take away the point of having arrays in the first place?
Pretty much, yeah.
use a=numpy.array(number_of_elements, dtype=numpy.int64) which should give you an array of 64-bit integers. These can store any integer number between -2^63 and (2^63)-1 (approximately between -10^19 and 10^19) which is usually more than enough.

Categories

Resources