how does memory allocation occur in numpy array?

how does memory allocation occur in numpy array? - python

import numpy as np
a = np.arange(5)
for i in a:
print("Id of {} : {} \n".format(i,id(i)))
>>>>
Id of 0 : 2295176255984
Id of 1 : 2295176255696
Id of 2 : 2295176255984
Id of 3 : 2295176255696
Id of 4 : 2295176255984
I want to understand how the elements of numpy array are being allocated in the memory, which I understand is different from that of Python arrays seeing the output.
Any help is appreciated.

In [68]: arr = np.arange(5)
In [69]: arr
Out[69]: array([0, 1, 2, 3, 4])
One way of viewing the attributes of a numpy array is:
In [70]: arr.__array_interface__
Out[70]:
{'data': (139628245945184, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (5,),
'version': 3}
data is something like the id of its data buffer, where the values are actually stored. We can't use this number in other code, but it is useful when checking for things like view. The rest is used to interpret those values.
The memory for arr is a c array 40 bytes long (5*8) somewhere. That where does not matter to us. Any view of arr will work with the same data buffer. a copy will have its own data buffer.
Iterating on the array is like accessing values one by one:
In [71]: i = arr[1]
In [72]: i
Out[72]: 1
In [73]: type(i)
Out[73]: numpy.int64
This i is not a reference to an element of a. It is new object with the same value. It's a lot like a 0d array, with many of the same attributes, including:
In [74]: i.__array_interface__
Out[74]:
{'data': (25251568, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (),
'version': 3,
'__ref': array(1)}
This is why you can't make much sense from looking at the id in the iteration. It is also why iterating on a numpy array is slower than iterating on list. We strongly discourage iteration like this.
Contrast that with a list, where elements are stored (in some sort of data buffer) by reference:
In [78]: a,b,c = 100,'b',{}
In [79]: id(a)
Out[79]: 9788064
In [80]: alist=[a,b,c]
In [81]: id(alist[0])
Out[81]: 9788064
The list actually contains a, or if you prefer a reference to the same object that the variable a references. Remember, Python is object oriented all the way down.
In sum, Python lists contain references. Numpy arrays contain values, which its own methods access and manipulate. There is an object dtype that does contain references, but let's not go there.

I'm a fan of Code with Mosh. He teaches all such kind of things on his youtube channel as well as udemy. I've purchased his udemy course on Data structures and Algorithms which goes deep into how something works.
For example, while teaching about an array, he shows how to make an array so as to understand the underlying concepts behind it.
You can take a look here: https://www.youtube.com/watch?v=BBpAmxU_NQo
If you're only interested in knowing about only the NumPy array:
First I'll tell you about the differences:
Difference between NumPy and an Array
Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays. A NumPy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.
The Python core library provided Lists. A list is the Python equivalent of an array, but is resizeable and can contain elements of different types.
A common beginner question is what is the real difference here. The answer is performance. Numpy data structures perform better in:
Size - Numpy data structures take up less space
Performance - they have a need for speed and are faster than lists
Functionality - SciPy and NumPy have optimized functions such as linear algebra operations built-in.
Another key notable difference is in how they store and make use of memory
Memory
The main benefits of using NumPy arrays should be smaller memory consumption and better runtime behaviour.
For Python Lists - We can conclude from this that for every new element, we need another eight bytes for the reference to the new object. The new integer object itself consumes 28 bytes.
NumPy takes up less space. This means that an arbitrary integer array of length "n" in NumPy needs
If you are curious and want me to prove that NumPy really takes less time:
# importing required packages
import numpy
import time
# size of arrays and lists
size = 1000000
# declaring lists
list1 = range(size)
list2 = range(size)
# declaring arrays
array1 = numpy.arange(size)
array2 = numpy.arange(size)
# capturing time before the multiplication of Python lists
initialTime = time.time()
# multiplying elements of both the lists and stored in another list
resultantList = [(a * b) for a, b in zip(list1, list2)]
# calculating execution time
print("Time taken by Lists to perform multiplication:",
(time.time() - initialTime),
"seconds")
# capturing time before the multiplication of Numpy arrays
initialTime = time.time()
# multiplying elements of both the Numpy arrays and stored in another Numpy array
resultantArray = array1 * array2
# calculating execution time
print("Time taken by NumPy Arrays to perform multiplication:",
(time.time() - initialTime),
"seconds")
Output:
Time taken by Lists : 0.15030384063720703 seconds
Time taken by NumPy Arrays : 0.005921125411987305 seconds
Wait.. There is a very big disadvantage too:
Requires continuous allocation of memory -
Insertion and deletion operations can become costly as data is stored in contiguous memory locations as shifting it requires shifting.
If you want to learn more about numpy:
https://www.educba.com/introduction-to-numpy/
You can thank me later!

Related

Difference between list and NumPy array memory size

I've heard that Numpy arrays are more efficient then python built in list and that they take less space in memory. As I understand Numpy stores this objects next to each other in memory, while python implementation of the list stores 8 bytes pointers to given values. However, when I try to test in jupyter notebook it turns out that both objects have same size.
import numpy as np
from sys import getsizeof
array = np.array([_ for _ in range(4)])
getsizeof(array), array
Returns (128, array([0, 1, 2, 3]))
Same as:
l = list([_ for _ in range(4)])
getsizeof(l), l
Gives (128, [0, 1, 2, 3])
Can you provide any clear example on how can I show that in jupyter notebook?

getsizeof is not a good measure of memory use, especially with lists. As you note the list has a buffer of pointers to objects elsewhere in memory. getsizeof notes the size of the buffer, but tells us nothing about the objects.
With
In [66]: list(range(4))
Out[66]: [0, 1, 2, 3]
the list has its basic object storage, plus the buffer with 4 pointers (plus some growth room). The numbers are stored else where. In this case the numbers are small, and already created and cached by the interpreter. So their storage doesn't add anything. But larger numbers (and floats) are created with each use, and take up space. Also a list can contain anything, such as pointers to other lists, or strings or dicts, or what ever.
In [67]: arr = np.array([i for i in range(4)]) # via list
In [68]: arr
Out[68]: array([0, 1, 2, 3])
In [69]: np.array(range(4)) # more direct
Out[69]: array([0, 1, 2, 3])
In [70]: np.arange(4)
Out[70]: array([0, 1, 2, 3]) # faster
arr too has a basic object storage with attributes like shape and dtype. It too has a databuffer, but for a numeric dtype like this, that buffer has actual numeric values (8 byte integers), not pointers to Python integer objects.
In [71]: arr.nbytes
Out[71]: 32
That data buffer only takes 32 bytes - 4*8.
For this small example it's not surprising that getsizeof returns the same thing. The basic object storage is more significant than where the 4 values are stored. It's when working with 1000's of values, and multidimensional arrays that memory use is significantly different.
But more important is the calculation speeds. With an array you can do things like arr+1 or arr.sum(). These operate in compiled code, and are quite fast. Similar list operations have to iterate, at slow Python speeds, though the pointers, fetching values etc. But doing the same sort of iteration on arrays is even slower.
As a general rule, if you start with lists, and do list operations such as append and list comprehensions, it's best to stick with them.
But if you can create the arrays once, or from other arrays, and then use numpy methods, you'll get 10x speed improvements. Arrays are indeed faster, but only if you use them in the right way. They aren't a simple drop in substitute for lists.

NumPy array has general array information on the array object header (like shape,data type etc.). All the values stored in continous block of memory. But lists allocate new memory block for every new object and stores their pointer. So when you iterate over, you are not directly iterating on memory. you are iterating over pointers. So it is not handy when you are working with large data. Here is an example:
import sys
import numpy as np
random_values_numpy=np.arange(1000)
random_values=range(1000)
#Numpy
print(random_values_numpy.itemsize)
print(random_values_numpy.size*random_values_numpy.itemsize)
#PyList
print(sys.getsizeof(random_values))
print(sys.getsizeof(random_values)*len(random_values))

Numpy view contiguous part of non-contiguous array as dtype of bigger size

I was trying to generate an array of trigrams (i.e. continuous-three-letter combinations) from a super long char array:
# data is actually load from a source file
a = np.random.randint(0, 256, 2**28, 'B').view('c')
Since making copy is not efficient (and it creates problems like cache miss), I directly generated the trigram using stride tricks:
tri = np.lib.stride_tricks.as_strided(a, (len(a) - 2, 3), a.strides * 2)
This generates a trigram list with shape (2**28 - 2, 3) where each row is a trigram. Now I want to convert the trigram to a list of string (i.e. S3) so that numpy displays it more "reasonably" (instead of individual chars).
tri = tri.view('S3')
It gives the exception:
ValueError: To change to a dtype of a different size, the array must be C-contiguous
I understand generally data should be contiguous in order to create a meaningful view, but this data is contiguous at "where it should be": each three elements are contiguous.
So I'm wondering how to view contiguous part in non-contiguous np.ndarray as dtype of bigger size? A more "standard" way would be better, while hackish ways are also welcome. It seems that I can set shape and stride freely with np.lib.stride_tricks.as_strided, but I can't force the dtype to be something, which is the problem here.
EDIT
Non-contiguous array can be made by simple slicing. For example:
np.empty((8, 4), 'uint32')[:, :2].view('uint64')
will throw the same exception above (while from a memory point of view I should be able to do this). This case is much more common than my example above.

If you have access to a contiguous array from which your non-contiguous one is derived, it should typically be possible to work around this limitation.
For example your trigrams can be obtained like so:
>>> a = np.random.randint(0, 256, 2**28, 'B').view('c')
>>> a
array([b')', b'\xf2', b'\xf7', ..., b'\xf4', b'\xf1', b'z'], dtype='|S1')
>>> np.lib.stride_tricks.as_strided(a[:0].view('S3'), ((2**28)-2,), (1,))
array([b')\xf2\xf7', b'\xf2\xf7\x14', b'\xf7\x14\x1b', ...,
b'\xc9\x14\xf4', b'\x14\xf4\xf1', b'\xf4\xf1z'], dtype='|S3')
In fact, this example demonstrates that all we need is a contiguous "stub" at the memory buffer's base for view casting, since afterwards, because as_strided does not do many checks we are essentially free to do whatever we like.
It seems we can always get such a stub by slicing to a size 0 array. For your second example:
>>> X = np.empty((8, 4), 'uint32')[:, :2]
>>> np.lib.stride_tricks.as_strided(X[:0].view(np.uint64), (8, 1), X.strides)
array([[140133325248280],
[ 32],
[ 32083728],
[ 31978800],
[ 0],
[ 29686448],
[ 32],
[ 32362720]], dtype=uint64)

As of numpy 1.23.0, you will be able to do exactly what you want without jumping through extra hoops. I've added PR#20722 to numpy to address pretty much this exact issue. The idea is that if your new dtype is smaller than the current, you can clearly expand a unit or contiguous axis without any problems. If the new dtype is larger, you can shrink a contiguous axis.
With the update, your code runs out of the box:
>>> a = np.random.randint(0, 256, 2**28, 'B').view('c')
>>> a
array([b'\x19', b'\xf9', b'\r', ..., b'\xc3', b'\xa3', b'{'], dtype='|S1')
>>> tri = np.lib.stride_tricks.as_strided(a, (len(a)-2,3), a.strides*2)
>>> tri.view('S3')
array([[b'\x9dB\xeb'],
[b'B\xebU'],
[b'\xebU\xa4'],
...,
[b'-\xcbM'],
[b'\xcbM\x97'],
[b'M\x97o']], dtype='|S3')
The array has to have a unit dimension or be contiguous in the last axis, which is true in your case.
I've also added PR#20694 to introduce slicing to the np.char module. If that PR gets accepted as-is, you will be able to do:
>>> np.char.slice_(a.view(f'U{len(a)}'), step=1, chunksize=3)

Memory Efficiency of NumPy

While learning NumPy, I came across its advantage that,
NumPy requires less memory than traditional list.
import numpy as np
import sys
# Less Memory
l = range(1000)
print(sys.getsizeof(l[3])*len(l))
p = np.arange(1000)
print(p.itemsize*p.size)
this looks convincing, but than when I try,
print(sys.getsizeof(p[3])*len(p))
It shows higher memory size than list.
Can someone help me out understanding this behavior.

First off all, as mentioned in comments getsizeof() is not a good function to relay on for this purpose, because it does not have to hold true for third-party extensions as it is implementation specific. Also, as mentioned in documentation, if you want to find the size of containers and all their contents, there is a recipe available at: https://code.activestate.com/recipes/577504/.
Now, regarding the Numpy arrays, it's very important to know how Numpy determines its arrays' types. For that purpose, you can read: How does numpy determin the array's dtype and what it means?
To sum up, the most important reason that Numpy performs better in memory managements is that it provides a wide variety of types that you can use for different kinds of data. You can read about Numpy's datatypes here: https://docs.scipy.org/doc/numpy-1.14.0/user/basics.types.html. Another reason is that Numpy is a library designed to work with matrices and arrays and for that reason there are many under the hood optimizations on how their items consume the memory.
Also, it's note worthy that Python provides an array module designed to perform efficiently by using constrained item types.
Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character.

It's easier to understand the memory use of arrays:
In [100]: p = np.arange(10)
In [101]: sys.getsizeof(p)
Out[101]: 176
In [102]: p.itemsize*p.size
Out[102]: 80
The databuffer of p is 80 bytes long. The rest of p is object overhead, attributes like shape, strides, etc.
An indexed element of the array is a numpy object.
In [103]: q = p[0]
In [104]: type(q)
Out[104]: numpy.int64
In [105]: q.itemsize*q.size
Out[105]: 8
In [106]: sys.getsizeof(q)
Out[106]: 32
So this multiplication doesn't tell us anything useful:
In [109]: sys.getsizeof(p[3])*len(p)
Out[109]: 320
Though it may help us estimate the size of this list:
In [110]: [i for i in p]
Out[110]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [111]: type(_[0])
Out[111]: numpy.int64
In [112]: sys.getsizeof(__)
Out[112]: 192
The list of 10 int64 objects occupies 320+192 bytes, more or less (the list overhead and its pointer buffer plus the size objects pointed to).
We can extract an int object from the array with item:
In [115]: p[0].item()
Out[115]: 0
In [116]: type(_)
Out[116]: int
In [117]: sys.getsizeof(p[0].item())
Out[117]: 24
Lists of the same len can have differing size, depending on how much growth space they have:
In [118]: sys.getsizeof(p.tolist())
Out[118]: 144
Further complicating things is the fact that small integers have a different storage than large ones - ones below 256 are unique.

How to efficiently construct a numpy array from a large set of data?

If I have a huge list of lists in memory and I wish to convert it into an array, does the naive approach cause python to make a copy of all the data, taking twice the space in memory? Should I convert a list of lists, vector by vector instead by popping?
# for instance
list_of_lists = [[...], ..., [...]]
arr = np.array(list_of_lists)
Edit:
Is it better to create an empty array of a known size and then populate it incrementally thus avoiding the list_of_lists object entirely? Could this be accomplished by something as simply as some_array[i] = some_list_of_float_values?

I'm just puttign theis here as it's a bit long for a comment.
Have you read the numpy documentation for array?
numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)
"""
...
copy : bool, optional
If true (default), then the object is copied. Otherwise, a copy will
only be made if __array__ returns a copy, if obj is a nested sequence,
or if a copy is needed to satisfy any of the other requirements (dtype,
order, etc.).
...
"""
When you say you don't want to copy the data of the original array when creating the numpy array, what data structure are you hoping to end up with?
A lot of the speed up you get from using numpy is because the C arrays that are created are contiguous in memory. An array in python is just an array of pointers to objects, so you have to go and find the objects every time - which isn't the case in numpy, as it's not written in python.
If you want to just have the numpy array reference the python arrays in your 2D array, then you'll lose the performance gains.
if you do np.array(my_2D_python_array, copy=False) i don't know what it will actually produce, but you could easily test it yourself. Look at the shape of the array, and see what kind of objects it houses.
If you want the numpy array to be contiguous though, as some point you're going to have to allocate all of the memory it needs (which if it's as large as you're suggesting, it sounds like it might be difficult to find a contiguous section large enough).
Sorry that was pretty rambling, just a comment. How big are the actual arrays you're looking at?
Here's a plot of the cpu usage and memory usage of a small sample program:
from __future__ import division
#Make a large python 2D array
N, M = 10000, 18750
print "%i x %i = %i doubles = %f GB" % (N, M, N * M, N*M*8/10**9)
#grab pid to moniter memory and cpu usage
import os
pid = os.getpid()
os.system("python moniter.py -p " + str(pid) + " &")
print "building python matrix"
large_2d_array = [[n + m*M for n in range(N)] for m in range(M)]
import numpy
from datetime import datetime
print datetime.now(), "creating numpy array with copy"
np1 = numpy.array(large_2d_array, copy=True)
print datetime.now(), "deleting array"
del(np1)
print datetime.now(), "creating numpy array with copy"
np1 = numpy.array(large_2d_array, copy=False)
print datetime.now(), "deleting array"
del(np1)
1, 2, and 3 are the points where each of the matrices finish being created. Note that the native python array takes up much more memory than the numpy ones - python objects each have their own overhead, and the lists are lists of objects. For the numpy array this is not the case, so it is considerably smaller.
Also note that using the copy on the python object has no effect - new data is always created. You could get around this by creating a numpy array of python objects (using dtype=object), but i wouldn't advise it.

how to take a matrix in python?

i want to create a matrix of size 1234*5678 with it being filled with 1 to 5678 in row major order?>..!!

I think you will need to use numpy to hold such a big matrix efficiently , not just computation. You have ~5e6 items of 4/8 bytes means 20/40 Mb in pure C already, several times of that in python without an efficient data structure (a list of rows, each row a list).
Now, concerning your question:
import numpy as np
a = np.empty((1234, 5678), dtype=np.int)
a[:] = np.linspace(1, 5678, 5678)
You first create an array of the requested size, with type int (I assume you know you want 4 bytes integer, which is what np.int will give you on most platforms). The 3rd line uses broadcasting so that each row (a[0], a[1], ... a[1233]) is assigned the values of the np.linspace line (which gives you an array of [1, ....., 5678]). If you want F storage, that is column major:
a = np.empty((1234, 4567), dtype=np.int, order='F')
...
The matrix a will takes only a tiny amount of memory more than an array in C, and for computation at least, the indexing capabilities of arrays are much better than python lists.
A nitpick: numeric is the name of the old numerical package for python - the recommended name is numpy.

Or just use Numerical Python if you want to do some mathematical stuff on matrix too (like multiplication, ...). If they use row major order for the matrix layout in memory I can't tell you but it gets coverd in their documentation

Here's a forum post that has some code examples of what you are trying to achieve.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.