I was inquiring myself the best way (or any good way actually) of simulating a small ram memory in Python.
In most languages, I would simply create a fixed size array of char, but this seems to be surprisingly complex in Python.
The closest thing I found was this:
self.two_KB_internal_ram = [] #goes from $0000-$07FF
for x in range (2048):
self.two_KB_internal_ram = 0
print ("two_KB_internal_ram: ", type(self.two_KB_internal_ram))
However, the type shows that the type is an int, and not char.
Is there a way of doing this with chars? If not (or even if there is), what would be a good way to emulate a ram memory?
To get a fixed sized Array you can use np.array which ist capable of Holding Strings aswell
Something Like this:
import numpy as np
np.array(["a"] * 2048)
But why would you do that in Python?
Related
To save memory, I want to use less bytes (4) for each int I have instead of 24.
I looked at structs, but I don't really understand how to use them.
https://docs.python.org/3/library/struct.html
When I do the following:
myInt = struct.pack('I', anInt)
sys.getsizeof(myInt) doesn't return 4 like I expected.
Is there something that I am doing wrong? Is there another way for Python to save memory for each variable?
ADDED: I have 750,000,000 integers in an array that I wish to be able to use given an index.
If you want to hold many integers in an array, use a numpy ndarray. Numpy is a very popular third-party package that handles arrays more compactly than Python alone does. Numpy is not in the standard library so that it could be updated more frequently than Python itself is updated--it was considered to be added to the standard library. Numpy is one of the reasons Python has become so popular for Data Science and for other scientific uses.
Numpy's np.int32 type uses four bytes for an integer. Declare your array full of zeros with
import numpy as np
myarray = np.zeros((750000000,), dtype=np.int32)
Or if you just want the array and do not want to spend any time initializing the values,
myarray = np.empty((750000000,), dtype=np.int32)
You then fill and use the array as you like. There is some Python overhead for the complete array, so the array's size will be slightly larger than 4 * 750000000, but the size will be close.
I have a Python program that needs to pass an array to a .dll that is expecting an array of c doubles. This is currently done by the following code, which is the fastest method of conversion I could find:
from array import array
from ctypes import *
import numpy as np
python_array = np.array(some_python_array)
temp = array('d', python_array.astype('float'))
c_double_array = (c_double * len(temp)).from_buffer(temp)
...where 'np.array' is just there to show that in my case the python_array is a numpy array. Let's say I now have two c_double arrays: c_double_array_a and c_double_array_b, the issue I'm having is I would like to append c_double_array_b to c_double_array_a without reconverting to/from whatever python typically uses for arrays. Is there a way to do this with the ctypes library?
I've been reading through the docs here but nothing seems to detail combining two c_type arrays after creation. It is very important in my program that they can be combined after creation, of course it would be trivial to just append python_array_b to python_array_a and then convert but that won't work in my case.
Thanks!
P.S. if anyone knows a way to speed up the conversion code that would also be greatly appreciated, it takes on the order of 150ms / million elements in the array and my program typically handles 1-10 million elements at a time.
Leaving aside the construction of the ctypes arrays (for which Mark's comment is surely relevant), the issue is that C arrays are not resizable: you can't append or extend them. (There do exist wrappers that provide these features, which may be useful references in constructing this.) What you can do is make a new array of the size of the two existing arrays together and then ctypes.memmove them into it. It might be possible to improve the performance by using realloc, but you'd have to go even lower than normal ctypes memory management to use it.
I read (int)32 bit audio data (given as string by previous commands) into a numpy.int32 array with :
myarray = numpy.fromstring(data, dtype=numpy.int32)
But then I want to store it in memory as int16 (I know this will decrease the bit depth / resolution / sound quality) :
myarray = myarray >> 16
my_16bit_array = myarray.astype('int16')
It works very well, but : is there a faster solution? (here I use : a string buffer, 1 array in int32, 1 array in int16 ; I wanted to know if it's possible to save one step)
How about this?
np.fromstring(data, dtype=np.uint16)[0::2]
Note however, that overhead of the kind you describe here is common when working with numpy, and cannot always be avoided. If this kind of overhead isn't acceptable for your application, make sure that you plan ahead to write extension modules for the performance critical parts.
Note: it should be 0::2 or 1::2 depending on the endianness of your platform
Let's say I need to save a matrix(each line corresponds one row) that could be loaded from fortran later. What method should I prefer? Is converting everything to string is the only one approach?
You can save them in binary format as well. Please see the documentation on the struct standard module, it has a pack function for converting Python object into binary data.
For example:
import struct
value = 3.141592654
data = struct.pack('d', value)
open('file.ext', 'wb').write(data)
You can convert each element of your matrix and write to a file. Fortran should be able to load that binary data. You can speed up the process by converting a row as a whole, like this:
row_data = struct.pack('d' * len(matrix_row), *matrix_row)
Please note, that 'd' * len(matrix_row) is a constant for your matrix size, so you need to calculate that format string only once.
I don't know fortran, so it's hard to tell what is easy for you to perform on that side for parsing.
It sounds like your options are either saving the doubles in plaintext (meaning, 'converting' them to string), or in binary (using struct and the likes). The decision for which one is better depends.
I would go with the plaintext solution, as it means the files will be easily readable, and you won't have to mess with different kinds of details (endianity, default double sizes).
But, there are cases where binary is better (for example, if you have a really big list of doubles and space is of importance, or if it is easier for you to parse it and you need the optimization) - but this is likely not your case.
You can use JSON
import json
matrix = [[2.3452452435, 3.34134], [4.5, 7.9]]
data = json.dumps(matrix)
open('file.ext', 'wb').write(data)
File content will look like:
[[2.3452452435, 3.3413400000000002], [4.5, 7.9000000000000004]]
If legibility and ease of access is important (and file size is reasonable), Fortran can easily parse a simple array of numbers, at least if it knows the size of the matrix beforehand (with something like READ(FILE_ID, '2(F)'), I think):
1.234 5.6789e4
3.1415 9.265358978
42 ...
Two nested for loops in your Python code can easily write your matrix in this form.
How to create big array in python, how efficient creating that
in C/C++:
byte *data = (byte*)memalloc(10000);
or
byte *data = new byte[10000];
in python...?
Have a look at the array module:
import array
array.array('B', [0] * 10000)
Instead of passing a list to initialize it, you can pass a generator, which is more memory efficient.
You can pre-allocate a list with:
l = [0] * 10000
which will be slightly faster than .appending to it (as it avoids intermediate reallocations). However, this will generally allocate space for a list of pointers to integer objects, which will be larger than an array of bytes in C.
If you need memory efficiency, you could use an array object. ie:
import array, itertools
a = array.array('b', itertools.repeat(0, 10000))
Note that these may be slightly slower to use in practice, as there is an unboxing process when accessing elements (they must first be converted to a python int object).
You can efficiently create big array with array module, but using it won't be as fast as C. If you intend to do some math, you'd be better off with numpy.array
Check this question for comparison.
Typically with python, you'd just create a list
mylist = []
and use it as an array. Alternatively, I think you might be looking for the array module. See http://docs.python.org/library/array.html.