How to convert specific elements within a numpy array to integers? - python

I've written a script that gives me the result of dividing two variables ("A" and "B") -- and the output of each variable is a numpy array with 26 elements. Usually, with any two elements from "A" and "B," the result of the operation is a float, and the the element in the output array that corresponds to that operation shows up as a float. But strangely, even if the output is supposed to be an integer (almost always 0 or 1), the integer will show up as "0." or "1." in the output array. Is there any way to turn these specific elements of the array back into integers, rather than keep them as floats?
I'd like to write a simple if statement that will convert any output elements that are supposed to be integers back into integers (i.e., make "0." into "0"). But I'm having some trouble with that. Any ideas?

You will probably want to read about data types:
http://docs.scipy.org/doc/numpy/user/basics.types.html
An entire numpy array has a datatype. For the operation you are doing, it would not make sense to ask that A/B sometimes be integer and sometime be float: the division of two float arrays is a float array.
Complication: it is possible to specify mixed-type arrays:
http://docs.scipy.org/doc/numpy/user/basics.rec.html#structured-arrays

The strength of Numpy arrays is that many low-level operations can be quickly performed on the data because most (not all) types used by these arrays have a fixed-size in memory. For instance, the floats you are using probably require 8 bytes each. The most important thing in that case is that all datas share the same type and fit in the same amount of memory. You can play a little around that if you really want (and need) to, but I would not suggest you to start by such special cases. Try to learn the strength of these arrays when used with this requirement (but this involves accepting the fact that you can't mix integers and floats in the same array).

Related

How to create a numpy array without having a specified number of elements?

I want to create a numpy array that can hold integer values, without knowing a priori how many integer values it can hold. However, I do know the maximum number of integer values that may be held.
With lists, I just create an empty list, and append to the list as and when a new 'suitable' number is found. However, I cannot do such a thing with an array. The closest I have is to create a zero array using np.zeros(N)(with N being the max possible number of integers) but then changing elements to the required number as and when needed. However, the problem with this is that I will still have zero elements, which I cannot delete/pop off, like I can with lists.
Is there a workaround?
'Arrays' are supposed to be a fixed length data structure which is why you might not be able to find a direct solution to your problem.
It is even better if you make a list and convert it into numpy array.
arr=[]
for i in some_list:
arr.append(i)
x=np.array(arr)
x is your numpy array now.

How to iterate over each `numpy.float16`

For a certain task, I have too many repeated calls to a complex function, call it f(x) where x is float. I do not have very large floats and not too much precision is required, so I thought why not use a lookup table for f(x), where x is a float16, maximum size of lookup table is (2**16). I was planning on making a small python demo using np.float16. I am a bit stuck on how to iterate over range of all floats. In C/C++, I would have used an uint16_t, kept incrementing it. How do I create this table using python ?
You can generate all the possible values using arange and then reinterpret the values as float16 values using view. Here is an example:
np.arange(65536, dtype=np.uint16).view(np.float16)
This should give you all possible float16 values. Note that many are NaN values.

How can I make my Python program use 4 bytes for an int instead of 24 bytes?

To save memory, I want to use less bytes (4) for each int I have instead of 24.
I looked at structs, but I don't really understand how to use them.
https://docs.python.org/3/library/struct.html
When I do the following:
myInt = struct.pack('I', anInt)
sys.getsizeof(myInt) doesn't return 4 like I expected.
Is there something that I am doing wrong? Is there another way for Python to save memory for each variable?
ADDED: I have 750,000,000 integers in an array that I wish to be able to use given an index.
If you want to hold many integers in an array, use a numpy ndarray. Numpy is a very popular third-party package that handles arrays more compactly than Python alone does. Numpy is not in the standard library so that it could be updated more frequently than Python itself is updated--it was considered to be added to the standard library. Numpy is one of the reasons Python has become so popular for Data Science and for other scientific uses.
Numpy's np.int32 type uses four bytes for an integer. Declare your array full of zeros with
import numpy as np
myarray = np.zeros((750000000,), dtype=np.int32)
Or if you just want the array and do not want to spend any time initializing the values,
myarray = np.empty((750000000,), dtype=np.int32)
You then fill and use the array as you like. There is some Python overhead for the complete array, so the array's size will be slightly larger than 4 * 750000000, but the size will be close.

Quick way to access first element in Numpy array with arbitrary number of dimensions?

I have a function that I want to have quickly access the first (aka zeroth) element of a given Numpy array, which itself might have any number of dimensions. What's the quickest way to do that?
I'm currently using the following:
a.reshape(-1)[0]
This reshapes the perhaps-multi-dimensionsal array into a 1D array and grabs the zeroth element, which is short, sweet and often fast. However, I think this would work poorly with some arrays, e.g., an array that is a transposed view of a large array, as I worry this would end up needing to create a copy rather than just another view of the original array, in order to get everything in the right order. (Is that right? Or am I worrying needlessly?) Regardless, it feels like this is doing more work than what I really need, so I imagine some of you may know a generally faster way of doing this?
Other options I've considered are creating an iterator over the whole array and drawing just one element from it, or creating a vector of zeroes containing one zero for each dimension and using that to fancy-index into the array. But neither of these seems all that great either.
a.flat[0]
This should be pretty fast and never require a copy. (Note that a.flat is an instance of numpy.flatiter, not an array, which is why this operation can be done without a copy.)
You can use a.item(0); see the documentation at numpy.ndarray.item.
A possible disadvantage of this approach is that the return value is a Python data type, not a numpy object. For example, if a has data type numpy.uint8, a.item(0) will be a Python integer. If that is a problem, a.flat[0] is better--see #user2357112's answer.
np.hsplit(x, 2)[0]
Source: https://numpy.org/doc/stable/reference/generated/numpy.dsplit.html
Source:
https://numpy.org/doc/stable/reference/generated/numpy.hsplit.html
## y -- numpy array of shape (1, Ty)
if you want to get the first element:
use y.shape[0]
if you want to get the second element:
use y.shape[1]
Source:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html
You can also use the take for more complicated extraction (to get few elements):
numpy.take(a, indices, axis=None, out=None, mode='raise')[source] Take
elements from an array along an axis.

Long integer shape of Numpy arrays

If I construct a numpy matrix like this:
A = array([[1,2,3],[4,5,6]])
and then type A.shape I get the result:
(2L, 3L)
Why am I getting a shape with the format long?
I can restart everything and I still have the same problem. And as far as I can see, it is only when I construct arrays I have this problem, otherwise I get short (regular) integers.
As #CédricJulien puts it on the comment, there is no problem with long numbers in this case - this should be treated as an implementation detail.
The real answer for your question can, of course, only be found inside numpy's source code, but the fact that the dimensions are long in this case should not matter for any use you have for the arrays or these indexes.

Categories

Resources