How to have an array of arrays in Python - python

I'm new to python, but I'm solid in coding in vb.net. I'm trying to hold numerical values in a jagged array; to do this in vb.net I would do the following:
Dim jag(3)() as double
For I = 0 to 3
Redim jag(i)(length of this row)
End
Now, I know python doesn't use explicit declarations like this (maybe it can, but I don't know how!). I have tried something like this;
a(0) = someOtherArray
But that doesn't work - I get the error Can't assign to function call. Any advice on a smoother way to do this? I'd prefer to stay away from using a 2D matrix as the different elements of a (ie. a(0), a(1),...) are different lengths.

arr = [[]]
I'm not sure what you're trying to do, python lists is dynamically assigned, but if you want a predefined length and dimension use list comprehensions.
arr = [[0 for x in range(3)] for y in range(3)]

From Microsoft documentation:
A jagged array is an array whose elements are arrays. The elements of
a jagged array can be of different dimensions and sizes
Python documentation about Data Structures.
You could store a list inside another list or a dictionary that stores a list. Depending on how deep your arrays go, this might not be the best option.
numbersList = []
listofNumbers = [1,2,3]
secondListofNumbers = [4,5,6]
numbersList.append(listofNumbers)
numbersList.append(secondListofNumbers)
for number in numbersList:
print(number)

Related

Getting a single array containing several sub-arrays iteratively

I have a little question about python Numpy. What I want to do is the following:
having two numpy arrays arr1 = [1,2,3] and arr2 = [3,4,5] I would like to obtain a new array arr3 = [[1,2,3],[3,4,5]], but in an iterative way. For a single instance, this is just obtained by typing arr3 = np.array([arr1,arr2]).
What I have instead, are several arrays e.g. [4,3,1 ..], [4,3,5, ...],[1,2,1,...] and I would like to end up with [[4,3,1 ..], [4,3,5, ...],[1,2,1,...]], potentally using a for loop. How should I do this?
EDIT:
Ok I'm trying to add more details to the overall problem. First, I have a list of strings list_strings=['A', 'B','C', 'D', ...]. I'm using a specific method to obtain informative numbers out of a single string, so for example I have method(list_strings[0]) = [1,2,3,...], and I can do this for each single string I have in the initial list.
What I would like to come up with is an iterative for loop to end up having all the numbers extracted from each string in turn in the way I've described at the beginning, i.e.a single array with all the numeric sub-arrays with information extracted from each string. Hope this makes more sense now, and sorry If I haven't explained correctly, I'm really new in programming and trying to figure out stuff.
Well if your strings are in a list, we want to put the arrays that result from calling method in a list as well. Python's list comprehension is a great way to achieve that.
list_strings = ['A', ...]
list_of_converted_strings = [method(item) for item in list_strings]
arr = np.array(list_of_converted_strings)
Numpy arrays are of fixed dimension i.e. for example a 2D numpy array of shape n X m will have n rows and m columns. If you want to convert a list of lists into a numpy array all the the sublists in the main list should be of same length. You cannot convert it into a numpy array if sublist are of varying size.
For example, below code will give an error
np.array([[1], [3,4]]])
so if all the sublist are of same size then you can use
np.array([method(x) for x in strings]])

Minimum of pairs between two lists, is there a quicker way?

I have two (very long) lists. I want to find the sum of the minimum of each pair in the list. Eg, if
X = [2,3,4]
Y = [5,4,2]
then, the sum would be 2+3+2 = 7.
At the moment, I'm doing this by zipping the lists and using a list comprehension. My lists are X and Y:
mins = [min(x,y) for x,y in zip(X,Y)]
summed_mins = sum(mins)
This is causing serious runtime issues in my program. Is there a faster way to do this? List comprehensions are the fastest that I know of.
You can use Python generators and the built-in map function to avoid the creation of the list, but this will probably be just slightly faster (thanks to Veedrac):
summed_mins = sum(map(min, x, y))
Alternatively, you can use Numpy. Here is how:
summed_mins = np.stack((X, Y)).min(axis=0).sum()
If you can store the input list directly as Numpy arrays, this can be much faster.
If you can even store it directly in a 2D Numpy array, you don't need the np.stack call resulting in a much faster code.
If you cannot store/create the input directly as Numpy arrays, you can create the Numpy arrays on the fly quickly by specifying the data type (assuming you are sure the list contain small integers). Here is an example:
summed_mins = np.stack((np.array(a, np.int64), np.array(b, np.int64))).min(axis=0)

Numpy broadcasting - using a variable value

EDIT:
As my question was badly formulated, I decided to rewrite it.
Does numpy allow to create an array with a function, without using Python's standard list comprehension ?
With list comprehension I could have:
array = np.array([f(i) for i in range(100)])
with f a given function.
But if the constructed array is really big, using Python's list would be slow and would eat a lot of memory.
If such a way doesn't exist, I suppose I could first create an array of my wanted size
array = np.arange(100)
And then map a function over it.
array = f(array)
According to results from another post, it seems that it would be a reasonable solution.
Let's say I want to use the add function with a simple int value, it will be as follows:
array = np.array([i for i in range(5)])
array + 5
But now what if I want the value (here 5) as something that varies according to the index of the array element. For example the operation:
array + [i for i in range(5)]
What object can I use to define special rules for a variable value within a vectorized operation ?
You can add two arrays together like this:
Simple adding two arrays using numpy in python?
This assumes your "variable by index" is just another array.
For your specific example, a jury-rigged solution would be to use numpy.arange() as in:
In [4]: array + np.arange(5)
Out[4]: array([0, 2, 4, 6, 8])
In general, you can find some numpy ufunc that does the job of your custom function or you can compose then in a python function to do so, which then returns an ndarray, something like:
def custom_func():
# code for your tasks
return arr
You can then simply add the returned result to your already defined array as in:
array + cusom_func()

Using list comprehension in data cubes

I am currently trying to use list comprehensions to filter some values on a data cube with some images, however I got lost to make the jump from 2 (as we can see in here or here) to 3 dimensions.
For a single image, the line of code that accomplishes what I want is:
AM2 = [[x if x > 1e-5 else 0 for x in line] for line in AM[0]]
How do I take this to also consider the different images that are stacked on the top of each other? I assume I would need to add a third nested loop but so far all my attempts to do so failed.
In my particular case the datacube is composed of numpy arrays having the dimensions of (100x400x900). Are lists comprehensions still advised to be used for filtering values over that volume of data?
Thanks for your time.
Don't use list comprehensions for numpy arrays, you lose their speed and power. Instead use numpy advanced indexing. For example your comprehension can be written as
AM2 = AM.copy() # USe AM2 = AM.copy()[0] if you just want the first row as in your example
AM2[AM2 < 1e-5] = 0
For pure Python nested lists, try this:
AM2 = [[x if x > 1e-5 else 0 for x in line] for A in AM for line in A]
See #FHTMitchell's answer if these are numpy arrays.

how to deal in a pythonic way with list of arrays or single array

I have this issue:
in my software I am either dealing with a single array or a list of 3 arrays (they are 1 or 3 components of a pixelized sky map).
If the single array was a list of 1 array, then it would be very easy to iterate over it transparently, regardless the number of elements.
Now, let's say I want to square these maps:
my_map = np.ones(100) # case of single component
# my_map = [np.ones(100) for c in [0, 1, 2]] # case of 3 components
if isinstance(my_map, list): #this is ugly
my_map_2 = [m**2 for m in my_map]
else:
my_map = my_map ** 2
would you have any suggestion on how to improve this?
Why wouldn't you directly create a 2D array ?
my_array = np.ones((100,3), dtype=float)
That way, you could directly square your 'three' arrays at once. You could still access individual elements as:
(x, y, z) = my_array.T
where .T is a shortcut for the .transpose method.
Using this approach would be far more efficient than looping on a list, especially if you apply the same function to the three arrays. Even if you want, say, to square the first array, double the second and take the square root of the third, you could:
my_array[:,0] **=2
my_array[:,1] *=2
my_array[:,2] **=0.5
You can convert your return value to list if it is a single value.. using list() factory method..
my_map = []
temp = np.ones(100) # case of single component
# Append your temp value.. either single or a list to your empty list..
my_map.append(temp)
my_map_2 = [m**2 for m in my_map]
I assume that your method np.ones(100) may return a single value and even a list..
Have you tried numpy.asarray? Then the if-else would just be
my_map = numpy.asarray(my_map)**2
Also check out numpy.asanyarray if you want to handle subclasses of ndarrays as well.
I often put a numpy.asarray call at the beginning of my functions, so they work on both lists and arrays.

Categories

Resources