How to split list of arrays into individual arrays? - python

I got one list of arrays with two different dimensions arrays inside.
c = [array([ 3.00493560e+05, 3.04300000e+01, 3.21649837e-01,
6.50984546e+05, 3.00493379e+05, 3.03073203e+01]), array([ 14.])]
I want to split them based on there dimensions to have two separate arrays.
a = array[([ 3.00493560e+05, 3.04300000e+01, 3.21649837e-01,
6.50984546e+05, 3.00493379e+05, 3.03073203e+01]]
b = array[([ 14.])]
I tried to use np.split(c, 6) - but it splits array based and given length and creates one big array so it's not what i am expecting.
I also tried to use
a = c[c[:, 0] < 1.5]
b = c[c[:, 1] > 5]
It works but sometimes my value from second array have same values as values from first array...

From my understanding, you wish to split a list of numpy arrays into individual python lists. You can do the following:
a,b = [ [individualArray] for individualArray in c]
This will give you the desired output:
a= [array([ 3.00493560e+05, 3.04300000e+01, 3.21649837e-01,
6.50984546e+05, 3.00493379e+05, 3.03073203e+01]
b= [array([ 14.])]
EDIT
In case c contains more than 2 arrays, you can generalize this approach by generating a list of split arrays:
splitArraysList = [ [individualArray] for individualArray in c ]
If the arrays are very big, you can use a generator instead of a list, to iterate on the individual arrays in the split list:
splitArraysList = ( [individualArray] for individualArray in c )

Maybe what you want is something like this:
a = sum([i for i in c if len(i) == 6], [])
b = sum([i for i in c if len(i) == 1], [])
If what you want is for a to be all lists with a length of 6, and b to be all list with a length of 1

Related

How to create a list of n arrays from a list of tuples, each tuple containing n arrays? (Other than with a for loop)

The issue
I have a list which contains 4 tuples. It is the output of multiprocessing.Pool.map() , but I don't think that's important.
Each tuple contains 3 numpy arrays.
What is a good way to create 3 arrays, i.e. append (vstack) all the first arrays into 1, all the second into another, etc? Ie create the orange output from the orange arrays, etc, in the screenshot below:
What I have tried
I could of course do a very banal loop, like in the toy example below; it works, but it doesn't seem very elegant. I presume there's a more elegant/ pythonic way?
x = np.random.rand(10,2)
a = ((x,2*x,3*x))
b = a
c = a
d = a
my_list =[a,b,c,d]
num_items = len(my_list[0])
out =[None] * num_items
for i in range(num_items): #3 arrays in each tuple
out[i] =[]
for l in my_list:
out[i].append( l[i] )
out[i] = np.vstack(out[i])
my_array = np.array(my_list).swapaxes(0,1) # puts the `out` dimension in front
my_array.shape
Out[]: (3, 4, 10, 2)
if you want to concatenate the first dimension:
my.array.reshape(3, -1, 2) #-> shape (3, 40, 2)
if you really want a list:
list(my_array) #-> list of 3 arrays of shape (4, 10, 2)
You can try:
np.concatenate(my_list, axis=1)

Numpy split array by grouping array

There are the following 2 arrays with equal length. My goal is to split the array B into groups defined by the array A. So finally there should be 3 arrays or an list of array. The final list of arrays should consists of the following rows of array B:
First and second
Third and fifth
Fourth
The order is not really relevant.
A = array([[-1],
[ 1],
[ 0],
[ 0],
[ 1]])
B = array([[ 624.5 , 548. ],
[ 912.8201, 564.3444],
[1564.5 , 764. ],
[1463.4163, 785.9251],
[1698.0757, 846.6306]])
The problem occured to me by using the dbscan clustering function. The A array describes the clusters (0, 1) of the points in array B. The values -1 declares the point as outlier. (The values used are not precise).
My goal is to calculate the compactness, ... of each found cluster
The numpy_indexed package (disclaimer: i am its author) was designed with these type of use cases in mind.
import numpy_indexed as npi
C = npi.group_by(A).split(B)
Not sure what you mean by compactness of each group; but rather than splitting and doing subsequent computations, it is typically more efficient to compute reductions over groups directly; whereby you can reuse the grouping object for increased efficiency:
groups = npi.group_by(A)
mean = groups.mean(B)
std = groups.std(B)
Keep is simple:
[data[labels == l] for l in np.unique(labels)]
Similarly, you can build a dict in a one-liner.
this is a bit lengthy but it should work.
final_dict = {}
for counter in range(0,len(A)):
if(A[counter] not in final_dict):
final_dict[A[counter]] = B[counter]
else:
final_dict[A[counter]] = final_dict[A[counter]] + B[counter]
final_array = []
for key,value in final_dict.items():
final_array.append(value)
Basically since you have odd values like -1 to work with you can set it as keys of a dictionary and then you iterate over the dictionary to get the groups of values which you can then append to a final output array

Python sum values from multiple lists (more than two)

Looking for a pythonic way to sum values from multiple lists:
I have got the following list of lists:
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = [a,b,c,d]
I am looking for the output:
[8,10,8]
I`ve used:
print ([sum(x) for x in zip(*my_list )])
but zip only works when I have 2 elements in my_list.
Any idea?
zip works for an arbitrary number of iterables:
>>> list(map(sum, zip(*my_list)))
[8, 10, 8]
which is, of course, roughly equivalent to your comprehension which also works:
>>> [sum(x) for x in zip(*my_list)]
[8, 10, 8]
Numpy has a nice way of doing this, it is also able to handle very large arrays. First we create the my_list as a numpy array as such:
import numpy as np
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = np.array([a,b,c,d])
To get the sum over the columns, you can do the following
np.sum(my_list, axis=0)
Alternatively, the sum over the rows can be retrieved by
np.sum(my_list, axis=1)
I'd make it a numpy array and then sum along axis 0:
my_list = numpy.array([a,b,c,d])
my_list.sum(axis=0)
Output:
[ 8 10 8]

Check how many numpy array within a numpy array are equal to other numpy arrays within another numpy array of different size

My problem
Suppose I have
a = np.array([ np.array([1,2]), np.array([3,4]), np.array([5,6]), np.array([7,8]), np.array([9,10])])
b = np.array([ np.array([5,6]), np.array([1,2]), np.array([3,192])])
They are two arrays, of different sizes, containing other arrays (the inner arrays have same sizes!)
I want to count how many items of b (i.e. inner arrays) are also in a. Notice that I am not considering their position!
How can I do that?
My Try
count = 0
for bitem in b:
for aitem in a:
if aitem==bitem:
count+=1
Is there a better way? Especially in one line, maybe with some comprehension..
The numpy_indexed package contains efficient (nlogn, generally) and vectorized solutions to these types of problems:
import numpy_indexed as npi
count = len(npi.intersection(a, b))
Note that this is subtly different than your double loop, discarding duplicate entries in a and b for instance. If you want to retain duplicates in b, this would work:
count = npi.in_(b, a).sum()
Duplicate entries in a could also be handled by doing npi.count(a) and factoring in the result of that; but anyway, im just rambling on for illustration purposes since I imagine the distinction probably does not matter to you.
Here is a simple way to do it:
a = np.array([ np.array([1,2]), np.array([3,4]), np.array([5,6]), np.array([7,8]), np.array([9,10])])
b = np.array([ np.array([5,6]), np.array([1,2]), np.array([3,192])])
count = np.count_nonzero(
np.any(np.all(a[:, np.newaxis, :] == b[np.newaxis, :, :], axis=-1), axis=0))
print(count)
>>> 2
You can do what you want in one liner as follows:
count = sum([np.array_equal(x,y) for x,y in product(a,b)])
Explanation
Here's an explanation of what's happening:
Iterate through the two arrays using itertools.product which will create an iterator over the cartesian product of the two arrays.
Compare each two arrays in a tuple (x,y) coming from step 1. using np.array_equal
True is equal to 1 when using sum on a list
Full example:
The final code looks like this:
import numpy as np
from itertools import product
a = np.array([ np.array([1,2]), np.array([3,4]), np.array([5,6]), np.array([7,8]), np.array([9,10])])
b = np.array([ np.array([5,6]), np.array([1,2]), np.array([3,192])])
count = sum([np.array_equal(x,y) for x,y in product(a,b)])
# output: 2
You can convert the rows to dtype = np.void and then use np.in1d as on the resulting 1d arrays
def void_arr(a):
return np.ascontiguousarray(a).view(np.dtype((np.void, a.dtype.itemsize * a.shape[1])))
b[np.in1d(void_arr(b), void_arr(a))]
array([[5, 6],
[1, 2]])
If you just want the number of intersections, it's
np.in1d(void_arr(b), void_arr(a)).sum()
2
Note: if there are repeat items in b or a, then np.in1d(void_arr(b), void_arr(a)).sum() likely won't be equal to np.in1d(void_arr(a), void_arr(b)).sum(). I've reversed the order from my original answer to match your question (i.e. how many elements of b are in a?)
For more information, see the third answer here

Two dimensional array in python

I want to know how to declare a two dimensional array in Python.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
The first two assignments work fine. But when I try to do, arr[1].append("bb1"), I get the following error:
IndexError: list index out of range.
Am I doing anything silly in trying to declare the 2-D array?
Edit:
but I do not know the number of elements in the array (both rows and columns).
You do not "declare" arrays or anything else in python. You simply assign to a (new) variable. If you want a multidimensional array, simply add a new array as an array element.
arr = []
arr.append([])
arr[0].append('aa1')
arr[0].append('aa2')
or
arr = []
arr.append(['aa1', 'aa2'])
There aren't multidimensional arrays as such in Python, what you have is a list containing other lists.
>>> arr = [[]]
>>> len(arr)
1
What you have done is declare a list containing a single list. So arr[0] contains a list but arr[1] is not defined.
You can define a list containing two lists as follows:
arr = [[],[]]
Or to define a longer list you could use:
>>> arr = [[] for _ in range(5)]
>>> arr
[[], [], [], [], []]
What you shouldn't do is this:
arr = [[]] * 3
As this puts the same list in all three places in the container list:
>>> arr[0].append('test')
>>> arr
[['test'], ['test'], ['test']]
What you're using here are not arrays, but lists (of lists).
If you want multidimensional arrays in Python, you can use Numpy arrays. You'd need to know the shape in advance.
For example:
import numpy as np
arr = np.empty((3, 2), dtype=object)
arr[0, 1] = 'abc'
You try to append to second element in array, but it does not exist.
Create it.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr.append([])
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
We can create multidimensional array dynamically as follows,
Create 2 variables to read x and y from standard input:
print("Enter the value of x: ")
x=int(input())
print("Enter the value of y: ")
y=int(input())
Create an array of list with initial values filled with 0 or anything using the following code
z=[[0 for row in range(0,x)] for col in range(0,y)]
creates number of rows and columns for your array data.
Read data from standard input:
for i in range(x):
for j in range(y):
z[i][j]=input()
Display the Result:
for i in range(x):
for j in range(y):
print(z[i][j],end=' ')
print("\n")
or use another way to display above dynamically created array is,
for row in z:
print(row)
When constructing multi-dimensional lists in Python I usually use something similar to ThiefMaster's solution, but rather than appending items to index 0, then appending items to index 1, etc., I always use index -1 which is automatically the index of the last item in the array.
i.e.
arr = []
arr.append([])
arr[-1].append("aa1")
arr[-1].append("aa2")
arr.append([])
arr[-1].append("bb1")
arr[-1].append("bb2")
arr[-1].append("bb3")
will produce the 2D-array (actually a list of lists) you're after.
You can first append elements to the initialized array and then for convenience, you can convert it into a numpy array.
import numpy as np
a = [] # declare null array
a.append(['aa1']) # append elements
a.append(['aa2'])
a.append(['aa3'])
print(a)
a_np = np.asarray(a) # convert to numpy array
print(a_np)
a = [[] for index in range(1, n)]
For compititve programming
1) For input the value in an 2D-Array
row=input()
main_list=[]
for i in range(0,row):
temp_list=map(int,raw_input().split(" "))
main_list.append(temp_list)
2) For displaying 2D Array
for i in range(0,row):
for j in range(0,len(main_list[0]):
print main_list[i][j],
print
the above method did not work for me for a for loop, where I wanted to transfer data from a 2D array to a new array under an if the condition. This method would work
a_2d_list = [[1, 2], [3, 4]]
a_2d_list.append([5, 6])
print(a_2d_list)
OUTPUT - [[1, 2], [3, 4], [5, 6]]
x=3#rows
y=3#columns
a=[]#create an empty list first
for i in range(x):
a.append([0]*y)#And again append empty lists to original list
for j in range(y):
a[i][j]=input("Enter the value")
In my case I had to do this:
for index, user in enumerate(users):
table_body.append([])
table_body[index].append(user.user.id)
table_body[index].append(user.user.username)
Output:
[[1, 'john'], [2, 'bill']]

Categories

Resources