Numpy array of strings into an array of integers - python

I have the following array:
pattern = array([['[0, 0, 1, 0, 0]'],
['[0, 1, 1, 1, 1]'],
['[0, 1, 1, 1, 0]'],
['[0, 0, 1, 1, 1]'],
['[0, 0, 0, 1, 1]'],
['[0, 0, 1, 0, 1]'],
['[0, 0, 0, 0, 1]'],
['[1, 0, 1, 0, 0]'],
['[0, 1, 0, 1, 1]'],
['[0, 0, 1, 1, 0]'],
['[1, 1, 1, 1, 1]'],
['[1, 1, 1, 1, 0]']], dtype='<U15')
and I want to get it in non-string format as the following:
import numpy
my_array = numpy.array([[0, 0, 1, 0, 0],
[0, 1, 1, 1, 1],
[0, 1, 1, 1, 0],
[0, 0, 1, 1, 1],
[0, 0, 0, 1, 1],
[0, 0, 1, 0, 1],
[0, 0, 0, 0, 1],
[1, 0, 1, 0, 0],
[0, 1, 0, 1, 1],
[0, 0, 1, 1, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 0]
])
Any idea on how to do it non-manually?

Using numpy string operations to strip brackets ([]), splitting on comma and recast into an array with int dtype is possible:
np.array(np.char.split(np.char.strip(pattern[:, 0], '[]'), ', ').tolist(), 'int')
but a list comprehension where you do the same things using python string methods is much easier to read (and faster as well) imo.
np.array([row[0][1:-1].split(', ') for row in pattern], dtype='int')
# array([[0, 0, 1, 0, 0],
# [0, 1, 1, 1, 1],
# [0, 1, 1, 1, 0],
# [0, 0, 1, 1, 1],
# [0, 0, 0, 1, 1],
# [0, 0, 1, 0, 1],
# [0, 0, 0, 0, 1],
# [1, 0, 1, 0, 0],
# [0, 1, 0, 1, 1],
# [0, 0, 1, 1, 0],
# [1, 1, 1, 1, 1],
# [1, 1, 1, 1, 0]])

Related

How to calculate rolling up values in python

I have a python dict with the following values
d = {
"k1": [[0, 1, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0, 0]],
"k2": [[0, 1, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 1]],
"k3": [[0, 1, 0, 1, 0, 1, 0], [1, 1, 1, 0, 0, 0, 0], [0, 1, 1, 1, 1, 1, 1]],
"k4": [[0, 1, 0, 1, 0, 1, 0], [1, 1, 1, 0, 0, 0, 0], [0, 1, 1, 1, 1, 1, 1],[0, 1, 1, 1, 1, 1, 1]]
}
I need to apply a reduce/roll up function on this.
For k1, I am expecting the value as ((1,0,0)+(0,1,1)+(0,0,1)+(0,1,0)+ (0,0,1)+ (1,0,0) -> 1+1+1+1+1+1 =6) and so on), k2 as 4 and for k3, it is 7.
For e.g. for k1, I can calculate using this
k1_sum = sum([x | y | z for x,y,z in zip([0, 1, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0, 0])])
k2_sum = sum([x | y for x,y in zip([0, 1, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 1])])
How do I dynamically unpack the values. There could be several values for each key. It is not fixed. There can be numerous keys.
I would like to write a function and by passing each key, I would like to get the rolled up value.
functools.reduce is perfect for the job:
from operator import or_
from functools import reduce
d = {
"k1": [[0, 1, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0, 0]],
"k2": [[0, 1, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 1]],
"k3": [[0, 1, 0, 1, 0, 1, 0], [1, 1, 1, 0, 0, 0, 0], [0, 1, 1, 1, 1, 1, 1]],
"k4": [[0, 1, 0, 1, 0, 1, 0], [1, 1, 1, 0, 0, 0, 0], [0, 1, 1, 1, 1, 1, 1], [0, 1, 1, 1, 1, 1, 1]]
}
sums = {key: sum(reduce(or_, t) for t in zip(*xss)) for key, xss in d.items()}
print(sums)
Result:
{'k1': 6, 'k2': 4, 'k3': 7, 'k4': 7}
Don't use heavier libraries than you need to (like numpy), unless you're using them anyway or you find they give you a performance advantage you need.

Generating binary entries array in python

I would like to generate an array as follows:
[[0,0,0],
[0,0,1],
[0,1,0],
[0,1,1],
[1,0,0],
[1,0,1],
[1,1,0]
[1,1,1]]
I tried to achieve this by setting 3 for loops, but I wish to go further to 4, 5, and higher bit-numbers, so the last method would not scale easly to these numbers.
Is there any simple way for doing this?
I can't figure out why you want this, but here goes:
For 3:
>>> [[int(x) for x in "{0:03b}".format(y)] for y in range(8)]
[[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]
>>>
For 5:
>>> [[int(x) for x in "{0:05b}".format(y)] for y in range(32)]
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 1], [0, 0, 0, 1, 0], [0, 0, 0, 1, 1], [0, 0, 1, 0, 0], [0, 0, 1, 0, 1], [0, 0, 1, 1, 0], [0, 0, 1, 1, 1], [0, 1, 0, 0, 0], [0, 1, 0, 0, 1], [0, 1, 0, 1, 0], [0, 1, 0, 1, 1], [0, 1, 1, 0, 0], [0, 1, 1, 0, 1], [0, 1, 1, 1, 0], [0, 1, 1, 1, 1], [1, 0, 0, 0, 0], [1, 0, 0, 0, 1], [1, 0, 0, 1, 0], [1, 0, 0, 1, 1], [1, 0, 1, 0, 0], [1, 0, 1, 0, 1], [1, 0, 1, 1, 0], [1, 0, 1, 1, 1], [1, 1, 0, 0, 0], [1, 1, 0, 0, 1], [1, 1, 0, 1, 0], [1, 1, 0, 1, 1], [1, 1, 1, 0, 0], [1, 1, 1, 0, 1], [1, 1, 1, 1, 0], [1, 1, 1, 1, 1]]
>>>
Matching your formatting is harder.
You can use itertools.product to do this.
>>> import itertools
>>> list(itertools.product([0,1], repeat=3))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)]
https://docs.python.org/3/library/itertools.html#itertools.product
You can use a recursive function like the following:
def generate_binary_entries(n, t=[[]]): # n: length of bit number
if n == 0:
return t
new_t = []
for entry in t:
new_t.append(entry + [0])
new_t.append(entry + [1])
return generate_binary_entries(n - 1, new_t)
Then
generate_binary_entries(4)
generates
[[0, 0, 0, 0],
[0, 0, 0, 1],
[0, 0, 1, 0],
[0, 0, 1, 1],
[0, 1, 0, 0],
[0, 1, 0, 1],
[0, 1, 1, 0],
[0, 1, 1, 1],
[1, 0, 0, 0],
[1, 0, 0, 1],
[1, 0, 1, 0],
[1, 0, 1, 1],
[1, 1, 0, 0],
[1, 1, 0, 1],
[1, 1, 1, 0],
[1, 1, 1, 1]]

Turning a list into list of lists

I am writing a function which takes columns=c and rows=r (both can be unequal!) and that should a list of lists, where each row is a list containing c elements, all rows within a list. How do I create such sublists given the list below?
list = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1]
should return:
[[0, 0, 0, 0, 0], [1, 1, 0, 1, 1], [0, 0, 1, 1, 1], [1, 1, 1, 1, 0], [0, 1, 0, 1, 1]]
I tried to use split() however it seems like it works for strings only.
Numpy:
import numpy
c, r = 4, 5
list_ = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0]
numpy.array(list_).reshape(c, r).tolist()
#out (shortened example list to avoid 5x5):
[[0, 0, 0, 0, 0], [1, 1, 0, 1, 1], [0, 0, 1, 1, 1], [1, 1, 1, 1, 0]]
However, if your goal is to create "an cxr array with zeroes and ones", you should better use:
numpy.random.randint(0, high=2, size=(c, r))
# out
array([[1, 1, 1, 0, 0],
[1, 1, 0, 0, 0],
[0, 1, 1, 1, 0],
[1, 0, 0, 1, 0]])
Use itertools.islice: (Also don't use list as a variable name. It replaces the builtin function)
from itertools import islice
def chunker(data, rows, cols):
d = iter(data)
return [list(islice(d, cols)) for row in range(rows)]
data = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1]
result = chunker(data, 4, 5)
Result:
[[0, 0, 0, 0, 0],
[1, 1, 0, 1, 1],
[0, 0, 1, 1, 1],
[1, 1, 1, 1, 0]]
You can use a list comprehension:
c, r = 4, 5
list = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1]
list_of_lists = [list[i - c: i] for i in range(c, len(list), c)]
l= [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1]
print([L[i:i+4] for i in range(0,len(L),4)])
output:
[[0, 0, 0, 0], [0, 1, 1, 0], [1, 1, 0, 0], [1, 1, 1, 1], [1, 1, 1, 0], [0, 1, 0, 1], [1]]
using slicing and list comprehension.
new_list=[list[i:i+5] for i in range(len(list)//5)]
just do this like it,it will be done.
a sample usage screenshot
Try this:
ls = [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1]
[ls[i*5:i*5+5] for i in range(len(ls)//5)]
Out[1]:
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 1, 1],
[0, 0, 1, 1, 0],
[0, 1, 1, 0, 1]]
Or as a function:
def split_list(list, length):
return [list[i*length:i*length+length] for i in range((len(list)//length))]
split_list(ls, 5)

Numpy multidimensional slice

If a have a 2d Numpy array:
array([[0, 1, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 1, 1, 1, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 0, 1, 1]])
and I wanted to slice each row up to and including the first index position equal to 1, as below:
array([[0, 1],
[0, 0, 0, 0, 1],
[1],
[0, 0, 1,
[0, 0, 1])
Is it possible to achieve this using broadcasting, or must all output arrays have the same shape? I have a solution using the following, but I was curious if this could be achieved using broadcasting?
x = np.random.choice([0,1], size = [5,10])
idx = x.argmax(axis = 1)
np.array([row[:i] for row, i in zip(x, idx + 1)])
You can do this use dtype=object
a =np.array([[0, 1, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 1, 1, 1, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 0, 1, 1]])
idx = a.argmax(axis = 1)
a = np.array([row[:i] for row, i in zip(a, idx + 1)], dtype=object)
The output is:
a = array([array([0, 1]), array([0, 0, 0, 0, 1]), array([1]),
array([0, 0, 1]), array([0, 0, 1])], dtype=object)

Neighbourhood of Scipy Labels

I've got an array of objects labeled with scipy.ndimage.measurements.label called Labels. I've got other array Data containing stuff related to Labels. How can I make a third array Neighbourhoods which could serve to map the nearest label to x,y is L
Given Labels and Data, how can I use python/numpy/scipy to get Neighbourhoods?
Labels = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] )
Data = array([[1, 1, 1, 1, 1, 1, 2, 3, 4, 5],
[1, 0, 0, 0, 0, 1, 2, 3, 4, 5],
[1, 0, 0, 0, 0, 1, 2, 3, 4, 4],
[1, 0, 0, 0, 0, 1, 2, 3, 3, 3],
[1, 0, 0, 0, 0, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 1, 0, 0, 0, 1],
[3, 3, 3, 3, 2, 1, 0, 0, 0, 1],
[4, 4, 4, 3, 2, 1, 0, 0, 0, 1],
[5, 5, 4, 3, 2, 1, 1, 1, 1, 1]] )
Neighbourhoods = array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 1, 1, 1, 0, 2],
[1, 0, 0, 0, 0, 1, 1, 0, 2, 2],
[1, 0, 0, 0, 0, 1, 0, 2, 2, 2],
[1, 1, 1, 1, 1, 0, 2, 2, 2, 2],
[1, 1, 1, 1, 0, 2, 0, 0, 0, 2],
[1, 1, 1, 0, 2, 2, 0, 0, 0, 2],
[1, 1, 0, 2, 2, 2, 0, 0, 0, 2],
[1, 1, 2, 2, 2, 2, 2, 2, 2, 2]] )
Note: I'm not sure what should happen with ties, so used zeros in the above Neighbourhoods
As suggested by David Zaslavsky, this is the job for a voroni diagram. Here is a numpy implementation: http://blancosilva.wordpress.com/2010/12/15/image-processing-with-numpy-scipy-and-matplotlibs-in-sage/
The relevant function is scipy.ndimage.distance_transform_edt. It has a return_indices option that can be exploited to do what you need (as well as calculate the raw distances (data in your example)).
As an example:
import numpy as np
from scipy.ndimage import distance_transform_edt
labels = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] )
i, j = distance_transform_edt(labels == 0, return_distances=False,
return_indices=True)
neighborhoods = labels[i,j]
print neighborhoods
This yields:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 2, 2, 2],
[1, 1, 1, 1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[1, 1, 1, 1, 2, 2, 2, 2, 2, 2],
[1, 1, 1, 2, 2, 2, 2, 2, 2, 2],
[1, 1, 2, 2, 2, 2, 2, 2, 2, 2]])

Categories

Resources