Related
I have some observations collected at sea, and that we managed to classify in 2 clusters (blue and red), based on their properties. As you see in my example below, when projected, the classification looks as "spatially coherent", or at least, clusters don't look like randomly distributed. I'm looking for an statistic that tells about this spatial coherence, for each class, or for the full classification. I have seen examples in PYSAL or ESDA modules, but none with this type of data, a two-dimensional labeled array (1 and 2 values) with missing data (zero values). I don't know how to proceed.
This is the code example:
import matplotlib as mpl
import matplotlib.pyplot as plt
# DATA EXAMPLE
# I have a regular grid, with not-sampled (d==0) and sampled (d>0) areas.
# Sampled areas were classified as '1' and '2', based on some measurements
# that we collected at each location.
d = [[0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0],
[1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2],
[0, 1, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 2, 2],
[0, 1, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 2, 2, 2],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2],
[0, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2]]
# VISUAL
# class 1 (blue) and class 2 (red)
cmap = mpl.colors.ListedColormap(['w','b','r'])
plt.pcolormesh(d, cmap=cmap)
That's what you see when running the example:
Any advice on how to proceed? Thanks in advance!
So say I have a 2D array such as:
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]
And I want to set all the values 2 levels out around the 3 to a specific number like:
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 1, 1, 3, 1, 1, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]
Note that the 3 could be in any position in the list, I'm using a random generator to get it. So how could I achieve this? Maybe using a for loop?
Carrying on from the comment - I find numpy super useful for slicing like this;
import numpy as np
arr = np.array([
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
])
xs, ys = np.where(arr == 3)
arr[xs[0] - 2: xs[0] + 3, ys[0] - 2: ys[0] + 3] = 1
arr[xs[0], ys[0]] = 3
Obviously possible in pure python/list form as well but you will be knee deep in double iteration probably
Here's a pure Python approach (albeit rather clumsy):
A = [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]
for j, a in enumerate(A):
try:
i = a.index(3)
P = A[j-1] if j > 0 else None
Q = A[j+1] if j < len(a) - 1 else None
for k in range(max(0, i-2), min(i+3, len(a))):
if P:
P[k] = 1
if Q:
Q[k] = 1
a[k] = 1
a[i] = 3
break
except ValueError:
pass
print(A)
pure python
A = [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]
location = [(i, j) for i, row in enumerate(A) for j, item in enumerate(row) if item == 3][0]
for i, row in enumerate(A):
for j, item in enumerate(row):
if (abs(i - location[0]) <= 2) and (abs(j - location[1]) <= 2) and not ((i, j) == location):
A[i][j] = 1
Currently I am working on a project that involves creating an array with 10 binomial values 0 and 1 and a given success rate (= ci_rate[i]/1'000).
Due to the fact that the rate is different for each of the 10 years, I run a loop 10 times that is creating 20'000 binomial values each time (for 20'000 scenarios).
The success rate for the binomial values is very small, but is an absorbing state for the following years. Simplified for only 10 scenarios and 10 years I would like to output the following:
[1,0,0,0,0,0,0,0,0,0]
[1,0,0,0,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
Currently I am solving the problem in this way:
for j in range(20000):
tem = np.zeros(len(ci_rate))
for i in range(len(ci_rate)):
if i == 0:
tem[0] = (np.random.binomial(1, p = ci_rate[i] / 1000))
else:
tem[i]= int(np.where(tem[i-1]==1, 1, np.random.binomial(1, p = ci_rate[i] / 1000)))
ci_sim.append(tem)
Is anyone creative enough to solve this more time efficient?
This solution first ignores the persistence rule and enforces it afterwards using maximum.accumulate.
ci_rate = np.random.uniform(0, 0.1, 10)
res = np.maximum.accumulate(np.random.random((20000, ci_rate.size))<ci_rate, axis=1).view(np.int8)
res[:20]
#
# array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
# [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
# [0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=int8)
My attempt would be:
import numpy as np
ci_rate = np.random.normal(size=20)
ci_rate = (ci_rate - min(ci_rate)) /(max(ci_rate) - min(ci_rate)) - 0.7
ci_rate[ci_rate < 0] = 0
r = []
for i in range(100):
t = np.random.binomial(1, ci_rate)
r += [t.tolist()]
ci_rate = [1 if j == 1 else i for i, j in zip(ci_rate, t)]
#output
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
I am suggesting geometric distribution, since it looks like you are trying to see the number of trials for first success.
I am comparing the usefulness of using the geomentric distribution in terms of computation time
EDIT:
%%timeit
ci_rate = np.random.uniform(0, 0.1, nb_years)
successful_trail = np.random.geometric(p=ci_rate)
ci_sim=np.zeros((nb_scenarios,nb_years))
for i in range(nb_years):
ci_sim[i,successful_trail[i]:]=1
## 10000 loops, best of 3: 41.4 µs per loop
%%timeit
ci_rate = np.random.uniform(0, 0.1, nb_years)
res = np.maximum.accumulate(np.random.random((nb_scenarios, ci_rate.size))<ci_rate, axis=1).view(np.int8)
## 100 loops, best of 3: 2.97 ms per loop
This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 4 years ago.
def create_octahedron(size):
x = []
y = []
z = []
if size % 2 == 0 or size <= 1:
return x
for i in range(size):
x.append(0)
for i in range(size):
y.append(x)
for i in range(size):
z.append(y)
for i in range(size):
for u in range(size):
for v in range(size):
if i == len(z)//2:
if u == len(y)//2:
if v == len(x)//2:
z[3][3][3] = 1
print(z)
create_octahedron(7)
[[[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]], [[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]], [[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]], [[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]], [[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]], [[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]], [[0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0]]]
this is the output i keep getting but the output I'm expecting is to only have a 1 at the middle of the entire equation not at. i am much less interested in how to fix this as i already know how. What i want to know is why this is giving this output.
Because you append the same list. In z, each row points the same Y, and in Y each row points to the same X. If you try z[0][0][0] = 2, you could see that every row's first element changes to 2.
To avoid this, create a new x/y list before append.
I have an column vector that signifies the day of the week
[1,2,2,3,4]
I need to binarise this vector in the sense that every item in the original vector must be transformed to a vector where the number indicates an index that needs to be 1 and the rest must be 0.
[[0,1,0,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0],
[0,0,0,0,1,0,0,0,0]]
do it by composing your binary list with zeroes except in the given position in a list comprehension which gives a nice one-liner:
w=[1,2,2,3,4]
m = [[0]*(pos)+[1]+[0]*(9-pos-1) for pos in w]
result:
m = [[0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0]]
A simple list comprehension would be:
>> vector = [1,2,2,3,4]
>> [[int(i==j) for i in range(10)] for j in vector]
[[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0]]