I have an array of coordinates, and I would like to split the array into two arrays dependent on the Y value when there is a large gap in the Y value. This post: Split an array dependent on the array values in Python does it dependent on the x value, and the method I use is like this:
array = [[1,5],[3,5],[6,7],[8,7],[25,25],[26,50],.....]
n = len(array)
for i in range(n-1):
if abs(array[i][0] - array[i+1][0]) >= 10:
arr1 = array[:i+1]
arr2 = array[i+1:]
I figured that when I want to split it dependent on the Y value I could just change:
if abs(array[i][0] - array[i+1][0]) to if abs(array[0][i] - array[0][i+1])
This does not work and I get IndexError: list index out of range.
I'm quite new to coding and I'm wondering why this does not work for finding gap in Y value when it works for finding the gap in the X value?
Also, how should I go about splitting the array depending on the Y value?
Any help is much appreciated!
you have to switch to this:
array = [[1,5],[3,5],[6,7],[8,7],[25,25],[26,50]]
n = len(array)
for i in range(n-1):
if abs(array[i][1] - array[i+1][1]) >= 10:
arr1 = array[:i+1]
arr2 = array[i+1:]
Related
I have a matrix X, and the labels to each vector of that matrix as a np array Y. If the value of Y is +1 I want to multiply the vector of matrix X by another vector W. If the value of Y is -1 I want to multiply the vector of matrix X by another vector Z.
I tried the following loop:
for i in X:
if Y[i] == 1:
np.sum(np.multiply(i, np.log(w)) + np.multiply((1-i), np.log(1-w)))
elif Y[i] == -1:
np.sum(np.multiply(i, np.log(z)) + np.multiply((1-i), np.log(1-z)))
IndexError: arrays used as indices must be of integer (or boolean) type
i is the index of X, but I'm not sure how to align the index of X to the value in that index of Y.
How can I do this?
Look into np.where, it is exactly what you need:
res = np.where(Y == +1, np.dot(X, W), np.dot(X, Z))
This assumes that Y can take only value +1 and -1. If that's not the case you can adapt the script above but we need to know what do you expect as result when Y takes a different value.
Also, try to avoid explicit for loops when using numpy, the resulting code will be thousands of times faster.
While it is better to use a no-iteration approach, I think you need a refresher on basic Python iteration.
Make an array:
In [57]: x = np.array(['one','two','three'])
Your style of iteration:
In [58]: for i in x: print(i)
one
two
three
to iterate on indices use range (or arange):
In [59]: for i in range(x.shape[0]): print(i,x[i])
0 one
1 two
2 three
enumerate is a handy way of getting both indices and values:
In [60]: for i,v in enumerate(x): print(i,v)
0 one
1 two
2 three
Your previous question had the same problem (and correction):
How to iterate over a numpy matrix?
I have two arrays of the same length that contain elements from 0 to 1. For example:
x = np.linspace(0,1,100)
y = np.random.permutation(x)
I grouped the elements of x in bins of width 0.1:
bins = np.arange(0,1,0.1)
x_bin = []
for i in range(1,10):
x_bin.append(x[np.digitize(x,bins)==i])
Now I would like to slice y in groups which have the same lengths of the arrays in x_bin.
How can I do that?
A possible way is:
y0 = y[0:len(x_bin[0])]
and so on, but it is not very elegant.
This may be what you want to use as a more elegant solution than using loops:
l = [len(x) for x in x_bin] # get bin lengths
split_indices = np.cumsum(l) # sum up lengths for correct split indices
y_split = np.split(y, split_indices)
I got the array lengths via list comprehension and then splitted the np array using the gathered indices. This can be shortened to a single python instruction, but it is much easier to read this way.
A possible way is:
y0 = y[0:len(x_bin[0])]
and so on, but it is not very elegant.
instead of using y0 = ... y1 = ... you can make a list of slices:
slices = []
for n in len(y):
slices.append(y[n:len(x_bin[0])])
(this might be wrong, but the principle is there)
instead of haveing y0 y1 and so on, you will have slices[0], slices[1] and so on
I'm trying to create an array based on values from another data frame in Python. I want it to fill the array as such.
If x > or = 3 in the dataframe then it inputs a 0 in the array.
If x < 3 in the dataframe then it inputs a 1 in the array.
If x = 0 in the dataframe then it inputs a 0 in the array.
Below is the code I have so far but the result is coming out as just [0]
array = np.array([])
for x in df["disc"]:
for y in array:
if x >= 3:
y=0
elif x < 3:
y=1
else:
y=0
Any help would be much appreciated thank you.
When working with numpy arrays, it is more efficient if you can avoid using explicit loops in Python at all. (The actual looping takes place inside compiled C code.)
disc = df["disc"]
# make an array containing 0 where disc >= 3, elsewhere 1
array = np.where(disc >= 3, 0, 1)
# now set it equal to 0 in any places where disc == 0
array[disc == 0] = 0
It could also be done in a single statement (other than the initial assignment of disc) using:
array = np.where((disc >= 3) | (disc == 0), 0, 1)
Here the | does an element-by-element "or" test on the boolean arrays. (It has higher precedence than comparison operators, so the parentheses around the comparisons are needed.)
This is a simple problem. There are many ways to solve this, I think the easiest way is to use a list. You can use a list and append the values according to the conditions.
array = []
for x in df["disc"]:
if x >= 3:
array.append(0)
elif x < 3:
array.append(1)
else:
array.append(0)
Your code doesn't seem to be doing anything to the array, as you are trying to modify the variable y, rather than the array itself. y doesn't reference the array, it just holds the values found. The second loop also doesn't do anything due to the array being empty - it's looping through 0 elements. What you need rather than another for loop is to simply append to the array.
With a list, you would use the .append() method to add an element, however as you appear to be using numpy, you'd want to use the append(arr, values) function it provides, like so:
array = np.array([])
for x in df["disc"]:
if x >= 3:
array = np.append(array, 0)
elif x < 3:
array = np.append(array, 1)
else:
array = np.append(array, 0)
I'll also note that these conditions can be simplified to combine the two branches which append a 0. Namely, if x < 3 and x is not 0, then add a 1, otherwise add a 0. Thus, the code can be rewriten as follows.
array = np.array([])
for x in df["disc"]:
if x < 3 and x != 0:
array = np.append(array, 1)
else:
array = np.append(array, 0)
I have a matrix or a multiple array written in python, each element in the array is an integer ranged from 0 to 7, how would I randomly initalize this matrix or multiple array, so that for each element holds a value, which is different from the values of its 4 neighbours(left,right, top, bottom)? can it be implemented in numpy?
You can write your own matrix initializer.
Go through the array[i][j] for each i, j pick a random number between 0 and 7.
If the number equals to either left element: array[i][j-1] or to the upper one: array[i-1][j] regenerate it once again.
You have 2/7 probability to encounter such a bad case, and 4/49 to make it twice in a row, 8/343 for 3 in a row, etc.. the probability dropes down very quickly.
The average case complexity for n elements in a matrix would be O(n).
A simpler problem that might get you started is to do the same for a 1d array. A pure-python solution would look like:
def sample_1d(n, upper):
x = [random.randrange(upper)]
for i in range(1, n)"
xi = random.randrange(upper - 1)
if xi >= x:
xi += 1
x.append(xi)
return x
You can vectorize this as:
def sample_1d_v(n, upper):
x = np.empty(n)
x[0] = 0
x[1:] = np.cumsum(np.random.randint(1, upper, size=n-1)) % upper
x += np.random.randint(upper)
return
The trick here is noting that if there is adjacent values must be different, then the difference between their values is uniformly distributed in [1, upper)
Consider a 100X100 array.
i) Generate an array of several thousand random locations within such an array, e.g. (3,75) and (56, 34).
ii) Calculate how often one of your random locations falls within 15 pixels of any of the (straight) edges.
I am trying to do the above question in order to help me to learn the programming language Python, i am new to programming.
Here is what i have got so far:
from __future__ import division
from pylab import *
import math as m
from numpy import *
from random import randrange
N = 3000
coords_array = array([randrange(100) for _ in range(2 * N)]).reshape(N, 2)
This creates the array of N random locations, and no i am trying to create a loop that will append a 1 to an empty list if x>85 or y>85 or x<15 or y<15, then append a zero to the same empty list if x or y is anything else. Then i would find the sum of the list, which would be my count of how many of the random location fall within the edges.
This is the kind of thing i am trying to do:
coordinate=coords_array[x,y]
b=[]
def location(x,y):
if x>85 or y>85:
b.appnend(1)
if x<15 or y<15:
b.append(1)
else:
b.append(0)
print b
print x
But i am having trouble assigning the array as x and y variables. I want to be able assign each row of the set of random coordinates as an x,y pair so that i can use it in my loop.
But i do not know how to do it!
Please can someone show me how to do it?
Thank you
Ok, the answer to this:
But i am having trouble assigning the array as x and y variables. I
want to be able assign each row of the set of random coordinates as an
x,y pair so that i can use it in my loop
Would be this:
for pair in coords_array:
# Do something with the pair
NumPy arrays behave as regular Python sequences by letting for to iterate over their main axis, meaning pair will contain an array of (in your case) two elements: x and y. You can also do this:
for x, y in coords_array:
# Do something with the pair
NB: I think you wanted to write the function like this:
def location(x,y):
if x>85 or y>85:
b.append(1)
elif x<15 or y<15:
b.append(1)
else:
b.append(0)
or
def location(x,y):
if x>85 or y>85 or x<15 or y<15:
b.append(1)
else:
b.append(0)
or even
def location(x,y):
if not (15 <= x <= 85) or not (15 <= y <= 85):
b.append(1)
else:
b.append(0)
Otherwise, as #TokenMacGuy points out, you'd be inserting two values in certain cases.
NB: from your question I understand you want to write this code specifically to learn Python, but you could do this in a much more straightforward (and efficient) way by just using NumPy functionality
You can let numpy do the looping for you:
n = 3000
coords = np.random.randint(100, size=(n, 2))
x, y = coords.T
is_close_to_edge = (x < 15) | (x >= 85) | (y < 15) | (y >= 85)
count_close_to_edge = np.sum(is_close_to_edge)
Note that the first index of a 100 element array is 0 and the last 99, hence items within 15 positions of the edges are 0...14 and 85...99, hence the >= in the comparison. In the code above, is_close_to_edge is your list, with boolean values.