I'm trying to do a sorting algorithm and I need to do something like this:
arr = numpy.array([2,4,5,6])
#some function here or something
array([5,2,4,6])
#element "5" moved from position 3 to 1, and all of the others moved up one position
What I mean is I want to change an element's position (index) and move all the other elements up one position. Is this possible?
You could use numpy.roll() with a subset assignment:
arr = numpy.array([2,4,5,6])
arr[:3] = numpy.roll(arr[:3],1)
print(arr)
[5 2 4 6]
If you know the position/index of the element to be shifted, then you could do:
indx = 2
np.r_[arr[indx], np.delete(arr, indx)]
Out[]: array([5, 2, 4, 6])
You could do this in-place without the need of creating first an intermediate array of n-1 elements and then creating another final array by concatenation. Instead, you could try this:
idx = 2
tmp = arr[idx]
arr[1:idx+1] = arr[0:idx]
arr[0] = tmp
So there are more than one ways of doing this, and the choice depends upon your algorithm's constraints.
Related
I want to compare two array such that i have to compare two arrays based on the index position such that arr2[i] in arr1[i:]. If the lenght of the array is equal it is easy, both the arrays can have the minimum length, could you please help me how to dynamically find the minimum lenght of the array in the loop?
arr1 = [1,2,3,4,5,7,8,9]
arr2 = [4,5,6]
for i in range(min(arr1,arr2)):
if minimum_arr[i] in max_arr[i:].
---> how to dynamically solve this issue, please help.
As I can understand your problem, you better set the minimum_arr and the max_arr before the for loop. I would like to notice that the indexing of lists starts with 0, which means that the statement you gave will never be True with these lists, so I fixed the issue in the if statement (you do not need to get an nth element of the max_arr, since you want to check if the given element is in that list or not).
arr1 = [1, 2, 3, 4, 5, 7, 8, 9]
arr2 = [4, 5, 6]
minimum_arr = arr1 if len(arr1) < len(arr2) else arr2
max_arr = arr1 if len(arr1) > len(arr2) else arr2
for i in range(len(minimum_arr)):
if minimum_arr[i] in max_arr:
# ...
I would determine which list is the shorter before entering the loop. You can do this by sorting a list of the two lists by there length, using the built-in sorted setting the key argument to the len function.
Then you code would look like:
arr1 = [1,2,3,4,5,7,8,9]
arr2 = [4,5,6]
# Sort the two arrays by length
short_arr, long_arr = sorted([arr1, arr2], key=len)
for i in range(len(short_arr)):
if short_arr[i] in long_arr[i:]:
pass
You can use enumerate instead of range to get the current object and the index you're at currently as you iterate the smaller list. As for getting the smaller list you need to use the key argument and I also suggest using sorted instead of using min and max.
list1 = [1,2,3,4,5,7,8,9]
list2 = [4,5,6]
min_list, max_list = sorted([list1, list2], key=len)
for i, n in enumerate(min_list):
if n in max_list[i:]:
...
I have a 3D numpy array of the shape (1, 60, 1). Now I need to remove the first value of the second dimension and instead append a new value at the end.
If it was a list, the code would look somewhat like this:
x = [1, 2, 3, 4]
x = x[1:]
x.append(5)
resulting in this list: [2, 3, 4, 5]
What would be the easiest way to do this with numpy?
I have basically never really worked with numpy before, so that's probably a pretty trivial problem, but thanks for your help!
import numpy as np
arr = np.arange(60) #creating a nd array with 60 values
arr = arr.reshape(1,60,1) # shaping it as mentiond in question
arr = np.roll(arr, -1) # use np.roll to circulate the array left or right (-1 is 1 step to the left)
#Now your last value is in the second last position, the second last value in the third last pos and so on (Your first value moves to the last position)
arr[:,-1,:] = 1000 # index the last location and add the values you want
print(arr)
I'd like to randomly remove a fraction of elements from a list without changing the order of the list.
Say I had some data and I wanted to remove 1/4 of them:
data = [1,2,3,4,5,6,7,8,9,10]
n = len(data) / 4
I'm thinking I need a loop to run through the data and delete a random element 'n' times? So something like:
for i in xrange(n):
random = np.randint(1,len(data))
del data[random]
My question is, is this the most 'pythonic' way of doing this? My list will be ~5000 elements long and I want to do this multiple times with different values of 'n'.
Thanks!
Sequential deleting is a bad idea since deletion in a list is O(n). Instead do something like this:
def delete_rand_items(items,n):
to_delete = set(random.sample(range(len(items)),n))
return [x for i,x in enumerate(items) if not i in to_delete]
You can use random.sample like this:
import random
a = [1,2,3,4,5,6,7,8,9,10]
no_elements_to_delete = len(a) // 4
no_elements_to_keep = len(a) - no_elements_to_delete
b = set(random.sample(a, no_elements_to_keep)) # the `if i in b` on the next line would benefit from b being a set for large lists
b = [i for i in a if i in b] # you need this to restore the order
print(len(a)) # 10
print(b) # [1, 2, 3, 4, 5, 8, 9, 10]
print(len(b)) # 8
Two notes on the above.
You are not modifying the original list in place but you could.
You are not actually deleting elements but rather keeping elements but it is the same thing (you just have to adjust the ratios)
The drawback is the list-comprehension that restores the order of the elements
As #koalo says in the comments the above will not work properly if the elements in the original list are not unique. I could easily fix that but then my answer would be identical to the one posted by#JohnColeman. So if that might be the case just use his instead.
Is the order meaningful?
if not you can do something like:
shuffle(data)
data=data[:len(data)-n]
I suggest using numpy indexing as in
import numpy as np
data = np.array([1,2,3,4,5,6,7,8,9,10])
n = len(data)/4
indices = sorted(np.random.choice(len(data),len(data)-n,replace=False))
result = data[indices]
I think it will be more convenient this way:
import random
n = round(len(data) *0.3)
for i in range(n):
data.pop(random.randrange(len(data)))
I'm trying to iterate through a two dimensional array in Python and compare items in the array to ints, however I am faced with a ton of various errors whenever I attempt to do such. I'm using numpy and pandas.
My dataset is created as follows:
filename = "C:/Users/User/My Documents/JoeTest.csv"
datas = pandas.read_csv(filename)
dataset = datas.values
Then, I attempt to go through the data, grabbing certain elements of it.
def model_building(data):
global blackKings
flag = 0;
blackKings.append(data[0][1])
for i in data:
if data[i][39] == 1:
if data[i][40] == 1:
values.append(1)
else:
values.append(-1)
else:
if data[i][40] == 1:
values.append(-1)
else:
values.append(1)
for j in blackKings:
if blackKings[j] != data[i][1]:
flag = 1
if flag == 1:
blackKings.append(data[i][1])
flag = 0;
However, doing so leaves me with a ValueError: The Truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). I don't want to use either of these, as I'm looking to compare the actual value of that one specific instance. Is there another way around this problem?
You need to tell us something about this: dataset = datas.values
It's probably a 2d array, since it derives from a load of a csv. But what shape and dtype? Maybe even a sample of the array.
Is that the data argument in the function?
What are blackKings and values? You treat them like lists (with append).
for i in data:
if data[i][39] == 1:
This doesn't make sense. for i in data, if data is 2d, i is the the first row, then the second row, etc. If you want i to in an index, you use something like
for i in range(data.shape[0]):
2d array indexing is normally done with data[i,39].
But in your case data[i][39] is probably an array.
Anytime you use an array in a if statement, you'll get this ValueError, because there are multiple values.
If i were proper indexes, then data[i,39] would be a single value.
To illustrate:
In [41]: data=np.random.randint(0,4,(4,4))
In [42]: data
Out[42]:
array([[0, 3, 3, 2],
[2, 1, 0, 2],
[3, 2, 3, 1],
[1, 3, 3, 3]])
In [43]: for i in data:
...: print('i',i)
...: print('data[i]',data[i].shape)
...:
i [0 3 3 2] # 1st row
data[i] (4, 4)
i [2 1 0 2] # a 4d array
data[i] (4, 4)
...
Here i is a 4 element array; using that to index data[i] actually produces a 4 dimensional array; it isn't selecting one value, but rather many values.
Instead you need to iterate in one of these ways:
In [46]: for row in data:
...: if row[3]==1:
...: print(row)
[3 2 3 1]
In [47]: for i in range(data.shape[0]):
...: if data[i,3]==1:
...: print(data[i])
[3 2 3 1]
To debug a problem like this you need to look at intermediate values, and especially their shapes. Don't just assume. Check!
I'm going to attempt to rewrite your function
def model_building(data):
global blackKings
blackKings.append(data[0, 1])
# Your nested if statements were performing an xor
# This is vectorized version of the same thing
values = np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1
# not sure where `values` is defined. If you really wanted to
# append to it, you can do
# values = np.append(values, np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1)
# Your blackKings / flag logic can be reduced
mask = (blackKings[:, None] != data[:, 1]).all(1)
blackKings = np.append(blackKings, data[:, 1][mask])
This may not be perfect because it is difficult to parse your logic considering you are missing some pieces. But hopefully you can adopt some of what I've included here and improve your code.
What is the easiest and cleanest way to get the first AND the last elements of a sequence? E.g., I have a sequence [1, 2, 3, 4, 5], and I'd like to get [1, 5] via some kind of slicing magic. What I have come up with so far is:
l = len(s)
result = s[0:l:l-1]
I actually need this for a bit more complex task. I have a 3D numpy array, which is cubic (i.e. is of size NxNxN, where N may vary). I'd like an easy and fast way to get a 2x2x2 array containing the values from the vertices of the source array. The example above is an oversimplified, 1D version of my task.
Use this:
result = [s[0], s[-1]]
Since you're using a numpy array, you may want to use fancy indexing:
a = np.arange(27)
indices = [0, -1]
b = a[indices] # array([0, 26])
For the 3d case:
vertices = [(0,0,0),(0,0,-1),(0,-1,0),(0,-1,-1),(-1,-1,-1),(-1,-1,0),(-1,0,0),(-1,0,-1)]
indices = list(zip(*vertices)) #Can store this for later use.
a = np.arange(27).reshape((3,3,3)) #dummy array for testing. Can be any shape size :)
vertex_values = a[indices].reshape((2,2,2))
I first write down all the vertices (although I am willing to bet there is a clever way to do it using itertools which would let you scale this up to N dimensions ...). The order you specify the vertices is the order they will be in the output array. Then I "transpose" the list of vertices (using zip) so that all the x indices are together and all the y indices are together, etc. (that's how numpy likes it). At this point, you can save that index array and use it to index your array whenever you want the corners of your box. You can easily reshape the result into a 2x2x2 array (although the order I have it is probably not the order you want).
This would give you a list of the first and last element in your sequence:
result = [s[0], s[-1]]
Alternatively, this would give you a tuple
result = s[0], s[-1]
With the particular case of a (N,N,N) ndarray X that you mention, would the following work for you?
s = slice(0,N,N-1)
X[s,s,s]
Example
>>> N = 3
>>> X = np.arange(N*N*N).reshape(N,N,N)
>>> s = slice(0,N,N-1)
>>> print X[s,s,s]
[[[ 0 2]
[ 6 8]]
[[18 20]
[24 26]]]
>>> from operator import itemgetter
>>> first_and_last = itemgetter(0, -1)
>>> first_and_last([1, 2, 3, 4, 5])
(1, 5)
Why do you want to use a slice? Getting each element with
result = [s[0], s[-1]]
is better and more readable.
If you really need to use the slice, then your solution is the simplest working one that I can think of.
This also works for the 3D case you've mentioned.