I have a numpy array,
myarray= np.array([49, 7, 44, 27, 13, 35, 171])
i wanted to replace the values if it is greater than 45, so i applied the below code
myarray=np.where(myarray> 45,myarray - 45, myarray)
but this is applied only once in that array, for example, the above array becomes
myarray= np.array([4, 7, 44, 27, 13, 35, 126])
Expected array
myarray= np.array([4, 7, 44, 27, 13, 35, 36])
How do i run the np.where till the condition is satisfied? basically in the above array i dont want any value to be greater than 45, is there a pythonic way of doing it. Thanks in advance.
Well the subtraction happens only once, since you did the operation only once, hence 171 - 45 => 126 and the operation has completed. Try using the modulo operator if you wanna do it this way.
myarray = np.array([49, 7, 44, 27, 13, 35, 171])
myarray = np.where(myarray> 45, myarray % 45, myarray)
print(myarray)
The output matches your prompt.
[ 4 7 44 27 13 35 36]
Not the most pythonic or even optimal way, but pretty easy to understand. You can try this:
import numpy as np
myarray= np.array([49, 7, 44, 27, 13, 35, 171])
for i in range(len(myarray)):
while(myarray[i]>45):
myarray[i]=myarray[i]-45
print(myarray)
Related
I am translating code from MATLAB to Python. I need to extract the lower subdiagonal values of a matrix. My attempt in python seems to extract the same values (sum is equal), but in different order. This is a problem as I need to apply corrcoef after.
The original Matlab code is using an array of indices to subset a matrix.
MATLAB code:
values = 1:100;
matrix = reshape(values,[10,10]);
subdiag = find(tril(ones(10),-1));
matrix_subdiag = matrix(subdiag);
subdiag_sum = sum(matrix_subdiag);
disp(matrix_subdiag(1:10))
disp(subdiag_sum)
Output:
2
3
4
5
6
7
8
9
10
13
1530
My attempt in Python
import numpy as np
matrix = np.arange(1,101).reshape(10,10)
matrix_t = matrix.T #to match MATLAB arrangement
matrix_subdiag = matrix_t[np.tril_indices((10), k = -1)]
subdiag_sum = np.sum(matrix_subdiag)
print(matrix_subdiag[0:10], subdiag_sum))
Output:
[2 3 13 4 14 24 5 15 25 35] 1530
How do I get the same order output? Where is my error?
Thank you!
For the sum use directly numpy.triu on the non-transposed matrix:
S = np.triu(matrix, k=1).sum()
# 1530
For the indices, numpy.triu_indices_from and slicing as a flattened array:
idx = matrix[np.triu_indices_from(matrix, k=1)]
output:
array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 14, 15, 16, 17, 18, 19, 20,
24, 25, 26, 27, 28, 29, 30, 35, 36, 37, 38, 39, 40, 46, 47, 48, 49,
50, 57, 58, 59, 60, 68, 69, 70, 79, 80, 90])
I am trying to create a list of 6 numbers lists from 1 to 49 throw looping from 1 to 49 and creating all possible sets of 1 to 49 .
the issue is that code stops at number 15 and in Pycharm nothing is being printed (excel file is being written but stops at 38759 record)
import itertools
import pandas as pd
stuff = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
all=[]
for L in range(0, len(stuff)+1):
for subset in itertools.combinations(stuff, L):
alist=list(subset)
if len(subset)==6:
all.append(alist)
all_tuple=tuple(all)
df = pd.DataFrame(all_tuple,columns=['z1','z2','z3','z4','z5','z6'])
print(df)
df.to_excel('test.xlsx')
If I understand correctly, you are trying to find the possible combinations of 6 numbers sampled from the list [1, 2, 3, ..., 49] without replacement.
But your code calculates the combinations of all lengths and then only saves those of length 6.
To get a clue as to why your code does not terminate quickly, consider the number of combinations of 6 numbers:
>>> print(len(list(itertools.combinations(range(1, 50), 6))))
13983816
So, if there are 14 million possible combinations of 6 numbers, imagine how many combinations there are of 7, 8, 9, ...
Here is some code to calculate only the 14 million combinations of length 6:
combs = list(itertools.combinations(range(1, 50), 6))
Or, if you really want to build the dataframe:
# Warning, this takes about 25 seconds
combs = itertools.combinations(range(1, 50), 6)
df = pd.DataFrame(combs, columns=['z1','z2','z3','z4','z5','z6'])
Bear in mind that this will take up quite a bit of memory. I'm not sure if Excel can handle 14 million rows so I didn't risk it.
Also, don't use reserved keywords for variable names. all is a built in Python function.
I have a multiband raster (84 bands). I am reading the raster using GDAL and converting it to numpy array. In numpy when I am checking the array shape it is showing as 84 = bands, 3 = row and 5 = col. I want to compute the ratio between the band(0)/band(n+1) for n in 1 to 84. Thus, I am able to get 83 arrays, each array represents pixel-by-pixel ratio. For example, I have:
Band 1
[[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]]
Band 2
[[21, 22, 23, 24, 25],
[26, 27, 28, 29, 30],
[31, 32, 33, 34, 35]]
Band 3
[[31, 32, 33, 34, 35],
[36, 37, 38, 39, 40],
[41, 42, 43, 44, 45]]
...
...
Band84
I need to loop through all the bands in such a way that I get these: Band2/Band1; Band3/Band1; ... ; Band84/Band1
Band2/Band1
[[1/21, 2/22, 3/23, 4/24, 5/25],
[6/26, 7/27, 8/28, 9/29, 10/30],
[11,31, 12/32, 13/33, 14/34, 15/35]]
And so on...
There is any way to vectorize this calculation?
I really appreciate your advice.
If I understand you need
Band2/Band1; Band3/Band1 ... Band84/Band1
Band3/Band2; Band4/Band2 ... Band84/Band2
...
Band84/Band83
It should be something like this
for a in range(0, len(all_bands)-1):
for b in range(a+1, len(all_bands)):
print( all_bands[b]/all_bands[a] )
I was wondering what the use of the comma was when slicing Python arrays - I have an example that appears to work, but the line that looks weird to me is
p = 20*numpy.log10(numpy.abs(numpy.fft.rfft(data[:2048, 0])))
Now, I know that when slicing an array, the first number is start, the next is end, and the last is step, but what does the comma after the end number designate? Thanks.
It is being used to extract a specific column from a 2D array.
So your example would extract column 0 (the first column) from the first 2048 rows (0 to 2047). Note however that this syntax will only work for numpy arrays and not general python lists.
Empirically - create an array using numpy
m = np.fromfunction(lambda i, j: (i +1)* 10 + j + 1, (9, 4), dtype=int)
which assigns an array like below to m
array(
[[11, 12, 13, 14],
[21, 22, 23, 24],
[31, 32, 33, 34],
[41, 42, 43, 44],
[51, 52, 53, 54],
[61, 62, 63, 64],
[71, 72, 73, 74],
[81, 82, 83, 84],
[91, 92, 93, 94]])
Now for the slice
m[:,0]
giving us
array([11, 21, 31, 41, 51, 61, 71, 81, 91])
I may have misinterpreted Khan Academy (so take with grain of salt):
In linear algebra terms, m[:,n] is taking the nth column vector of
the matrix m
See Abhranil's note how this specific interpretation only applies to numpy
It slices with a tuple. What exactly the tuple means depends on the object being sliced. In NumPy arrays, it performs a m-dimensional slice on a n-dimensional array.
>>> class C(object):
... def __getitem__(self, val):
... print val
...
>>> c = C()
>>> c[1:2,3:4]
(slice(1, 2, None), slice(3, 4, None))
>>> c[5:6,7]
(slice(5, 6, None), 7)
range(5, 15) [1, 1, 5, 6, 10, 10, 10, 11, 17, 28]
range(6, 24) [4, 10, 10, 10, 15, 16, 18, 20, 24, 30]
range(7, 41) [9, 18, 19, 23, 23, 26, 28, 40, 42, 44]
range(11, 49) [9, 23, 24, 27, 29, 31, 43, 44, 45, 45]
range(38, 50) [1, 40, 41, 42, 44, 48, 49, 49, 49, 50]
I get the above outpout from a print command from a function. What I really want is a combined list of the range, for example in the top line 5,6,7...15,1,1,5,6 etc.
The output range comes from
range_draws=range(int(lower),int(upper))
which I naively thought would give a range. The other numbers come from a sliced list.
Could someone help me to get the desired result.
The range() function returns a special range object to save on memory (no need to keep all the numbers in memory when only the start, end and step size will do). Cast it to a list to 'expand' it:
list(yourrange) + otherlist
To quote the documentation:
The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).