This question already has answers here:
Repeating each element of a numpy array 5 times
(2 answers)
Closed last year.
I'm using numpy.
I want to repeat numbers from 0 to 10000 five times.
Let me show you a simple example below.
0
0
0
0
0
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
9999
9999
9999
9999
9999
10000
10000
10000
10000
10000
How can I do this?
Use numpy.repeat with numpy.arange:
a = np.repeat(np.arange(10001), 5)
print (a)
[ 0 0 0 ... 10000 10000 10000]
Related
I am developing a model and using a Pandas Dataframe as an input, each row represent a period for a given id. I need to calculate some columns (which would be the output of the model). The problem is that one colum is the function of other (D = F(A, fixed inputs) A is B t-1 (the value of B the previous period) and B is A - D. So the problem here is as each column depend on each other, and previous values the only way I found to resolve that is to iterate over the rows with itertuples(), but this way is too slow. I was wondering if there is a more efficient way to do this, perhaps without iteration.
This would be the simplified initial table (there are more columns and operations)
Id Period MoneyInitial MoneyBoP Money_EoP Money_Paid
0 0001 1 1000 0 0 0
1 0001 2 1000 0 0 0
2 0001 3 1000 0 0 0
3 0001 4 1000 0 0 0
4 0001 5 1000 0 0 0
5 0001 6 1000 0 0 0
6 0001 7 1000 0 0 0
7 0001 8 1000 0 0 0
The desired output would be:
For the period 1 of each contract MoneyBoP would be equal to MoneyInitial , for the rest would be Money_EoP of the previous period.
Money_Paid is a function which takes MoneyBoP and other inputs (these are already calculated in the initial table)for the calculation
Money_EoP would be MoneyBoP + Money_Paid
So the desired output table would be:
Id Period MoneyInitial MoneyBoP Money_EoP Money_Paid
0 0001 1 1000 1000 900 -100
1 0001 2 1000 900 850 -50
2 0001 3 1000 850 700 -150
3 0001 4 1000 700 600 -100
4 0001 5 1000 600 450 -150
5 0001 6 1000 450 300 -150
6 0001 7 1000 150 50 -100
7 0001 8 1000 50 0 -50
It looks like all the values can be calculated knowing the number of periods and MoneyInitial
#Some function to calculate MoneyPaid from BoP
def MoneyPaid(BoP):
return round(-BoP * .1, 2)
#Calculate
def Calculate_Data(start, n):
d = [] # {'BoP' : [], 'EoP' : [], 'MP' : []}
for i in range(0, n):
bop = start
mp = MoneyPaid(bop)
start = start + mp
d.append((bop,start,mp))
return pd.DataFrame(d)
df[['MoneyBoP','Money_EoP','Money_Paid']] = Calculate_Data(df.iloc[0]['MoneyInitial'], len(df))
The result of this is
Id Period MoneyInitial MoneyBoP Money_EoP Money_Paid
0 1 1 1000 1000.00 900.00 -100.00
1 1 2 1000 900.00 810.00 -90.00
2 1 3 1000 810.00 729.00 -81.00
3 1 4 1000 729.00 656.10 -72.90
4 1 5 1000 656.10 590.49 -65.61
5 1 6 1000 590.49 531.44 -59.05
6 1 7 1000 531.44 478.30 -53.14
7 1 8 1000 478.30 430.47 -47.83
I'm working on create a sankey plot and have the raw data mapped so that I know source and target node. I'm having an issue with grouping the source & target and then counting the number of times each occurs. E.g. using the table below finding out how many time 0 -> 4 occurs and recording that in the dataframe.
index event_action_num next_action_num
227926 0 6
227928 1 5
227934 1 6
227945 1 7
227947 1 6
227951 0 7
227956 0 6
227958 2 6
227963 0 6
227965 1 6
227968 1 5
227972 3 6
Where I want to send up is:
event_action_num next_action_num count_of
0 4 1728
0 5 2382
0 6 3739
etc
Have tried:
df_new_2 = df_new.groupby(['event_action_num', 'next_action_num']).count()
but doesn't give me the result I'm looking for.
Thanks in advance
Try to use agg('size') instead of count():
df_new_2.groupby(['event_action_num', 'next_action_num']).agg('size')
For your sample data output will be:
I'm very new to learning python, though I understand the basics of the looping, I am unable to understand the method in which output is arrived at.
In particular, how does the mapping of all three for loops happen to give the desired output, as I finding it impossible to understand the logic to be applied, when I try to write the output on paper without referring to IDE.
Code:
n = 4
a = 3
z = 2
for i in range(n):
for j in range(a):
for p in range(z):
print(i, j, p)
Output is:
0 0 0
0 0 1
0 1 0
0 1 1
0 2 0
0 2 1
1 0 0
1 0 1
1 1 0
1 1 1
1 2 0
1 2 1
2 0 0
2 0 1
2 1 0
2 1 1
2 2 0
2 2 1
3 0 0
3 0 1
3 1 0
3 1 1
3 2 0
3 2 1
The first loop iterates four times.
The second loop iterates three times. However since it is embedded inside the first loop, it actually iterates twelve times (4 * 3.)
The third loop iterates two times. However since it is embedded inside the first and second loops, it actually iterates twenty-four times (4 * 3 * 2).
I have a square numpy 2D matrix.
2 2 2 2
2 2 2 2
2 2 2 2
2 2 2 2
And I need to set a certain count of random matrix values to 0. Let's say it is 5 elements. That means any 5 from 16 matrix values must be set to 0. For example new matrix could be
2 2 0 0
0 2 2 2
2 2 2 2
0 2 0 2
or
2 0 2 2
2 2 0 2
2 2 0 2
0 2 2 0
or some else.
How could I do this efficient way?
This will do it:
import random
arr1d = arr.ravel()
randidx = random.sample(range(len(arr1d)), 5)
arr1d[randidx] = 0
This modifies arr because ravel() returns a view, not a copy.
For more on how the random numbers can be generated, see: Non-repetitive random number in numpy
lets say your matrix is "matrix"
import random
for i in range(5):
random1=random.randint(0,size_x_ofmatrix)
random2=random.randint(0,size_y_ofmatrix)
matrix[random1,random2]=0
This question already has answers here:
How to turn a float number like 293.4662543 into 293.47 in python?
(8 answers)
Closed 7 years ago.
I'm trying to convert poloar into x-y panel:
df['x']=df.apply(lambda x: x['speed'] * math.cos(math.radians(x['degree'])),axis=1)
df['y']=df.apply(lambda x: x['speed'] * math.sin(math.radians(x['degree'])),axis=1)
df.head()
This produces
The problem is that the x is too long, how can I make it shorter?
I find in How to turn a float number like 293.4662543 into 293.47 in python?, I can do "%.2f" % 1.2399, but if this is a good approach?
Actually, np.round works well in this case
> from pandas import DataFrame
> import numpy as np
> a = DataFrame(np.random.normal(size=10).reshape((5,2)))
0 1
0 -1.444689 -0.991011
1 1.054962 -0.288084
2 -0.700032 -0.604181
3 0.693142 2.281788
4 -1.647281 -1.309406
> np.round(a,2)
0 1
0 -1.44 -0.99
1 1.05 -0.29
2 -0.70 -0.60
3 0.69 2.28
4 -1.65 -1.31
you can also round an individual column by simply overwriting with rounded values:
> a[1] = np.round(a[1],3)
> a
0 1
0 0.028320 -1.104
1 -0.121453 -0.179
2 -1.906779 -0.347
3 0.234835 -0.522
4 -0.309782 0.129