Split a Pandas Dataframe into multiple Dataframes based on Triangular Number Series

Split a Pandas Dataframe into multiple Dataframes based on Triangular Number Series - python

I have a DataFrame (df) and I need to split it into n number of Dataframes based on the column numbers. But, it has to follow the Triangular Series pattern:
df1 = df[[0]]
df2 = df[[1,2]]
df3 = df[[3,4,5]]
df4 = df[[6,7,8,9]]
etc.

Consider the dataframe df
df = pd.DataFrame(
np.arange(100).reshape(10, 10),
columns=list('ABCDEFGHIJ')
)
df
A B C D E F G H I J
0 0 1 2 3 4 5 6 7 8 9
1 10 11 12 13 14 15 16 17 18 19
2 20 21 22 23 24 25 26 27 28 29
3 30 31 32 33 34 35 36 37 38 39
4 40 41 42 43 44 45 46 47 48 49
5 50 51 52 53 54 55 56 57 58 59
6 60 61 62 63 64 65 66 67 68 69
7 70 71 72 73 74 75 76 77 78 79
8 80 81 82 83 84 85 86 87 88 89
9 90 91 92 93 94 95 96 97 98 99
i_s, j_s = np.arange(4).cumsum(), np.arange(1, 5).cumsum()
df1, df2, df3, df4 = [
df.iloc[:, i:j] for i, j in zip(i_s, j_s)
]
Verify
pd.concat(dict(enumerate([df.iloc[:, i:j] for i, j in zip(i_s, j_s)])), axis=1)
0 1 2 3
A B C D E F G H I J
0 0 1 2 3 4 5 6 7 8 9
1 10 11 12 13 14 15 16 17 18 19
2 20 21 22 23 24 25 26 27 28 29
3 30 31 32 33 34 35 36 37 38 39
4 40 41 42 43 44 45 46 47 48 49
5 50 51 52 53 54 55 56 57 58 59
6 60 61 62 63 64 65 66 67 68 69
7 70 71 72 73 74 75 76 77 78 79
8 80 81 82 83 84 85 86 87 88 89
9 90 91 92 93 94 95 96 97 98 99

first get Triangular Number Series, then apply it to dataframe
n = len(df.columns.tolist())
end = 0
i = 0
res = []
while end < n:
begin = end
end = i*(i+1)/2
res.append(begin,end)
idx = map( lambda x:range(x),res)
for i in idx:
df[i]

Related

Pivoting an numpy array by using pandas [duplicate]

How do I convert a list of lists to a panda dataframe?
it is not in the form of coloumns but instead in the form of rows.
#!/usr/bin/env python
from random import randrange
import pandas
data = [[[randrange(0,100) for j in range(0, 12)] for y in range(0, 12)] for x in range(0, 5)]
print data
df = pandas.DataFrame(data[0], columns=['B','P','F','I','FP','BP','2','M','3','1','I','L'])
print df
for example:
data[0][0] == [64, 73, 76, 64, 61, 32, 36, 94, 81, 49, 94, 48]
I want it to be shown as rows and not coloumns.
currently it shows somethign like this
B P F I FP BP 2 M 3 1 I L
0 64 73 76 64 61 32 36 94 81 49 94 48
1 57 58 69 46 34 66 15 24 20 49 25 98
2 99 61 73 69 21 33 78 31 16 11 77 71
3 41 1 55 34 97 64 98 9 42 77 95 41
4 36 50 54 27 74 0 8 59 27 54 6 90
5 74 72 75 30 62 42 90 26 13 49 74 9
6 41 92 11 38 24 48 34 74 50 10 42 9
7 77 9 77 63 23 5 50 66 49 5 66 98
8 90 66 97 16 39 55 38 4 33 52 64 5
9 18 14 62 87 54 38 29 10 66 18 15 86
10 60 89 57 28 18 68 11 29 94 34 37 59
11 78 67 93 18 14 28 64 11 77 79 94 66
I want the rows and coloumns to be switched. Moreover, How do I make it for all 5 main lists?
This is how I want the output to look like with other coloumns also filled in.
B P F I FP BP 2 M 3 1 I L
0 64
1 73
1 76
2 64
3 61
4 32
5 36
6 94
7 81
8 49
9 94
10 48
However. df.transpose() won't help.

This is what I came up with
data = [[[randrange(0,100) for j in range(0, 12)] for y in range(0, 12)] for x in range(0, 5)]
print data
df = pandas.DataFrame(data[0], columns=['B','P','F','I','FP','BP','2','M','3','1','I','L'])
print df
df1 = df.transpose()
df1.columns = ['B','P','F','I','FP','BP','2','M','3','1','I','L']
print df1

import numpy
df = pandas.DataFrame(numpy.asarray(data[x]).T.tolist(),
columns=['B','P','F','I','FP','BP','2','M','3','1','I','L'])

Project Euler problem 11 in Python - Row by row iterations not working

In order to solve problem 11, I have sought to implement 4 loops. Each of the 4 loops iterates in a different direction, so for example the first loop (which I will use to demonstrate my issue below) starts vertically from the top left of the grid. The logic of the loop is to go through the top row and then move down a row and follow the same multiplication pattern. After 16 iterations there are no more combinations of numbers and so the loop stops.
In order to test whether or not the function works, I want to print a list of all the iterations to ensure that it prints 360 unique numbers. The idea being that I can then alter the code to start with figure = 0, and with each iteration I can check to see if the number produced is bigger than the current value for figure. If it is, then figure is replaced with the value of that iteration.
My issue is that the output of my code is the same list of 20 numbers 16 times. Any help with this one would be highly appreciated! I know that there are many ways of doing this, and that I can look up the answers, but I want to get my own logic/solution working before I look at any answers, and this is the main blocker at the moment.
#code starts here
twenmat = [20*20 matrix]
newlist = []
figure = 0
for items in twenmat:
for x in range(0,20):
y = 0
newlist.append(twenmat[0+y][x]*twenmat[1+y][x]*twenmat[2+y][x]*twenmat[3+y][x])
y = y + 1
if y == 16:
break
print(newlist)
#end of script

Rather than manipulating individual coordinates, you could just shift the matrix by 1, 2 and 3 in each direction and perform cell by cell of multiplications between shifted matrices. Record the maximum of these products as you go through the 4 directions (right, down, down-right, up-right):
data =\
"""08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48"""
M = [ [*map(int,line.split())] for line in data.split("\n") ]
...
# shift the matrix by a positive or negative amount vertically and horizontally
# empty positions are filled with 1 so that the products aren't impacted
def shift(m,v,h):
if v<0 : m = [[1]*len(m)]*-v + m[:v]
else : m = m[v:] + [[1]*len(m)]*v
if h<0 : m = [ [1]*-h + r[:h] for r in m ]
else : m = [ r[h:] + [1]*h for r in m ]
return m
# base matrix multiplied cell by cell with 3 shifted versions ...
maxProd = 0
for dv,dh in [(0,1),(1,0),(1,1),(-1,1)]:
m = M # start with non-shifted values
for i in range(1,4):
# multiply by each shifted copies cell by cell
m = [ [a*b for a,b in zip(r0,r1)]
for r0,r1 in zip(m,shift(M,dv*i,dh*i)) ]
# record maximum of all resulting products
maxProd = max(maxProd,max((max(row) for row in m)))
print(maxProd) # 70600674
To illustrate this shifting process, let's look at the 3 shifted versions going down-right on the main diagonal (offset: 1,1):
shifted by 1:
49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 4 56 62 0 1
49 31 73 55 79 14 29 93 71 40 67 53 88 30 3 49 13 36 65 1
70 95 23 4 60 11 42 69 24 68 56 1 32 56 71 37 2 36 91 1
31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80 1
47 32 60 99 3 45 2 44 75 33 53 78 36 84 20 35 17 12 50 1
98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70 1
26 20 68 2 62 12 20 95 63 94 39 63 8 40 91 66 49 94 21 1
55 58 5 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72 1
36 23 9 75 0 76 44 20 45 35 14 0 61 33 97 34 31 33 95 1
17 53 28 22 75 31 67 15 94 3 80 4 62 16 14 9 53 56 92 1
39 5 42 96 35 31 47 55 58 88 24 0 17 54 24 36 29 85 57 1
56 0 48 35 71 89 7 5 44 44 37 44 60 21 58 51 54 17 58 1
80 81 68 5 94 47 69 28 73 92 13 86 52 17 77 4 89 55 40 1
52 8 83 97 35 99 16 7 97 57 32 16 26 26 79 33 27 98 66 1
36 68 87 57 62 20 72 3 46 33 67 46 55 12 32 63 93 53 69 1
42 16 73 38 25 39 11 24 94 72 18 8 46 29 32 40 62 76 36 1
69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 4 36 16 1
73 35 29 78 31 90 1 74 31 49 71 48 86 81 16 23 57 5 54 1
70 54 71 83 51 54 69 16 92 33 48 61 43 52 1 89 19 67 48 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
shifted by 2:
31 73 55 79 14 29 93 71 40 67 53 88 30 3 49 13 36 65 1 1
95 23 4 60 11 42 69 24 68 56 1 32 56 71 37 2 36 91 1 1
16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80 1 1
32 60 99 3 45 2 44 75 33 53 78 36 84 20 35 17 12 50 1 1
81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70 1 1
20 68 2 62 12 20 95 63 94 39 63 8 40 91 66 49 94 21 1 1
58 5 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72 1 1
23 9 75 0 76 44 20 45 35 14 0 61 33 97 34 31 33 95 1 1
53 28 22 75 31 67 15 94 3 80 4 62 16 14 9 53 56 92 1 1
5 42 96 35 31 47 55 58 88 24 0 17 54 24 36 29 85 57 1 1
0 48 35 71 89 7 5 44 44 37 44 60 21 58 51 54 17 58 1 1
81 68 5 94 47 69 28 73 92 13 86 52 17 77 4 89 55 40 1 1
8 83 97 35 99 16 7 97 57 32 16 26 26 79 33 27 98 66 1 1
68 87 57 62 20 72 3 46 33 67 46 55 12 32 63 93 53 69 1 1
16 73 38 25 39 11 24 94 72 18 8 46 29 32 40 62 76 36 1 1
36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 4 36 16 1 1
35 29 78 31 90 1 74 31 49 71 48 86 81 16 23 57 5 54 1 1
54 71 83 51 54 69 16 92 33 48 61 43 52 1 89 19 67 48 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
shifter by 3:
23 4 60 11 42 69 24 68 56 1 32 56 71 37 2 36 91 1 1 1
71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80 1 1 1
60 99 3 45 2 44 75 33 53 78 36 84 20 35 17 12 50 1 1 1
28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70 1 1 1
68 2 62 12 20 95 63 94 39 63 8 40 91 66 49 94 21 1 1 1
5 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72 1 1 1
9 75 0 76 44 20 45 35 14 0 61 33 97 34 31 33 95 1 1 1
28 22 75 31 67 15 94 3 80 4 62 16 14 9 53 56 92 1 1 1
42 96 35 31 47 55 58 88 24 0 17 54 24 36 29 85 57 1 1 1
48 35 71 89 7 5 44 44 37 44 60 21 58 51 54 17 58 1 1 1
68 5 94 47 69 28 73 92 13 86 52 17 77 4 89 55 40 1 1 1
83 97 35 99 16 7 97 57 32 16 26 26 79 33 27 98 66 1 1 1
87 57 62 20 72 3 46 33 67 46 55 12 32 63 93 53 69 1 1 1
73 38 25 39 11 24 94 72 18 8 46 29 32 40 62 76 36 1 1 1
41 72 30 23 88 34 62 99 69 82 67 59 85 74 4 36 16 1 1 1
29 78 31 90 1 74 31 49 71 48 86 81 16 23 57 5 54 1 1 1
71 83 51 54 69 16 92 33 48 61 43 52 1 89 19 67 48 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Each number is moved to the next position diagonally so the product of cells at a given position corresponds to the 4 values going down-right on the main diagonal.
We do this for all directions to get the maximum product.

How to write this code in an optimal (pythonic) way?

I have the following code in R and I need to write it in an optimal way in python using pandas. I wrote it but it takes a long time to run.
1) is there someone who can confirm that this is an equivalent of R code in python
2) how to write it in a pythonic way(optimal way)
in R
for (i in 1:dim(df1)[1])
df1$column1[i] <- sum(df2[i,4:33])
in Python
for i in range(df1.shape[0]):
df1['column1'][i] = df2.iloc[i,3:34].sum()

These are two ways to make the replacement
df1['column1'] = df2.iloc[:, 3:34].sum(axis=1)
OR
df1.loc[:, 'column1'] = df2.iloc[:, 3:34].sum(axis=1)

Use vectorized operations:
>>> df = pd.DataFrame(np.random.randint(0, 100, (10, 15)), columns=list('abcdefghijklmno'))
>>> df
a b c d e f g h i j k l m n o
0 71 93 12 32 17 23 35 57 26 89 4 29 28 83 30
1 98 78 75 0 61 81 8 17 93 71 48 47 72 52 11
2 13 62 93 48 31 23 42 66 77 99 59 1 40 72 87
3 7 5 5 43 83 19 59 36 18 96 50 60 46 45 54
4 32 69 93 6 7 12 15 49 29 11 37 83 75 97 84
5 52 53 43 61 93 85 91 99 65 62 35 89 55 77 62
6 44 7 41 56 40 11 39 91 87 46 95 48 30 75 16
7 93 15 63 23 14 20 7 33 29 31 41 40 82 0 16
8 46 63 59 59 81 51 34 41 89 68 20 64 95 70 74
9 33 58 49 91 51 46 43 83 37 53 47 32 42 12 59
Then simply:
>>> df['column1'] = df.iloc[:, 3:8].sum(axis=1)
>>> df
a b c d e f g h i j k l m n o column1
0 71 93 12 32 17 23 35 57 26 89 4 29 28 83 30 164
1 98 78 75 0 61 81 8 17 93 71 48 47 72 52 11 167
2 13 62 93 48 31 23 42 66 77 99 59 1 40 72 87 210
3 7 5 5 43 83 19 59 36 18 96 50 60 46 45 54 240
4 32 69 93 6 7 12 15 49 29 11 37 83 75 97 84 89
5 52 53 43 61 93 85 91 99 65 62 35 89 55 77 62 429
6 44 7 41 56 40 11 39 91 87 46 95 48 30 75 16 237
7 93 15 63 23 14 20 7 33 29 31 41 40 82 0 16 97
8 46 63 59 59 81 51 34 41 89 68 20 64 95 70 74 266
9 33 58 49 91 51 46 43 83 37 53 47 32 42 12 59 314
>>>

How to create a pandas dataframe array ,whose specific column always has value greater than a particular column -by using np.random.randint

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
print(df)
I want column 'A' always to have a value greater than column 'B'.

df.A, df.B = df[['A', 'B']].max(axis=1), df[['A', 'B']].min(axis=1)

Try this:
newdf = df.apply(lambda x: x if x[0]>x[1] else [*x[:2][::-1],*x[2:]],axis=1)
print(newdf)
Output:
A B C D
0 85 14 22 85
1 62 54 20 1
2 82 78 48 59
3 81 59 54 39
4 92 12 79 44
5 69 64 8 11
6 49 34 48 69
7 68 28 80 27
8 72 17 2 40
9 26 15 49 62
10 29 2 86 12
11 69 7 32 99
12 39 35 65 32
13 45 36 36 12
14 54 21 29 79
15 91 82 35 80
16 67 16 4 37
17 94 82 93 37
18 64 18 2 15
19 13 11 28 82
20 78 9 93 45
21 72 41 16 33
22 92 71 62 69
23 87 79 71 11
24 31 14 8 24
25 85 27 43 3
26 82 34 14 52
27 41 32 39 48
28 13 12 24 86
29 96 17 14 80
.. .. .. .. ..
70 17 13 20 91
71 26 7 57 96
72 41 0 24 58
73 98 68 90 13
74 88 35 81 56
75 65 43 70 86
76 82 81 44 68
77 97 45 23 66
78 81 45 78 48
79 62 24 43 62
80 43 13 42 49
81 97 28 75 45
82 3 0 54 40
83 57 46 16 38
84 87 46 35 13
85 41 13 78 89
86 62 36 94 23
87 84 35 69 93
88 63 18 39 3
89 45 42 30 6
90 81 8 49 82
91 28 28 11 47
92 97 81 49 92
93 86 24 82 40
94 76 72 30 51
95 93 92 1 69
96 97 76 38 81
97 87 49 26 64
98 98 25 93 55
99 57 2 87 10
[100 rows x 4 columns]

You can apply it to any no of columns.
import numpy as np
import pandas as pd
#np.random.seed(1)
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
#we are just sorting values of each rows in descending order.
df.values[:,::-1].sort()
print(df)
It gives following output:
A B C D
0 72 37 12 9
1 79 75 64 5
2 76 71 16 1
3 50 25 20 6
4 84 28 18 11
5 68 50 29 14
6 96 94 87 87
7 86 13 9 7
8 63 61 57 22
9 81 60 1 0
10 88 47 13 8
11 72 71 30 3
12 70 57 49 21
13 68 43 24 3
14 80 76 52 26
15 82 64 41 15
16 98 87 68 25
17 26 25 22 7
18 67 27 23 9
19 83 57 38 37
20 34 32 10 8

Issue with merging time series variables to create new DataFrame with arbitrary index

So I am trying to merge the following columns of data which are currently indexed as daily entries (but only have points once per week). I have separated the columns into year variables but am having trouble getting them into a combined dataframe and disregard the date index so that I can build out min/max columns by week over the years. I am not sure how to get merge/join function to do this.
#Create year variables, append to new dataframe with new index
I have the following:
def minmaxdata():
Totrigs = dataforgraphs()
tr = Totrigs
yrs=[tr['2007'],tr['2008'],tr['2009'],tr['2010'],tr['2011'],tr['2012'],tr['2013'],tr['2014']]
yrlist = ['tr07','tr08','tr09','tr10','tr11','tr12','tr13','tr14']
dic = dict(zip(yrlist,yrs))
yr07,yr08,yr09,yr10,yr11,yr12,yr13,yr14 =dic['tr07'],dic['tr08'],dic['tr09'],dic['tr10'],dic['tr11'],dic['tr12'],dic['tr13'],dic['tr14']
minmax = yr07.append([yr08,yr09,yr10,yr11,yr12,yr13,yr14],ignore_index=True)
I would like a Dataframe like the following:
2007 2008 2009 2010 2011 2012 2013 2014 min max
1 10 13 10 12 34 23 22 14 10 34
2 25 ...
3 22
4 ...
5
.
.
. ...
52

I'm not sure what your original data look like, but I don't think it's a good idea to hard-code all years. You lose re-usability. I'll setup a sequence of random integers indexed by date with one date per week.
In [65]: idx = pd.date_range ('2007-1-1','2014-12-31',freq='W')
In [66]: df = pd.DataFrame(np.random.randint(100, size=len(idx)), index=idx, columns=['value'])
In [67]: df.head()
Out[67]:
value
2007-01-07 7
2007-01-14 2
2007-01-21 85
2007-01-28 55
2007-02-04 36
In [68]: df.tail()
Out[68]:
value
2014-11-30 76
2014-12-07 34
2014-12-14 43
2014-12-21 26
2014-12-28 17
Then get year of the week:
In [69]: df['year'] = df.index.year
In [70]: df['week'] = df.groupby('year').cumcount()+1
(You may try df.index.week for week# but I've seen weird behavior like starting from week #53 in Jan.)
Finally, do a pivot table to transform and get row-wise max/min:
In [71]: df2 = df.pivot_table(index='week', columns='year', values='value')
In [72]: df2['max'] = df2.max(axis=1)
In [73]: df2['min'] = df2.min(axis=1)
And now our dataframe df2 looks like this and should be what you need:
In [74]: df2
Out[74]:
year 2007 2008 2009 2010 2011 2012 2013 2014 max min
week
1 7 82 13 32 24 58 18 10 82 7
2 2 5 29 0 2 97 59 83 97 0
3 85 89 8 83 63 73 47 49 89 8
4 55 5 1 44 78 10 13 87 87 1
5 36 41 48 98 98 24 24 69 98 24
6 51 43 62 60 44 57 34 33 62 33
7 37 66 72 46 28 11 73 36 73 11
8 30 13 86 93 46 67 95 15 95 13
9 78 84 16 21 70 39 43 90 90 16
10 9 2 88 15 39 81 44 96 96 2
11 34 76 16 44 44 26 30 77 77 16
12 2 24 23 13 25 69 25 74 74 2
13 66 91 67 77 18 47 95 66 95 18
14 59 52 22 42 40 99 88 21 99 21
15 76 17 31 57 43 31 91 67 91 17
16 76 38 53 43 84 45 78 9 84 9
17 88 53 34 22 99 93 61 42 99 22
18 78 19 82 19 5 80 55 69 82 5
19 54 92 56 6 2 85 7 67 92 2
20 8 56 86 41 60 76 31 81 86 8
21 64 76 11 38 41 98 39 72 98 11
22 21 86 34 1 15 27 26 95 95 1
23 82 90 3 17 62 18 93 20 93 3
24 47 42 32 27 83 8 22 14 83 8
25 15 66 70 16 4 22 26 14 70 4
26 12 68 21 7 86 2 27 10 86 2
27 85 85 9 39 17 94 67 42 94 9
28 73 80 96 49 46 23 69 84 96 23
29 57 74 6 71 79 31 79 7 79 6
30 18 84 85 34 71 69 0 62 85 0
31 24 40 93 53 72 46 44 71 93 24
32 95 4 58 57 68 27 95 71 95 4
33 65 84 87 41 38 45 71 33 87 33
34 62 14 41 83 79 63 44 13 83 13
35 49 96 50 62 25 45 69 63 96 25
36 6 38 86 34 98 60 67 80 98 6
37 99 44 26 19 19 20 57 17 99 17
38 2 40 7 65 68 58 68 13 68 2
39 72 31 83 65 69 39 10 76 83 10
40 90 31 42 20 7 8 62 79 90 7
41 10 46 82 96 30 43 12 84 96 10
42 79 38 28 78 25 9 80 2 80 2
43 64 83 63 40 29 86 10 15 86 10
44 89 91 62 48 53 69 16 0 91 0
45 99 26 85 45 26 53 79 86 99 26
46 35 14 46 25 74 6 68 44 74 6
47 17 9 84 88 29 83 85 1 88 1
48 18 69 55 16 77 35 16 76 77 16
49 60 4 36 50 81 28 50 34 81 4
50 36 29 38 28 81 86 71 43 86 28
51 41 82 95 27 95 77 74 26 95 26
52 2 81 89 82 28 2 11 17 89 2
53 NaN NaN NaN NaN NaN 0 NaN NaN 0 0
EDIT:
If you need max/min over a certain columns, just list them. In this case (2007-2013), they are consecutive so you can do the following.
df2['max_2007to2013'] = df2[range(2007,2014)].max(axis=1)
If not, simply list them like: df2[[2007,2010,2012,2013]].max(axis=1)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Split a Pandas Dataframe into multiple Dataframes based on Triangular Number Series - python

I have a DataFrame (df) and I need to split it into n number of Dataframes based on the column numbers. But, it has to follow the Triangular Series pattern: df1 = df[[0]] df2 = df[[1,2]] df3 = df[[3,4,5]] df4 = df[[6,7,8,9]] etc.

first get Triangular Number Series, then apply it to dataframe n = len(df.columns.tolist()) end = 0 i = 0 res = [] while end < n: begin = end end = i*(i+1)/2 res.append(begin,end) idx = map( lambda x:range(x),res) for i in idx: df[i]

Related

Pivoting an numpy array by using pandas [duplicate]

Project Euler problem 11 in Python - Row by row iterations not working

How to write this code in an optimal (pythonic) way?

How to create a pandas dataframe array ,whose specific column always has value greater than a particular column -by using np.random.randint

Issue with merging time series variables to create new DataFrame with arbitrary index

Categories

Resources