I am having a really simple csv file of this type (i have put the Fibonacci numbers for example):
nn,number
1,1
2,1
3,2
4,3
5,5
6,8
7,13
8,21
9,34
10,55
11,89
12,144
13,233
14,377
15,610
16,987
17,1597
18,2584
19,4181
20,6765
21,10946
22,17711
23,28657
24,46368
25,75025
26,121393
27,196418
and i am just trying to bulk process the rows in the following manner (the fib numbers are irrelevant)
import csv
b=0
s=1
i=1
itera=0
maximum=10000
bulk_save=10
csv_file='really_simple.csv'
fo = open(csv_file)
reader = csv.reader(fo)
##Skipping headers
_headers=reader.next()
while (s>0) and itera<maximum:
print 'processing...'
b+=1
tobesaved=[]
for row,i in zip(reader,range(1,bulk_save+1)):
itera+=1
tobesaved.append(row)
print itera,row[0]
s=len(tobesaved)
print 'chunk no '+str(b)+' processed '+str(s)+' rows'
print 'Exit.'
The output i get is a bit weird (as if the reader is omitting an entry at the end of the loop)
processing...
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
chunk no 1 commited 10 rows
processing...
11 12
12 13
13 14
14 15
15 16
16 17
17 18
18 19
19 20
20 21
chunk no 2 commited 10 rows
processing...
21 23
22 24
23 25
24 26
25 27
chunk no 3 commited 5 rows
processing...
chunk no 4 commited 0 rows
Exit.
Do you have any idea what the problem could be?
My guess is the zip function.
The reason i have the code like that (getting chunks of data )is that i need to save in bulk csv entries to sqlite3 database (using executemany and commit at the end of every zip loop, so that I will not overload my memory.
Thanks!
Try following:
import csv
def process(rows, chunk_no):
for no, data in rows:
print no, data
print 'chunk no {} process {} rows'.format(chunk_no, len(rows))
csv_file='really_simple.csv'
with open(csv_file) as fo:
reader = csv.reader(fo)
_headers = reader.next()
chunk_no = 1
tobesaved = []
for row in reader:
tobesaved.append(row)
if len(tobesaved) == 10:
process(tobesaved, chunk_no)
chunk_no += 1
tobesaved = []
if tobesaved:
process(tobesaved, chunk_no)
prints
1 1
2 1
3 2
4 3
5 5
6 8
7 13
8 21
9 34
10 55
chunk no 1 process 10 rows
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
chunk no 2 process 10 rows
21 10946
22 17711
23 28657
24 46368
25 75025
26 121393
27 196418
chunk no 3 process 7 rows
Related
Strange issue, when I run this code:
data = open("data.txt", "r")
output = open("output.txt", "w")
for line in data:
output.write(line)
It will only start to write onto the output file at line 22
data.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
output.txt
22
23
24
25
26
27
28
29
30
This only happens when running it in a JupyterLab notebook. Bug or feature? Or am I missing something?
Huh, strange because I tried almost similar code on my machine and it copies all the 30 inputs. However, the only different thing I did was use the absolute file path so my code was:
data = open("C:\Users\User\Jupyter Notebook\data.txt", "r")
outputs = open("C:\Users\User\Jupyter Notebook\outputs.txt", "w")
for line in data:
outputs.write(line)
Can you see if this method works?
I have a df with time data and I would like to transform these data to second (see example below).
Compression_level Size (M) Real time (s) User time (s) Sys time (s)
0 0 265 0:19.938 0:24.649 0:3.062
1 1 76 0:17.910 0:25.929 0:3.098
2 2 74 1:02.619 0:27.724 0:3.014
3 3 73 0:20.607 0:27.937 0:3.193
4 4 67 0:19.598 0:28.853 0:2.925
5 5 67 0:21.032 0:30.119 0:3.206
6 6 66 0:27.013 0:31.462 0:3.106
7 7 65 0:27.337 0:36.226 0:3.060
8 8 64 0:37.651 0:47.246 0:2.933
9 9 64 0:59.241 1:8.333 0:3.027
This is the output I would like to obtain.
df["Real time (s)"]
0 19.938
1 17.910
2 62.619
...
I have some useful code but I do not how to itinerate this code in a data frame
x = time.strptime("00:01:00","%H:%M:%S")
datetime.timedelta(hours=x.tm_hour,minutes=x.tm_min, seconds=x.tm_sec).total_seconds()
Add 00: from right side for 0hours, pass to to_timedelta and then add Series.dt.total_seconds:
df["Real time (s)"] = pd.to_timedelta(df["Real time (s)"].radd('00:')).dt.total_seconds()
print (df)
Compression_level Size (M) Real time (s) User time (s) Sys time (s)
0 0 265 19.938 0:24.649 0:3.062
1 1 76 17.910 0:25.929 0:3.098
2 2 74 62.619 0:27.724 0:3.014
3 3 73 20.607 0:27.937 0:3.193
4 4 67 19.598 0:28.853 0:2.925
5 5 67 21.032 0:30.119 0:3.206
6 6 66 27.013 0:31.462 0:3.106
7 7 65 27.337 0:36.226 0:3.060
8 8 64 37.651 0:47.246 0:2.933
9 9 64 59.241 1:8.333 0:3.027
Solution for processing multiple columns:
def to_td(x):
return pd.to_timedelta(x.radd('00:')).dt.total_seconds()
cols = ["Real time (s)", "User time (s)", "Sys time (s)"]
df[cols] = df[cols].apply(to_td)
print (df)
Compression_level Size (M) Real time (s) User time (s) Sys time (s)
0 0 265 19.938 24.649 3.062
1 1 76 17.910 25.929 3.098
2 2 74 62.619 27.724 3.014
3 3 73 20.607 27.937 3.193
4 4 67 19.598 28.853 2.925
5 5 67 21.032 30.119 3.206
6 6 66 27.013 31.462 3.106
7 7 65 27.337 36.226 3.060
8 8 64 37.651 47.246 2.933
9 9 64 59.241 68.333 3.027
I have to deal with a square matrix (N x N) (N will change depending on the system, but the matrix will always be a square matrix).
Here is an example:
0 1 2 3 4
0 5.1677124550E-001 5.4962112499E-005 3.2484393256E-002 -1.8901697652E-001 -6.7156804753E-003
1 5.5380106796E-005 5.6159927753E-001 -1.9000545049E-003 -1.4737748792E-002 -7.2598453774E-002
2 3.2486915835E-002 -1.8996351539E-003 5.6791783316E-001 7.2316374186E-002 1.5013066446E-003
3 -1.8901411495E-001 -1.4737367075E-002 7.2315825338E-002 6.2721160365E-001 3.1553528602E-002
4 -6.7136454124E-003 -7.2597907350E-002 1.5007743348E-003 3.1554372311E-002 2.7318109331E-001
5 6.6738948243E-002 1.4102132238E-003 -1.2689244944E-001 4.7666038803E-002 1.8559074897E-002
6 -2.5293332676E-002 3.7536452002E-002 -1.3453018251E-002 -1.3177136905E-001 6.8262612506E-002
7 5.0951492945E-003 2.1082303893E-005 2.2599127408E-004 1.0287898189E-001 -1.1117916184E-001
8 1.0818230191E-003 -1.2435319909E-002 8.1008075834E-003 -4.2864102001E-002 4.2865913226E-002
9 -1.8399671295E-002 -2.1579653853E-002 -8.3073582356E-003 -2.1848513510E-001 -7.3408914625E-002
10 3.4566032399E-003 -4.0687639382E-003 1.3769999130E-003 -1.1873434189E-001 -3.3274201039E-002
11 6.6093238125E-003 1.7153435473E-002 4.9392012712E-003 -8.4590814134E-002 -4.3601041176E-002
12 -1.1418316960E-001 -1.1241625427E-001 -3.2263873516E-002 -1.9323129435E-002 -2.6233049625E-002
13 -1.1352899039E-001 -2.2898299860E-001 -5.3035072561E-002 7.4480651562E-004 6.3778892206E-004
14 -3.2197359289E-002 -5.3404040557E-002 -6.2530142462E-002 9.6648204015E-003 1.5382174347E-002
15 -1.2210509335E-001 1.1380412205E-001 -3.8374895516E-002 -1.2823165326E-002 2.3865200517E-002
16 1.1478157080E-001 -2.1487971631E-001 5.9955334103E-002 -1.2803721235E-003 -2.2477259002E-004
17 -3.9162044498E-002 6.0167325377E-002 -6.7692892326E-002 6.3814569032E-003 -1.3309923267E-002
18 -5.1386866211E-002 -1.1483215267E-003 -3.8482481829E-002 2.2227734790E-003 2.4860195290E-004
19 -1.8287048910E-003 -4.5442287955E-002 -7.6787332291E-003 7.6970470456E-004 -1.8456603178E-003
20 -3.4812676792E-002 -7.8376169613E-003 -3.1205975353E-001 -2.8005140005E-003 3.9792109835E-004
21 2.6908361866E-003 3.7102890907E-004 2.8494060446E-002 -4.8904422930E-002 -5.8840348122E-004
22 -1.6354677061E-003 2.2592828188E-003 1.6591434361E-004 -4.9992263663E-003 -4.3243295112E-002
23 -1.4297833794E-003 -1.7830154308E-003 -1.1426700328E-002 1.7125095395E-003 -1.2016863398E-002
24 1.6271802154E-003 1.6383303957E-003 -7.8049656555E-004 3.7177399735E-003 -1.0472268655E-002
25 -4.1949740427E-004 1.5301971185E-004 -9.8681335931E-004 -2.2257204483E-004 -5.1722898203E-003
26 1.0290471110E-003 9.3255502541E-004 7.7166886713E-004 4.5630851485E-003 -4.3761358485E-003
27 -7.0031784470E-004 -3.5205332654E-003 -1.6311730073E-003 -1.2805479632E-002 -6.5565487971E-003
28 7.4046927792E-004 1.9332629981E-003 3.7374682636E-004 3.9965654817E-003 -6.2275912806E-003
29 -3.4680278867E-004 -2.3027344089E-003 -1.1338817043E-003 -1.2023581780E-002 -5.4242202971E-003
5 6 7 8 9
0 6.6743285428E-002 -2.5292337123E-002 5.0949675928E-003 1.0817408844E-003 -1.8399704662E-002
1 1.4100215877E-003 3.7536256943E-002 2.1212526899E-005 -1.2435482773E-002 -2.1579384876E-002
2 -1.2689432485E-001 -1.3453164785E-002 2.2618690004E-004 8.1008703937E-003 -8.3084039605E-003
3 4.7663851818E-002 -1.3181118094E-001 1.0290976691E-001 -4.2887391630E-002 -2.1847562123E-001
4 1.8558453001E-002 6.8311145594E-002 -1.1122358467E-001 4.2891711956E-002 -7.3413776745E-002
5 6.5246209445E-001 -3.7960754525E-002 5.8439215647E-002 -9.0620367134E-002 -8.4164313206E-002
6 -3.7935271881E-002 1.9415336793E-001 -6.8115262349E-002 5.0899890760E-002 -3.3687874555E-002
7 5.8422477033E-002 -6.8128901087E-002 3.9950499633E-001 -4.4336879147E-002 -4.0665928103E-002
8 -9.0612201567E-002 5.0902528870E-002 -4.4330072001E-002 1.2680415316E-001 1.7096405711E-002
9 -8.4167028549E-002 -3.3690056890E-002 -4.0677875424E-002 1.7097273427E-002 5.2579065978E-001
10 -6.4841142152E-002 -5.4453858464E-003 -2.4697277476E-001 8.5069643903E-005 1.8744016178E-001
11 -1.0367060076E-001 1.5864203200E-002 -1.6074822795E-002 -5.5265410413E-002 -7.3152548403E-002
12 -9.0665723957E-003 3.3027526012E-003 1.8484849938E-003 -7.5841163489E-004 -3.3700244298E-003
13 4.7717318460E-004 -1.8118719766E-003 1.6014630540E-003 -2.3830908057E-004 2.1049292570E-003
14 4.3836856576E-003 -1.7242302777E-003 -1.2023546553E-003 4.0533783460E-004 1.4850814596E-003
15 -1.2402059167E-002 -7.4793143461E-003 -3.8769252328E-004 3.9551076185E-003 1.0737706641E-003
16 -9.3076805579E-005 -1.6074185601E-003 1.7551579833E-003 -5.1663470094E-004 1.1072804383E-003
17 4.6817349747E-003 3.6900011954E-003 -8.6155331565E-004 -9.1007768778E-005 -7.3899260162E-004
18 3.2959550689E-002 3.0400921147E-003 3.9724187499E-004 -1.9220339108E-003 1.8075790317E-003
19 7.0905456379E-004 -5.0949208181E-004 -4.6021500516E-004 -7.9847500945E-004 1.4079850530E-004
20 -1.8687467448E-002 -6.3913023759E-004 -7.3566296037E-004 2.3726543730E-003 -1.0663719038E-003
21 3.6598966411E-003 -8.2335128379E-003 7.5645765132E-004 -2.1824880567E-002 -3.5125687811E-003
22 -1.6198130808E-002 8.4576317115E-003 -6.2045498682E-004 3.3460766491E-002 3.2638760335E-003
23 -3.2057393808E-001 -1.1315081941E-002 3.4822885510E-003 -5.8263446092E-003 2.9508421818E-004
24 -2.6366856593E-002 -5.8331954255E-004 1.1995976399E-003 3.4813904521E-003 -5.0942740761E-002
25 6.5474742063E-003 -5.7681583908E-003 -2.2680039574E-002 -3.3264360995E-002 4.8925407218E-003
26 -1.1288074542E-002 -4.5938216710E-003 -1.9339903561E-003 1.0812058656E-002 2.3005958417E-002
27 1.8937006089E-002 6.5590668002E-003 -2.9973042787E-003 -9.1466195902E-003 -2.0027029530E-001
28 -5.0006834397E-003 -3.1011487603E-002 -2.1071980031E-002 1.5171078954E-002 -6.3286786806E-002
29 1.0199591553E-002 -7.9372677248E-004 3.0157129340E-003 3.3043947441E-003 1.2554933598E-001
10 11 12 13 14
0 3.4566170422E-003 6.6091516193E-003 -1.1418209846E-001 -1.1352717720E-001 -3.2196213169E-002
1 -4.0687114857E-003 1.7153538295E-002 -1.1241515840E-001 -2.2897846552E-001 -5.3401852861E-002
2 1.3767476381E-003 4.9395834885E-003 -3.2262805417E-002 -5.3032729716E-002 -6.2527093260E-002
3 -1.1874067860E-001 -8.4586993618E-002 -1.9322697616E-002 7.4504831410E-004 9.6646936748E-003
4 -3.3280804952E-002 -4.3604931512E-002 -2.6232842935E-002 6.3789697287E-004 1.5382093474E-002
5 -6.4845769217E-002 -1.0366990398E-001 -9.0664935892E-003 4.7719667654E-004 4.3835884630E-003
6 -5.4306282394E-003 1.5863464756E-002 3.3027917727E-003 -1.8118646089E-003 -1.7242102753E-003
7 -2.4687457565E-001 -1.6075394559E-002 1.8484728466E-003 1.6014634135E-003 -1.2023496466E-003
8 8.5962912652E-005 -5.5265657567E-002 -7.5843145596E-004 -2.3831274033E-004 4.0533385644E-004
9 1.8744386918E-001 -7.3152643002E-002 -3.3700964189E-003 2.1048865009E-003 1.4850822567E-003
10 4.2975054072E-001 1.0364270794E-001 -1.5875283846E-003 6.7147216913E-004 1.2875627684E-004
11 1.0364402707E-001 6.0712435750E-001 5.1492123223E-003 8.2705404716E-004 -1.8653698814E-003
12 -1.5875318643E-003 5.1492269487E-003 1.2662026379E-001 1.2488481495E-001 3.3008712754E-002
13 6.7147489686E-004 8.2705994225E-004 1.2488477299E-001 2.4603749137E-001 5.7666439818E-002
14 1.2875157882E-004 -1.8653719810E-003 3.3008614344E-002 5.7666322609E-002 6.3196096154E-002
15 1.1375173141E-003 -1.2188735107E-003 9.5708352328E-003 -1.3282223067E-002 5.3571128896E-003
16 2.1319373893E-004 -2.6367828437E-004 1.4833724552E-002 -2.0115235494E-002 7.8461850894E-003
17 2.3051283757E-004 3.4044831571E-004 4.9262824289E-003 -6.6151918659E-003 1.1684894610E-003
18 -5.6658408835E-004 1.5710333316E-003 -2.6543076573E-003 1.0490950154E-003 -1.5676208892E-002
19 1.0005496308E-003 1.0400419914E-003 -2.7122935995E-003 -5.3716049248E-005 -2.6747366947E-002
20 3.1068907684E-004 5.3348953665E-004 -4.7934824223E-004 4.4853558686E-004 -6.0300656596E-003
21 2.7080517882E-003 -1.9033626829E-002 8.8615570289E-004 -3.7735646663E-004 -7.4101143501E-004
22 -2.9622921796E-003 -2.4159082408E-002 6.6943323966E-004 1.1154593780E-004 1.5914682394E-004
23 3.2842560830E-003 -6.2612752482E-003 1.5738434272E-004 4.6284599959E-004 4.0588132107E-004
24 1.6971737369E-003 2.4217812563E-002 4.3246402884E-004 9.5059931011E-005 3.5484698283E-004
25 -7.4868993750E-002 -8.7332668698E-002 -6.0147742690E-005 -4.8099146029E-005 1.1509155506E-004
26 -9.3177706949E-002 -2.9315061874E-001 2.1287190612E-004 5.0813661565E-005 2.6955715462E-004
27 -7.0097859908E-002 1.2458191360E-001 -1.2846480258E-003 1.2192486380E-004 4.6853704861E-004
28 -6.9485493530E-002 4.8763866344E-002 7.7223643475E-004 1.3853535883E-004 5.4636752811E-005
29 4.8961381968E-002 -1.5272337445E-001 -8.8648769643E-004 -4.4975303480E-005 5.9586006091E-004
15 16 17 18 19
0 -1.2210501176E-001 1.1478027359E-001 -3.9162145749E-002 -5.1389252158E-002 -1.8288904037E-003
1 1.1380272374E-001 -2.1487588526E-001 6.0165774430E-002 -1.1487007778E-003 -4.5441546655E-002
2 -3.8374694597E-002 5.9953296524E-002 -6.7691825286E-002 -3.8484030260E-002 -7.6800715249E-003
3 -1.2822729286E-002 -1.2805898275E-003 6.3813065178E-003 2.2220841872E-003 7.6991955181E-004
4 2.3864994996E-002 -2.2470892452E-004 -1.3309838494E-002 2.4851560674E-004 -1.8460620529E-003
5 -1.2402212045E-002 -9.2994801153E-005 4.6817064931E-003 3.2958166488E-002 7.0866732024E-004
6 -7.4793278406E-003 -1.6074103229E-003 3.6899979002E-003 3.0392561951E-003 -5.0946020505E-004
7 -3.8770026733E-004 1.7551659565E-003 -8.6155605026E-004 3.9692465089E-004 -4.6038088334E-004
8 3.9551171890E-003 -5.1663991899E-004 -9.1008948343E-005 -1.9220277566E-003 -7.9837924658E-004
9 1.0738350084E-003 1.1072790098E-003 -7.3897453645E-004 1.8057852560E-003 1.4013275714E-004
10 1.1375075076E-003 2.1317640112E-004 2.3050639764E-004 -5.6673414945E-004 1.0005316579E-003
11 -1.2189105982E-003 -2.6367792495E-004 3.4043235164E-004 1.5732522246E-003 1.0407973658E-003
12 9.5708232459E-003 1.4833737759E-002 4.9262816092E-003 -2.6542614308E-003 -2.7122986789E-003
13 -1.3282260152E-002 -2.0115238348E-002 -6.6152067653E-003 1.0491248568E-003 -5.3705750675E-005
14 5.3571028398E-003 7.8462085672E-003 1.1684872139E-003 -1.5676176683E-002 -2.6747374282E-002
15 1.3378635756E-001 -1.2613361119E-001 4.2401828623E-002 -2.6595403473E-003 1.9873360401E-003
16 -1.2613349126E-001 2.3154756121E-001 -6.5778628114E-002 -2.2828335280E-003 1.4601821131E-003
17 4.2401749392E-002 -6.5778591727E-002 6.8187241643E-002 -1.6653902450E-002 2.5505038138E-002
18 -2.6595920073E-003 -2.2828074980E-003 -1.6653942562E-002 5.4855247002E-002 2.4729783529E-003
19 1.9873415121E-003 1.4601899329E-003 2.5505058190E-002 2.4729967206E-003 4.4724663284E-002
20 -3.8366743828E-004 -8.8746730931E-004 -6.4420927497E-003 3.6656962180E-002 8.1224860664E-003
21 9.2845385141E-004 3.6802433505E-004 -9.5040708316E-004 -5.1941208846E-003 -1.2444625713E-004
22 -5.0318487549E-004 1.4342911215E-004 2.8985859503E-004 2.0416113478E-004 9.1951318240E-004
23 7.4036073171E-004 -3.4730013615E-004 -1.3351566400E-004 2.3474188588E-003 1.3102362758E-005
24 -2.7749145090E-004 4.7724454321E-005 5.5527644806E-005 -1.7302886151E-004 -1.7726879169E-004
25 -2.5090250470E-004 2.1741519930E-005 2.7208805916E-004 -2.5982303487E-004 -1.9668228900E-004
26 -1.4489113997E-004 -3.0397727583E-005 2.7239543481E-005 -6.0050637375E-004 -2.9892198193E-005
27 -1.6519482597E-005 1.6435294198E-004 5.0961893634E-005 1.4077278097E-004 -1.9027010603E-005
28 -2.3547595249E-004 7.6124571826E-005 1.0117983985E-004 -1.1534040559E-004 -1.0579685787E-004
29 7.0507166233E-005 1.1552377841E-004 -4.5931305760E-005 -2.0007797315E-004 -1.3505340062E-004
20 21 22 23 24
0 -3.4812101478E-002 2.6911592086E-003 -1.6354152863E-003 -1.4301333227E-003 1.6249964844E-003
1 -7.8382610347E-003 3.7103408229E-004 2.2593110441E-003 -1.7829862164E-003 1.6374435740E-003
2 -3.1205423941E-001 2.8493671639E-002 1.6587990556E-004 -1.1426237591E-002 -7.8189111866E-004
3 -2.8004725758E-003 -4.8903739721E-002 -4.9988134121E-003 1.7100983514E-003 3.7179545055E-003
4 3.9806443322E-004 -5.8790208912E-004 -4.3242458298E-002 -1.2016207108E-002 -1.0472139534E-002
5 -1.8686790048E-002 3.6592865292E-003 -1.6198931842E-002 -3.2057224847E-001 -2.6367531700E-002
6 -6.3919412091E-004 -8.2335246704E-003 8.4576155591E-003 -1.1315054733E-002 -5.8369163532E-004
7 -7.3581915791E-004 7.5646519519E-004 -6.2047477465E-004 3.4823216513E-003 1.1991380964E-003
8 2.3726528036E-003 -2.1824763131E-002 3.3460717579E-002 -5.8262172949E-003 3.4812921433E-003
9 -1.0665296285E-003 -3.5124206435E-003 3.2639684654E-003 2.9530797749E-004 -5.0943824872E-002
10 3.1067613876E-004 2.7079189356E-003 -2.9623459983E-003 3.2841200274E-003 1.6984442797E-003
11 5.3351732140E-004 -1.9033427571E-002 -2.4158940046E-002 -6.2609613281E-003 2.4221378111E-002
12 -4.7937892256E-004 8.8611314755E-004 6.6939922854E-004 1.5740024716E-004 4.3249394082E-004
13 4.4851926804E-004 -3.7736678097E-004 1.1153694999E-004 4.6284806253E-004 9.5077824774E-005
14 -6.0300787410E-003 -7.4096053004E-004 1.5918637627E-004 4.0586523098E-004 3.5485782222E-004
15 -3.8368712363E-004 9.2843754228E-004 -5.0316845184E-004 7.4036906127E-004 -2.7745851356E-004
16 -8.8745240886E-004 3.6801936222E-004 1.4342995270E-004 -3.4729860789E-004 4.7711904531E-005
17 -6.4420819427E-003 -9.5038506002E-004 2.8983698019E-004 -1.3352326563E-004 5.5544671478E-005
18 3.6656852373E-002 -5.1941195232E-003 2.0415783452E-004 2.3474119607E-003 -1.7153048632E-004
19 8.1224361521E-003 -1.2444681834E-004 9.1951236579E-004 1.3097434442E-005 -1.7668019335E-004
20 3.3911554853E-001 2.8652507893E-003 -6.8339696880E-005 3.7476484447E-004 8.3606654277E-004
21 2.8652527558E-003 6.1967615286E-002 -3.2455918220E-003 7.8074203872E-003 -1.5351890960E-003
22 -6.8340068690E-005 -3.2455946984E-003 4.1826230856E-002 6.5337193429E-003 -3.1932674182E-003
23 3.7476336333E-004 7.8073802579E-003 6.5336763366E-003 3.4246747567E-001 -2.2590437719E-005
24 8.3515185725E-004 -1.5351889308E-003 -3.1932682244E-003 -2.2585651674E-005 4.7006835231E-002
25 5.3158843621E-007 1.0652535047E-003 1.4954902777E-003 2.4073368793E-004 1.1954474977E-003
26 5.5963948637E-004 -4.4872582333E-004 -1.4772351943E-003 6.3199701928E-004 -2.1389718034E-002
27 -1.7619372799E-004 9.0741766644E-004 9.8175835796E-004 -2.9459682310E-004 7.2835611826E-004
28 2.5127782091E-004 -9.3298199434E-004 6.8787235133E-005 1.2732690365E-004 7.9688727422E-003
29 2.6201943695E-004 1.7128017387E-004 1.2934748675E-003 3.4008367645E-004 1.9615268308E-002
25 26 27 28 29
0 -4.2035299977E-004 1.0294528397E-003 -7.0032537135E-004 7.4047266192E-004 -3.4678947810E-004
1 1.5264932827E-004 9.3263518942E-004 -3.5205362458E-003 1.9332600101E-003 -2.3027335108E-003
2 -9.8735571502E-004 7.7177183895E-004 -1.6311830663E-003 3.7374078263E-004 -1.1338849320E-003
3 -2.2267753982E-004 4.5631164845E-003 -1.2805227755E-002 3.9967067646E-003 -1.2023590679E-002
4 -5.1722782688E-003 -4.3757731112E-003 -6.5561880794E-003 -6.2274289617E-003 -5.4242286711E-003
5 6.5472637324E-003 -1.1287788747E-002 1.8937046693E-002 -5.0006811267E-003 1.0199602824E-002
6 -5.7685226078E-003 -4.5935456207E-003 6.5591405092E-003 -3.1011377655E-002 -7.9382348181E-004
7 -2.2680665405E-002 -1.9338350120E-003 -2.9972765688E-003 -2.1071947728E-002 3.0156847654E-003
8 -3.3264515239E-002 1.0812126530E-002 -9.1466888768E-003 1.5170890552E-002 3.3044094214E-003
9 4.8928775025E-003 2.3007654009E-002 -2.0026482543E-001 -6.3285758846E-002 1.2554808336E-001
10 -7.4869041758E-002 -9.3178724533E-002 -7.0098856149E-002 -6.9485640501E-002 4.8962839723E-002
11 -8.7330564494E-002 -2.9314613543E-001 1.2458021507E-001 4.8763534298E-002 -1.5272144228E-001
12 -6.0132426168E-005 2.1286995818E-004 -1.2846479090E-003 7.7223667108E-004 -8.8648784383E-004
13 -4.8090893023E-005 5.0813447259E-005 1.2192474211E-004 1.3853537972E-004 -4.4975512069E-005
14 1.1509828375E-004 2.6955725919E-004 4.6853708025E-004 5.4636589826E-005 5.9585997916E-004
15 -2.5088560837E-004 -1.4490239429E-004 -1.6517113547E-005 -2.3547725232E-004 7.0506301073E-005
16 2.1741623849E-005 -3.0396484786E-005 1.6435437640E-004 7.6123660238E-005 1.1552303684E-004
17 2.7209709129E-004 2.7234932342E-005 5.0963084246E-005 1.0117936124E-004 -4.5931984725E-005
18 -2.5882735848E-004 -6.0031848430E-004 1.4070861538E-004 -1.1535910049E-004 -2.0001808065E-004
19 -1.9638025822E-004 -2.9919459983E-005 -1.9047914816E-005 -1.0580143635E-004 -1.3503643634E-004
20 8.4829116415E-007 5.5948891149E-004 -1.7619563318E-004 2.5127749619E-004 2.6202088722E-004
21 1.0652521780E-003 -4.4872868033E-004 9.0739586785E-004 -9.3299673048E-004 1.7126146660E-004
22 1.4954902653E-003 -1.4772362211E-003 9.8175151528E-004 6.8801505444E-005 1.2934673074E-003
23 2.4072903510E-004 6.3199689136E-004 -2.9460500091E-004 1.2731327319E-004 3.4007600115E-004
24 1.1952923145E-003 -2.1389995888E-002 7.2832026293E-004 7.9688600183E-003 1.9615297182E-002
25 9.4289717269E-002 1.0562741426E-001 -1.7552990896E-004 7.0060843371E-003 8.7782610441E-003
26 1.0562750999E-001 3.0308674016E-001 -1.6382699707E-003 -5.5832273099E-003 -1.1726448645E-002
27 -1.7551353029E-004 -1.6382784849E-003 2.0673701256E-001 8.2101212014E-002 -1.3115219203E-001
28 7.0060896795E-003 -5.5832572276E-003 8.2101377926E-002 8.7668224780E-002 -5.4259499038E-002
29 8.7782416309E-003 -1.1726450275E-002 -1.3115216547E-001 -5.4259354736E-002 1.5092602943E-001
This should be a 30x30 matrix and I'm trying:
data = pd.read_fwf('C:/Users/henri/Documents/Projects/Python-Lessons/ORCA/orca.hess',
widths=[9, 19, 19, 19, 19, 19])
But it reads as 185x6. I'd like to ignore the first column (numbering the lines) from 0-29 and I'm not using the columns indexes (from 0-29 too) to perform any mathematical operation. Also, Pandas is rounding my numbers and I'd like to keep the original format.
Here is a snip of my output:
Unnamed: 0 0 1 2 3 4
0 0.0 5.167712e-01 0.000055 0.032484 -0.189017 -0.006716
1 1.0 5.538011e-05 0.561599 -0.001900 -0.014738 -0.072598
2 2.0 3.248692e-02 -0.001900 0.567918 0.072316 0.001501
Any help is much appreciated, guys.
import pandas as pd
filename = 'data'
df = pd.read_fwf(filename, widths=[9, 19, 19, 19, 19, 19])
df = df.rename(columns={'Unnamed: 0':'row'})
df = df.dropna(subset=['row'], how='any')
df['col'] = df.groupby('row').cumcount()
df = df.pivot(index='row', columns='col')
df = df.dropna(how='any', axis=1)
df.columns = range(len(df.columns))
print(df.head())
yields
0 1 2 3 4 5 6 \
row
0.0 0.516771 0.066743 0.003457 -0.122105 -0.034812 -0.000420 0.000055
1.0 0.000055 0.001410 -0.004069 0.113803 -0.007838 0.000153 0.561599
2.0 0.032487 -0.126894 0.001377 -0.038375 -0.312054 -0.000987 -0.001900
3.0 -0.189014 0.047664 -0.118741 -0.012823 -0.002800 -0.000223 -0.014737
4.0 -0.006714 0.018558 -0.033281 0.023865 0.000398 -0.005172 -0.072598
7 8 9 ... 20 21 22 \
row ...
0.0 -0.025292 0.006609 0.114780 ... -0.113527 -0.051389 -0.001430
1.0 0.037536 0.017154 -0.214876 ... -0.228978 -0.001149 -0.001783
2.0 -0.013453 0.004940 0.059953 ... -0.053033 -0.038484 -0.011426
3.0 -0.131811 -0.084587 -0.001281 ... 0.000745 0.002222 0.001710
4.0 0.068311 -0.043605 -0.000225 ... 0.000638 0.000249 -0.012016
23 24 25 26 27 28 29
row
0.0 0.000740 -0.006716 -0.018400 -0.032196 -0.001829 0.001625 -0.000347
1.0 0.001933 -0.072598 -0.021579 -0.053402 -0.045442 0.001637 -0.002303
2.0 0.000374 0.001501 -0.008308 -0.062527 -0.007680 -0.000782 -0.001134
3.0 0.003997 0.031554 -0.218476 0.009665 0.000770 0.003718 -0.012024
4.0 -0.006227 0.273181 -0.073414 0.015382 -0.001846 -0.010472 -0.005424
[5 rows x 30 columns]
After parsing the file with
df = pd.read_fwf(filename, widths=[9, 19, 19, 19, 19, 19])
df = df.rename(columns={'Unnamed: 0':'row'})
the column headers can be identified by have a df['row'] value of NaN.
So they can be removed with
df = df.dropna(subset=['row'], how='any')
Now the row numbers keep repeating from 0 to 29. If we group by the row
value, then we can assign an intra-group "cumulative count" to the rows within
each group. That is, the first row of the group gets assigned the value 0, the
next row 1, etc. -- within that group -- and the process is repeated for each
group.
df['col'] = df.groupby('row').cumcount()
# row 0 1 2 3 4 col
# 0 0.0 5.167712e-01 0.000055 0.032484 -0.189017 -0.006716 0
# 1 1.0 5.538011e-05 0.561599 -0.001900 -0.014738 -0.072598 0
# 2 2.0 3.248692e-02 -0.001900 0.567918 0.072316 0.001501 0
# ...
# 182 27.0 -1.755135e-04 -0.001638 0.206737 0.082101 -0.131152 5
# 183 28.0 7.006090e-03 -0.005583 0.082101 0.087668 -0.054259 5
# 184 29.0 8.778242e-03 -0.011726 -0.131152 -0.054259 0.150926 5
Now the desired DataFrame can be obtained by pivoting:
df = df.pivot(index='row', columns='col')
and relabeling the columns:
df.columns = range(len(df.columns))
A more NumPy-based approach might look like this:
import numpy as np
import pandas as pd
filename = 'data'
df = pd.read_csv(filename, delim_whitespace=True)
arr = df.values
N = df.index.max()+1
arr = np.delete(arr, np.arange(N, len(arr), N+1), axis=0)
chunks = np.split(arr, np.arange(N, len(arr), N))
result = pd.DataFrame(np.hstack(chunks)).dropna(axis=1)
print(result)
This will also work for any sized matrix.
I have an existing dataframe that is sorted like this:
In [3]: result_GB_daily_average
Out[3]:
NREL Avert
Month Day
1 1 14.718417 37.250000
2 40.381167 45.250000
3 42.512646 40.666667
4 12.166896 31.583333
5 14.583208 50.416667
6 34.238000 45.333333
7 45.581229 29.125000
8 60.548479 27.916667
9 48.061583 34.041667
10 20.606958 37.583333
11 5.418833 70.833333
12 51.261375 43.208333
13 21.796771 42.541667
14 27.118979 41.958333
15 8.230542 43.625000
16 14.233958 48.708333
17 28.345875 51.125000
18 43.896375 55.500000
19 95.800542 44.500000
20 53.763104 39.958333
21 26.171437 50.958333
22 20.372688 66.916667
23 20.594042 42.541667
24 16.889083 48.083333
25 16.416479 42.125000
26 28.459625 40.125000
27 1.055229 49.833333
28 36.798792 42.791667
29 27.260083 47.041667
30 23.584917 55.750000
... ... ...
12 2 34.491604 55.916667
3 26.444333 53.458333
4 15.088333 45.000000
5 10.213500 32.083333
6 19.087688 17.000000
7 23.078292 17.375000
8 41.523667 29.458333
9 17.173854 37.833333
10 11.488687 52.541667
11 15.203479 30.000000
12 8.390917 37.666667
13 70.067062 23.458333
14 24.281729 25.583333
15 31.826104 33.458333
16 5.085271 42.916667
17 3.778229 46.916667
18 31.276958 57.625000
19 7.399458 46.916667
20 18.531958 39.291667
21 26.831937 35.958333
22 55.514000 32.375000
23 24.018875 34.041667
24 54.454125 43.083333
25 57.379812 25.250000
26 94.520833 33.958333
27 49.693854 27.500000
28 2.406438 46.916667
29 7.133833 53.916667
30 7.829167 51.500000
31 5.584646 55.791667
I would like to split this dataframe apart into 12 different data frames, one for each month, but the problem is they are all slightly different lengths because the amount of days in a month vary, meaning that attempts at using np.array_split have failed. How can I split this based on the Month index?
One solution :
df=result_GB_daily_average
[df.iloc[df.index.get_level_values('Month')==i+1] for i in range(12)]
or, shorter:
[df.ix[i] for i in range(12)]
For index.csv file, its fourth column has ten numbers ranging from 1-5. Each number can be regarded as an index, and each index corresponds with an array of numbers in filename.csv.
The row number of filename.csv represents the index, and each row has three numbers. My question is about using a nesting loop to transfer the numbers in filename.csv to index.csv.
from numpy import genfromtxt
import numpy as np
import csv
data1 = genfromtxt('filename.csv', delimiter=',')
data2 = genfromtxt('index.csv', delimiter=',')
f = open('index.csv','wb')
write = csv.writer(f, delimiter=',',quoting=csv.QUOTE_ALL)
for row in data2:
for ch_row in data1:
if ( data2[row,3] == ch_row ):
write.writerow(data1[data2[row,3],:])
For example, the fourth column of index.csv contains 1,2,5,3,4,1,4,5,2,3 and filename.csv contains:
# filename.csv
20 30 50
70 60 45
35 26 77
93 37 68
13 08 55
What I need is to write the indexed row from filename.csv to index.csv and store these number in 5th, 6th and 7th column:
# index.csv
# 4 5 6 7
... 1 20 30 50
... 2 70 60 45
... 5 13 08 55
... 3 35 26 77
... 4 93 37 68
... 1 20 30 50
... 4 93 37 68
... 5 13 08 55
... 2 70 60 45
... 3 35 26 77
Can anyone help me solve this problem?
You need to indent your last 2 lines. Also, it looks like you are writing to the file from which you are reading.