How to add up more data in an existing plotly graph? - python

I have successfully plotted the below data using plotly from an Excel file.
Here is my code:
file_loc1 = "AgeGroupData_time_to_treatment.xlsx"
df_centroid_CoordNew = pd.read_excel(file_loc1, index_col=None, na_values=['NA'], usecols="C:D,AB")
df_centroid_CoordNew.head()
df_centroid_Coord['Ambulance_Treatment_Time'] = df_centroid_Coord ['Base_TT']
fig = px.scatter(df_centroid_Coord, x="x", y="y",
title="Southern Region Centroids",
color='Ambulance_Treatment_Time',
hover_name="KnNamn",
hover_data= ['Ambulance_Treatment_Time', "TotPop"],
log_x=True, size_max=60,
color_continuous_scale='Reds', range_color=(0.5,2), width=1250, height=1000)
fig.update_traces(marker={'size': 8, 'symbol': 1})
#fig.update_traces(marker={'symbol': 1})
fig.update_layout(paper_bgcolor="LightSteelBlue")
fig.show()
The shapes of the plotted data points are square.
Here is output of my code:
Now, I want to plot more data points in circle or any shapes on the same plotly graph by reading an excel file. Please have a look at the data below.
How I can add up the new data to an existing graph in plotly?
Map data with total population and treatment time (Base_TT):
ID KnNamn x y TotPop Base_TT
1 2 Växjö 14.662290 57.027520 9 1.599971
2 3 Bromölla 14.494072 56.065635 264 1.307165
3 4 Trelleborg 13.219968 55.478675 40 1.411554
4 5 Tomelilla 14.005013 55.721209 6 1.968138
5 6 Halmstad 12.737361 56.710973 386 1.309849
6 7 Alvesta 14.566685 56.748729 47 1.719117
7 8 Laholm 13.241388 56.413591 0 2.000620
8 9 Tingsryd 14.943081 56.542837 16 1.668725
9 10 Sölvesborg 14.574474 56.056953 1147 1.266862
10 11 Halmstad 13.068009 56.635666 38 1.589239
11 12 Tingsryd 14.699642 56.479597 3 1.960050
12 13 Vellinge 13.029769 55.484749 61 1.254957
13 14 Örkelljunga 13.169010 56.232819 12 1.429789
14 15 Svalöv 13.059068 55.853696 26 1.553722
15 16 Sjöbo 13.738205 55.601936 6 1.326429
16 17 Hässleholm 13.729872 56.347672 13 1.709021
17 18 Olofström 14.588037 56.290604 6 1.444833
18 19 Eslöv 13.168712 55.900311 3 1.527547
19 20 Ronneby 15.024222 56.273317 3 1.692005
20 21 Ängelholm 12.910101 56.246689 19 1.090544
Ambulance Data:
ID Ambulance station name Longtitude Latitude
0 1 Älmhult 14.128734 56.547992
1 2 Ängelholm 12.870739 56.242114
2 3 Alvesta 14.549503 56.920740
3 4 Östra Ljungby 13.057450 56.188099
4 5 Broby 14.080958 56.254481
5 6 Bromölla 14.466869 56.072272
6 7 Förslöv 12.814913 56.350098
7 9 Hässleholm 13.778234 56.161536
8 10 Höganäs 12.556995 56.206016
9 11 Hörby 13.643265 55.849811
10 12 Halmstad, Väster 12.819960 56.674306
11 13 Halmstad, Öster 12.882289 56.676871
12 14 Helsingborg 12.738642 56.084708
13 15 Hyltebruk 13.238277 56.993058
14 16 Karlshamn 14.854022 56.186596
15 17 Karlskrona 15.606300 56.183054
16 18 Kristianstad 14.171371 56.031201
17 20 Löddeköpinge 12.995037 55.766946
18 21 Laholm 13.033763 56.498955
19 22 Landskrona 12.867245 55.872659
20 23 Lenhovda 15.283913 57.001953
21 24 Lessebo 15.267357 56.756860
22 25 Ljungby 13.935399 56.835023
23 26 Lund 13.226607 55.695212
24 27 Markaryd 13.591491 56.452057
25 28 Olofström 14.545848 56.272221
26 29 Osby 13.983674 56.384833
27 30 Perstorp 13.388304 56.130752
28 31 Ronneby 15.280554 56.211863
29 32 Sölvesborg 14.570503 56.052113
30 33 Simrishamn 14.338632 55.552765
Merged Dataset for plotting
KnNamn x y TotPop Base_TT Ambulance station name Longtitude Latitude
Växjö 14.66229 57.02752 9 1.599971 Ängelholm 12.87074 56.24211
Bromölla 14.49407 56.06564 264 1.307165 Alvesta 14.5495 56.92074
Trelleborg 13.21997 55.47868 40 1.411554 Östra Ljungby 13.05745 56.1881
Tomelilla 14.00501 55.72121 6 1.968138 Broby 14.08096 56.25448
Halmstad 12.73736 56.71097 386 1.309849
Alvesta 14.56669 56.74873 47 1.719117
Laholm 13.24139 56.41359 0 2.00062
Tingsryd 14.94308 56.54284 16 1.668725

If the data is the same but the column names are different, aligning to either column name is fine for the data for the chart.
Add a graph with a graph object by reusing the graph data created with plotly.express. First I added a chart that was already completed, then a chart with latitude and longitude. Station names and locations are drawn using scatterplot markers and text mode.
df_station.rename(columns={'Longtitude':'x', 'Latitude':'y'}, inplace=True)
import plotly.express as px
import plotly.graph_objects as go
df_centroid_Coord['Ambulance_Treatment_Time'] = df_centroid_Coord ['Base_TT']
sca = px.scatter(df_centroid_Coord, x="x", y="y",
title="Southern Region Centroids",
color='Ambulance_Treatment_Time',
hover_name="KnNamn",
#hover_data= ['Ambulance_Treatment_Time', "TotPop"],
log_x=True,
size_max=60,
color_continuous_scale='Reds',
range_color=(0.5,2),
)
sca.update_traces(marker={'size': 8, 'symbol': 1})
fig = go.Figure()
fig.add_trace(go.Scatter(sca.data[0]))
fig.add_trace(go.Scatter(x=df_station['x'],
y=df_station['y'],
mode='markers+text',
text=df_station['Ambulance station name'],
textposition='top center',
showlegend=False,
marker=dict(
size=5,
symbol=2,
color='blue'
)
)
)
#fig.update_traces(marker={'symbol': 1})
fig.update_layout(width=625, height=500, paper_bgcolor="LightSteelBlue")
fig.show()

Related

Sort rows of curve shaped data in python

I have a dataset that consists of 5 rows that are formed like a curve. I want to separate the inner row from the other or if possible each row and store them in a separate array. Is there any way to do this, like somehow flatten the curved data and sorting it afterwards based on the x and y values?
I would like to assign each row from left to right numbers from 0 to the max of the row. Right now the labels for each dot are not useful for me and I can't change the labels.
Here are the first 50 data points of my data set:
x y
0 -6.4165 0.3716
1 -4.0227 2.63
2 -7.206 3.0652
3 -3.2584 -0.0392
4 -0.7565 2.1039
5 -0.0498 -0.5159
6 2.363 1.5329
7 -10.7253 3.4654
8 -8.0621 5.9083
9 -4.6328 5.3028
10 -1.4237 4.8455
11 1.8047 4.2297
12 4.8147 3.6074
13 -5.3504 8.1889
14 -1.7743 7.6165
15 1.1783 6.9698
16 4.3471 6.2411
17 7.4067 5.5988
18 -2.6037 10.4623
19 0.8613 9.7628
20 3.8054 9.0202
21 7.023 8.1962
22 9.9776 7.5563
23 0.1733 12.6547
24 3.7137 11.9097
25 6.4672 10.9363
26 9.6489 10.1246
27 12.5674 9.3369
28 3.2124 14.7492
29 6.4983 13.7562
30 9.2606 12.7241
31 12.4003 11.878
32 15.3578 11.0027
33 6.3128 16.7014
34 9.7676 15.6557
35 12.2103 14.4967
36 15.3182 13.5166
37 18.2495 12.5836
38 9.3947 18.5506
39 12.496 17.2993
40 15.3987 16.2716
41 18.2212 15.1871
42 21.1241 14.0893
43 12.3548 20.2538
44 15.3682 18.9439
45 18.357 17.8862
46 21.0834 16.6258
47 23.9992 15.4145
48 15.3776 21.9402
49 18.3568 20.5803
50 21.1733 19.3041
It seems that your curves have a pattern, so you could select the curve of interest using splicing. I had the offset the selection slightly to get the five curves because the first 8 points are not in the same order as the rest of the data. So the initial 8 data points are discarded. But these could be added back in afterwards if required.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({ 'x': [-6.4165, -4.0227, -7.206, -3.2584, -0.7565, -0.0498, 2.363, -10.7253, -8.0621, -4.6328, -1.4237, 1.8047, 4.8147, -5.3504, -1.7743, 1.1783, 4.3471, 7.4067, -2.6037, 0.8613, 3.8054, 7.023, 9.9776, 0.1733, 3.7137, 6.4672, 9.6489, 12.5674, 3.2124, 6.4983, 9.2606, 12.4003, 15.3578, 6.3128, 9.7676, 12.2103, 15.3182, 18.2495, 9.3947, 12.496, 15.3987, 18.2212, 21.1241, 12.3548, 15.3682, 18.357, 21.0834, 23.9992, 15.3776, 18.3568, 21.1733],
'y': [0.3716, 2.63, 3.0652, -0.0392, 2.1039, -0.5159, 1.5329, 3.4654, 5.9083, 5.3028, 4.8455, 4.2297, 3.6074, 8.1889, 7.6165, 6.9698, 6.2411, 5.5988, 10.4623, 9.7628, 9.0202, 8.1962, 7.5563, 12.6547, 11.9097, 10.9363, 10.1246, 9.3369, 14.7492, 13.7562, 12.7241, 11.878, 11.0027, 16.7014, 15.6557, 14.4967, 13.5166, 12.5836, 18.5506, 17.2993, 16.2716, 15.1871, 14.0893, 20.2538, 18.9439, 17.8862, 16.6258, 15.4145, 21.9402, 20.5803, 19.3041]})
# Generate the 5 dataframes
df_list = [df.iloc[i+8::5, :] for i in range(5)]
# Generate the plot
fig = plt.figure()
for frame in df_list:
plt.scatter(frame['x'], frame['y'])
plt.show()
# Print the data of the innermost curve
print(df_list[4])
OUTPUT:
The 5th dataframe df_list[4] contains the data of the innermost plot.
x y
12 4.8147 3.6074
17 7.4067 5.5988
22 9.9776 7.5563
27 12.5674 9.3369
32 15.3578 11.0027
37 18.2495 12.5836
42 21.1241 14.0893
47 23.9992 15.4145
You can then add the missing data like this:
# Retrieve the two missing points of the inner curve
inner_curve = pd.concat([df_list[4], df[5:7]]).sort_index(ascending=True)
print(inner_curve)
# Plot the inner curve only
fig2 = plt.figure()
plt.scatter(inner_curve['x'], inner_curve['y'], color = '#9467BD')
plt.show()
OUTPUT: inner curve
x y
5 -0.0498 -0.5159
6 2.3630 1.5329
12 4.8147 3.6074
17 7.4067 5.5988
22 9.9776 7.5563
27 12.5674 9.3369
32 15.3578 11.0027
37 18.2495 12.5836
42 21.1241 14.0893
47 23.9992 15.4145
Complete Inner Curve

Seaborn figure with multiple axis (year) and month on x-axis

I try to become warm with seaborn. I want to create one or both of that figures (bar plot & line plot). You see 12 months on the X-axis and 3 years each one with its own line or bar color.
That is the data creating script including the data in comments.
#!/usr/bin/env python3
import random as rd
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
rd.seed(0)
a = pd.DataFrame({
'Y': [2016]*12 + [2017]*12 + [2018]*12,
'M': list(range(1, 13)) * 3,
'n': rd.choices(range(100), k=36)
})
print(a)
# Y M n
# 0 2016 1 84
# 1 2016 2 75
# 2 2016 3 42
# ...
# 21 2017 10 72
# 22 2017 11 89
# 23 2017 12 68
# 24 2018 1 47
# 25 2018 2 10
# ...
# 34 2018 11 54
# 35 2018 12 1
b = a.pivot_table(columns='M', index='Y')
print(b)
# n
# M 1 2 3 4 5 6 7 8 9 10 11 12
# Y
# 2016 84 75 42 25 51 40 78 30 47 58 90 50
# 2017 28 75 61 25 90 98 81 90 31 72 89 68
# 2018 47 10 43 61 91 96 47 86 26 80 54 1
I'm even not sure which form (a or b or something elese) of a dataframe I should use here.
What I tried
I assume in seaboarn speech it is a countplot() I want. Maybe I am wrong?
>>> sns.countplot(data=a)
<AxesSubplot:ylabel='count'>
>>> plt.show()
The result is senseless
I don't know how I could add the pivoted dataframe b to seaborn.
You could do the first plot with a relplot, using hue as a categorical grouping variable:
sns.relplot(data=a, x='M', y='n', hue='Y', kind='line')
I'd use these colour and size settings to make it more similar to the plot you wanted:
sns.relplot(data=a, x='M', y='n', hue='Y', kind='line', palette='pastel', height=3, aspect=3)
The equivalent axes-level code would be sns.lineplot(data=a, x='M', y='n', hue='Y', palette='pastel')
Your second can be done with catplot:
sns.catplot(kind='bar', data=a, x='M', y='n', hue='Y')
Or the axes-level function sns.barplot. In that case let's move the default legend location:
sns.barplot(data=a, x='M', y='n', hue='Y')
plt.legend(bbox_to_anchor=(1.05, 1))

Is it possible to generate a clock chart using Plotly?

I'm developing a dataviz project and I came across the report generated by Last.FM, in which there is a clock chart to represent the distribution of records by hours.
The chart in question is this:
It is an interactive graph, so I tried to use the Plotly library to try to replicate the chart, but without success.
Is there any way to replicate this in Plotly? Here are the data I need to represent
listeningHour = df.hour.value_counts().rename_axis('hour').reset_index(name='counts')
listeningHour
hour counts
0 17 16874
1 18 16703
2 16 14741
3 19 14525
4 23 14440
5 22 13455
6 20 13119
7 21 12766
8 14 11605
9 13 11575
10 15 11491
11 0 10220
12 12 7793
13 1 6057
14 9 3774
15 11 3476
16 10 1674
17 8 1626
18 2 1519
19 3 588
20 6 500
21 7 163
22 4 157
23 5 26
The graph provided by Plotly is a polar bar chart. I have written a code using it with your data. At the time of my research, there does not seem to be a way to place the ticks inside the doughnut. The point of the code is to start at 0:00 in the direction of the angle axis. The clock display is a list of 24 tick places with an empty string and a string every 6 hours. The angle grid is aligned with the center of the bar chart.
import plotly.graph_objects as go
r = df['counts'].tolist()
theta = np.arange(7.5,368,15)
width = [15]*24
ticktexts = [f'$\large{i}$' if i % 6 == 0 else '' for i in np.arange(24)]
fig = go.Figure(go.Barpolar(
r=r,
theta=theta,
width=width,
marker_color=df['counts'],
marker_colorscale='Blues',
marker_line_color="white",
marker_line_width=2,
opacity=0.8
))
fig.update_layout(
template=None,
polar=dict(
hole=0.4,
bgcolor='rgb(223, 223,223)',
radialaxis=dict(
showticklabels=False,
ticks='',
linewidth=2,
linecolor='white',
showgrid=False,
),
angularaxis=dict(
tickvals=np.arange(0,360,15),
ticktext=ticktexts,
showline=True,
direction='clockwise',
period=24,
linecolor='white',
gridcolor='white',
showticklabels=True,
ticks=''
)
)
)
fig.show()

Converting Basemap to Cartopy, are there quivalent functions such as Basemap's shiftgrid()?

I am converting an application that uses matplotlib's toolkit Basemap to using Cartopy in preparation for moving from Python 2 to Python 3.
I have found similar functions in Cartopy for Basemap's 'addcyclic()' and 'maskoceans()',
However I cannot find something similar in either numpy or Cartopy for Basemap's shiftgrid() function.
This is the code using Basemap:
'''
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import cartopy
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import numpy as np
from mpl_toolkits.basemap import shiftgrid
bmap = Basemap(projection='ortho', lat_0=0, lon_0=0)
lons = np.arange(30, 410, 30)
lons[1] = 70
lats = np.arange(0, 100, 10)
data = np.indices((lats.shape[0], lons.shape[0]))
data = data[0] + data[1]
data, lons = shiftgrid(180., data, lons, start=False)
llons, llats = np.meshgrid(lons, lats)
x, y = bmap(llons, llats)
bmap.contourf(x, y, data)
bmap.drawcoastlines()
'''
The initial data:
data
'''
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 1 2 3 4 5 6 7 8 9 10 11 12 13]
[ 2 3 4 5 6 7 8 9 10 11 12 13 14]
[ 3 4 5 6 7 8 9 10 11 12 13 14 15]
[ 4 5 6 7 8 9 10 11 12 13 14 15 16]
[ 5 6 7 8 9 10 11 12 13 14 15 16 17]
[ 6 7 8 9 10 11 12 13 14 15 16 17 18]
[ 7 8 9 10 11 12 13 14 15 16 17 18 19]
[ 8 9 10 11 12 13 14 15 16 17 18 19 20]
[ 9 10 11 12 13 14 15 16 17 18 19 20 21]]
lons
[ 30 70 90 120 150 180 210 240 270 300 330 360 390]
After the 'data, lons = shiftgrid(180., data, lons, start=False)':
data
[[ 5 6 7 8 9 10 11 12 1 2 3 4 5]
[ 6 7 8 9 10 11 12 13 2 3 4 5 6]
[ 7 8 9 10 11 12 13 14 3 4 5 6 7]
[ 8 9 10 11 12 13 14 15 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 5 6 7 8 9]
[10 11 12 13 14 15 16 17 6 7 8 9 10]
[11 12 13 14 15 16 17 18 7 8 9 10 11]
[12 13 14 15 16 17 18 19 8 9 10 11 12]
[13 14 15 16 17 18 19 20 9 10 11 12 13]
[14 15 16 17 18 19 20 21 10 11 12 13 14]]
lons
[-180 -150 -120 -90 -60 -30 0 30 70 90 120 150 180]
'''
I have tried the following cartopy code to recreate what the Basemap shiftgrid did.
This is the Cartopy code, some things are commented out as I tried them at one time:
'''
DATA_CRS = ccrs.PlateCarree()
lons = np.arange(30, 410, 30)
lons[1] = 70
lats = np.arange(0, 100, 10)
data = np.indices((lats.shape[0], lons.shape[0]))
data = data[0] + data[1]
# data2 = np.roll(data, -5)
# lons2 = np.mod(lons2 - 180.0, 360.0) - 180.0
cm_lon = 0
#llons, llats = np.meshgrid(lons2, lats)
llons, llats = np.meshgrid(lons, lats)
PROJECTION = ccrs.Orthographic(central_longitude=cm_lon)
fig1 = plt.figure(num=1, figsize=(11, 8.5), dpi=150)
ax = plt.axes(projection=PROJECTION)
ax.add_feature(cfeature.COASTLINE, linewidths=0.7)
ax.add_feature(cfeature.BORDERS, edgecolor='black', linewidths=0.7)
ax.contourf(llons, llats, data, transform=ccrs.PlateCarree())
'''
The data and the longitudes as original and I just used the 'central_longitude' in the projection.
The Basemap image shows the entire globe but the Cartopy image only shows from the equator up.
The color of the data seems similar except for the far right side, so I'm concerned the data didn't map the same in Cartopy as it did in Basemap.
So, the question is... Is there anything equivalent to Basemap's shiftgrid() or do I need to figure out something similar to Basemap's shiftgrid() or just use the 'central_longitude' in the projection?
I don't seem to be able to paste the .png files.
Any help is really appreciated.
I have searched the web looking for equivalent functions but haven't found one for the shiftgrid().
Thank you.
I'm not aware of any shiftgrid equivalent. It may be worth opening an issue over on the CartoPy issue tracker requesting such a feature. It would help in doing so to mention a solid use case to help drive the functionality.
This must be the most inelegant solution, but what I have been doing for several of Basemap's useful features that are not (yet?) in cartopy, is just to copy the function definitions from Basemap's source code. It works fine. For example, shiftgrid:
def shiftgrid(lon0,datain,lonsin,start=True,cyclic=360.0):
"""
Shift global lat/lon grid east or west.
.. tabularcolumns:: |l|L|
============== ====================================================
Arguments Description
============== ====================================================
lon0 starting longitude for shifted grid
(ending longitude if start=False). lon0 must be on
input grid (within the range of lonsin).
datain original data with longitude the right-most
dimension.
lonsin original longitudes.
============== ====================================================
.. tabularcolumns:: |l|L|
============== ====================================================
Keywords Description
============== ====================================================
start if True, lon0 represents the starting longitude
of the new grid. if False, lon0 is the ending
longitude. Default True.
cyclic width of periodic domain (default 360)
============== ====================================================
returns ``dataout,lonsout`` (data and longitudes on shifted grid).
"""
if np.fabs(lonsin[-1]-lonsin[0]-cyclic) > 1.e-4:
# Use all data instead of raise ValueError, 'cyclic point not included'
start_idx = 0
else:
# If cyclic, remove the duplicate point
start_idx = 1
if lon0 < lonsin[0] or lon0 > lonsin[-1]:
raise ValueError('lon0 outside of range of lonsin')
i0 = np.argmin(np.fabs(lonsin-lon0))
i0_shift = len(lonsin)-i0
if ma.isMA(datain):
dataout = ma.zeros(datain.shape,datain.dtype)
else:
dataout = np.zeros(datain.shape,datain.dtype)
if ma.isMA(lonsin):
lonsout = ma.zeros(lonsin.shape,lonsin.dtype)
else:
lonsout = np.zeros(lonsin.shape,lonsin.dtype)
if start:
lonsout[0:i0_shift] = lonsin[i0:]
else:
lonsout[0:i0_shift] = lonsin[i0:]-cyclic
dataout[...,0:i0_shift] = datain[...,i0:]
if start:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]+cyclic
else:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]
dataout[...,i0_shift:] = datain[...,start_idx:i0+start_idx]
return dataout,lonsout
I have found the shiftgrid function of basemap
here
You can possibly call it as a separate function together with cartopy.
import numpy as np
import numpy.ma as ma
def shiftgrid(lon0,datain,lonsin,start=True,cyclic=360.0):
"""
Shift global lat/lon grid east or west.
.. tabularcolumns:: |l|L|
============== ====================================================
Arguments Description
============== ====================================================
lon0 starting longitude for shifted grid
(ending longitude if start=False). lon0 must be on
input grid (within the range of lonsin).
datain original data with longitude the right-most
dimension.
lonsin original longitudes.
============== ====================================================
.. tabularcolumns:: |l|L|
============== ====================================================
Keywords Description
============== ====================================================
start if True, lon0 represents the starting longitude
of the new grid. if False, lon0 is the ending
longitude. Default True.
cyclic width of periodic domain (default 360)
============== ====================================================
returns ``dataout,lonsout`` (data and longitudes on shifted grid).
"""
if np.fabs(lonsin[-1]-lonsin[0]-cyclic) > 1.e-4:
# Use all data instead of raise ValueError, 'cyclic point not included'
start_idx = 0
else:
# If cyclic, remove the duplicate point
start_idx = 1
if lon0 < lonsin[0] or lon0 > lonsin[-1]:
raise ValueError('lon0 outside of range of lonsin')
i0 = np.argmin(np.fabs(lonsin-lon0))
i0_shift = len(lonsin)-i0
if ma.isMA(datain):
dataout = ma.zeros(datain.shape,datain.dtype)
else:
dataout = np.zeros(datain.shape,datain.dtype)
if ma.isMA(lonsin):
lonsout = ma.zeros(lonsin.shape,lonsin.dtype)
else:
lonsout = np.zeros(lonsin.shape,lonsin.dtype)
if start:
lonsout[0:i0_shift] = lonsin[i0:]
else:
lonsout[0:i0_shift] = lonsin[i0:]-cyclic
dataout[...,0:i0_shift] = datain[...,i0:]
if start:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]+cyclic
else:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]
dataout[...,i0_shift:] = datain[...,start_idx:i0+start_idx]
return dataout,lonsout

Pandas - formatting a NxN matrix

I have to deal with a square matrix (N x N) (N will change depending on the system, but the matrix will always be a square matrix).
Here is an example:
0 1 2 3 4
0 5.1677124550E-001 5.4962112499E-005 3.2484393256E-002 -1.8901697652E-001 -6.7156804753E-003
1 5.5380106796E-005 5.6159927753E-001 -1.9000545049E-003 -1.4737748792E-002 -7.2598453774E-002
2 3.2486915835E-002 -1.8996351539E-003 5.6791783316E-001 7.2316374186E-002 1.5013066446E-003
3 -1.8901411495E-001 -1.4737367075E-002 7.2315825338E-002 6.2721160365E-001 3.1553528602E-002
4 -6.7136454124E-003 -7.2597907350E-002 1.5007743348E-003 3.1554372311E-002 2.7318109331E-001
5 6.6738948243E-002 1.4102132238E-003 -1.2689244944E-001 4.7666038803E-002 1.8559074897E-002
6 -2.5293332676E-002 3.7536452002E-002 -1.3453018251E-002 -1.3177136905E-001 6.8262612506E-002
7 5.0951492945E-003 2.1082303893E-005 2.2599127408E-004 1.0287898189E-001 -1.1117916184E-001
8 1.0818230191E-003 -1.2435319909E-002 8.1008075834E-003 -4.2864102001E-002 4.2865913226E-002
9 -1.8399671295E-002 -2.1579653853E-002 -8.3073582356E-003 -2.1848513510E-001 -7.3408914625E-002
10 3.4566032399E-003 -4.0687639382E-003 1.3769999130E-003 -1.1873434189E-001 -3.3274201039E-002
11 6.6093238125E-003 1.7153435473E-002 4.9392012712E-003 -8.4590814134E-002 -4.3601041176E-002
12 -1.1418316960E-001 -1.1241625427E-001 -3.2263873516E-002 -1.9323129435E-002 -2.6233049625E-002
13 -1.1352899039E-001 -2.2898299860E-001 -5.3035072561E-002 7.4480651562E-004 6.3778892206E-004
14 -3.2197359289E-002 -5.3404040557E-002 -6.2530142462E-002 9.6648204015E-003 1.5382174347E-002
15 -1.2210509335E-001 1.1380412205E-001 -3.8374895516E-002 -1.2823165326E-002 2.3865200517E-002
16 1.1478157080E-001 -2.1487971631E-001 5.9955334103E-002 -1.2803721235E-003 -2.2477259002E-004
17 -3.9162044498E-002 6.0167325377E-002 -6.7692892326E-002 6.3814569032E-003 -1.3309923267E-002
18 -5.1386866211E-002 -1.1483215267E-003 -3.8482481829E-002 2.2227734790E-003 2.4860195290E-004
19 -1.8287048910E-003 -4.5442287955E-002 -7.6787332291E-003 7.6970470456E-004 -1.8456603178E-003
20 -3.4812676792E-002 -7.8376169613E-003 -3.1205975353E-001 -2.8005140005E-003 3.9792109835E-004
21 2.6908361866E-003 3.7102890907E-004 2.8494060446E-002 -4.8904422930E-002 -5.8840348122E-004
22 -1.6354677061E-003 2.2592828188E-003 1.6591434361E-004 -4.9992263663E-003 -4.3243295112E-002
23 -1.4297833794E-003 -1.7830154308E-003 -1.1426700328E-002 1.7125095395E-003 -1.2016863398E-002
24 1.6271802154E-003 1.6383303957E-003 -7.8049656555E-004 3.7177399735E-003 -1.0472268655E-002
25 -4.1949740427E-004 1.5301971185E-004 -9.8681335931E-004 -2.2257204483E-004 -5.1722898203E-003
26 1.0290471110E-003 9.3255502541E-004 7.7166886713E-004 4.5630851485E-003 -4.3761358485E-003
27 -7.0031784470E-004 -3.5205332654E-003 -1.6311730073E-003 -1.2805479632E-002 -6.5565487971E-003
28 7.4046927792E-004 1.9332629981E-003 3.7374682636E-004 3.9965654817E-003 -6.2275912806E-003
29 -3.4680278867E-004 -2.3027344089E-003 -1.1338817043E-003 -1.2023581780E-002 -5.4242202971E-003
5 6 7 8 9
0 6.6743285428E-002 -2.5292337123E-002 5.0949675928E-003 1.0817408844E-003 -1.8399704662E-002
1 1.4100215877E-003 3.7536256943E-002 2.1212526899E-005 -1.2435482773E-002 -2.1579384876E-002
2 -1.2689432485E-001 -1.3453164785E-002 2.2618690004E-004 8.1008703937E-003 -8.3084039605E-003
3 4.7663851818E-002 -1.3181118094E-001 1.0290976691E-001 -4.2887391630E-002 -2.1847562123E-001
4 1.8558453001E-002 6.8311145594E-002 -1.1122358467E-001 4.2891711956E-002 -7.3413776745E-002
5 6.5246209445E-001 -3.7960754525E-002 5.8439215647E-002 -9.0620367134E-002 -8.4164313206E-002
6 -3.7935271881E-002 1.9415336793E-001 -6.8115262349E-002 5.0899890760E-002 -3.3687874555E-002
7 5.8422477033E-002 -6.8128901087E-002 3.9950499633E-001 -4.4336879147E-002 -4.0665928103E-002
8 -9.0612201567E-002 5.0902528870E-002 -4.4330072001E-002 1.2680415316E-001 1.7096405711E-002
9 -8.4167028549E-002 -3.3690056890E-002 -4.0677875424E-002 1.7097273427E-002 5.2579065978E-001
10 -6.4841142152E-002 -5.4453858464E-003 -2.4697277476E-001 8.5069643903E-005 1.8744016178E-001
11 -1.0367060076E-001 1.5864203200E-002 -1.6074822795E-002 -5.5265410413E-002 -7.3152548403E-002
12 -9.0665723957E-003 3.3027526012E-003 1.8484849938E-003 -7.5841163489E-004 -3.3700244298E-003
13 4.7717318460E-004 -1.8118719766E-003 1.6014630540E-003 -2.3830908057E-004 2.1049292570E-003
14 4.3836856576E-003 -1.7242302777E-003 -1.2023546553E-003 4.0533783460E-004 1.4850814596E-003
15 -1.2402059167E-002 -7.4793143461E-003 -3.8769252328E-004 3.9551076185E-003 1.0737706641E-003
16 -9.3076805579E-005 -1.6074185601E-003 1.7551579833E-003 -5.1663470094E-004 1.1072804383E-003
17 4.6817349747E-003 3.6900011954E-003 -8.6155331565E-004 -9.1007768778E-005 -7.3899260162E-004
18 3.2959550689E-002 3.0400921147E-003 3.9724187499E-004 -1.9220339108E-003 1.8075790317E-003
19 7.0905456379E-004 -5.0949208181E-004 -4.6021500516E-004 -7.9847500945E-004 1.4079850530E-004
20 -1.8687467448E-002 -6.3913023759E-004 -7.3566296037E-004 2.3726543730E-003 -1.0663719038E-003
21 3.6598966411E-003 -8.2335128379E-003 7.5645765132E-004 -2.1824880567E-002 -3.5125687811E-003
22 -1.6198130808E-002 8.4576317115E-003 -6.2045498682E-004 3.3460766491E-002 3.2638760335E-003
23 -3.2057393808E-001 -1.1315081941E-002 3.4822885510E-003 -5.8263446092E-003 2.9508421818E-004
24 -2.6366856593E-002 -5.8331954255E-004 1.1995976399E-003 3.4813904521E-003 -5.0942740761E-002
25 6.5474742063E-003 -5.7681583908E-003 -2.2680039574E-002 -3.3264360995E-002 4.8925407218E-003
26 -1.1288074542E-002 -4.5938216710E-003 -1.9339903561E-003 1.0812058656E-002 2.3005958417E-002
27 1.8937006089E-002 6.5590668002E-003 -2.9973042787E-003 -9.1466195902E-003 -2.0027029530E-001
28 -5.0006834397E-003 -3.1011487603E-002 -2.1071980031E-002 1.5171078954E-002 -6.3286786806E-002
29 1.0199591553E-002 -7.9372677248E-004 3.0157129340E-003 3.3043947441E-003 1.2554933598E-001
10 11 12 13 14
0 3.4566170422E-003 6.6091516193E-003 -1.1418209846E-001 -1.1352717720E-001 -3.2196213169E-002
1 -4.0687114857E-003 1.7153538295E-002 -1.1241515840E-001 -2.2897846552E-001 -5.3401852861E-002
2 1.3767476381E-003 4.9395834885E-003 -3.2262805417E-002 -5.3032729716E-002 -6.2527093260E-002
3 -1.1874067860E-001 -8.4586993618E-002 -1.9322697616E-002 7.4504831410E-004 9.6646936748E-003
4 -3.3280804952E-002 -4.3604931512E-002 -2.6232842935E-002 6.3789697287E-004 1.5382093474E-002
5 -6.4845769217E-002 -1.0366990398E-001 -9.0664935892E-003 4.7719667654E-004 4.3835884630E-003
6 -5.4306282394E-003 1.5863464756E-002 3.3027917727E-003 -1.8118646089E-003 -1.7242102753E-003
7 -2.4687457565E-001 -1.6075394559E-002 1.8484728466E-003 1.6014634135E-003 -1.2023496466E-003
8 8.5962912652E-005 -5.5265657567E-002 -7.5843145596E-004 -2.3831274033E-004 4.0533385644E-004
9 1.8744386918E-001 -7.3152643002E-002 -3.3700964189E-003 2.1048865009E-003 1.4850822567E-003
10 4.2975054072E-001 1.0364270794E-001 -1.5875283846E-003 6.7147216913E-004 1.2875627684E-004
11 1.0364402707E-001 6.0712435750E-001 5.1492123223E-003 8.2705404716E-004 -1.8653698814E-003
12 -1.5875318643E-003 5.1492269487E-003 1.2662026379E-001 1.2488481495E-001 3.3008712754E-002
13 6.7147489686E-004 8.2705994225E-004 1.2488477299E-001 2.4603749137E-001 5.7666439818E-002
14 1.2875157882E-004 -1.8653719810E-003 3.3008614344E-002 5.7666322609E-002 6.3196096154E-002
15 1.1375173141E-003 -1.2188735107E-003 9.5708352328E-003 -1.3282223067E-002 5.3571128896E-003
16 2.1319373893E-004 -2.6367828437E-004 1.4833724552E-002 -2.0115235494E-002 7.8461850894E-003
17 2.3051283757E-004 3.4044831571E-004 4.9262824289E-003 -6.6151918659E-003 1.1684894610E-003
18 -5.6658408835E-004 1.5710333316E-003 -2.6543076573E-003 1.0490950154E-003 -1.5676208892E-002
19 1.0005496308E-003 1.0400419914E-003 -2.7122935995E-003 -5.3716049248E-005 -2.6747366947E-002
20 3.1068907684E-004 5.3348953665E-004 -4.7934824223E-004 4.4853558686E-004 -6.0300656596E-003
21 2.7080517882E-003 -1.9033626829E-002 8.8615570289E-004 -3.7735646663E-004 -7.4101143501E-004
22 -2.9622921796E-003 -2.4159082408E-002 6.6943323966E-004 1.1154593780E-004 1.5914682394E-004
23 3.2842560830E-003 -6.2612752482E-003 1.5738434272E-004 4.6284599959E-004 4.0588132107E-004
24 1.6971737369E-003 2.4217812563E-002 4.3246402884E-004 9.5059931011E-005 3.5484698283E-004
25 -7.4868993750E-002 -8.7332668698E-002 -6.0147742690E-005 -4.8099146029E-005 1.1509155506E-004
26 -9.3177706949E-002 -2.9315061874E-001 2.1287190612E-004 5.0813661565E-005 2.6955715462E-004
27 -7.0097859908E-002 1.2458191360E-001 -1.2846480258E-003 1.2192486380E-004 4.6853704861E-004
28 -6.9485493530E-002 4.8763866344E-002 7.7223643475E-004 1.3853535883E-004 5.4636752811E-005
29 4.8961381968E-002 -1.5272337445E-001 -8.8648769643E-004 -4.4975303480E-005 5.9586006091E-004
15 16 17 18 19
0 -1.2210501176E-001 1.1478027359E-001 -3.9162145749E-002 -5.1389252158E-002 -1.8288904037E-003
1 1.1380272374E-001 -2.1487588526E-001 6.0165774430E-002 -1.1487007778E-003 -4.5441546655E-002
2 -3.8374694597E-002 5.9953296524E-002 -6.7691825286E-002 -3.8484030260E-002 -7.6800715249E-003
3 -1.2822729286E-002 -1.2805898275E-003 6.3813065178E-003 2.2220841872E-003 7.6991955181E-004
4 2.3864994996E-002 -2.2470892452E-004 -1.3309838494E-002 2.4851560674E-004 -1.8460620529E-003
5 -1.2402212045E-002 -9.2994801153E-005 4.6817064931E-003 3.2958166488E-002 7.0866732024E-004
6 -7.4793278406E-003 -1.6074103229E-003 3.6899979002E-003 3.0392561951E-003 -5.0946020505E-004
7 -3.8770026733E-004 1.7551659565E-003 -8.6155605026E-004 3.9692465089E-004 -4.6038088334E-004
8 3.9551171890E-003 -5.1663991899E-004 -9.1008948343E-005 -1.9220277566E-003 -7.9837924658E-004
9 1.0738350084E-003 1.1072790098E-003 -7.3897453645E-004 1.8057852560E-003 1.4013275714E-004
10 1.1375075076E-003 2.1317640112E-004 2.3050639764E-004 -5.6673414945E-004 1.0005316579E-003
11 -1.2189105982E-003 -2.6367792495E-004 3.4043235164E-004 1.5732522246E-003 1.0407973658E-003
12 9.5708232459E-003 1.4833737759E-002 4.9262816092E-003 -2.6542614308E-003 -2.7122986789E-003
13 -1.3282260152E-002 -2.0115238348E-002 -6.6152067653E-003 1.0491248568E-003 -5.3705750675E-005
14 5.3571028398E-003 7.8462085672E-003 1.1684872139E-003 -1.5676176683E-002 -2.6747374282E-002
15 1.3378635756E-001 -1.2613361119E-001 4.2401828623E-002 -2.6595403473E-003 1.9873360401E-003
16 -1.2613349126E-001 2.3154756121E-001 -6.5778628114E-002 -2.2828335280E-003 1.4601821131E-003
17 4.2401749392E-002 -6.5778591727E-002 6.8187241643E-002 -1.6653902450E-002 2.5505038138E-002
18 -2.6595920073E-003 -2.2828074980E-003 -1.6653942562E-002 5.4855247002E-002 2.4729783529E-003
19 1.9873415121E-003 1.4601899329E-003 2.5505058190E-002 2.4729967206E-003 4.4724663284E-002
20 -3.8366743828E-004 -8.8746730931E-004 -6.4420927497E-003 3.6656962180E-002 8.1224860664E-003
21 9.2845385141E-004 3.6802433505E-004 -9.5040708316E-004 -5.1941208846E-003 -1.2444625713E-004
22 -5.0318487549E-004 1.4342911215E-004 2.8985859503E-004 2.0416113478E-004 9.1951318240E-004
23 7.4036073171E-004 -3.4730013615E-004 -1.3351566400E-004 2.3474188588E-003 1.3102362758E-005
24 -2.7749145090E-004 4.7724454321E-005 5.5527644806E-005 -1.7302886151E-004 -1.7726879169E-004
25 -2.5090250470E-004 2.1741519930E-005 2.7208805916E-004 -2.5982303487E-004 -1.9668228900E-004
26 -1.4489113997E-004 -3.0397727583E-005 2.7239543481E-005 -6.0050637375E-004 -2.9892198193E-005
27 -1.6519482597E-005 1.6435294198E-004 5.0961893634E-005 1.4077278097E-004 -1.9027010603E-005
28 -2.3547595249E-004 7.6124571826E-005 1.0117983985E-004 -1.1534040559E-004 -1.0579685787E-004
29 7.0507166233E-005 1.1552377841E-004 -4.5931305760E-005 -2.0007797315E-004 -1.3505340062E-004
20 21 22 23 24
0 -3.4812101478E-002 2.6911592086E-003 -1.6354152863E-003 -1.4301333227E-003 1.6249964844E-003
1 -7.8382610347E-003 3.7103408229E-004 2.2593110441E-003 -1.7829862164E-003 1.6374435740E-003
2 -3.1205423941E-001 2.8493671639E-002 1.6587990556E-004 -1.1426237591E-002 -7.8189111866E-004
3 -2.8004725758E-003 -4.8903739721E-002 -4.9988134121E-003 1.7100983514E-003 3.7179545055E-003
4 3.9806443322E-004 -5.8790208912E-004 -4.3242458298E-002 -1.2016207108E-002 -1.0472139534E-002
5 -1.8686790048E-002 3.6592865292E-003 -1.6198931842E-002 -3.2057224847E-001 -2.6367531700E-002
6 -6.3919412091E-004 -8.2335246704E-003 8.4576155591E-003 -1.1315054733E-002 -5.8369163532E-004
7 -7.3581915791E-004 7.5646519519E-004 -6.2047477465E-004 3.4823216513E-003 1.1991380964E-003
8 2.3726528036E-003 -2.1824763131E-002 3.3460717579E-002 -5.8262172949E-003 3.4812921433E-003
9 -1.0665296285E-003 -3.5124206435E-003 3.2639684654E-003 2.9530797749E-004 -5.0943824872E-002
10 3.1067613876E-004 2.7079189356E-003 -2.9623459983E-003 3.2841200274E-003 1.6984442797E-003
11 5.3351732140E-004 -1.9033427571E-002 -2.4158940046E-002 -6.2609613281E-003 2.4221378111E-002
12 -4.7937892256E-004 8.8611314755E-004 6.6939922854E-004 1.5740024716E-004 4.3249394082E-004
13 4.4851926804E-004 -3.7736678097E-004 1.1153694999E-004 4.6284806253E-004 9.5077824774E-005
14 -6.0300787410E-003 -7.4096053004E-004 1.5918637627E-004 4.0586523098E-004 3.5485782222E-004
15 -3.8368712363E-004 9.2843754228E-004 -5.0316845184E-004 7.4036906127E-004 -2.7745851356E-004
16 -8.8745240886E-004 3.6801936222E-004 1.4342995270E-004 -3.4729860789E-004 4.7711904531E-005
17 -6.4420819427E-003 -9.5038506002E-004 2.8983698019E-004 -1.3352326563E-004 5.5544671478E-005
18 3.6656852373E-002 -5.1941195232E-003 2.0415783452E-004 2.3474119607E-003 -1.7153048632E-004
19 8.1224361521E-003 -1.2444681834E-004 9.1951236579E-004 1.3097434442E-005 -1.7668019335E-004
20 3.3911554853E-001 2.8652507893E-003 -6.8339696880E-005 3.7476484447E-004 8.3606654277E-004
21 2.8652527558E-003 6.1967615286E-002 -3.2455918220E-003 7.8074203872E-003 -1.5351890960E-003
22 -6.8340068690E-005 -3.2455946984E-003 4.1826230856E-002 6.5337193429E-003 -3.1932674182E-003
23 3.7476336333E-004 7.8073802579E-003 6.5336763366E-003 3.4246747567E-001 -2.2590437719E-005
24 8.3515185725E-004 -1.5351889308E-003 -3.1932682244E-003 -2.2585651674E-005 4.7006835231E-002
25 5.3158843621E-007 1.0652535047E-003 1.4954902777E-003 2.4073368793E-004 1.1954474977E-003
26 5.5963948637E-004 -4.4872582333E-004 -1.4772351943E-003 6.3199701928E-004 -2.1389718034E-002
27 -1.7619372799E-004 9.0741766644E-004 9.8175835796E-004 -2.9459682310E-004 7.2835611826E-004
28 2.5127782091E-004 -9.3298199434E-004 6.8787235133E-005 1.2732690365E-004 7.9688727422E-003
29 2.6201943695E-004 1.7128017387E-004 1.2934748675E-003 3.4008367645E-004 1.9615268308E-002
25 26 27 28 29
0 -4.2035299977E-004 1.0294528397E-003 -7.0032537135E-004 7.4047266192E-004 -3.4678947810E-004
1 1.5264932827E-004 9.3263518942E-004 -3.5205362458E-003 1.9332600101E-003 -2.3027335108E-003
2 -9.8735571502E-004 7.7177183895E-004 -1.6311830663E-003 3.7374078263E-004 -1.1338849320E-003
3 -2.2267753982E-004 4.5631164845E-003 -1.2805227755E-002 3.9967067646E-003 -1.2023590679E-002
4 -5.1722782688E-003 -4.3757731112E-003 -6.5561880794E-003 -6.2274289617E-003 -5.4242286711E-003
5 6.5472637324E-003 -1.1287788747E-002 1.8937046693E-002 -5.0006811267E-003 1.0199602824E-002
6 -5.7685226078E-003 -4.5935456207E-003 6.5591405092E-003 -3.1011377655E-002 -7.9382348181E-004
7 -2.2680665405E-002 -1.9338350120E-003 -2.9972765688E-003 -2.1071947728E-002 3.0156847654E-003
8 -3.3264515239E-002 1.0812126530E-002 -9.1466888768E-003 1.5170890552E-002 3.3044094214E-003
9 4.8928775025E-003 2.3007654009E-002 -2.0026482543E-001 -6.3285758846E-002 1.2554808336E-001
10 -7.4869041758E-002 -9.3178724533E-002 -7.0098856149E-002 -6.9485640501E-002 4.8962839723E-002
11 -8.7330564494E-002 -2.9314613543E-001 1.2458021507E-001 4.8763534298E-002 -1.5272144228E-001
12 -6.0132426168E-005 2.1286995818E-004 -1.2846479090E-003 7.7223667108E-004 -8.8648784383E-004
13 -4.8090893023E-005 5.0813447259E-005 1.2192474211E-004 1.3853537972E-004 -4.4975512069E-005
14 1.1509828375E-004 2.6955725919E-004 4.6853708025E-004 5.4636589826E-005 5.9585997916E-004
15 -2.5088560837E-004 -1.4490239429E-004 -1.6517113547E-005 -2.3547725232E-004 7.0506301073E-005
16 2.1741623849E-005 -3.0396484786E-005 1.6435437640E-004 7.6123660238E-005 1.1552303684E-004
17 2.7209709129E-004 2.7234932342E-005 5.0963084246E-005 1.0117936124E-004 -4.5931984725E-005
18 -2.5882735848E-004 -6.0031848430E-004 1.4070861538E-004 -1.1535910049E-004 -2.0001808065E-004
19 -1.9638025822E-004 -2.9919459983E-005 -1.9047914816E-005 -1.0580143635E-004 -1.3503643634E-004
20 8.4829116415E-007 5.5948891149E-004 -1.7619563318E-004 2.5127749619E-004 2.6202088722E-004
21 1.0652521780E-003 -4.4872868033E-004 9.0739586785E-004 -9.3299673048E-004 1.7126146660E-004
22 1.4954902653E-003 -1.4772362211E-003 9.8175151528E-004 6.8801505444E-005 1.2934673074E-003
23 2.4072903510E-004 6.3199689136E-004 -2.9460500091E-004 1.2731327319E-004 3.4007600115E-004
24 1.1952923145E-003 -2.1389995888E-002 7.2832026293E-004 7.9688600183E-003 1.9615297182E-002
25 9.4289717269E-002 1.0562741426E-001 -1.7552990896E-004 7.0060843371E-003 8.7782610441E-003
26 1.0562750999E-001 3.0308674016E-001 -1.6382699707E-003 -5.5832273099E-003 -1.1726448645E-002
27 -1.7551353029E-004 -1.6382784849E-003 2.0673701256E-001 8.2101212014E-002 -1.3115219203E-001
28 7.0060896795E-003 -5.5832572276E-003 8.2101377926E-002 8.7668224780E-002 -5.4259499038E-002
29 8.7782416309E-003 -1.1726450275E-002 -1.3115216547E-001 -5.4259354736E-002 1.5092602943E-001
This should be a 30x30 matrix and I'm trying:
data = pd.read_fwf('C:/Users/henri/Documents/Projects/Python-Lessons/ORCA/orca.hess',
widths=[9, 19, 19, 19, 19, 19])
But it reads as 185x6. I'd like to ignore the first column (numbering the lines) from 0-29 and I'm not using the columns indexes (from 0-29 too) to perform any mathematical operation. Also, Pandas is rounding my numbers and I'd like to keep the original format.
Here is a snip of my output:
Unnamed: 0 0 1 2 3 4
0 0.0 5.167712e-01 0.000055 0.032484 -0.189017 -0.006716
1 1.0 5.538011e-05 0.561599 -0.001900 -0.014738 -0.072598
2 2.0 3.248692e-02 -0.001900 0.567918 0.072316 0.001501
Any help is much appreciated, guys.
import pandas as pd
filename = 'data'
df = pd.read_fwf(filename, widths=[9, 19, 19, 19, 19, 19])
df = df.rename(columns={'Unnamed: 0':'row'})
df = df.dropna(subset=['row'], how='any')
df['col'] = df.groupby('row').cumcount()
df = df.pivot(index='row', columns='col')
df = df.dropna(how='any', axis=1)
df.columns = range(len(df.columns))
print(df.head())
yields
0 1 2 3 4 5 6 \
row
0.0 0.516771 0.066743 0.003457 -0.122105 -0.034812 -0.000420 0.000055
1.0 0.000055 0.001410 -0.004069 0.113803 -0.007838 0.000153 0.561599
2.0 0.032487 -0.126894 0.001377 -0.038375 -0.312054 -0.000987 -0.001900
3.0 -0.189014 0.047664 -0.118741 -0.012823 -0.002800 -0.000223 -0.014737
4.0 -0.006714 0.018558 -0.033281 0.023865 0.000398 -0.005172 -0.072598
7 8 9 ... 20 21 22 \
row ...
0.0 -0.025292 0.006609 0.114780 ... -0.113527 -0.051389 -0.001430
1.0 0.037536 0.017154 -0.214876 ... -0.228978 -0.001149 -0.001783
2.0 -0.013453 0.004940 0.059953 ... -0.053033 -0.038484 -0.011426
3.0 -0.131811 -0.084587 -0.001281 ... 0.000745 0.002222 0.001710
4.0 0.068311 -0.043605 -0.000225 ... 0.000638 0.000249 -0.012016
23 24 25 26 27 28 29
row
0.0 0.000740 -0.006716 -0.018400 -0.032196 -0.001829 0.001625 -0.000347
1.0 0.001933 -0.072598 -0.021579 -0.053402 -0.045442 0.001637 -0.002303
2.0 0.000374 0.001501 -0.008308 -0.062527 -0.007680 -0.000782 -0.001134
3.0 0.003997 0.031554 -0.218476 0.009665 0.000770 0.003718 -0.012024
4.0 -0.006227 0.273181 -0.073414 0.015382 -0.001846 -0.010472 -0.005424
[5 rows x 30 columns]
After parsing the file with
df = pd.read_fwf(filename, widths=[9, 19, 19, 19, 19, 19])
df = df.rename(columns={'Unnamed: 0':'row'})
the column headers can be identified by have a df['row'] value of NaN.
So they can be removed with
df = df.dropna(subset=['row'], how='any')
Now the row numbers keep repeating from 0 to 29. If we group by the row
value, then we can assign an intra-group "cumulative count" to the rows within
each group. That is, the first row of the group gets assigned the value 0, the
next row 1, etc. -- within that group -- and the process is repeated for each
group.
df['col'] = df.groupby('row').cumcount()
# row 0 1 2 3 4 col
# 0 0.0 5.167712e-01 0.000055 0.032484 -0.189017 -0.006716 0
# 1 1.0 5.538011e-05 0.561599 -0.001900 -0.014738 -0.072598 0
# 2 2.0 3.248692e-02 -0.001900 0.567918 0.072316 0.001501 0
# ...
# 182 27.0 -1.755135e-04 -0.001638 0.206737 0.082101 -0.131152 5
# 183 28.0 7.006090e-03 -0.005583 0.082101 0.087668 -0.054259 5
# 184 29.0 8.778242e-03 -0.011726 -0.131152 -0.054259 0.150926 5
Now the desired DataFrame can be obtained by pivoting:
df = df.pivot(index='row', columns='col')
and relabeling the columns:
df.columns = range(len(df.columns))
A more NumPy-based approach might look like this:
import numpy as np
import pandas as pd
filename = 'data'
df = pd.read_csv(filename, delim_whitespace=True)
arr = df.values
N = df.index.max()+1
arr = np.delete(arr, np.arange(N, len(arr), N+1), axis=0)
chunks = np.split(arr, np.arange(N, len(arr), N))
result = pd.DataFrame(np.hstack(chunks)).dropna(axis=1)
print(result)
This will also work for any sized matrix.

Categories

Resources