Rotate the axis of an excel chart using openpyxl - python

there
I'm trying to using openpyxl to deal with Excel Data.Drawing the picture out and exporting them is OK.But the x_axis is not pretty enough,I want to rotate that but haven't found the solution in the doc.
Here is the solution using XlsxWriter:solution.
My Code is something like this:
from openpyxl import load_workbook
from openpyxl.chart import (
ScatterChart,
LineChart,
Reference,
Series,
shapes,
text,
axis)
wb = load_workbook('text.xlsx')
ws = wb.active
c5 = ScatterChart()
x = Reference(ws, min_col=1, min_row=2, max_row=ws.max_row)
for i in range(3,5):
values = Reference(ws, min_col=i, min_row=1, max_row=ws.max_row)
series = Series(values, xvalues=x, title_from_data=True)
# series.marker.symbol = 'triangle'
c5.series.append(series)
c5.x_axis.number_format='yyyy/mm/dd'
c5.x_axis.title = 'Date'
Thanks all!
My Python version is 3.5.2 and openpyxl is 2.4.0
----------------newcode to rotate but make file broken and need repaired
from openpyxl.chart.text import RichText
from openpyxl.drawing.text import RichTextProperties
c5.x_axis.txPr = RichText(bodyPr=RichTextProperties(rot="-2700000"))
------------- code in Excel xml
<valAx>
some codes here
<txPr>
<a:bodyPr rot="-1350000" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main"/>
</txPr>
<crossAx val="20"/>
</valAx>
codes above breaks the file until I add these in it
<txPr>
<a:bodyPr rot="-1350000" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main"/>
<a:p xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:pPr xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:defRPr xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main"/>
</a:pPr>
<a:endParaRPr xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" lang="zh-CN"/>
</a:p>
</txPr>

Thanks to #Charlie Clark
I finally get the answer
The adding codes are below:
from openpyxl.chart.text import RichText
from openpyxl.drawing.text import RichTextProperties,Paragraph,ParagraphProperties, CharacterProperties
c5.x_axis.txPr = RichText(bodyPr=RichTextProperties(anchor="ctr",anchorCtr="1",rot="-2700000",
spcFirstLastPara="1",vertOverflow="ellipsis",wrap="square"),
p=[Paragraph(pPr=ParagraphProperties(defRPr=CharacterProperties()), endParaRPr=CharacterProperties())])
It is a little complicated ,But can be learned from openpyxl documentation
txPr is typed RichText,and it consists of (bodyPr and p),bodyPr defines it's properties, p is a sequence and decides if the axis will be shown or not.
it can rotate the x_axis of the chart -45 degrees.
It also might be a bit more convenient to make a copy of an existing property and set its rotation:
chart.x_axis.title = 'Date'
chart.x_axis.txPr = deepcopy(chart.x_axis.title.text.rich)
chart.x_axis.txPr.properties.rot = "-2700000"

Related

Save "DataFrame.style" table in Jupyter as png? [duplicate]

I constructed a pandas dataframe of results. This data frame acts as a table. There are MultiIndexed columns and each row represents a name, ie index=['name1','name2',...] when creating the DataFrame. I would like to display this table and save it as a png (or any graphic format really). At the moment, the closest I can get is converting it to html, but I would like a png. It looks like similar questions have been asked such as How to save the Pandas dataframe/series data as a figure?
However, the marked solution converts the dataframe into a line plot (not a table) and the other solution relies on PySide which I would like to stay away simply because I cannot pip install it on linux. I would like this code to be easily portable. I really was expecting table creation to png to be easy with python. All help is appreciated.
Pandas allows you to plot tables using matplotlib (details here).
Usually this plots the table directly onto a plot (with axes and everything) which is not what you want. However, these can be removed first:
import matplotlib.pyplot as plt
import pandas as pd
from pandas.table.plotting import table # EDIT: see deprecation warnings below
ax = plt.subplot(111, frame_on=False) # no visible frame
ax.xaxis.set_visible(False) # hide the x axis
ax.yaxis.set_visible(False) # hide the y axis
table(ax, df) # where df is your data frame
plt.savefig('mytable.png')
The output might not be the prettiest but you can find additional arguments for the table() function here.
Also thanks to this post for info on how to remove axes in matplotlib.
EDIT:
Here is a (admittedly quite hacky) way of simulating multi-indexes when plotting using the method above. If you have a multi-index data frame called df that looks like:
first second
bar one 1.991802
two 0.403415
baz one -1.024986
two -0.522366
foo one 0.350297
two -0.444106
qux one -0.472536
two 0.999393
dtype: float64
First reset the indexes so they become normal columns
df = df.reset_index()
df
first second 0
0 bar one 1.991802
1 bar two 0.403415
2 baz one -1.024986
3 baz two -0.522366
4 foo one 0.350297
5 foo two -0.444106
6 qux one -0.472536
7 qux two 0.999393
Remove all duplicates from the higher order multi-index columns by setting them to an empty string (in my example I only have duplicate indexes in "first"):
df.ix[df.duplicated('first') , 'first'] = '' # see deprecation warnings below
df
first second 0
0 bar one 1.991802
1 two 0.403415
2 baz one -1.024986
3 two -0.522366
4 foo one 0.350297
5 two -0.444106
6 qux one -0.472536
7 two 0.999393
Change the column names over your "indexes" to the empty string
new_cols = df.columns.values
new_cols[:2] = '','' # since my index columns are the two left-most on the table
df.columns = new_cols
Now call the table function but set all the row labels in the table to the empty string (this makes sure the actual indexes of your plot are not displayed):
table(ax, df, rowLabels=['']*df.shape[0], loc='center')
et voila:
Your not-so-pretty but totally functional multi-indexed table.
EDIT: DEPRECATION WARNINGS
As pointed out in the comments, the import statement for table:
from pandas.tools.plotting import table
is now deprecated in newer versions of pandas in favour of:
from pandas.plotting import table
EDIT: DEPRECATION WARNINGS 2
The ix indexer has now been fully deprecated so we should use the loc indexer instead. Replace:
df.ix[df.duplicated('first') , 'first'] = ''
with
df.loc[df.duplicated('first') , 'first'] = ''
There is actually a python library called dataframe_image
Just do a
pip install dataframe_image
Do the imports
import pandas as pd
import numpy as np
import dataframe_image as dfi
df = pd.DataFrame(np.random.randn(6, 6), columns=list('ABCDEF'))
and style your table if you want by:
df_styled = df.style.background_gradient() #adding a gradient based on values in cell
and finally:
dfi.export(df_styled,"mytable.png")
The best solution to your problem is probably to first export your dataframe to HTML and then convert it using an HTML-to-image tool.
The final appearance could be tweaked via CSS.
Popular options for HTML-to-image rendering include:
WeasyPrint
wkhtmltopdf/wkhtmltoimage
Let us assume we have a dataframe named df.
We can generate one with the following code:
import string
import numpy as np
import pandas as pd
np.random.seed(0) # just to get reproducible results from `np.random`
rows, cols = 5, 10
labels = list(string.ascii_uppercase[:cols])
df = pd.DataFrame(np.random.randint(0, 100, size=(5, 10)), columns=labels)
print(df)
# A B C D E F G H I J
# 0 44 47 64 67 67 9 83 21 36 87
# 1 70 88 88 12 58 65 39 87 46 88
# 2 81 37 25 77 72 9 20 80 69 79
# 3 47 64 82 99 88 49 29 19 19 14
# 4 39 32 65 9 57 32 31 74 23 35
Using WeasyPrint
This approach uses a pip-installable package, which will allow you to do everything using the Python ecosystem.
One shortcoming of weasyprint is that it does not seem to provide a way of adapting the image size to its content.
Anyway, removing some background from an image is relatively easy in Python / PIL, and it is implemented in the trim() function below (adapted from here).
One also would need to make sure that the image will be large enough, and this can be done with CSS's #page size property.
The code follows:
import weasyprint as wsp
import PIL as pil
def trim(source_filepath, target_filepath=None, background=None):
if not target_filepath:
target_filepath = source_filepath
img = pil.Image.open(source_filepath)
if background is None:
background = img.getpixel((0, 0))
border = pil.Image.new(img.mode, img.size, background)
diff = pil.ImageChops.difference(img, border)
bbox = diff.getbbox()
img = img.crop(bbox) if bbox else img
img.save(target_filepath)
img_filepath = 'table1.png'
css = wsp.CSS(string='''
#page { size: 2048px 2048px; padding: 0px; margin: 0px; }
table, td, tr, th { border: 1px solid black; }
td, th { padding: 4px 8px; }
''')
html = wsp.HTML(string=df.to_html())
html.write_png(img_filepath, stylesheets=[css])
trim(img_filepath)
Using wkhtmltopdf/wkhtmltoimage
This approach uses an external open source tool and this needs to be installed prior to the generation of the image.
There is also a Python package, pdfkit, that serves as a front-end to it (it does not waive you from installing the core software yourself), but I will not use it.
wkhtmltoimage can be simply called using subprocess (or any other similar means of running an external program in Python).
One would also need to output to disk the HTML file.
The code follows:
import subprocess
df.to_html('table2.html')
subprocess.call(
'wkhtmltoimage -f png --width 0 table2.html table2.png', shell=True)
and its aspect could be further tweaked with CSS similarly to the other approach.
Although I am not sure if this is the result you expect, you can save your DataFrame in png by plotting the DataFrame with Seaborn Heatmap with annotations on, like this:
http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.heatmap.html#seaborn.heatmap
It works right away with a Pandas Dataframe. You can look at this example: Efficiently ploting a table in csv format using Python
You might want to change the colormap so it displays a white background only.
Hope this helps.
Edit:
Here is a snippet that does this:
import matplotlib
import seaborn as sns
def save_df_as_image(df, path):
# Set background to white
norm = matplotlib.colors.Normalize(-1,1)
colors = [[norm(-1.0), "white"],
[norm( 1.0), "white"]]
cmap = matplotlib.colors.LinearSegmentedColormap.from_list("", colors)
# Make plot
plot = sns.heatmap(df, annot=True, cmap=cmap, cbar=False)
fig = plot.get_figure()
fig.savefig(path)
The solution of #bunji works for me, but default options don't always give a good result.
I added some useful parameter to tweak the appearance of the table.
import pandas as pd
import matplotlib.pyplot as plt
from pandas.plotting import table
import numpy as np
dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
df.index = [item.strftime('%Y-%m-%d') for item in df.index] # Format date
fig, ax = plt.subplots(figsize=(12, 2)) # set size frame
ax.xaxis.set_visible(False) # hide the x axis
ax.yaxis.set_visible(False) # hide the y axis
ax.set_frame_on(False) # no visible frame, uncomment if size is ok
tabla = table(ax, df, loc='upper right', colWidths=[0.17]*len(df.columns)) # where df is your data frame
tabla.auto_set_font_size(False) # Activate set fontsize manually
tabla.set_fontsize(12) # if ++fontsize is necessary ++colWidths
tabla.scale(1.2, 1.2) # change size table
plt.savefig('table.png', transparent=True)
The result:
I had the same requirement for a project I am doing. But none of the answers came elegant to my requirement. Here is something which finally helped me, and might be useful for this case:
from bokeh.io import export_png, export_svgs
from bokeh.models import ColumnDataSource, DataTable, TableColumn
def save_df_as_image(df, path):
source = ColumnDataSource(df)
df_columns = [df.index.name]
df_columns.extend(df.columns.values)
columns_for_table=[]
for column in df_columns:
columns_for_table.append(TableColumn(field=column, title=column))
data_table = DataTable(source=source, columns=columns_for_table,height_policy="auto",width_policy="auto",index_position=None)
export_png(data_table, filename = path)
There is a Python library called df2img available at https://pypi.org/project/df2img/ (disclaimer: I'm the author). It's a wrapper/convenience function using plotly as backend.
You can find the documentation at https://df2img.dev.
import pandas as pd
import df2img
df = pd.DataFrame(
data=dict(
float_col=[1.4, float("NaN"), 250, 24.65],
str_col=("string1", "string2", float("NaN"), "string4"),
),
index=["row1", "row2", "row3", "row4"],
)
Saving a pd.DataFrame as a .png-file can be done fairly quickly. You can apply formatting, such as background colors or alternating the row colors for better readability.
fig = df2img.plot_dataframe(
df,
title=dict(
font_color="darkred",
font_family="Times New Roman",
font_size=16,
text="This is a title",
),
tbl_header=dict(
align="right",
fill_color="blue",
font_color="white",
font_size=10,
line_color="darkslategray",
),
tbl_cells=dict(
align="right",
line_color="darkslategray",
),
row_fill_color=("#ffffff", "#d7d8d6"),
fig_size=(300, 160),
)
df2img.save_dataframe(fig=fig, filename="plot.png")
If you're okay with the formatting as it appears when you call the DataFrame in your coding environment, then the absolute easiest way is to just use print screen and crop the image using basic image editing software.
Here's how it turned out for me using Jupyter Notebook, and Pinta Image Editor (Ubuntu freeware).
As jcdoming suggested, use Seaborn heatmap():
import seaborn as sns
import matplotlib.pyplot as plt
fig = plt.figure(facecolor='w', edgecolor='k')
sns.heatmap(df.head(), annot=True, cmap='viridis', cbar=False)
plt.savefig('DataFrame.png')
The easiest and fastest way to convert a Pandas dataframe into a png image using Anaconda Spyder IDE- just double-click on the dataframe in variable explorer, and the IDE table will appear, nicely packaged with automatic formatting and color scheme. Just use a snipping tool to capture the table for use in your reports, saved as a png:
This saves me lots of time, and is still elegant and professional.
The following would need extensive customisation to format the table correctly, but the bones of it works:
import numpy as np
from PIL import Image, ImageDraw, ImageFont
import pandas as pd
df = pd.DataFrame({ 'A' : 1.,
'B' : pd.Series(1,index=list(range(4)),dtype='float32'),
'C' : np.array([3] * 4,dtype='int32'),
'D' : pd.Categorical(["test","train","test","train"]),
'E' : 'foo' })
class DrawTable():
def __init__(self,_df):
self.rows,self.cols = _df.shape
img_size = (300,200)
self.border = 50
self.bg_col = (255,255,255)
self.div_w = 1
self.div_col = (128,128,128)
self.head_w = 2
self.head_col = (0,0,0)
self.image = Image.new("RGBA", img_size,self.bg_col)
self.draw = ImageDraw.Draw(self.image)
self.draw_grid()
self.populate(_df)
self.image.show()
def draw_grid(self):
width,height = self.image.size
row_step = (height-self.border*2)/(self.rows)
col_step = (width-self.border*2)/(self.cols)
for row in range(1,self.rows+1):
self.draw.line((self.border-row_step//2,self.border+row_step*row,width-self.border,self.border+row_step*row),fill=self.div_col,width=self.div_w)
for col in range(1,self.cols+1):
self.draw.line((self.border+col_step*col,self.border-col_step//2,self.border+col_step*col,height-self.border),fill=self.div_col,width=self.div_w)
self.draw.line((self.border-row_step//2,self.border,width-self.border,self.border),fill=self.head_col,width=self.head_w)
self.draw.line((self.border,self.border-col_step//2,self.border,height-self.border),fill=self.head_col,width=self.head_w)
self.row_step = row_step
self.col_step = col_step
def populate(self,_df2):
font = ImageFont.load_default().font
for row in range(self.rows):
print(_df2.iloc[row,0])
self.draw.text((self.border-self.row_step//2,self.border+self.row_step*row),str(_df2.index[row]),font=font,fill=(0,0,128))
for col in range(self.cols):
text = str(_df2.iloc[row,col])
text_w, text_h = font.getsize(text)
x_pos = self.border+self.col_step*(col+1)-text_w
y_pos = self.border+self.row_step*row
self.draw.text((x_pos,y_pos),text,font=font,fill=(0,0,128))
for col in range(self.cols):
text = str(_df2.columns[col])
text_w, text_h = font.getsize(text)
x_pos = self.border+self.col_step*(col+1)-text_w
y_pos = self.border - self.row_step//2
self.draw.text((x_pos,y_pos),text,font=font,fill=(0,0,128))
def save(self,filename):
try:
self.image.save(filename,mode='RGBA')
print(filename," Saved.")
except:
print("Error saving:",filename)
table1 = DrawTable(df)
table1.save('C:/Users/user/Pictures/table1.png')
The output looks like this:
People who use Plotly for data visualization:
You can easily convert the dataframe to go.Table.
You can save the dataframe with columns names.
You can format the dataframe through go.Table.
You can save the dataframe as pdf, jpg, or png with different scales and high resolution.
import plotly.express as px
df = px.data.medals_long()
fig = go.Figure(data=[
go.Table(
header=dict(values=list(df.columns),align='center'),
cells=dict(values=df.values.transpose(),
fill_color = [["white","lightgrey"]*df.shape[0]],
align='center'
)
)
])
fig.write_image('image.png',scale=6)
Note: the image is downloaded in the same directory where the current python file is running.
Output:
I really like the way Jupyter notebooks format the DataFrame and this library exports it in the same format:
import dataframe_image as dfi
dfi.export(df, "df.png")
There is also a dpi argument in case you want to increase the quality of the image. I'd recommend 300 for an ok quality, 600 for exelent, 1200 for perfect and more than that is probably too much.
import dataframe_image as dfi
dfi.export(df, "df.png", dpi = 600)

Choropleth map is not showing color variation in the output

I am not getting any color variation in my output map even after the choropleth is linked with the geo_data and the data frame is linked with data parameter in the choropleth method.
I have provided the "key_on" parameter rightly and the "columns" parameter rightly. I have removed all the NULL values from the Dataframe.
import pandas as pd
from pandas import read_csv
import folium
import os
import webbrowser
crimes = read_csv('Dataframe.csv',error_bad_lines=False)
vis = os.path.join('Community_Areas.geojson')
m = folium.Map(location = [41.878113, -87.629799], zoom_start = 10, tiles = "cartodbpositron")
m.choropleth(geo_data=vis, data = crimes, columns = ['Community Area', 'count'], fill_color = 'YlGn', key_on = 'feature.properties.area_numbe')
folium.LayerControl().add_to(m)
m.save('map.html')
webbrowser.open(filepath)
I expected a colored choropleth map, but the actual output was completely grey. I will add the code, data, output in the link below.
Code link: https://github.com/rahul0070/Stackoverflow_question_data
Data
,Community Area,count
0,25.0,92679
1,8.0,48751
2,43.0,47731
3,23.0,45943
4,29.0,44819
5,28.0,42243
6,71.0,40624
7,67.0,40157
8,24.0,39680
9,32.0,38513
10,49.0,37227
11,68.0,37023
12,69.0,35874
13,66.0,33877
14,44.0,32256
15,6.0,31043
16,26.0,30565
17,27.0,28113
18,61.0,27362
19,22.0,27329
20,46.0,26897
21,19.0,26198
22,30.0,24362
23,53.0,22645
24,42.0,21284
25,7.0,21047
26,1.0,19932
27,3.0,19799
28,15.0,17820
29,38.0,17660
30,2.0,17213
31,73.0,17071
32,16.0,15926
33,40.0,14943
34,58.0,14143
35,31.0,13934
36,63.0,13203
37,70.0,12980
38,35.0,12965
39,14.0,12714
40,77.0,12612
41,21.0,12587
42,75.0,11156
43,65.0,10812
44,51.0,10348
45,56.0,10176
46,4.0,9984
47,33.0,8987
48,60.0,8982
49,76.0,8938
50,20.0,8922
51,17.0,8536
52,41.0,8076
53,48.0,7823
54,5.0,7761
55,45.0,7485
56,39.0,7466
57,52.0,7150
58,54.0,6777
59,10.0,6372
60,11.0,6072
61,34.0,6053
62,62.0,5736
63,50.0,5730
64,59.0,5695
65,57.0,5244
66,64.0,5194
67,72.0,4962
68,37.0,4908
69,13.0,4570
70,36.0,3359
71,74.0,3145
72,55.0,3109
73,18.0,3109
74,12.0,2454
75,47.0,2144
76,9.0,1386
After reading the data what I found was you did not convert the data type of Community Area column to string type as the geojson file contains the key in the form of string. So the data type of both the key and column should match.
Here is a full fledged solution to your answer
# importing libraries
import pandas as pd
from pandas import read_csv
import folium
import os
import webbrowser
# read the data
crimes = read_csv('Dataframe.csv',error_bad_lines=False)
# convert float to int then to string
crimes['Community Area'] = crimes['Community Area'].astype('int').astype('str')
# choropleth map
vis = 'Community_Areas.geojson'
m = folium.Map(location = [41.878113, -87.629799], zoom_start = 10, tiles = "cartodbpositron")
m.choropleth(geo_data=vis, data = crimes, columns = ['Community Area', 'count'], fill_color = 'YlGn', key_on = 'feature.properties.area_num_1')
folium.LayerControl().add_to(m)
m.save('map.html')
webbrowser.open('map.html')
Please comment if you find difficulty in understanding any part of code. For the key_on parameter you can also try feature.properties.area_numbe

Axis text orientation on openpyxl chart

I'm generating a ScatterChart with pyopenxl from a pandas dataframe.
I am trying to change the rotation of the text in the X axis to 270º but I cannot found documentation about how to do it.
This is the code to generate the chart.
import numpy as np
from openpyxl.chart import ScatterChart, Reference, Series
from openpyxl.chart.axis import DateAxis
import pandas as pd
def generate_chart_proyeccion(writer_sheet, col_to_graph, start_row, end_row, title):
"""
Construct a new chart object
:param writer_sheet: Worksheet were is data located
:param col_to_graph: Column of data to be plotted
:param start_row: Row where data starts
:param end_row: Row where data ends
:param title: Chart title
:return: returns a chart object
"""
chart = ScatterChart()
chart.title = title
chart.x_axis.number_format = 'd-mmm HH:MM'
chart.x_axis.majorTimeUnit = "days"
chart.x_axis.title = "Date"
chart.y_axis.title = "Value"
chart.legend.position = "b"
data = Reference(writer_sheet, min_col=col_to_graph, max_col=col_to_graph, min_row=start_row, max_row=end_row)
data_dates = Reference(writer_sheet, min_col=1, max_col=1, min_row=start_row, max_row=end_row) # Corresponde a la columna con la fecha
serie = Series(data, data_dates, title_from_data=True)
chart.series.append(serie)
return chart
# Write data to excel
writer = pd.ExcelWriter("file.xlsx", engine='openpyxl')
df = pd.DataFrame(np.random.randn(10,1), columns=['Value'], index=pd.date_range('20130101',periods=10,freq='T'))
start_row = 1 # Row to start writing df in excel
df.to_excel(writer, sheet_name="Sheet1", startrow=start_row)
end_row = start_row + len(df) # End of the data
chart = generate_chart_proyeccion(writer.sheets["Sheet1"], 2, start_row, end_row, "A title")
# Añado gráfico a excel
writer.sheets["Sheet1"].add_chart(chart, "C2")
writer.save()
This is the output chart that I got.
This is the output chart that I want.
Thanks!
This is unfortunately nothing like as simple as it should be because in the specification this is one of the areas where the schema changes from SpreadsheetDrawingML to DrawingML, which is far more abstract. The best thing to do is actually create two sample files and compare them. In this case this difference is in rot or rotation attribute of the txPr or textProperties of the axis. This is covered in § 21.1.2.1.1 of the OOXML specification.
The following code should work, but might require you to create a TextProperties object:
chart.x_axis.textProperties.bodyProperties.rot = -5400000
I had the same question - this SO post by #oldhumble solved it for me - please see Rotate the axis of an excel chart using openpyxl

Choosing the correct values in excel in Python

General Overview:
I am creating a graph of a large data set, however i have created a sample text document so that it is easier to overcome the problems.
The Data is from an excel document that will be saved as a CSV.
Problem:
I am able to compile the data a it will graph (see below) However how i pull the data will not work for all of the different excel sheet i am going to pull off of.
More Detail of problem:
The Y-Values (Labeled 'Value' and 'Value1') are being pulled for the excel sheet from the numbers 26 and 31 (See picture and Code).
This is a problem because the Values 26 and 31 will not be the same for each graph.
Lets take a look for this to make more sense.
Here is my code
import pandas as pd
import matplotlib.pyplot as plt
pd.read_csv('CSV_GM_NB_Test.csv').T.to_csv('GM_NB_Transpose_Test.csv,header=False)
df = pd.read_csv('GM_NB_Transpose_Test.csv', skiprows = 2)
DID = df['SN']
Value = df['26']
Value1 = df['31']
x= (DID[16:25])
y= (Value[16:25])
y1= (Value1[16:25])
"""
print(x,y)
print(x,y1)
"""
plt.plot(x.astype(int), y.astype(int))
plt.plot(x.astype(int), y1.astype(int))
plt.show()
Output:
Data Set:
Below in the comments you will find the 0bin to my Data Set this is because i do not have enough reputation to post two links.
As you can see from the Data Set
X- DID = Blue
Y-Value = Green
Y-Value1 = Grey
Troublesome Values = Red
The problem again is that the data for the Y-Values are pulled from Row 10&11 from values 26,31 under SN
Let me know if more information is needed.
Thank you
Not sure why you are creating the transposed CSV version. It is also possible to work directly from your original data. For example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('CSV_GM_NB_Test.csv', skiprows=8)
data = df.ix[:,19:].T
data.columns = df['SN']
data.plot()
plt.show()
This would give you:
You can use pandas.DataFrame.ix() to give you a sliced version of your data using integer positions. The [:,19:] says to give you columns 19 onwards. The final .T transposes it. You can then apply the values for the SN column as column headings using .columns to specify the names.

Pandas GroupBy object is not 'serializable' by Plot.ly

I'm trying to create a boxplot using Plotly and I get an error when attempting to use a Pandas DataFrame that's been grouped. Some initial digging produced this chunk of code to convert Pandas to Plotly interface:
def df_to_iplot(df):
'''
Coverting a Pandas Data Frame to Plotly interface
'''
x = df.index.values
lines={}
for key in df:
lines[key]={}
lines[key]["x"]=x
lines[key]["y"]=df[key].values
lines[key]["name"]=key
#Appending all lines
lines_plotly=[lines[key] for key in df]
return lines_plotly
Are there alternatives to this method of converting DataFrame's to a Plotly-compatible series? The above code is for line graphs, but I'd like to iterate over my dimensions to produce a boxplot for each group in my DataFrame. Here is the error message I'm getting:
"TypeError: pandas.core.groupby.SeriesGroupBy object is not JSON serializable"
Here is an example from the Plotly website: https://plot.ly/python/box-plots
import plotly.plotly as py
from plotly.graph_objs import *
py.sign_in("xxxx", "xxxxxxxxxx")
import numpy as np
y0 = np.random.randn(50)
y1 = np.random.randn(50)+1
trace0 = Box(
y=y0
)
trace1 = Box(
y=y1
)
data = Data([trace0, trace1])
unique_url = py.plot(data, filename = 'basic-box-plot')
If I understand right, you want something like this:
data = Data([Box(y=v.values) for k, v in g])
(where g is your grouped object). Then you can use py.plot on that.
Like I said in the comments, I know nothing about plotly; I'm just going based off your example. We'll see if anyone who knows more about plotly replies. Failing that, it would be helpful if you could explain in your question what format you want the data in (i.e., figure out what format plotly wants).

Categories

Resources