Creating chord diagram in Python - python

I want to create a Chord diagram for the following dataset where I have the first two columns as physical locations and a third column showing how many people visited both.
Place1 Place2 Count
US UK 200
FR US 450
UK US 200
NL FR 150
IT FR 500
I tried using Holoviews but I couldn't make it work
nodes = hv.Dataset(df, 'Place1', 'Place2')
chord = hv.Chord((df, nodes), ['Place1', 'Place2'], ['Count'])
graph = chord.select(selection_mode='nodes')
But I get the following error: DataError: None of the available storage backends were able to support the supplied data format.
How can I use this dataframe to create a Chord diagram?

A possible solution to this is the following. Remember that your shared data is not very large and the resulting chord diagram is pretty uggly.
import holoviews as hv
chords = chord.groupby(by=["Place1", "Place2"]).sum()[["Count"]].reset_index()
chords = chords.sort_values(by="Count", ascending=False)
CChord = hv.Chord(chords)
print(CChord)
hv.extension("bokeh")
CChord
The last part hv.extension("bokeh") is essential for the visualization. You could even add label using something like this:
cities = list(set(chords["Place1"].unique().tolist() + chords["Place2"].unique().tolist()))
cities_dataset = hv.Dataset(pd.DataFrame(cities, columns=["City"]))

The D3Blocks library can help create Chord charts and easily adjust the colors, weights, opacity, Font size. Let me illustrate it for your case:
Create your dataset:
import pandas as pd
import numpy as np
source=['US','FR','UK','NL','IT']
target=['UK','US','US','FR','FR']
weights=[200,450,200,150,500]
df = pd.DataFrame(data=np.c_[source, target, weights], columns=['source','target','weight'])
Now we can create the Chord chart:
pip install d3blocks
# Import library
from d3blocks import D3Blocks
# Initialize
d3 = D3Blocks(frame=False)
d3.chord(df, color='source', opacity='source', cmap='Set2')
We can also make adjustments:
# Edit any of the properties you want in the dataframe:
d3.node_properties
d3.node_properties.get('NL')['color']='#000000'
# {'US': {'id': 0, 'label': 'US', 'color': '#1f77b4', 'opacity': 0.8},
# 'UK': {'id': 1, 'label': 'UK', 'color': '#98df8a', 'opacity': 0.8},
# 'FR': {'id': 2, 'label': 'FR', 'color': '#8c564b', 'opacity': 0.8},
# 'NL': {'id': 3, 'label': 'NL', 'color': '#000000', 'opacity': 0.8},
# 'IT': {'id': 4, 'label': 'IT', 'color': '#9edae5', 'opacity': 0.8}}
d3.edge_properties
d3.edge_properties[('FR', 'US')]['color']='#000000'
# {('FR', 'US'): {'source': 'FR',
# 'target': 'US',
# 'weight': 450.0,
# 'opacity': 0.8,
# 'color': '#8c564b'},
# ('IT', 'FR'): {'source': 'IT',
# 'target': 'FR',
# 'weight': 500.0,
# 'opacity': 0.8,
# ...
# ...
# Plot again
d3.show()

Related

rather than directly plotting ,need to plot smooth line chart python

i have a 3 df's fro 3 machines(Machine1/Machine2/Machine3) .Each df with 3 columns. Day-shift and production.
sample df:
Day-Shift Production Quality
Day 11-01 20 A
Night 11-01 45 A
Day 11-02 65 A
Night 11-02 12 B
Day 11-03 97 B
my code:
import numpy as np
import pandas as pd
from plotly.offline import iplot
import plotly.graph_objects as go
# Machine1: Create numpy arrays of values for the given quality.
b1 = np.where(df1['Quality'] == 'A', df1['Production'], None)
# Machine2: Same as above.
b2 = np.where(df2['Quality'] == 'A', df2['Production'], None)
# Machine3: Same as above.
b3 = np.where(df3['Quality'] == 'A', df3['Production'], None)
# Setup.
t = []
line = ['solid']
Quality = ['A']
t.append({'x': df1['Day-Shift'],
'y': b1,
'name': f'Machine1',
'line': {'color': 'red',
'dash': line[0]}})
t.append({'x': df2['Day-Shift'],
'y': b2,
'name': f'Machine1',
'line': {'color': 'blue',
'dash': line[0]}})
t.append({'x': df3['Day-Shift'],
'y': b3,
'name': f'Machine1',
'line': {'color': 'yellow',
'dash': line[0]}})
# Plot the graph.
layout = go.Layout(
title='Production meterage of Machine1/Machine2/Machine3 for Quality A',
template='plotly_dark',
xaxis=dict(
autorange=True
),
yaxis=dict(
autorange=True
)
)
fig = go.Figure(data=t, layout=layout)
iplot(fig)
Chart I got:
I created one line chart for all three machines. But the line chart looks messy. Need to do smoothing. I tried with gaussian_filter1d. But It does not work for me.
I think the best way of representing your data is with a histogram. I don't know much of ploty ofline module but you can do it (easily) with matplotlib.
Here is some documentation from matplotlib
https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html
and an example:
https://matplotlib.org/3.1.1/gallery/statistics/hist.html
and an example with multiply datasets for 1 chart
https://matplotlib.org/3.1.1/gallery/statistics/histogram_multihist.html

Plotly: How to inspect and make changes to a plotly figure?

Related questions have already been asked and before e.g.
How can I search for the options for a particular property of a plotly figure?
Plotly: How to inspect the basic figure structure (version 4)
But the answers to these questions have been limited by the fact that not all parameters have been available through Python, meaning that the real answers were buried somewhere in JavaScript. But for newer versions of plotly, how can you inspect and edit a plotly figure? How could you, for example, find out what the background color of a figure is? And then change it? From the second link above you can see that fig.show and print(fig) will reveal some details about the figure structure. But certainly not all of it. The code snippet below will produce the following plot:
Plot:
Code:
import plotly.graph_objects as go
import plotly.express as px
df = px.data.gapminder().query("country=='Canada'")
fig = px.line(df, x="year", y="lifeExp", title='Life expectancy in Canada')
fig.show()
Running fig.show will now partly reveal the structure of the figure in the form of a dict:
<bound method BaseFigure.show of Figure({
'data': [{'hovertemplate': 'year=%{x}<br>lifeExp=%{y}<extra></extra>',
'legendgroup': '',
'line': {'color': '#636efa', 'dash': 'solid'},
'mode': 'lines',
'name': '',
'orientation': 'v',
'showlegend': False,
'type': 'scatter',
'x': array([1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2002, 2007],
dtype=int64),
'xaxis': 'x',
'y': array([68.75 , 69.96 , 71.3 , 72.13 , 72.88 , 74.21 , 75.76 , 76.86 , 77.95 ,
78.61 , 79.77 , 80.653]),
'yaxis': 'y'}],
'layout': {'legend': {'tracegroupgap': 0},
'template': '...',
'title': {'text': 'Life expectancy in Canada'},
'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'year'}},
'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'lifeExp'}}}
})
But as you can see for yourself, there are a lot of details missing. So how can you peform a more complete figure introspection?
As of version 4.10, the plotly developers have introduced the awesome fig.full_figure_for_development() function which they talk about here. There you'll see that:
fig.full_figure_for_development() function will return a new go.Figure
object, prepopulated with the same values you provided, as well as all
the default values computed by Plotly.js, to allow you to learn more
about what attributes control every detail of your figure and how you
can customize them. This function is named “for development” because
it’s not necessary to use it to produce figures, but it can be really
handy to explore figures while you’re figuring out how to build them.
So building on the example in the question, the following snippet will produce an output of about 180 lines containing, among a plethora of other details, this part about the figure layout:
'margin': {'autoexpand': True, 'b': 80, 'l': 80, 'pad': 0, 'r': 80, 't': 100},
'modebar': {'activecolor': 'rgba(68, 68, 68, 0.7)',
'bgcolor': 'rgba(255, 255, 255, 0.5)',
'color': 'rgba(68, 68, 68, 0.3)',
'orientation': 'h'},
'newshape': {'drawdirection': 'diagonal',
'fillcolor': 'rgba(0,0,0,0)',
'fillrule': 'evenodd',
'layer': 'above',
'line': {'color': '#444', 'dash': 'solid', 'width': 4},
'opacity': 1},
'paper_bgcolor': 'white',
'plot_bgcolor': '#E5ECF6',
'separators': '.,',
'showlegend': False,
'spikedistance': 20,
And there you can also see the background color of the plot as 'plot_bgcolor': '#E5ECF6'. And you probably know that you can set the background color using to for example 'grey' using fig.update_layout(plot_bgcolor='grey'). But now you know how to get it as well:
# In:
fig.layout.plot_bgcolor
# Out:
'#E5ECF6'
And in knowing how to do this, you know how to get and set almost any attribute of a plotly figure. And it doesn't matter if you've built the figure using plotly.graph_objects or plotly.express

How to make a layered bar chart using matplotlib

For this question, I was provided the following information.
Data in code form:
order_data = {'Alice': {5: 'chocolate'},
'Bob': {9: 'vanilla'},
'Clair': {7: 'strawberry'},
'Drake': {10: 'chocolate' },
'Emma': {82: 'vanilla'},
'Alice': {70: 'strawberry'},
'Emma': {42: 'chocolate'},
'Ginger': {64: 'strawberry'} }
I was asked to make a bar graph detailing this data. The bar graph and the code used to make it using Altair is provided below.
import altair
data = altair.Data(customer=['Alice', 'Bob', 'Claire', 'Drake', 'Emma','Alice', 'Emma', 'Ginger'],
cakes=[5,9,7,10,82,70,42,64],
flavor=['chocolate', 'vanilla', 'strawberry','chocolate','vanilla','strawberry','chocolate','strawberry'])
chart = altair.Chart(data)
mark = chart.mark_bar()
enc = mark.encode(x='customer:N',y='cakes',color='flavor:N')
enc.display()
Graph:
My question is: What is the best way to go about constructing this graph using matplotlib?
I know this isn't an unusual graph per say but it is unusual in the sense that I have not found any replications of this kind of graph. Thank you!
It has already been answered, but you can also graph it in pandas.plot.
import pandas as pd
data = pd.DataFrame({'customer':['Alice', 'Bob', 'Claire', 'Drake', 'Emma','Alice', 'Emma', 'Ginger'],
'cakes':[5,9,7,10,82,70,42,64],
'flavor':['chocolate', 'vanilla', 'strawberry','chocolate','vanilla','strawberry','chocolate','strawberry']})
df = pd.DataFrame(data)
df = df.pivot(index='customer',columns='flavor', values='cakes').fillna(0)
df.plot(kind='bar', stacked=True)
Here is a reproduction of the Altair graph with Matplotlib. Note that I had to modify the order_data dictionary because a dict cannot be defined with multiple keys at once (so I had to group the dictionary by key values). Also note that some optionally styling statements are included to also mimic the style of Altair.
The trick is to use the bottom keyword argument of the ax.bar function. The following image is obtained from the code below.
import matplotlib.pyplot as plt
# data
order_data = {
"Alice": {"chocolate": 5, "strawberry": 70},
"Bob": {"vanilla": 9},
"Clair": {"strawberry": 7},
"Drake": {"chocolate": 10},
"Emma": {"chocolate": 42, "vanilla": 82},
"Ginger": {"strawberry": 64},
}
# init figure
fig, ax = plt.subplots(1, figsize=(2.5, 4))
colors = {"chocolate": "C0", "strawberry": "C1", "vanilla": "C2"}
# show a bar for each person
for person_id, (name, orders) in enumerate(order_data.items()):
quantities = 0
for order_id, (order, quantity) in enumerate(orders.items()):
ax.bar(person_id, quantity, bottom=quantities, color=colors[order])
quantities += quantity
# add legend
ax.legend([color for color in colors], bbox_to_anchor=(2.0, 1.0))
# remove top/right axes for style match
ax.spines["top"].set_visible(False)
ax.spines["right"].set_visible(False)
ax.grid(axis="y", zorder=-1)
# ticks
ax.set_xticks(range(len(order_data)))
ax.set_xticklabels([name for name in order_data], rotation="vertical")

Plotly: How to inspect the basic figure structure (version 4)

For older versions of plotly, for example in Jupyterlab, you could simply run figure to inspect the basics of your figure like this:
Ouput:
{'data': [{'marker': {'color': 'red', 'size': '10', 'symbol': 104},
'mode': 'markers+lines',
'name': '1st Trace',
'text': ['one', 'two', 'three'],
'type': 'scatter',
'x': [1, 2, 3],
'y': [4, 5, 6]}],
'layout': {'title': 'First Plot',
'xaxis': {'title': 'x1'},
'yaxis': {'title': 'x2'}}}
Code for versions prior to V4:
import plotly.plotly as py
import plotly.graph_objs as go
trace1 = go.Scatter(x=[1,2,3], y=[4,5,6], marker={'color': 'red', 'symbol': 104, 'size': "10"},
mode="markers+lines", text=["one","two","three"], name='1st Trace')
data=go.Data([trace1])
layout=go.Layout(title="First Plot", xaxis={'title':'x1'}, yaxis={'title':'x2'})
figure=go.Figure(data=data,layout=layout)
#py.iplot(figure, filename='pyguide_1')
figure
If you do the same thing now with a similar setup, the same approach will not produce the figure basics, but rather plot the figure itself:
Code:
import pandas as pd
import plotly.graph_objects as go
trace1 = go.Scatter(x=[1,2,3], y=[4,5,6], marker={'color': 'red', 'symbol': 104},
mode="markers+lines", text=["one","two","three"], name='1st Trace')
figure = go.Figure(data=trace1)
figure
Output:
In many ways this is similar to how you for example would build and plot a figure with ggplot in R. And since plotly is available for both R and Python I thinks this makes sense after all. But I'd really like to know how to access that basic setup.
What I've tried:
I think this change is due to the fact that figure is now a plotly.graph_objs._figure.Figure and used to be a dictionary(?). So figure['data'] and figure['layout'] are still dicts with necessary and interesting content:
Output from figure['data']
(Scatter({
'marker': {'color': 'red', 'symbol': 104},
'mode': 'markers+lines',
'name': '1st Trace',
'text': [one, two, three],
'x': [1, 2, 3],
'y': [4, 5, 6]
}),)
Output from figure['layout']
Layout({
'template': '...'
})
And of course options such as help(figure) and dir(figure) are helpful, but produces a very different output.
I just found out that 'forgetting' the brackets for figure.show() will give me exactly what I'm looking for. So with a setup similar to the code in the question and with plotly V4, simply running figure.show will give you this:
Output:
<bound method BaseFigure.show of Figure({
'data': [{'marker': {'color': 'red', 'symbol': 104},
'mode': 'markers+lines',
'name': '1st Trace',
'text': [one, two, three],
'type': 'scatter',
'x': [1, 2, 3],
'y': [4, 5, 6]}],
'layout': {'template': '...'}
})>
Code:
import pandas as pd
import plotly.graph_objects as go
trace1 = go.Scatter(x=[1,2,3], y=[4,5,6], marker={'color': 'red', 'symbol': 104},
mode="markers+lines", text=["one","two","three"], name='1st Trace')
figure = go.Figure(data=trace1)
figure.show

Plot.ly Pandas scatter plot text from multiple columns

I have a chart that I'm rendering using Plot.ly from a Pandas DataFrame:
import pandas as pd
import numpy as np
import string
df1 = pd.DataFrame({'x':np.random.rand(10), 'y':np.random.rand(10),
'd':list(range(10)), 'e':list(string.ascii_lowercase[:10])})
df2 = pd.DataFrame({'x':np.random.rand(10), 'y':np.random.rand(10),
'd':list(range(10)), 'e':list(string.ascii_lowercase[:10])})
fig = {
'data': [
{'x': df1.x,
'y': df1.y,
'text': df1.d,
'mode': 'markers',
'name': 'Example 1',
},
{'x': df2.x,
'y': df2.y,
'text': df2.d,
'mode': 'markers',
'name': 'Example 2',
}
]
}
py.iplot(fig, filename='couldbeanything')
And this draws a nice chart with the dataframe's column 'd' used for the data labels.
But actually I want to use a composite of two columns for the data labels (let's say d and e). Is this possible? I've tried passing a list or a dict and neither appear to work.

Categories

Resources