Using plotly, I've learned to plot maps that represent stuff like 'salary per country' or 'number of XX per country' etc .
Now I'd like to plot the following : say I'm interested in three quantities A,B and C, I would like to plot, for each country, little circles with a size that gets bigger when the value gets bigger, for example :
USA : A=10, B=12,C=3 , I would have 3 circles in the US zone, circle(B)>circle(A)>circle(C).
My dataframe has 4 columns :columns=['Country','quantity_A','quantity_B','quantity_C']
How can I plot a map that looks like what I described. I'm willing to use any library that allows that (the simpler the better of course).
Thanks !
Using matplotlib you can draw a scatter plot as follows, where the size of the scatter point is given by the quantity in the respective column.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
import pandas as pd
countries = ["USA", "Israel", "Tibet"]
columns=['quantity_A','quantity_B','quantity_C']
df = pd.DataFrame(np.random.rand(len(countries),len(columns))+.2,
columns=columns, index=countries)
fig, ax=plt.subplots()
for i,c in enumerate(df.columns):
ax.scatter(df.index, np.ones(len(df))*i, s = df[c]*200, c=range(len(df)), cmap="tab10")
ax.set_yticks(range(len(df.columns)))
ax.set_yticklabels(df.columns)
ax.margins(0.5)
plt.show()
Related
I have a set of 125 x and y data (Xray absorption spectroscopy data ie energy vs intensity) and I would like to reproduce a plot similar to this one : [contour plot of xanes spectras]
(https://i.stack.imgur.com/0Kymp.png)
The spectras were taken as a function of time and my goal is to plot them in a 2d contour plot with the energy as x, and the time (or maybe just the index of the spectra) as the y. I would like the z axis to represent the intensity of the spectra with different colors so that changes in time are easily seen.
My data currently look like this, when I plot them all in the same graph with a viridis color map.line plot of the spectras
I have tried to work with the contour function of matplotlib and got this result :
attempt of a contour plot
I used the following code :
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_excel('data.xlsx')
energy = df['energy']
df.index = energy
df = df.iloc[:,2:]
df = df.transpose()
X = energy
Y = range(len(df.index))
fig, ax = plt.subplots()
ax.contourf(X,Y,df)
plt.show()
If you have any idea, I would be grateful. I am in fact not sure that the contour function is the most apropriate for what I want, and I am open to any suggestion.
Thanks,
Yoloco
I have collected data on an experiment, where I am looking at property A over time, and then making a histogram of property A at a given condition B. Now the deal is that A is collected over an array of B values.
So I have a histogram that corresponds to B=B1, B=B2, ..., B=Bn. What I want to do, is construct a 3D plot, with the z axis being for property B, and the x axis being property A, and y axis being counts.
As an example, I want the plot to look like this (B corresponds to Temperature, A corresponds to Rg):
How do I pull this off on python?
The python library joypy can plot graphs like this. But I'm not sure if you also want these molecules within your graph.
Here an example:
import joypy
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import cm
%matplotlib inline
temp = pd.read_csv("data/daily_temp.csv",comment="%")
labels=[y if y%10==0 else None for y in list(temp.Year.unique())]
fig, axes = joypy.joyplot(temp, by="Year", column="Anomaly", labels=labels, range_style='own',
grid="y", linewidth=1, legend=False, figsize=(6,5),
title="Global daily temperature 1880-2014 \n(°C above 1950-80 average)",
colormap=cm.autumn_r)
Output:
See this thread as reference.
I am trying to plot FRED's economic data using matplotlib/seaborn, But the values themselves are in floating points and matplotlib instead of using range, quite literally just uses all the values as distinct y-axis points, something like . I need to plot this in a way where the changes are apparent. I tried to specify y axis range by using yticks, but it still does not work. Here's my code
mort30=pd.read_csv('Dataset/MORTGAGE30US.csv')
mort30['DATE']= pd.DateTimeIndex(mort30['DATE']).years # to get only year values on the x-axis
sns.lineplot(data=mort30, x='DATE', y='MORTGAGE30US')
plt.yticks(np.arange(1,11,step=1))
Any other ideas that could work? Here is the dataset link for the graph (P.S. go to edit graph and change frequency to Annual for simplicity)
Your y-data are objects, not numerical values. Take a look to the CSV, the last line contains no number.
mort30['MORTGAGE30US']
47 4.5446153846153846
48 3.9357692307692308
49 3.1116981132075472
50 .
Name: MORTGAGE30US, dtype: object
Next time add a running example, please. Your shown code is not working, it should be:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
mort30=pd.read_csv('MORTGAGE30US.csv')
mort30['DATE']= pd.DatetimeIndex(mort30['DATE']).year # to get only year values on the x-axis
sns.lineplot(data=mort30, x='DATE', y='MORTGAGE30US')
plt.yticks(np.arange(1,11,step=1))
I am new to data visualization, so please bear with me.
I am trying to create a data plot that describes various different attributes on a data set on blockbuster movies. The x-axis will be year of the movie and the y-axis will be worldwide gross. Now, some movies have made upwards of a billion in this category, and it seems that my y axis is overwhelmed as it completely blocks out the numbers and becomes illegible. Here is what I have thus far:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('blockbusters.csv')
fig, ax = plt.subplots()
ax.set_title('Top Grossing Films')
ax.set_xlabel('Year')
ax.set_ylabel('Worldwide Grossing')
x = df['year'] #xaxis
y = df['worldwide_gross'] #yaxis
plt.show()
Any tips on how to scale this down? Ideally it could be presented on a scale of 10. Thanks in advance!
You could try logarithmic scaling:
ax.set_yscale('log')
You might want to manually set the ticks on the y-axis using
ax.set_yticks([list of values for which you want to have a tick])
ax.set_yticklabels([list of labels you want on each tick]) # optional
Another way to approach this might be to rank the movies (which gross is the highest, second highest, ...), i.e. on the y axis you would plot
df['worldwide_gross'].rank()
Edit: as you indicate, one might also check the dtypes to make sure the data is numerical. If not, use .astype(int) or .astype(float) to convert it.
I have a count table as dataframe in Python and I want to plot my distribution as a boxplot. E.g.:
df=pandas.DataFrame.from_items([('Quality',[29,30,31,32,33,34,35,36,37,38,39,40]), ('Count', [3,38,512,2646,9523,23151,43140,69250,107597,179374,840596,38243])])
I 'solved' it by repeating my quality value by its count. But I dont think its a good way and my dataframe is getting very very big.
In R there its a one liner:
ggplot(df, aes(x=1,y=Quality,weight=Count)) + geom_boxplot()
This will output:!Boxplot from R1
My aim is to compare the distribution of different groups and it should look like
Can Python solve it like this too?
What are you trying to look at here? The boxplot hereunder will return the following figure.
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
df=pd.DataFrame.from_items([('Quality',[29,30,31,32,33,34,35,36,37,38,39,40]), ('Count', [3,38,512,2646,9523,23151,43140,69250,107597,179374,840596,38243])])
plt.figure()
df_box = df.boxplot(column='Quality', by='Count',return_type='axes')
If you want to look at your Quality distibution weighted on Count, you can try plotting an histogramme:
plt.figure()
df_hist = plt.hist(df.Quality, bins=10, range=None, normed=False, weights=df.Count)