How to plot small floating numbers properly - python

How to plot the set of numbers like (first column is x-axis, second column is y-axis):
1 3.4335e-14
2 5.8945e-28
3 6.7462e-42
4 5.7908e-56
5 3.9765e-70
6 2.2756e-84
7 1.1162e-98
8 4.7904e-113
9 1.8275e-127
10 6.2749e-142
11 1.9586e-156
12 5.6041e-171
13 1.4801e-185
14 3.6300e-200
15 8.3091e-215
16 1.7831e-229
17 3.6013e-244
18 6.8694e-259
19 1.2414e-273
For now I get:
And I can't figure out how to make it properly. It means no flat line from 2 to the end and correct y-axis values. I read these values from the file with:
x_values.append(line.split(' ')[0])
y_values.append(float(line.split(' ')[1]))

You may wish to switch the yscale to "log" scale, e.g.:
import matplotlib.ticker as mtick
_,ax = plt.subplots()
plt.plot(x,y)
plt.xticks(x)
plt.yscale("log")
ax.yaxis.set_major_formatter(mtick.FormatStrFormatter('%.2e'));

Related

Add x-axis to matplotlib with multiple y-axis line chart

How do I add the x-axis(Month) to a simple Matplotlib
My Dataset:
Month Views CMA30
0 11 24662 24662.000000
1 11 2420 13541.000000
2 11 11318 12800.000000
3 11 8529 11732.250000
4 10 78861 25158.000000
5 10 1281 21178.500000
6 10 22701 21396.000000
7 10 17088 20857.500000
This is my code:
df[['Views', 'CMA30']].plot(label='Views', figsize=(5, 5))
This is giving me Views and CMA30 on the y-axis. How do I add Month(1-12) on the x-axis?
If you average the values per month, then try groupby/mean:
df.groupby('Month')[['Views','CMA30']].mean().plot(label='Views', figsize=(5, 5))

dates.YearLocator() does not show years

I unfortunately cannot upload my dataset but here is how my dataset looks like:
UMTMVS month
DATE
1992-01-01 209438.0 1
1992-02-01 232679.0 2
1992-03-01 249673.0 3
1992-04-01 239666.0 4
1992-05-01 243231.0 5
1992-06-01 262854.0 6
1992-07-01 222832.0 7
1992-08-01 240299.0 8
1992-09-01 260216.0 9
1992-10-01 252272.0 10
1992-11-01 245261.0 11
1992-12-01 245603.0 12
1993-01-01 223258.0 1
1993-02-01 246941.0 2
1993-03-01 264886.0 3
1993-04-01 249181.0 4
1993-05-01 250870.0 5
1993-06-01 271047.0 6
1993-07-01 224077.0 7
1993-08-01 248963.0 8
1993-09-01 269227.0 9
1993-10-01 263075.0 10
1993-11-01 256142.0 11
1993-12-01 252830.0 12
1994-01-01 234097.0 1
1994-02-01 259041.0 2
1994-03-01 277243.0 3
1994-04-01 261755.0 4
1994-05-01 267573.0 5
1994-06-01 287336.0 6
1994-07-01 239931.0 7
1994-08-01 276947.0 8
1994-09-01 291357.0 9
1994-10-01 282489.0 10
1994-11-01 280455.0 11
1994-12-01 279888.0 12
1995-01-01 260175.0 1
1995-02-01 286290.0 2
1995-03-01 303201.0 3
1995-04-01 283129.0 4
1995-05-01 289257.0 5
1995-06-01 310201.0 6
1995-07-01 255163.0 7
1995-08-01 293605.0 8
1995-09-01 313228.0 9
1995-10-01 301301.0 10
1995-11-01 293164.0 11
1995-12-01 290963.0 12
1996-01-01 263041.0 1
1996-02-01 290317.0 2
I want to set a locator for each year and ran the following code
ax = df.UMTMVS.plot(figsize=(12, 5))
ax.xaxis.set_major_locator(dates.YearLocator())
but it simply gives the following figure without any locator at all
Why does the locator fail to point out the years?
Try applying set_major_locator() to the axis before df.plot(). Like this:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import dates
# reading your sample data into dataframe
df = pd.read_clipboard()
# dates should be dates (datetime), not strings
df.index = df.index.to_series().apply(pd.to_datetime)
fig, ax = plt.subplots(1,1,figsize=(12, 5))
# set locator before df.plot()
ax.xaxis.set_major_locator(dates.YearLocator())
df.UMTMVS.plot()
Result:
Slightly different result could be achieved with last bit of code above modified to the following:
fig, ax = plt.subplots(1,1,figsize=(12, 5))
ax.plot(df.UMTMVS)
ax.xaxis.set_major_locator(dates.YearLocator())
plt.xlabel('DATE')
plt.show()
Result_alt (note the "padding" and loss of minor ticks):

Is it possible to generate a clock chart using Plotly?

I'm developing a dataviz project and I came across the report generated by Last.FM, in which there is a clock chart to represent the distribution of records by hours.
The chart in question is this:
It is an interactive graph, so I tried to use the Plotly library to try to replicate the chart, but without success.
Is there any way to replicate this in Plotly? Here are the data I need to represent
listeningHour = df.hour.value_counts().rename_axis('hour').reset_index(name='counts')
listeningHour
hour counts
0 17 16874
1 18 16703
2 16 14741
3 19 14525
4 23 14440
5 22 13455
6 20 13119
7 21 12766
8 14 11605
9 13 11575
10 15 11491
11 0 10220
12 12 7793
13 1 6057
14 9 3774
15 11 3476
16 10 1674
17 8 1626
18 2 1519
19 3 588
20 6 500
21 7 163
22 4 157
23 5 26
The graph provided by Plotly is a polar bar chart. I have written a code using it with your data. At the time of my research, there does not seem to be a way to place the ticks inside the doughnut. The point of the code is to start at 0:00 in the direction of the angle axis. The clock display is a list of 24 tick places with an empty string and a string every 6 hours. The angle grid is aligned with the center of the bar chart.
import plotly.graph_objects as go
r = df['counts'].tolist()
theta = np.arange(7.5,368,15)
width = [15]*24
ticktexts = [f'$\large{i}$' if i % 6 == 0 else '' for i in np.arange(24)]
fig = go.Figure(go.Barpolar(
r=r,
theta=theta,
width=width,
marker_color=df['counts'],
marker_colorscale='Blues',
marker_line_color="white",
marker_line_width=2,
opacity=0.8
))
fig.update_layout(
template=None,
polar=dict(
hole=0.4,
bgcolor='rgb(223, 223,223)',
radialaxis=dict(
showticklabels=False,
ticks='',
linewidth=2,
linecolor='white',
showgrid=False,
),
angularaxis=dict(
tickvals=np.arange(0,360,15),
ticktext=ticktexts,
showline=True,
direction='clockwise',
period=24,
linecolor='white',
gridcolor='white',
showticklabels=True,
ticks=''
)
)
)
fig.show()

Hide lines from a multiple line plot

I have a dataframe with 12 columns and 30 rows (only the first 5 rows are shown here):
0 1 2 3 4 5 6 7 8 9 10 11
0
10 0.420000 0.724000 0.552000 0.316000 0.176000 0.320000 0.228000 0.552000 0.476000 0.468000 0.560000 0.332000
20 0.387097 0.701613 0.516129 0.338710 0.177419 0.346774 0.217742 0.443548 0.483871 0.435484 0.516129 0.330645
30 0.353659 0.731707 0.365854 0.280488 0.158537 0.243902 0.231707 0.451220 0.524390 0.414634 0.451220 0.329268
40 0.377049 0.557377 0.311475 0.213115 0.213115 0.262295 0.262295 0.459016 0.540984 0.475410 0.377049 0.262295
50 0.285714 0.673469 0.183673 0.183673 0.163265 0.285714 0.204082 0.387755 0.489796 0.367347 0.306122 0.244898
I would like to plot a dot plot with rows indices as the x-axis columns values as the y-axis (ie. 12 dots on each x).
I have tried the following:
df.plot()
and I get this plot
I would like to show only the markers (dots) and not the lines
I tried df.plot(linestyle='None') but then I get an empty plot.
How can I change my code to show the dots/markers and hide the lines?
pandas.DataFrame.plot passes **kwargs to matplotlib's .plot method. Thus you can use any of the matplotlib.lines.Line2D properties:
df.plot(ls='', marker='.')

Scatter plot in python with Groups

Below are three columns VMDensity, ServerswithCorrectable errors and VMReboots.
VMDensity correctableCount avgVMReboots
LowDensity 7 5
HighDensity 1 23
LowDensity 5 11
HighDensity 1 23
LowDensity 9 5
HighDensity 1 22
HighDensity 1 22
LowDensity 9 2
LowDensity 9 6
LowDensity 5 3
I tried the following but not sure how to create it by groups with different colors.
import matplotlib.pyplot as plt
import pandas as pd
plt.scatter(df.correctableCount, df.avgVMReboots)
Now, I need generate a scatter plot with the grouping by VMDensity. The low density VM's should be in one color and the high density in another one.
If I understand you correctly you do not need to "group" the data: You want to plot all data points regardsless. You just want to color them differently. So try something like
plt.scatter(df.correctableCount, df.avgVMReboots, c=df.VMDensity)
You will need to map the df.VMDensity strings to numbers and/or play with scatter's cmap parameter.
See this example from matplotlib's gallery.

Categories

Resources