Python matplotlib - setting x-axis scale - python

I have this graph displaying the following:
plt.plot(valueX, scoreList)
plt.xlabel("Score number") # Text for X-Axis
plt.ylabel("Score") # Text for Y-Axis
plt.title("Scores for the topic "+progressDisplay.topicName)
plt.show()
valueX = [1, 2, 3, 4] and
scoreList = [5, 0, 0, 2]
I want the scale to go up in 1's, no matter what values are in 'scoreList'. Currently get my x-axis going up in .5 instead of 1s.
How do I set it so it goes up only in 1?

Just set the xticks yourself.
plt.xticks([1,2,3,4])
or
plt.xticks(valueX)
Since the range functions happens to work with integers you could use that instead:
plt.xticks(range(1, 5))
Or be even more dynamic and calculate it from the data:
plt.xticks(range(min(valueX), max(valueX)+1))

Below is my favorite way to set the scale of axes:
plt.xlim(-0.02, 0.05)
plt.ylim(-0.04, 0.04)

Hey it looks like you need to set the x axis scale.
Try
matplotlib.axes.Axes.set_xscale(1, 'linear')
Here's the documentation for that function

Related

Plotly: How to adjust size of markers in a scatter geo map so that differences become more visible?

I used library Plotly in Python: chart: Bubble-Map to display average_score of countries. However, average_score has values between 2-4 and therefore the size of the bubbles in the bubble-map chart does not differentiate much (the size of bubbles is very similar). How can I achieve a bigger difference among bubbles with values 2-4?
Here is my piece of code:
plot = px.scatter_geo(df, locations="country_code", color="country_code", size_max=20,hover_name="country_code", size="avg_score", animation_frame="year", projection="natural earth",title="Bubble Map",labels={"country_code": "Country"})
I would raise your raw data to the power of a number that suits the visualization you're aiming to build. This makes larger numbers look disproportionally larger than sall numbers. Compare the two plots below where the first is the px.scatter_geo() example from the docs, and the second where the same data for population df['pop'] has been replaced with df['pop']**1.6.
1. Raw data for df['pop']
1. Data for df['pop'] raised to the power of 1.6
Of course these numbers have no other business in the figure, so you will have to include the following in order to keep the correct hoverinfo:
fig.update_traces(hovertemplate = 'pop=%{text}<br>iso_alpha=%{location}<extra></extra>', text = df['pop'])
Complete code:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
df['pop_display'] = df['pop']**1.6
fig = px.scatter_geo(df, locations="iso_alpha",
size="pop_display",
)
fig.update_traces(hovertemplate = 'pop=%{text}<br>iso_alpha=%{location}<extra></extra>', text = df['pop'])
fig.show()
What you're asking is not really a Plotly question but a general math question.
Given the inputs [2,3,4] step=1. return corresponding integers that have step > 1. There are multiple ways you can accomplish this:
One way is to multiply all items by an integer.
[2,3,4] * 2 = [4, 6, 8] # step=2
[2,3,4] * 3 = [6, 9, 12] # step=3
...
In this case the difference between new values will rise linearly. Meaning that the step between all values will remain constant.
If you want step to grow in a non linear way you can square the items:
[2,3,4]^2 = [4, 9, 16] # step=5, 7...
[2,3,4]^3= [8, 27, 64] # step=19,37...
...
Possibilities are really endless. It all depends on what kind of difference you want between the bubbles. In code, quick and dirty solution will look something like this:
plot = px.scatter_geo(df,
locations="country_code",
color="country_code",
size_max=20,
hover_name="country_code",
size=df["avg_score"]**2,
animation_frame="year",
projection="natural earth",
title="Bubble Map",
labels={"country_code": "Country"}
)

Giving imshow a custom list of yaxis labels

I'm trying to give pyplot.imshow() a list of values that aren't necessarily linear to use as y-axis labels. The truncated list is:
run_numbers = array([815676, 815766, 815767, 815768, 815769, 815770, 815771, 815772,
815773, 815774, 815775, 815776, 815777, 815778, 815779, 815780,
815781, 815783, 815784, 815785, 815786, 815789, 815790, 815792,
815793, 815794, 815795, 815796, 815797, 815798, 815799, 815800,
815801, 815802, 815803, 815804, 815805, 815806, 815807, 815808,
815809, 815811, 815812, 815813, 815814, 815815, 815816, 815817,
815818, 815819, 815820, 815821, 815822, 815823, 815824, 815825,
815826, 815827, 815829, 815830, 815831, 815832, 815833, 815834,
815835, 815836, 815837, 815838, 815839, 815841, 815842, 815843,
815844, 815845, 815846, 815847, 815848, 815849, 815851, 815852,
815853, 815854, 815855, 815856, 815857, 815858, 815859, 815860,
815861, 815863, 815864, 815865, 815866, 815867, 815869, 815870,
815871, 815872, 815873, 815874, 815875, 815876, 815877, 815878])
My image looks like this:
At first glance, this seems fine. But imshow isn't using the values from the list, instead it's using a linear range from 815676 to the max value of the list. I've tried a few different things:
plt.imshow(np.array(profiles), aspect='auto', vmin=-5, vmax=20, extent=[0,500,max(run_numbers), min(run_numbers)])
The above code gave the image above, which makes sense given what I put in extent.
Is there a way to tell imshow to use the values in the list as the yaxis label? I've tried ax.yticklabels and ax.ytick, but those also give a linear progression of numbers instead of the list values.
Please let me know how I can clarify my question if there's any confusion. I can also provide an example data set if my question isn't clear.
Letting the extent go between 0 and the number of labels minus 1, and then use plt.yticks(range(N), run_numbers) would set the labels. As there are about 100 labels, this would look very crowded, which could be mitigated by setting a large figure size and a small font. Or the labels could be set every with steps, e.g. steps of 10:
import numpy as np
import matplotlib.pyplot as plt
run_numbers = np.array([815676, 815766, 815767, 815768, 815769, 815770, 815771, 815772, 815773, 815774, 815775, 815776, 815777, 815778, 815779, 815780, 815781, 815783, 815784, 815785, 815786, 815789, 815790, 815792, 815793, 815794, 815795, 815796, 815797, 815798, 815799, 815800, 815801, 815802, 815803, 815804, 815805, 815806, 815807, 815808, 815809, 815811, 815812, 815813, 815814, 815815, 815816, 815817, 815818, 815819, 815820, 815821, 815822, 815823, 815824, 815825, 815826, 815827, 815829, 815830, 815831, 815832, 815833, 815834, 815835, 815836, 815837, 815838, 815839, 815841, 815842, 815843, 815844, 815845, 815846, 815847, 815848, 815849, 815851, 815852, 815853, 815854, 815855, 815856, 815857, 815858, 815859, 815860, 815861, 815863, 815864, 815865, 815866, 815867, 815869, 815870, 815871, 815872, 815873, 815874, 815875, 815876, 815877, 815878])
N = len(run_numbers)
profiles = np.random.randn(N, 501).cumsum(axis=1)
plt.imshow(profiles, aspect='auto', extent=[0, 500, N-1, 0])
plt.yticks(range(0, N, 10), run_numbers[::10])
plt.show()
With
plt.figure(figsize=(10, 16))
plt.imshow(profiles, aspect='auto', extent=[0, 500, N-1, 0])
plt.yticks(range(N), run_numbers, fontsize=8)
It could look like

How to plot x axis with month?

I want to plot a dataframe df1. The x axis contains month and the y-axis counts. My x axis is just a black bar because of too many values. I tried a lot but nothing works. Is there a simple way to plot just every 5th date for example?
I think the problem is that the month are date times and I can't build the minimum and maximum?

df1 = pd.read_csv('hello.csv')
plt.plot(df1['a'],df1['b'])
plt.show()
My data frame df1 is:
a b
2006-06,211.0
2006-07,212.41176470588235
2006-08,238.26315789473685
2006-09,239.9375
2006-10,266.1111111111111
2006-11,265.22222222222223
2006-12,283.3333333333333
2007-01,290.0
2007-02,307.5
2007-03,325.0
2007-04,343.05882352941177
2007-05,340.42105263157896
2007-06,353.75
2007-07,348.5
2007-08,359.6111111111111
2007-09,346.5625
2007-10,365.57894736842104
2007-11,358.7647058823529
2007-12,372.8333333333333
2008-01,381.8888888888889
2008-02,396.25
2008-03,422.94117647058823
2008-04,428.6666666666667
2008-05,418.5882352941176
2008-06,433.0
2008-07,440.4736842105263
2008-08,470.375
2008-09,481.3529411764706
2008-10,489.44444444444446
2008-11,485.125
2008-12,514.5714285714286
2009-01,515.375
2009-02,535.3125
2009-03,555.0555555555555
2009-04,557.7222222222222
2009-05,533.375
2009-06,567.7222222222222
2009-07,575.1111111111111
2009-08,582.5294117647059
2009-09,569.1666666666666
2009-10,611.1176470588235
2009-11,591.6470588235294
2009-12,634.6428571428571
2010-01,647.9375
2010-02,655.375
2010-03,672.7368421052631
2010-04,678.5882352941177
2010-05,667.8235294117648
2010-06,689.5
2010-07,657.4117647058823
2010-08,679.1111111111111
2010-09,661.2222222222222
2010-10,685.75
2010-11,676.5555555555555
2010-12,692.3571428571429
2011-01,691.9411764705883
2011-02,697.4375
2011-03,720.5263157894736
2011-04,723.5
2011-05,694.7222222222222
2011-06,705.7222222222222
2011-07,677.9375
2011-08,693.7368421052631
2011-09,671.2352941176471
2011-10,685.1176470588235
2011-11,669.9444444444445
2011-12,708.3076923076923
2012-01,674.9444444444445
2012-04,748.0
2012-05,811.0526315789474
2012-06,863.6875
2012-07,843.1666666666666
2012-08,885.5
2012-09,857.75
2012-10,876.8421052631579
2012-11,863.1764705882352
2012-12,917.6666666666666
2013-01,933.4444444444445
2013-03,975.0625
2013-04,994.0
2013-05,1019.6666666666666
2013-06,1063.625
2013-07,1057.8947368421052
2013-08,1102.1764705882354
2013-09,1046.4117647058824
2013-10,1153.1052631578948
2013-11,1107.25
2013-12,1155.3076923076924
2014-01,1191.3529411764705
2014-02,1240.5
2014-03,1272.764705882353
2014-04,1316.9444444444443
2014-05,1310.3529411764705
2014-06,1349.4117647058824
2014-07,1403.8947368421052
2014-08,1412.375
2014-09,1409.0555555555557
2014-10,1472.9444444444443
2014-11,1421.8125
2014-12,1473.2142857142858
2015-01,1476.9375
2015-02,1495.75
2015-03,1546.111111111111
2015-04,1563.7777777777778
2015-05,1499.0
2015-06,1583.111111111111
2015-07,1594.2222222222222
2015-08,1618.1176470588234
2015-09,1595.8333333333333
2015-10,1706.3529411764705
2015-11,1652.8823529411766
2015-12,1691.0714285714287
2016-01,1717.125
2016-02,1746.7058823529412
2016-03,1945.4736842105262
2016-04,2329.375
2016-05,2408.4444444444443
2016-06,2404.222222222222
2016-07,2184.4375
2016-08,2160.6315789473683
2016-09,2402.176470588235
2016-10,2481.823529411765
2016-11,2372.0
2016-12,2153.0
2017-01,2145.777777777778
2017-02,2213.5625
2017-03,2309.6111111111113
2017-04,2295.8125
2017-05,2116.7894736842104
2017-06,2093.8823529411766
In order to show every nth value, what you can do is to set the x-ticks value.
x = df1['a']
plt.xticks(np.arange(0, len(x), 1.0)) #you can replace 1 with the step interval
Or else, what you can do to further improve the visibility and keep the accuracy is to rotate the x axis inputs by modifying the x-ticks with a rotation variable.
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [1, 4, 9, 6]
labels = ['Frogs', 'Hogs', 'Bogs', 'Slogs']
plt.plot(x, y)
# You can specify a rotation for the tick labels in degrees or with keywords.
plt.xticks(x, labels, rotation='vertical') # You can input an integer too.
# Pad margins so that markers don't get clipped by the axes
plt.margins(0.2)
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=0.15)
plt.show()

How to generate image by using python and given data?

I have one data file which is like this:
1, 23%
2, 33%
3, 12%
I want to use python to generate one histogram to represent the percentage. I followed these command:
from PIL import Image
img = Image.new('RGB', (width, height))
img.putdata(my_data)
img.show()
However I got the error when I put the data: SystemError: new style getargs format but argument is not a tuple. Do I have to change my data file? and How?
A histogram is usually made in matplotlib by having a set of data points and then assigning them into bins. An example would be this:
import matplotlib.pyplot as plt
data = [1, 2, 3, 3, 4, 4, 4, 5, 5, 6, 7]
plt.hist(data, 7)
plt.show()
You already know what percentage of your data fits into each category (although, I might point out your percentages don't add to 100...). A way to represent this is to to make a list where each data value is represented a number of times equal to its percentage like below.
data = [1]*23 + [2]*33 + [3]*12
plt.hist(data, 3)
plt.show()
The second argument to hist() is the number of bins displayed, so this is likely the number you want to make it look pretty.
Documentation for hist() is found here:
http://matplotlib.org/api/pyplot_api.html
Are you graphing only? PIL is an image processing module - if you want histograms and other graphs you should consider matplotlib.
I found an example of a histogram here.

Pyplot set tick frequency and tick labels

I'm trying to make a plot with matplotlib where I want to specify both the position of the tick marks, and the text of the tick marks. I can individually do both with yticks(np.arange(0,1.1,1/16.)) and gca().set_yticklabels(['1','2','3']). However, for some reason when I do both of them together, the labels do not appear on the graph. Is there a reason for this? How can I get around it? Below is a working example of what I want to accomplish.
x = [-1, -0.2, -0.15, 0.15, 0.2, 7.8, 7.85, 8.15, 8.2, 12]
y = [1, 1, 15/16., 15/16., 1, 1, 15/16., 15/16., 1, 1]
figure(1)
plot(x,y)
xlabel('Time (years)')
ylabel('Brightness')
yticks(np.arange(0,1.1,1/16.))
xticks(np.arange(0,13,2))
ylim(12/16.,16.5/16.)
xlim(-1,12)
gca().set_yticklabels(['12/16', '13/16', '14/16', '15/16', '16/16'])
show(block = False)
Effectively I just wanted to replace the numerical values with fractions, but when I run this, the labels do not appear. It seems that using both yticks() and set_yticklabels together is a problem because if I remove either line, the remaining line works as it should.
If anyone can indicate how to simply force the label to be a fraction, that would also solve my problem.
EDIT:
I found an ugly workaround by using
ylim(12/16., 16.5/16)
gca().yaxis.set_major_locator(FixedLocator([12/16., 13/16., 14/16., 15/16., 16/16.]))
gca().yaxis.set_major_formatter(FixedFormatter(['12/16', '13/16', '14/16', '15/16', '16/16']))
While this may work for this specific example, it does not generalize well and it is cumbersome to specify the exact location and label of every tick mark. If anyone finds another solution, I'm all ears.
1) Your arange should produce 5 ticks, the same as labels you set.
arange is not good for that. It is better to use linspace.
2) You can set ticks and labels with the same function
plot(x,y)
xlabel('Time (years)')
ylabel('Brightness')
yticks(np.linspace(12/16., 1, 5), ('12/16', '13/16', '14/16', '15/16', '16/16') )
xticks(np.arange(0,13,2))
ylim(12/16.,16.5/16.)
xlim(-1,12)
3) Note that you should adjust the actual values of the axis with the position of the labels using linspace(12/16., 1, 5) instead of arange(0, 1.1, 1/16.))

Categories

Resources