I'm very new to Python/data sets. I am not sure of the terminology to use when describing my question. As a result it's been difficult searching for answers. I have plotted a data set that ranges 0 - 20,000,000,000,000 for the y axis. Here is what I have:
However, for my assignment it should look like so:
I have tried ticklabel_format(style='plain', axis='y') which gives me:
This give me the numbers with no scientific notation and too many zeros! I'm not sure which methods I should be using to scale(?) the numbers. I've thought about operating on my raw data values but that doesn't seem right. I have looked at the documentation for matplotlib.pyplot.yscale but unless I'm missing something it doesn't seem to be what I need. Any help is appreciated. Thanks.
You can scale your data using a numpy.array and then specify the scaling in the y-axis:
import numpy as np
import matplotlib.pyplot as plt
#scale the data appropriately with numpy.array
dataset_1 = np.array(dataset_1)
scaled_dataset_1 = dataset_1/(10**9)
#plot with scaling specified in label
plt.plot(x_axis, scaled_dataset_1)
plt.ylabel('Dataset 1 (10^9)')
To change the Y-axis to exponential, you can use the following settings Please refer to this page.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import ScalarFormatter
x = np.arange(0, 200000000, 1)
fig, ax = plt.subplots(figsize=(6, 6))
ax.plot(x * 1000000, x * 100000)
ax.yaxis.set_major_formatter(ScalarFormatter(useMathText=True))
ax.ticklabel_format(style="sci", axis="y", scilimits=(0,0))
plt.show()
Related
I do have a question with matplotlib in python. I create different figures, where every figure should have the same height to print them in a publication/poster next to each other.
If the y-axis has a label on the very top, this shrinks the height of the box with the plot. So I use MaxNLocator to remove the upper and lower y-tick. In some plots, I want to have the 1.0 as a number on the y-axis, because I have normalized data. So I need a solution, which expands in these cases the y-axis and ensures 1.0 is a y-Tick, but does not corrupt the size of the figure using tight_layout().
Here is a minimal example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
x = np.linspace(0,1,num=11)
y = np.linspace(1,.42,num=11)
fig,axs = plt.subplots(1,1)
axs.plot(x,y)
locator=MaxNLocator(prune='both',nbins=5)
axs.yaxis.set_major_locator(locator)
plt.tight_layout()
fig.show()
Here is a link to a example-pdf, which shows the problems with height of upper boxline.
I tried to work with adjust_subplots() but this is of no use for me, because I vary the size of the figures and want to have same the font size all the time, which changes the margins.
Question is:
How can I use MaxNLocator and specify a number which has to be in the y-axis?
Hopefully someone of you has some advice.
Greetings,
Laenan
Assuming that you know in advance how many plots there will be in 1 row on a page one way to solve this would be to put all those plots into one figure - matplotlib will make sure they are alinged on axes:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
x = np.linspace(0, 1, num=11)
y = np.linspace(1, .42, num=11)
fig, (ax1, ax2) = plt.subplots(1,2, figsize=(8,3), gridspec_kw={'wspace':.2})
ax1.plot(x,y)
ax2.plot(x,y)
locator=MaxNLocator(prune='both', nbins=5)
ax1.yaxis.set_major_locator(locator)
# You don't need to use tight_layout and using it might give an error
# plt.tight_layout()
fig.show()
Lets say I've 2 arrays
x = [1,2,3,4,5,6,7]
y = [1,2,2,2,3,4,5]
its scatter plot looks like this
what I want to do is that I want my x axis to look like this in the plot
0,4,8
as a result of which values of y in each piece of x should come closer .
The similar behavior I've seen is bar plots where this is called clustering , how do I do the same in case of scatter plot , or is there any other plot I should be using ?
I hope my question is clear/understandable .
All the help is appreciated
With you plot, try this, before you display the plot.
plt.xticks([0,4,8]))
or
import numpy as np
plt.xticks(np.arange(0, 8+1, step=4))
Then to change the scale you can try something like this,
plt.xticks([0,4,8]))
plt.rcParams["figure.figsize"] = (10,5)
I got this with my example,
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.xticks([0,4,8])
plt.rcParams["figure.figsize"] = (7,3)
plt.plot(x, y, 'o', color='black')
output
I think what you are looking for is close to swarmplots and stripplots in Seaborn. However, Seaborn's swarmplot and stripplot are purely categorical on one of the axes, which means that they wouldn't preserve the relative x-axis order of your elements inside each category.
One way to do what you want would be to increase the space in your x-axis between categories ([0,4,8]) and modify your xticks accordingly.
Below is an example of this where I assign the data to 3 different categories: [-2,2[, [2,6[, [6,10[. And each bar is dil_k away from its directly neighboring bars.
import matplotlib.pyplot as plt
import numpy as np
#Generating data
x= np.random.choice(8,size=(100))
y= np.random.choice(8,size=(100))
dil_k=20
#Creating the spacing between categories
x[np.logical_and(x<6, x>=2)]+=dil_k
x[np.logical_and(x<10, x>=6)]+=2*dil_k
#Plotting
ax=plt.scatter(x,y)
#Modifying axes accordingly
plt.xticks([0,2,22,24,26,46,48,50],[0,2,2,4,6,6,8,10])
plt.show()
And the output gives:
Alternatively, if you don't care about keeping the order of your elements along the x-axis inside each category, then you can use swarmplot directly.
The code can be seen below:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
#Generating data
x= np.random.choice(8,size=(100))
y= np.random.choice(8,size=(100))
#Creating the spacing between categories
x[np.logical_and(x<2,x>=-2)]=0
x[np.logical_and(x<6, x>=2)]=4
x[np.logical_and(x<10, x>=6)]=8
#Plotting
sns.swarmplot(x=x,y=y)
plt.show()
And the output gives:
I have seen several similar question but none which answers my precise question, so please bear with it before marking it duplicate.
I have to plot n lines on a graph.
The standard value of n is 10 but it can vary. Previously, I used
ax.set_color_cycle([plt.cm.Accent(i) for i in np.linspace(0, 1, limit)])
Which works well for me.
However, this method has depreciated as of late, and my program gives a warning for the same to use set_prop_cycle
I wasn't able to find any example in the documentation http://matplotlib.org/api/axes_api.html that suits my needs of uses a colour bar like accent or hot or cool etc.
Update
Sample code for testing:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
limit=10
ax.set_color_cycle([plt.cm.Accent(i) for i in np.linspace(0, 1, limit)])
for limit in range(1,limit+1):
x=np.random.randint(10, size=10)
y=limit*x
plt.plot(x,y,lw=2, label=limit)
plt.legend(loc='best')
What this shows me is:
The question is how can I use a colormap to chose between different shades of a color or the sort? http://matplotlib.org/users/colormaps.html
You need a slightly different formulation in order to make use of the more generic set_prop_cycle. You have to create a cycler for the argument 'color' with plt.cycler and for the rest do it exactly as you did before:
import matplotlib.pyplot as plt
import numpy as np
limit=10
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_prop_cycle(plt.cycler('color', plt.cm.Accent(np.linspace(0, 1, limit))))
for limit in range(1,limit+1):
x=np.random.randint(10, size=10)
y=limit*x
plt.plot(x,y,lw=2, label=limit)
plt.legend(loc='best')
I do have a question with matplotlib in python. I create different figures, where every figure should have the same height to print them in a publication/poster next to each other.
If the y-axis has a label on the very top, this shrinks the height of the box with the plot. So I use MaxNLocator to remove the upper and lower y-tick. In some plots, I want to have the 1.0 as a number on the y-axis, because I have normalized data. So I need a solution, which expands in these cases the y-axis and ensures 1.0 is a y-Tick, but does not corrupt the size of the figure using tight_layout().
Here is a minimal example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
x = np.linspace(0,1,num=11)
y = np.linspace(1,.42,num=11)
fig,axs = plt.subplots(1,1)
axs.plot(x,y)
locator=MaxNLocator(prune='both',nbins=5)
axs.yaxis.set_major_locator(locator)
plt.tight_layout()
fig.show()
Here is a link to a example-pdf, which shows the problems with height of upper boxline.
I tried to work with adjust_subplots() but this is of no use for me, because I vary the size of the figures and want to have same the font size all the time, which changes the margins.
Question is:
How can I use MaxNLocator and specify a number which has to be in the y-axis?
Hopefully someone of you has some advice.
Greetings,
Laenan
Assuming that you know in advance how many plots there will be in 1 row on a page one way to solve this would be to put all those plots into one figure - matplotlib will make sure they are alinged on axes:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
x = np.linspace(0, 1, num=11)
y = np.linspace(1, .42, num=11)
fig, (ax1, ax2) = plt.subplots(1,2, figsize=(8,3), gridspec_kw={'wspace':.2})
ax1.plot(x,y)
ax2.plot(x,y)
locator=MaxNLocator(prune='both', nbins=5)
ax1.yaxis.set_major_locator(locator)
# You don't need to use tight_layout and using it might give an error
# plt.tight_layout()
fig.show()
I have a data set that has two independent variables and 1 dependent variable. I thought the best way to represent the dataset is by a checkerboard-type plot wherein the color of the cells represent a range of values, like this:
I can't seem to find a code to do this automatically.
You need to use a plotting package to do this. For example, with matplotlib:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
X = 100*np.random.rand(6,6)
fig, ax = plt.subplots()
i = ax.imshow(X, cmap=cm.jet, interpolation='nearest')
fig.colorbar(i)
plt.show()
For those who come across this years later as myself, what Original Poster wants is a heatmap.
Matplotlib has documentation regarding the following example here.