I am making a figure with subplots. The number of subplots is dynamic and depending on df.shape.
This is working but i am not satisfied. I have 3 Questions:
1) Is it possible to optimize the plot part? if k==b is lil bit annoying
2) How can I delete the last 3 empty subplots?
3) I was thinking of making a static figure (size=4,4) and opening a new one after the figure is full. How can I realize this?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math
import string
import random
# got generator from here:
# https://stackoverflow.com/questions/2257441/random-string-generation-with-upper-case-letters-and-digits
def id_generator(size=4, chars=string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
#%% make random data
labels = []
for i in range(0,27):
labels.append(id_generator())
mat = np.random.rand(20,27)
df = pd.DataFrame(mat,columns=labels)
#%% plot
k = 0 #go left
l=0 #go down
b = 5 #static number for columns
a = math.ceil(len(labels)/5) #round up for 'go down'
fig, axs = plt.subplots(
a,b, figsize=(10, 10),sharex=True, constrained_layout=True
)
for j in labels:
axs[l,k].plot(df[j])
k+=1
if k == b:
k = 0
l+=1
Edit:
With help of Chris A the 3 Questions above are solved.
I found out how to change the xlabel,xlim and title.
Is it possible to change the position of the legend to the top left corner?
Cant find anything in the documentation and how can i hand over a list with ylabels?
df.index.name = 'xlabel'
fig = df.plot(subplots=True, title= 'Make title',y=labels,layout=(-1, 5), figsize=(10, 10),grid=True,xlim=[0,20]) #xticks=[0,5,10,15,20]
Related
I have a dataset containing 10 features and corresponding labels. I am using scatterplot to plot distinct pair of features to see which of them describe the labels perfectly (which means that total 45 plots will be created). In order to do that, I used a nested loop format. The code shows no error and I obtained all the plots as well. However, there is clearly something wrong with the code because each new scatterplot that gets created and saved is accumulating points from the previous ones as well. I am attaching the complete code which I used. How to fix this problem? Below is the link for raw dataset:
https://github.com/IITGuwahati-AI/Learning-Content/raw/master/Phase%203%20-%202020%20(Summer)/Week%201%20(Mar%2028%20-%20Apr%204)/assignment/data.txt
import pandas as pd
import matplotlib
from matplotlib import pyplot as plt
data_url ='https://raw.githubusercontent.com/diwakar1412/Learning-Content/master/DiwakarDas_184104503/datacsv.csv'
df = pd.read_csv(data_url)
df.head()
def transform_label(value):
if value >= 2:
return "BLUE"
else:
return "RED"
df["Label"] = df.Label.apply(transform_label)
df.head()
colors = {'RED':'r', 'BLUE':'b'}
fig, ax = plt.subplots()
for i in range(1,len(df.columns)):
for j in range(i+1,len(df.columns)):
for k in range(len(df[str(i)])):
ax.scatter(df[str(i)][k], df[str(j)][k], color=colors[df['Label'][k]])
ax.set_title('F%svsF%s' %(i,j))
ax.set_xlabel('%s' %i)
ax.set_ylabel('%s' %j)
plt.savefig('F%svsF%s' %(i,j))
Dataset
You have to create a new figure each time. Try to put
fig, ax = plt.subplots()
inside your loop:
for i in range(1,len(df.columns)):
for j in range(i+1,len(df.columns)):
fig, ax = plt.subplots() # <-------------- here
for k in range(len(df[str(i)])):
ax.scatter(df[str(i)][k], df[str(j)][k], color=colors[df['Label'][k]])
ax.set_title('F%svsF%s' %(i,j))
ax.set_xlabel('%s' %i)
ax.set_ylabel('%s' %j)
plt.savefig('/Users/Alessandro/Desktop/tmp/F%svsF%s' %(i,j))
I'm trying to make an animated 3-D scatter plot with the ability to plot a dynamic number of classes as different colors. This is one of the attempts. I've included the whole code in case it is helpful, and marked the trouble spot with a row of stars:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.animation as animation
from random import uniform
x_arr,y_arr,depth_arr,time_arr,ml_arr,cluster_arr = np.loadtxt(data, unpack=5, usecols=(0, 1, 2, 5, 6))
class Point:
def __init__(self,x,y,depth,time,cluster):
self.x=x
self.y=y
self.depth=depth
self.time=time
self.cluster=cluster
points = []
for i in range(0,len(x_arr)):
points.append(Point(x_arr[i],y_arr[i],depth_arr[i],time_arr[i],cluster_arr[i]))
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlim(min(x_arr), max(x_arr))
ax.set_ylim(min(y_arr), max(y_arr))
ax.set_zlim(min(depth_arr), max(depth_arr))
colors_1 = plt.cm.jet(np.linspace(0,max(cluster_arr),max(cluster_arr)+1))
colors = colors_1.reshape(-1,4)
def plot_points(time):
x = []
y = []
z = []
clust = []
points_cp = list(np.copy(points))
for i in range(0,(int(max(cluster_arr))+1)):
for event in points_cp:
if event.cluster == i:
if event.time < time:
points_cp.remove(event)
elif event.time <= time + 86400:
x.append(event.x)
y.append(event.y)
z.append(event.depth)
clust.append(event.cluster)
points_cp.remove(event)
# **************************************************************
color_ind = 0
first_ind = 0
last_ind = 0
for i in range(0,len(x)):
if clust[i] != color_ind:
last_ind = i
for i in range(0,len(x)):
ax.scatter(x[first_ind:last_ind],y[first_ind:last_ind],z[first_ind:last_ind],c=colors[int(color_ind)])
color_ind = clust[i]
first_ind = i
time = np.linspace(min(time_arr),max(time_arr),100)
ani = animation.FuncAnimation(fig,plot_points,time)
plt.show()
This gives me a plot with the correct colors, but once a point is plotted, it remains throughout the entire animation.
I have also tried set_x, set_color, etc., but this doesn't work with a loop (it is updated with each iteration, so that only the last class is actually plotted), and I need to use a for loop to accommodate a variable number of classes. I've tried using a colormap with a fixed extent, but have been unsuccessful, as colormapping doesn't work with the plot function, and I haven't been able to get the rest of the code to work with a scatter function.
Thanks in advance for your help, and my apologies if the code is a little wonky. I'm pretty new to this.
I want to create a python programm that is able to plot multiple graphs into one PDF file, however the number of subplots is variable. I did this already with one plot per page. However, since i got someteimes arround 100 plots that makes a lot of scrolling and is not really clearly shown. Therefore I would like to get like 5X4 subpltots per page.
I wrote code for that alreaedy, the whole code is long and since im very new to pyhton it looks terrible to someone who knows what to do, however the ploting part looks like this:
rows = (len(tags))/5
fig = plt.figure()
count = 0
for keyInTags in tags:
count = count + 1
ax = fig.add_subplot(int(rows), 5, count)
ax.set_title("cell" + keyInTags)
ax.plot(x, y_green, color='k')
ax.plot(x, y_red, color='k')
plt.subplots_adjust(hspace=0.5, wspace=0.3)
pdf.savefig(fig)
The idea is that i get an PDF with all "cells" (its for biological research) ploted. The code I wrote is working fine so far, however if I got more than 4 rows of subplots I would like to do a "pageprake". In some cases i got over 21 rows on one page, that makes it impossible to see anything.
So, is there a solution to, for example, tell Python to do a page break after 4 rows? In the case with 21 rows id like to have 6 pages with nice visible plots. Or is it done by doing 5x4 plots and then iterating somehow over the file?
I would be really happy if someone could help a little or give a hint. Im sitting here since 4 hours, not finding a solution.
A. Loop over pages
You could find out how many pages you need (npages) and create a new figure per page.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
tags = ["".join(np.random.choice(list("ABCDEFG123"), size=5)) for _ in range(53)]
N = len(tags) # number of subplots
nrows = 5 # number of rows per page
ncols = 4 # number of columns per page
# calculate number of pages needed
npages = N // (nrows*ncols)
if N % (nrows*ncols) > 0:
npages += 1
pdf = PdfPages('out2.pdf')
for page in range(npages):
fig = plt.figure(figsize=(8,11))
for i in range(min(nrows*ncols, N-page*(nrows*ncols))):
# Your plot here
count = page*ncols*nrows+i
ax = fig.add_subplot(nrows, ncols, i+1)
ax.set_title(f"{count} - {tags[count]}")
ax.plot(np.cumsum(np.random.randn(33)))
# end of plotting
fig.tight_layout()
pdf.savefig(fig)
pdf.close()
plt.show()
B. Loop over data
Or alternatively you could loop over the tags themselves and create a new figure once it's needed:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
tags = ["".join(np.random.choice(list("ABCDEFG123"), size=5)) for _ in range(53)]
nrows = 5 # number of rows per page
ncols = 4 # number of columns per page
pdf = PdfPages('out2.pdf')
for i, tag in enumerate(tags):
j = i % (nrows*ncols)
if j == 0:
fig = plt.figure(figsize=(8,11))
ax = fig.add_subplot(nrows, ncols,j+1)
ax.set_title(f"{i} - {tags[i]}")
ax.plot(np.cumsum(np.random.randn(33)))
# end of plotting
if j == (nrows*ncols)-1 or i == len(tags)-1:
fig.tight_layout()
pdf.savefig(fig)
pdf.close()
plt.show()
You can use matplotlib's PdfPages as follows.
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
import numpy as np
pp = PdfPages('multipage.pdf')
x=np.arange(1,10)
y=np.arange(1,10)
fig=plt.figure()
ax1=fig.add_subplot(211)
# ax1.set_title("cell" + keyInTags)
# ax1.plot(x, y, color='k')
# ax.plot(x, y_red, color='k')
ax2=fig.add_subplot(212)
pp.savefig(fig)
fig2=plt.figure()
ax1=fig2.add_subplot(321)
ax1.plot(x, y, color='k')
ax2=fig2.add_subplot(322)
ax2.plot(x, y, color='k')
ax3=fig2.add_subplot(313)
pp.savefig(fig2)
pp.close()
Play with these subplot numbers a little bit, so you would understand how to handle which graph goes where.
I'm trying to create a 2x2 graphs in python and is struggling with the axes. This is what I get so far - the axes on each subplot is messed up.
This is my code:
def plotCarBar(df):
fig = plt.figure()
j = 1
for i in pandaDF.columns[15:18]:
cat_count = df.groupby(i)[i].count().sort_values().plot(figsize= 12,12), kind = 'line')
ax = fig.add_subplot(2, 2, j)
j += 1
return ax.plot(lw = 1.3)
plotCarBar(pandaDF)
Can someone please help? Thanks in advance!
I am not sure if you need two loops. If you post some sample data, we may be able to make better sense of what your cat_count line is doing. As it stands, I'm not sure if you need two counters (i and j).
Generally, I would also recommend using matplotlib directly, unless you're really just doing some quick and dirty plotting in pandas.
So, something like this might work:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
randoms = np.random.rand(10, 4) # generate some data
print(randoms)
fig = plt.figure()
for i in range(1, randoms.shape[1] + 1): # number of cols
ax = fig.add_subplot(2, 2, i)
ax.plot(randoms[i, :])
plt.show()
Output:
[[0.78436298 0.85009767 0.28524816 0.28137471]
[0.58936976 0.00614068 0.25312449 0.58549765]
[0.24216048 0.13100618 0.76956316 0.66210005]
[0.95156085 0.86171181 0.40940887 0.47077143]
[0.91523306 0.33833055 0.74360696 0.2322519 ]
[0.68563804 0.69825892 0.5836696 0.97711073]
[0.62709986 0.44308186 0.24582971 0.97697002]
[0.04356271 0.01488111 0.73322443 0.04890864]
[0.9090653 0.25895051 0.73163902 0.83620635]
[0.51622846 0.6735348 0.20570992 0.13803589]]
I am a newbie to matplotlib. I am trying to plot step function and having some trouble. Right now I am able to read from the file and plot it as shown below. But the graph in the top is not in steps and the one below is not a proper step. I saw examples to plot step function by giving x & y value. I am not sure how to do it by reading from a file though. Can someone help me?
from pylab import plotfile, show, gca
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
fname = cbook.get_sample_data('sample.csv', asfileobj=False)
plotfile(fname, cols=(0,1), delimiter=' ')
plotfile(fname, cols=(0,2), newfig=False, delimiter=' ')
plt.show()
Sample inputs(3 columns):
27023927 3 0
27023938 2 0
27023949 3 0
27023961 2 0
27023972 3 0
27023984 2 0
27023995 3 0
27024007 2 0
27024008 2 1
27024018 3 1
27024030 2 1
27024031 2 0
27024041 3 0
27024053 2 0
27024054 2 1
27024098 2 0
Note: I have made the y-axis1 values as 3 & 2 so that this graph can occur in the top and another y-axis2 values 0 & 1 so that it comes in the bottom as shown below
Waveform as it looks now
Essentially your resolution is too low, for the lower plot the steps (except the last one) occur over 1 unit in x, while the steps are about an order of magnitude larger. This gives the appearance of steps while if you zoom in you will see the vertical lines have a non-infinite gradient (true steps change with an infinite gradient).
This is the same problem for both the top and bottom plots. We can easily remedy this by using the step function. You will generally find it easier to import the data, in this example I use the powerful numpy genfromtxt. This loads the data as an array data:
import numpy as np
import matplotlib.pylab as plt
data = np.genfromtxt('test.csv', delimiter=" ")
ax1 = plt.subplot(2,1,1)
ax1.step(data[:,0], data[:,1])
ax2 = plt.subplot(2,1,2)
ax2.step(data[:,0], data[:,2])
plt.show()
If you are new to python then there may be two things to mention, we use two subplots (ax1 and ax2) to plot the data rather than plotting on the same plot (this means you wouldn't need to add values to spatially separate them). We access the elements of the array through the [] this gives the [column, row] with : meaning all columns and and index i being the ith column
I would propose to load the data to a numpy array
import numpy as np
data = np.loadtxt('sample.csv')
And than plot it:
# first point
ax = [data[0,0]]
ay = [data[0,1]]
for i in range(1, data.shape[0]):
if ay[-1] != data[i,1]: # if y value has changed
# add current x and old y
ax.append(data[i,0])
ay.append(ay[-1])
# add current x and current y
ax.append(data[i,0])
ay.append(data[i,1])
import matplotlib.pyplot as plt
plt.plot(ax,ay)
plt.show()
What my solution differs from yours, is that I plot two points for every change in y. The two points produce this 90 degree bend. I Only plot the first curve. Change [?,1] to [?,2] for the second one.
Thanks for the suggestions. I was able to plot it after some research and here is my code,
import csv
import datetime
import matplotlib.pyplot as plt
import numpy as np
import dateutil.relativedelta as rd
import bisect
import scipy as sp
fname = "output.csv"
portfolio_list = []
x = []
a = []
b = []
portfolio = csv.DictReader(open(fname, "r"))
portfolio_list.extend(portfolio)
for data in portfolio_list:
x.append(data['i'])
a.append(data['a'])
b.append(data['b'])
stepList = [0, 1,2,3]
fig = plt.figure(figsize=(20, 10))
ax = fig.add_subplot(111)
plt.step(x, a, 'g', where='post')
plt.step(x, b, 'r', where='post')
plt.show()
and got the image like,