plotting each columns in single subplot - python

I have a text file that contain 2048 rows and 256 columns, i want to plot only 10 columns of data in an subplot,
I tried
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data=np.loadtxt("input_data.txt")
data=data[:,0:10]
print(data.shape)
nrows=5
ncols=10
fig, axs = plt.subplots(nrows,ncols, figsize=(13,10))
count = 0
for i in range(ncols):
for j in range(nrows):
axs[i,j].plot(data[count])
count += 1
print(count)
plt.show()
But it doesnot plot the each column values, I hope experts may help me.Thanks.

I used random numbers to reproduce your problem.
data=np.random.randint(0,1,size = (2048,256))
data=data[:,0:10]
print(np.shape(data))
nrows=2
ncols=5
fig, axs = plt.subplots(nrows,ncols, figsize=(13,10))
count = 0
for i in range(nrows):
for j in range(ncols):
print(count)
axs[i,j].plot(data[:,count])
count += 1
Here is your plot.
if you put real data you will see some variation in each subplot.

Related

Multiple boxplots of all categorical variables in one plotting window using seaborn?

In the code below I'd like to loop through all categorical variables in "variables", and show separate boxplots of "fare" for all of them in a single plotting window. How do I do that? Thanks.
import seaborn as sns
sns.set(style="ticks")
titanic = sns.load_dataset("titanic")
variables = list(titanic.select_dtypes(include="object").columns) # list of categorical variables
# single boxplot of fare vs passenger sex
g = sns.catplot(x="sex", y="fare", kind="box", data=titanic.query("fare>0"))
g.set(yscale="log")
Update: The following looping code seems to work, but I'd like some help with cleaning up the plot (attached below) if possible, namely removing the empty subplot window and interior axes ticks/labels. Thanks again.
fig, axs = plt.subplots(nrows=2, ncols=3)
i = j = 0
for variable in variables:
g = sns.boxplot(x=variable, y="fare", data=titanic.query("fare>0"), ax=axs[i][j])
g.set(yscale="log")
j += 1
if j>2:
i += 1; j = 0
Update #2: YOLO's code below does the job. Thanks!
Here's a way to do:
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(15,10))
for i, c in enumerate(variables, 1):
plt.subplot(2,3,i)
g = sns.boxplot(x=c, y="fare",data=titanic.query("fare>0"))
g.set(yscale="log")

Proper Matplotlib axes construction / reuse

I currently am building a set of scatter plot charts using pandas plot.scatter. In this construction off of two base axes.
My current construction looks akin to
ax1 = pandas.scatter.plot()
ax2 = pandas.scatter.plot(ax=ax1)
for dataframe in list:
output_ax = pandas.scatter.plot(ax2)
output_ax.get_figure().save("outputfile.png")
total_output_ax = total_list.scatter.plot(ax2)
total_output_ax.get_figure().save("total_output.png")
This seems inefficient. For 1...N permutations I want to reuse a base axes that has 50% of the data already plotted. What I am trying to do is:
Add base data to scatter plot
For item x in y: (save data to base scatter and save image)
Add all data to scatter plot and save image
here's one way to do it with plt.scatter.
I plot column 0 on x-axis, and all other columns on y axis, one at a time.
Notice that there is only 1 ax object, and I don't replot all points, I just add points using the same axes with a for loop.
Each time I get a corresponding png image.
import numpy as np
import pandas as pd
np.random.seed(2)
testdf = pd.DataFrame(np.random.rand(20,4))
testdf.head(5) looks like this
0 1 2 3
0 0.435995 0.025926 0.549662 0.435322
1 0.420368 0.330335 0.204649 0.619271
2 0.299655 0.266827 0.621134 0.529142
3 0.134580 0.513578 0.184440 0.785335
4 0.853975 0.494237 0.846561 0.079645
#I put the first axis out of a loop, that can be in the loop as well
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(testdf[0],testdf[1], color='red')
fig.legend()
fig.savefig('fig_1.png')
colors = ['pink', 'green', 'black', 'blue']
for i in range(2,4):
ax.scatter(testdf[0], testdf[i], color=colors[i])
fig.legend()
fig.savefig('full_' + str(i) + '.png')
Then you get these 3 images (fig_1, fig_2, fig_3)
Axes objects cannot be simply copied or transferred. However, it is possible to set artists to visible/invisible in a plot. Given your ambiguous question, it is not fully clear how your data are stored but it seems to be a list of dataframes. In any case, the concept can easily be adapted to different input data.
import matplotlib.pyplot as plt
#test data generation
import pandas as pd
import numpy as np
rng = np.random.default_rng(123456)
df_list = [pd.DataFrame(rng.integers(0, 100, (7, 2))) for _ in range(3)]
#plot all dataframes into an axis object to ensure
#that all plots have the same scaling
fig, ax = plt.subplots()
patch_collections = []
for i, df in enumerate(df_list):
pc = ax.scatter(x=df[0], y=df[1], label=str(i))
pc.set_visible(False)
patch_collections.append(pc)
#store individual plots
for i, pc in enumerate(patch_collections):
pc.set_visible(True)
ax.set_title(f"Dataframe {i}")
fig.savefig(f"outputfile{i}.png")
pc.set_visible(False)
#store summary plot
[pc.set_visible(True) for pc in patch_collections]
ax.set_title("All dataframes")
ax.legend()
fig.savefig(f"outputfile_0_{i}.png")
plt.show()

Generate multiple plots with for loop; display output in matplotlib subplots

Objective: To generate 100 barplots using a for loop, and display the output as a subplot image
Data format: Datafile with 101 columns. The last column is the X variable; the remaining 100 columns are the Y variables, against which x is plotted.
Desired output: Barplots in 5 x 20 subplot array, as in this example image:
Current approach: I've been using PairGrid in seaborn, which generates an n x 1 array: .
where input == dataframe; input3 == list from which column headers are called:
for i in input3:
plt.figure(i)
g = sns.PairGrid(input,
x_vars=["key_variable"],
y_vars=i,
aspect=.75, size=3.5)
g.map(sns.barplot, palette="pastel")
Does anyone have any ideas how to solve this?
To give an example of how to plot 100 dataframe columns over a grid of 20 x 5 subplots:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
data = np.random.rand(3,101)
data[:,0] = np.arange(2,7,2)
df = pd.DataFrame(data)
fig, axes = plt.subplots(nrows=5, ncols=20, figsize=(21,9), sharex=True, sharey=True)
for i, ax in enumerate(axes.flatten()):
ax.bar(df.iloc[:,0], df.iloc[:,i+1])
ax.set_xticks(df.iloc[:,0])
plt.show()
You can try to use matplotlob's subplots to create the plot grid and pass the axis to the barplot. The axis indexing you could do using a nested loop...

Multiple titles (suptitle) with subplots

I have a series of 9 subplots in a 3x3 grid, each subplot with a title.
I want to add a title for each row. To do so I thought about using suptitle.
The problem is if I use 3 suptitles they seems to be overwritten and only the last one seems to be shown.
Here is my basic code:
fig, axes = plt.subplots(3,3,sharex='col', sharey='row')
for j in range(9):
axes.flat[j].set_title('plot '+str(j))
plt1 = fig.suptitle("row 1",x=0.6,y=1.8,fontsize=18)
plt2 = fig.suptitle("row 2",x=0.6,y=1.2,fontsize=18)
plt3 = fig.suptitle("row 3",x=0.6,y=0.7,fontsize=18)
fig.subplots_adjust(right=1.1,top=1.6)
You can tinker with the titles and labels. Check the following example adapted from your code:
import matplotlib.pyplot as plt
fig, axes = plt.subplots(3,3,sharex='col', sharey='row')
counter = 0
for j in range(9):
if j in [0,3,6]:
axes.flat[j].set_ylabel('Row '+str(counter), rotation=0, size='large',labelpad=40)
axes.flat[j].set_title('plot '+str(j))
counter = counter + 1
if j in [0,1,2]:
axes.flat[j].set_title('Column '+str(j)+'\n\nplot '+str(j))
else:
axes.flat[j].set_title('plot '+str(j))
plt.show()
, which results in:

Plot panda series in separate subplots using matplotlib

Hoping to get some help please, I'm trying plot simulation data in separate subplots using pandas and matplotlib my code so far is:
import matplotlib.pylab as plt
import pandas as pd
fig, ax = plt.subplots(2, 3)
for i in range(2):
for j in range(50, 101, 10):
for e in range(3):
Var=(700* j)/ 100
Names1 = ['ig','M_GZ']
Data1 = pd.read_csv('~/File/JTL_'+str(Var)+'/GZ.csv', names=Names1)
ig = Data1['ig']
M_GZ=Data1['M_GZ']
MGZ = Data1[Data1.M_GZ != 0]
ax[i, e].plot(MGZ['ig'][:4], MGZ['M_GZ'][:4], '--v', linewidth=1.75)
plt.tight_layout()
plt.show()
But the code gives me 6 duplicate copies of the same plot:
instead of each iteration of Var having its own plot, I've tried changing the loop and using different variations like:
fig = plt.figure()
for i in range(1, 7):
ax = fig.add_subplot(2, 3, i)
for j in range(50, 101, 10):
Var=(700* j)/ 100
Names1 = ['ig','M_GZ']
Data1 = pd.read_csv('~/File/JTL_'+str(Var)+'/GZ.csv', names=Names1)
ig = Data1['ig']
M_GZ=Data1['M_GZ']
MGZ = Data1[Data1.M_GZ != 0]
ax.plot(MGZ['ig'][:4], MGZ['M_GZ'][:4], '--v', linewidth=1.75)
plt.tight_layout()
plt.show()
but that changes nothing I still get the same plot as above. Any help would be appreciated, I'm hoping that each subplot contains one set of data instead of all six
This is a Link to one of the Dataframes each subdirectory ~/File/JTL_'+str(Var)+'/ contains a copy of this file there are 6 in total
The problem is in your loop
for i in range(2): # Iterating rows of the plot
for j in range(50, 101, 10): # Iterating your file names
for e in range(3): # iterating the columns of the plot
The end result is that you iterate all the columns for each file name
For it two work, you should have only two nesting levels in your loop. Potential code (updated) :
import matplotlib.pylab as plt
import pandas as pd
fig, ax = plt.subplots(2, 3)
for row in range(2):
for col in range(3):
f_index = range(50, 101, 10)[row+1 * col]
print row, col, f_index
Var=(700* f_index)/ 100
Names1 = ['ig','M_GZ']
Data1 = pd.read_csv('~/File/JTL_'+str(Var)+'/GZ.csv', names=Names1)
ig = Data1['ig']
M_GZ=Data1['M_GZ']
MGZ = Data1[Data1.M_GZ != 0]
ax[row, col].plot(MGZ['ig'][:4], MGZ['M_GZ'][:4], '--v',linewidth=1.75)
plt.tight_layout()
plt.show()

Categories

Resources