Dendrogram Label Overlapping

Dendrogram Label Overlapping - python

I have a 2D array of relation data with labels(first row and column).
when I created the dendrogram, my Labels overlapped.
How can I make the labels separate evenly?
file= open(fileName)
line = file.readline()
file.close()
populations=line.split('\t')
del populations[0]
data = np.loadtxt(fileName, delimiter="\t",skiprows=1,usecols=range(1,len(populations)+1 ))
fig, ax = plt.subplots()
Y1 = sch.linkage(data, method='ward',optimal_ordering=True)
Z1 = sch.dendrogram(Y1, orientation='top')
ind1= Z1['leaves']
arr = np.array(populations)
populations = arr[ind1]
ax.set_xticks([])
ax.set_xticks(np.arange(len(populations)))
ax.set_xticklabels(populations )
plt.xticks(rotation=90)
plt.show()

I think it may be easier to simply specify the labels in construction of the dendrogram, since they are known at the time of construction, something like the following
import scipy.cluster.hierarchy as sch
import numpy as np # Only needed for random sample data
np.random.seed(1) # Seeded for reproducing
populations = np.arange(10) # Create some random sample data
data = abs(np.random.randn(10))
fig, ax = plt.subplots()
Y1 = sch.linkage(data, method='ward',optimal_ordering=True)
Z1 = sch.dendrogram(Y1, orientation='top', labels=populations)
plt.show()
Would give you

Related

How to use dates in this code for y axis?

The person who made this had used dates in the second graph. I was wondering how would dates be used with the scipy.signal.argrelextrema function.
With this code it doesn't do anything it prints out an empty array for peak_x and peak_y:
data_y = np.array('2015-07-04', dtype=np.datetime64) + np.arange(25)
Here's the link for the original code:
https://openwritings.net/pg/python/python-find-peaks-and-valleys-chart-using-scipysignalargrelextrema
import matplotlib
matplotlib.use('Agg') # Bypass the need to install Tkinter GUI framework
from scipy import signal
import numpy as np
import matplotlib.pyplot as plt
# Generate random data.
data_x = np.arange(start = 0, stop = 25, step = 1, dtype='int')
data_y = np.array('2015-07-04', dtype=np.datetime64) + np.arange(25) #edited part
# Find peaks(max).
peak_indexes = signal.argrelextrema(data_y, np.greater)
peak_indexes = peak_indexes[0]
# Find valleys(min).
valley_indexes = signal.argrelextrema(data_y, np.less)
valley_indexes = valley_indexes[0]
# Plot main graph.
(fig, ax) = plt.subplots()
ax.plot(data_x, data_y)
# Plot peaks.
peak_x = peak_indexes
peak_y = data_y[peak_indexes]
ax.plot(peak_x, peak_y, marker='o', linestyle='dashed', color='green', label="Peaks")
print(peak_x,peak_y)
# Plot valleys.
valley_x = valley_indexes
valley_y = data_y[valley_indexes]
ax.plot(valley_x, valley_y, marker='o', linestyle='dashed', color='red', label="Valleys")
# Save graph to file.
plt.title('Find peaks and valleys using argrelextrema()')
plt.legend(loc='best')
plt.savefig('argrelextrema.png')
Here's the example how it would work:

You're going to want to use the xticks method. See below:
import matplotlib.pyplot as plt
names = [str(i) for i in range(20)]
x_data = [x for x in range(20)]
y_data = [x for x in range(20)]
plt.plot(x_data, y_data)
plt.xticks(x_data, label=names)
plt.show()
What this does is use an integer between 1-19 cast as a string as the label for the axis X.
Except in your case you want to swap out the names for datatime objects cast to strings. For the xticks, the x_data element prescribes where the ticks will be. You may use any interval of points so long as they are within the bounds of the xdata.
In your case, replace:
data_y = np.array('2015-07-04', dtype=np.datetime64) + np.arange(25)
with
data_y_ticks = np.array('2015-07-04', dtype=np.datetime64) + np.arange(25)
data_y = [i for i, _ in enumerate(data_y_ticks.tolist())]
then plot as follows:
plt.plot(data_y, x_data)
plt.xticks(data_y, label=data_y_ticks)
plt.show()
Just a heads-up, your X and Y axis names are flipped in your code. I did not correct this in my example, however did interchange their locations in the plot to make the plot make sense.

how to animate an image derived from a 2d histogram

I am trying to create an animation of a scatterplot as well as a 2d Histogram. I can get the scatter plot working. I can also create individual stills of the 2d Histogram but cannot get it to animate with the scatter plot.
I can create some mock data if that would help. Please find code below.
import numpy as np
import matplotlib.pyplot as plt
import csv
import matplotlib.animation as animation
#Create empty lists
visuals = [[],[],[]]
#This dataset contains XY coordinates from 21 different players derived from a match
with open('Heatmap_dataset.csv') as csvfile :
readCSV = csv.reader(csvfile, delimiter=',')
n=0
for row in readCSV :
if n == 0 :
n+=1
continue
#All I'm doing here is appending all the X-Coordinates and all the Y-Coordinates. As the data is read across the screen, not down.
visuals[0].append([float(row[3]),float(row[5]),float(row[7]),float(row[9]),float(row[11]),float(row[13]),float(row[15]),float(row[17]),float(row[19]),float(row[21]),float(row[23]),float(row[25]),float(row[27]),float(row[29]),float(row[31]),float(row[33]),float(row[35]),float(row[37]),float(row[39]),float(row[41]),float(row[43])])
visuals[1].append([float(row[2]),float(row[4]),float(row[6]),float(row[8]),float(row[10]),float(row[12]),float(row[14]),float(row[16]),float(row[18]),float(row[20]),float(row[22]),float(row[24]),float(row[26]),float(row[28]),float(row[30]),float(row[32]),float(row[34]),float(row[36]),float(row[38]),float(row[40]),float(row[42])])
visuals[2].append([1,2])
#Create a list that contains all the X-Coordinates and all the Y-Coordinates. The 2nd list indicates the row. So visuals[1][100] would be the 100th row.
Y = visuals[1][0]
X = visuals[0][0]
fig, ax = plt.subplots(figsize = (8,8))
plt.grid(False)
# Create scatter plot
scatter = ax.scatter(visuals[0][0], visuals[1][0], c=['white'], alpha = 0.7, s = 20, edgecolor = 'black', zorder = 2)
#Create 2d Histogram
data = (X, Y)
data,x,y,p = plt.hist2d(X,Y, bins = 15, range = np.array([(-90, 90), (0, 140)]))
#Smooth with filter
im = plt.imshow(data.T, interpolation = 'gaussian', origin = 'lower', extent = [-80,80,0,140])
ax.set_ylim(0,140)
ax.set_xlim(-85,85)
#Define animation.
def animate(i) :
scatter.set_offsets([[[[[[[[[[[[[[[[[[[[[visuals[0][0+i][0], visuals[1][0+i][0]], [visuals[0][0+i][1], visuals[1][0+i][1]], [visuals[0][0+i][2], visuals[1][0+i][2]], [visuals[0][0+i][3], visuals[1][0+i][3]], [visuals[0][0+i][4], visuals[1][0+i][4]],[visuals[0][0+i][5], visuals[1][0+i][5]], [visuals[0][0+i][6], visuals[1][0+i][6]], [visuals[0][0+i][7], visuals[1][0+i][7]], [visuals[0][0+i][8], visuals[1][0+i][8]], [visuals[0][0+i][9], visuals[1][0+i][9]], [visuals[0][0+i][10], visuals[1][0+i][10]], [visuals[0][0+i][11], visuals[1][0+i][11]], [visuals[0][0+i][12], visuals[1][0+i][12]], [visuals[0][0+i][13], visuals[1][0+i][13]], [visuals[0][0+i][14], visuals[1][0+i][14]], [visuals[0][0+i][15], visuals[1][0+i][15]], [visuals[0][0+i][16], visuals[1][0+i][16]], [visuals[0][0+i][17], visuals[1][0+i][17]], [visuals[0][0+i][18], visuals[1][0+i][18]], [visuals[0][0+i][19], visuals[1][0+i][19]], [visuals[0][0+i][20], visuals[1][0+i][20]]]]]]]]]]]]]]]]]]]]]])
# This is were I'm having trouble...How do I animate the image derived from the 2d histogram
im.set_array[i+1]
ani = animation.FuncAnimation(fig, animate, np.arange(0,1000),
interval = 100, blit = False)

The image can be updated with im.set_data(data), where you need to call hist2d to get the updated data to pass to im. As a minimal example,
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
X = np.random.randn(100000)
Y = np.random.randn(100000) + 5
fig, ax = plt.subplots(figsize = (8,8))
#Create 2d Histogram
data,x,y = np.histogram2d(X,Y, bins = 15)
#Smooth with filter
im = plt.imshow(data.T, interpolation = 'gaussian', origin = 'lower')
#Define animation.
def animate(i) :
X = np.random.randn(100000)
Y = np.random.randn(100000) + 5
data,x,y = np.histogram2d(X,Y, bins = 15)
im.set_data(data)
ani = animation.FuncAnimation(fig, animate, np.arange(0,1000),
interval = 100, blit = False)
plt.show()

3D matplotlib: color depending on x axis position

Dear Stackoverflow users,
I'm using 3D matplotlib to generate 3D envelopes. So far I got success in getting almost what I want, but there is a last detail I would like to solve: I would like the envelope to be colored according to x axis values and not according to z axis values.
I admit I copied parts of the code to get the graph without understanding each line in detail, there are a few lines that remain cryptic to me. Each line I don't understand is marked by a comment "Here line I don't understand", so that if one of you suspect that the modification I need is in a line I don't understand, they know it and it may help solve the problem. Here is the working code:
# ----- System libraries and plot parameters-----
import argparse
import re
import glob, os, sys
import subprocess
import math
import copy
import hashlib
import scipy
from scipy import optimize
import time
from decimal import *
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import matplotlib.colors as colors
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import MaxNLocator
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from numpy.random import randn, shuffle
from scipy import linspace, meshgrid, arange, empty, concatenate, newaxis, shape
import numpy as np
import numpy
from mpl_toolkits.axes_grid1 import make_axes_locatable
params = {'legend.fontsize' : 70,
'figure.figsize' : (80, 30),
'axes.labelsize' : 70,
'axes.titlesize' : 70,
'xtick.labelsize' : 70,
'ytick.labelsize' : 70}
pylab.rcParams.update(params)
FFMPEG_BIN = "C:\Users\User\Desktop\ffmpeg-20170125-2080bc3-win64-static\bin\ffmpeg.exe"
parser = argparse.ArgumentParser(description='utility to print 3D sigma profiles', formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('--name', type=str, help='name of prf and pot files without the extension, example for tempjob1.prf: --name="tempjob1"', default=["all"])
args = parser.parse_args()
#parse sigma profile
name = args.name + ".prf"
with open(name) as f:
sig_prof_set = f.read().splitlines()
sigma = list()
profile = list()
sigma_set = list()
profile_set = list()
dieze = 0
for l in sig_prof_set:
if dieze < 2: #the first dummy compound should not be taken into account and once we reach the second compound, it is the first layer so we start the filling
if "#" in l:
dieze += 1
pass
else:
if "#" in l:
if dieze > 1: #each time we reach a dieze, we store the sigma profile gathered into the sigma profile set and empty the list for the next
sigma_set.append(sigma)
profile_set.append(profile)
sigma = list()
profile = list()
dieze += 1 #the first dummy compound should not be taken into account
else:
splitted = l.split()
sigma.append(splitted[0])
profile.append(splitted[1])
#display 3D plot
fig = plt.figure()
#convert data to numpy arrays
sigma_set = numpy.array(sigma_set)
profile_set = numpy.array(profile_set)
potential_set = numpy.array(potential_set)
#shape data for graphs
layer = numpy.array(range(len(sigma_set)))
layer_flatten = list()
sigma_flatten = list()
profile_flatten = list()
potential_flatten = list()
#X is sigma, Y is layer number, Z is profile or potential
for i in layer:
for j in range(len(sigma_set[0])):
layer_flatten.append(layer[i])
sigma_flatten.append(float(sigma_set[i][j]))
profile_flatten.append(float(profile_set[i][j]))
potential_flatten.append(float(potential_set[i][j]))
#assign graph data
X = numpy.array(sigma_flatten)
Y = numpy.array(layer_flatten)
Z1 = numpy.array(profile_flatten)
Z2 = numpy.array(potential_flatten)
#actually make 3D plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d') #Here line I don't understand
surf = ax.plot_trisurf(X, Y, Z1, cmap=cm.jet, linewidth=0)
fig.colorbar(surf)
#set title of graph and axes
title = ax.set_title("Z-dependent sigma-profile")
title.set_y(1.01) #Here line I don't understand
ax.xaxis.set_major_locator(MaxNLocator(5)) #Here line I don't understand
ax.yaxis.set_major_locator(MaxNLocator(6)) #Here line I don't understand
ax.zaxis.set_major_locator(MaxNLocator(5)) #Here line I don't understand
ax.set_xlabel('sigma (e/A^2)')
ax.set_ylabel('layer')
ax.set_zlabel('p(sigma)')
ax.xaxis.labelpad = 100
ax.yaxis.labelpad = 70
ax.zaxis.labelpad = 70
fig.tight_layout() #Here line I don't understand
#save the figure
fig.savefig('3D_sig_prf{}.png'.format(args.name))
This generates the following figure:
How can I use the same colors, but associate them to x values instead of z values as they seem to be automatically?
Thanks in advance! 🙂
Best regards!

Setting the color of a trisurf plot to something other than its Z values is not possible, since unfortunately plot_trisurf ignores the facecolors argument.
However using a normal surface_plot makes it possible to supply an array of colors to facecolors.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
X,Y = np.meshgrid(np.arange(10), np.arange(10))
Z = np.sin(X) + np.sin(Y)
x = X.flatten()
y = Y.flatten()
z = Z.flatten()
fig = plt.figure(figsize=(9,3.2))
plt.subplots_adjust(0,0.07,1,1,0,0)
ax = fig.add_subplot(121, projection='3d')
ax2 = fig.add_subplot(122, projection='3d')
ax.set_title("trisurf with color acc. to z")
ax2.set_title("surface with color acc. to x")
ax.plot_trisurf(x,y,z , cmap="magma")
colors =plt.cm.magma( (X-X.min())/float((X-X.min()).max()) )
ax2.plot_surface(X,Y,Z ,facecolors=colors, linewidth=0, shade=False )
ax.set_xlabel("x")
ax2.set_xlabel("x")
plt.show()

3d plot from two vectors and an array

I have two vectors that store my X, Y values than are lengths 81, 105 and then a (81,105) array (actually a list of lists) that stores my Z values for those X, Y. What would be the best way to plot this in 3d? This is what i've tried:
Z = np.load('Z.npy')
X = np.load('X.npy')
Y = np.linspace(0, 5, 105)
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z, cmap= 'viridis')
plt.show()
I get the following error : ValueError: shape mismatch: objects cannot be broadcast to a single shape

OK, I got it running. There is some tricks here. I will mention them in the codes.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from random import shuffle
# produce some data.
x = np.linspace(0,1,81)
y = np.linspace(0,1,105)
z = [[i for i in range(81)] for x in range(105)]
array_z = np.array(z)
# Make them randomized.
shuffle(x)
shuffle(y)
shuffle(z)
# Match data in x and y.
data = []
for i in range(len(x)):
for j in range(len(y)):
data.append([x[i], y[j], array_z[j][i]])
# Be careful how you data is stored in your Z array.
# Stored in dataframe
results = pd.DataFrame(data, columns = ['x','y','z'])
# Plot the data.
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(results.x, results.y, results.z, cmap= 'viridis')
The picture looks weird because I produced some data. Hope it helps.

add new plot to existing figure

I have a script with some plots ( see example code). After some other things i want to add a new plot to an existing one. But when i try that it add the plot by the last created figure(now fig2).
I can't figure out how to change that...
import matplotlib.pylab as plt
import numpy as np
n = 10
x1 = np.arange(n)
y1 = np.arange(n)
fig1 = plt.figure()
ax1 = fig1.add_subplot(111)
ax1.plot(x1,y1)
fig1.show()
x2 = np.arange(10)
y2 = n/x2
# add new data and create new figure
fig2 = plt.figure()
ax2 = fig2.add_subplot(111)
ax2.plot(x2,y2)
fig2.show()
# do something with data to compare with new data
y1_geq = y1 >= y2
y1_a = y1**2
ax1.plot(y1_geq.nonzero()[0],y1[y1_geq],'ro')
fig1.canvas.draw

Since your code is not runnable without errors I'll provide a sample snippet showing how to plot several data in same graph/diagram:
import matplotlib.pyplot as plt
xvals = [i for i in range(0, 10)]
yvals1 = [i**2 for i in range(0, 10)]
yvals2 = [i**3 for i in range(0, 10)]
f, ax = plt.subplots(1)
ax.plot(xvals, yvals1)
ax.plot(xvals, yvals2)
So the basic idea is to call ax.plot() for all datasets you need to plot into the same plot.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dendrogram Label Overlapping - python

Related

How to use dates in this code for y axis?

how to animate an image derived from a 2d histogram

3D matplotlib: color depending on x axis position

3d plot from two vectors and an array

add new plot to existing figure

Categories

Resources