Bar plot cutting off axis title - python

I'm trying to draw a bar plot with vertical axis labels and an axis title.
The script below makes the graph but it cuts off the x-axis label/title. Even if I try to make the picture bigger on my screen it still is cut off a bit. Also when I run this, I have to run it twice. The first time I get error about the fontdict property, but the next time it works.
Anyone know how to not make it cut that off? Also I am just saving the one that pops up on the screen as the saving is not working for some reason.
Thanks!
import numpy
import matplotlib
import matplotlib.pylab as pylab
import matplotlib.pyplot
import pdb
from collections import Counter
phenos = [128, 20, 0, 144, 4, 16, 160, 136, 192, 128, 20, 0, 4, 16, 144, 130, 136, 132, 22,
128, 160, 4, 0, 36, 132, 136, 130, 128, 22, 4, 0, 144, 160, 130, 132,
128, 4, 0, 136, 132, 68, 130, 192, 8, 128, 4, 0, 20, 22, 132, 144, 192, 130, 2,
128, 4, 0, 132, 20, 136, 144, 192, 64, 130, 128, 4, 0, 144, 132, 192, 20, 16, 136,
128, 4, 0, 130, 160, 132, 192, 2, 128, 4, 0, 132, 68, 160, 192, 36, 64,
128, 4, 0, 136, 192, 8, 160, 36, 128, 4, 0, 22, 20, 144, 132, 160,
128, 4, 0, 132, 20, 192, 144, 160, 68, 64, 128, 4, 0, 132, 160, 144, 136, 192, 68, 20]
from collections import Counter
import numpy as np
import matplotlib.pyplot as plt
from operator import itemgetter
c = Counter(phenos).items()
c.sort(key=itemgetter(1))
font = {'family' : 'sanserif',
'color' : 'black',
'weight' : 'normal',
'size' : 22,
}
font2 = {'family' : 'sansserif',
'color' : 'black',
'weight' : 'normal',
'size' : 18,
}
labels, values = zip(*c)
labels = ("GU", "IT", "AA", "SG", "A, IGI", "A, SG", "GU, A, AA", "D, GU", "D, IT", "A, AA", "D, IGI", "D, AA", "192", "D, A", "D, H", "H", "A")
pylab.show()
pylab.draw()
indexes = np.arange(0, 2*len(labels), 2)
width = 2
plt.bar(indexes, values, width=2, color="blueviolet")
plt.xlabel("Phenotype identifier", fontdict=font)
plt.ylabel("Number of occurances in top 10 \n phenotypes for cancerous tumours", fontdict=font)
#plt.title("Number of occurances for different phenotypes \n in top 10 subclones of a tumour", fontdict=font2)
plt.xticks(indexes + width * 0.5, labels, rotation='vertical', fontdict=font2)
plt.figure(figsize=(8.0, 7.0))
pictureFileName2 = "..\\Stats\\" + "Phenos2.png"
pylab.savefig(pictureFileName2, dpi=800)
#fig.set_size_inches(18.5,10.5)
#plt.savefig('test2png.png',dpi=100)

Three problems:
1, It is not true that the first time you run the code it doesn't work and the second time it does. The reason is that you call .show() before making the plot. The 1st time you run the code, the code stopped at where the except error message indicates. The 2nd time, .show() gets executed first and the partially made plot from the previous run now show up.
2, fontdict=font2 etc is not necessary and in fact wrong. You just need **font2 etc.
3, The truncated tick labels. There are just about many different ways to do it, but the basic idea is to increase the area of white space around the plot, alternatives are:
plt.gcf().subplots_adjust(bottom=0.35, top=0.7) #adjusting the plotting area
plt.tight_layout() #may raise an exception, depends on which backend is in use
plt.savefig('test.png', bbox_inches='tight', pad_inches = 0.0) #use bbox and pad, if you only want to change the saved figure.

Related

How to read plot points from a text file, reading each line of text individually?

So I am very new to coding, and I have to draw an image in python using multiple separate polygons with turtle with the plot points all contained in a single .txt file. I have my file set up, so each set of points is on a different line of text, but I'm just not sure how to call each line of text individually to the program.
Like this:
-30,9,108,5,110,16,33,72,-42,75,-30,9
-171,15,-56,10,-64,78,-161,77,-171,15
-201,17,-182,75,-322,75,-340,18,-201,17
-378,-32,-366,-31,-361,17,-335,84,-345,84,-372,18,-378,-32
366,-24,355,-27,355,-45,372,-45,366,-24
-149,-2,-187,0,-187,-7,-150,-6,-149,-2
-1,-8,-37,-4,-37,-10,-2,-11,-1,-8
Here is the code that I have so far, not including the code that involves actually drawing the image:
import turtle
import os
import re
file_directory = os.path.dirname(__file__)
movements = ""
with open(file_directory + '\\plotpoints.txt', "r") as plotme:
movements = movements + plotme.readlines()
plotme.close()
pointlist = movements.split(",")
for counter in range(0, len(pointlist)):
pointlist[counter] = int(pointlist[counter])
Like I said, I'm very new to coding, so anything at all to help me understand this better would be greatly appreciated.
Your attempt was good, but to achieve the result you need to do this:
remove plotme.close() as this is handled by the with block (context manager).
use os.path.join() if you are joining paths.
for loop inside the with block to read all lines to a list.
import os
file_directory = os.path.dirname(__file__)
full_path = os.path.join(file_directory,'plotpoints.txt')
movements = []
with open(full_path, "r") as plotme:
for line in plotme.readlines():
movements.extend(line.strip().split(","))
# convert strings to ints
movements = [int(x) for x in movements]
print(movements)
result:
[-30, 9, 108, 5, 110, 16, 33, 72, -42, 75, -30, 9, -171, 15, -56, 10, -64, 78, -161, 77, -171, 15, -201, 17, -182, 75, -322, 75, -340, 18, -201, 17, -378, -32, -366, -31, -361, 17, -335, 84, -345, 84, -372, 18, -378, -32, 366, -24, 355, -27, 355, -45, 372, -45, 366, -24, -149, -2, -187, 0, -187, -7, -150, -6, -149, -2, -1, -8, -37, -4, -37, -10, -2, -11, -1, -8]
you can then use this result for plotting...
Here is an alternative option for producing a list of lists for each shape:
import os
file_directory = os.path.dirname(__file__)
full_path = os.path.join(file_directory,'plotpoints.txt')
movements = []
with open(full_path, "r") as plotme:
for line in plotme.readlines():
shape = line.strip().split(",")
shape = [int(x) for x in shape]
movements.append(shape)
print(movements)
An the result looks like this:
[[-30, 9, 108, 5, 110, 16, 33, 72, -42, 75, -30, 9],
[-171, 15, -56, 10, -64, 78, -161, 77, -171, 15],
[-201, 17, -182, 75, -322, 75, -340, 18, -201, 17],
[-378, -32, -366, -31, -361, 17, -335, 84, -345, 84, -372, 18, -378, -32],
[366, -24, 355, -27, 355, -45, 372, -45, 366, -24],
[-149, -2, -187, 0, -187, -7, -150, -6, -149, -2],
[-1, -8, -37, -4, -37, -10, -2, -11, -1, -8]]

Python find convolution kernel if input image and output image is known

I have a problem with convolution kernel in python. It is about simple convolution operator. I have input matrix and output matrix. I want to find a possible convolution kernel with size(5x5). How to solve this problem with python, numpy or tensorflow ?
import scipy.signal as ss
input_img = np.array([[94, 166, 76, 106, 152, 232],
[48, 242, 30, 98, 46, 210],
[52, 60, 86, 60, 216, 248],
[52, 236, 116, 240, 224, 184],
[138, 160, 146, 254, 236, 252],
[94, 100, 224, 246, 152, 74]], dtype=float)
output_img = np.array([[15, 49, 23, 105, 0, 0],
[43,30, 108, 124, 0, 0],
[58, 120, 112, 92, 0, 0],
[73, 127, 118, 126, 0, 0],
[112, 123, 76, 37, 0, 0],
[0, 0, 0, 0, 0, 0]], dtype=float)
# I want to find this kernel
conv = np.zeros((5,5), dtype=int)
# So if I do convolution operator, output_img will resulting a value same as I defined above
output_img = ss.convolve2d(input_img, conv, padding='same')
As far as I understood, you need to reconstruct window weights by given input, output arrays and window size. This is possible, I think, especially, if input array (image) is sufficiently big.
Look at the code below:
import scipy.signal as ss
import numpy as np
source_dataset = np.random.rand(20, 10)
sample_convolution = np.diag([1, 1, 1])
output_dataset = ss.convolve2d(data, sample_convolution, mode='same')
conv_size = c.shape[0]
# Given output_dataset, source_datset, and conv_size we need to reconstruct
# window weights.
def reconstruct(data, output, csize):
half_size = int(csize / 2)
min_row_ind = half_size
max_row_ind = int(data.shape[0]) - half_size
min_col_ind = half_size
max_col_ind = int(data.shape[1]) - half_size
A = list()
b = list()
for i in np.arange(min_row_ind, max_row_ind, dtype=int):
for j in np.arange(min_col_ind, max_col_ind, dtype=int):
A.append(data[(i - half_size):(i + half_size + 1), (j - half_size):(j + half_size + 1)].ravel().tolist())
b.append(output[i, j])
if len(A) == csize * csize and np.linalg.matrix_rank(A) == csize * csize:
return (np.linalg.pinv(A)#np.array(b)[:, np.newaxis]).reshape(csize, csize)
if len(A) < csize*csize:
raise Exception("Insufficient data")
result = reconstruct(source_dataset, output_dataset, 3)
I got the following result
array([[ 1.00000000e+00, -1.77635684e-15, -1.11022302e-16],
[ 0.00000000e+00, 1.00000000e+00, -8.88178420e-16],
[ 0.00000000e+00, -1.22124533e-15, 1.00000000e+00]])
So, it works as expected; but definitely need to be improved to take into account edge effects, case when size of window is even etc.

OpenCV local pixel average generating extrange output

I am trying to use python to just compute a local pixel color average, however my output is not at all that.
Image:
Output:
Code:
image = cv2.imread('perspective.jpeg')
for i in range(image.shape[1]):
for j in range(image.shape[0]):
up = image[min(j + 1, image.shape[0]-1), i]
down = image[max(j - 1, 0), i]
right = image[j, min(i + 1, image.shape[1]-1)]
left = image[j, max(i - 1, 0)]
average = (up + down + left + right + image[j, i]) / 5
image[j, i] = average
The issues that you are observing is due to integer arithmetic overflow while computing the average. The reason of overflow is that the pixels are of type np.uint8 which when added together, generate result of type np.uint8 which is not large enough to hold the result of addition.
The solution to this problem is to cast the pixels to a larger data-type before adding them. Then cast the final value back to np.uint8 before storing back to the result image.
In-fact, casting only one of the values (say up) to larger data type will suffice as the rest of them will automatically be upgraded while performing addition.
The corrected code may look like this:
image = cv2.imread('perspective.jpeg')
for i in range(image.shape[1]):
for j in range(image.shape[0]):
up = np.float32(image[min(j + 1, image.shape[0]-1), i])
down = image[max(j - 1, 0), i]
right = image[j, min(i + 1, image.shape[1]-1)]
left = image[j, max(i - 1, 0)]
average = (up + down + left + right + image[j, i]) / 5
image[j, i] = np.uint8(average)
You can easily do this with filter2D as shown in the example below. It will work on any number of channels.
im = np.random.randint(0, 256, (5, 5), np.uint8)
kernel = np.array([[0, 1./5, 0], [1./5, 1./5, 1./5], [0, 1./5, 0]])
filt = cv2.filter2D(im, cv2.CV_8U, kernel)
For example:
im
array([[ 14, 127, 221, 74, 2],
[132, 251, 88, 19, 215],
[183, 140, 17, 60, 76],
[208, 144, 182, 11, 64],
[183, 89, 217, 131, 23]], dtype=uint8)
filt
array([[106, 173, 120, 67, 116],
[166, 148, 119, 91, 66],
[161, 147, 97, 37, 95],
[172, 153, 114, 90, 37],
[155, 155, 160, 79, 83]], dtype=uint8)
You can choose the border type, I've used the default.

Find polynomial function through 30 points with polyfit

I need to find the polynomial function of degree 29 that exactly fits thirty data points. We can be sure, that such a function exists. However, the error of numpy.polyfit increases dramatically after only three points.
import numpy as np
y = [126, 34, 78, 120, 83, 62, 104, 6, 70, 142, 147, 63, 35, 126, 9, 84, 7, 122, 93, 29, 95, 141, 42, 102, 38, 96, 130, 83, 138, 148]
print(len(y))
x = np.arange(len(y))
f = np.polyfit(x,y,30)
def eval_polynom(f, x):
res = 0
for i in range(len(f)):
res += f[i] * x**(len(f)-i-1)
return res
for i in range(len(y)):
print(y[i], " -- ", eval_polynom(f, x[i]))
My data points are (x,y) with x = [0,1,2,3,4,...,29]
The output is
126 -- 125.941598976
34 -- 34.7366402172
78 -- 73.703669116
120 -- 134.514176467
83 -- 51.6471546864
62 -- 105.143046704
104 -- 70.1470309453
6 -- 13.808372367
70 -- 347.425617622
142 -- -1281.11122538
...
Is there a way to get the exact polynomial function such that the error is 0?
There's almost certainly an integer overflow issue (due to large exponents) in your eval_polynom function, because the values in x are all integers. Try to replace
res += f[i] * x**(len(f)-i-1)
with
res += f[i] * float(x)**(len(f)-i-1)
You'll probably end up with values that still don't perfectly match, but remember that floating point operations are inherently inaccurate. Even more so if numbers become large, as is the case here.
y - green, polynome - red, error - blue, it's 140 degree polynome
I need to find the polynomial function of degree 29 that exactly fits thirty data points. We can be sure, that such a function exists
Why you sure of this? I tried some twists and visualizations and think you datapoints can't be fit by such polinome.
I'v tried Chebyshev's polynomes, it's doing better, but still can't fit these values even with 140 degree polynome.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from numpy.polynomial.chebyshev import chebfit,chebval
%matplotlib inline
y = [126, 34, 78, 120, 83, 62, 104, 6, 70, 142, 147, 63, 35, 126, 9, 84, 7, 122, 93, 29, 95, 141, 42, 102, 38, 96, 130, 83, 138, 148]
print(len(y))
x = np.arange(len(y))
c = chebfit(x, y, 30)
p = []
for i in np.arange(len(y)):
p.append(chebval(i, c))
df = pd.DataFrame(data={'x': x, 'y': y, 'p': p})
df['diff'] = df['y'] - df['p']
sns.pointplot(x = 'x', y = 'y', data=df, color='green')
sns.pointplot(x = 'x', y = 'p', data=df, color='red')
sns.pointplot(x = 'x', y = 'diff', data=df, color='blue')
While not exact, you get much better results if you use NumPys polyval
import numpy as np
y = [126, 34, 78, 120, 83, 62, 104, 6, 70, 142, 147, 63, 35, 126, 9, 84, 7, 122, 93, 29, 95, 141, 42, 102, 38, 96, 130, 83, 138, 148]
x = np.arange(len(y))
f = np.polyfit(x ,y, 30)
for i in range(len(y)):
print(y[i], " -- ", np.polyval(f, x[i]))
which gives
(126, ' -- ', 125.94427340268774)
(34, ' -- ', 34.674505165214924)
(78, ' -- ', 73.961360153890183)
(120, ' -- ', 133.96863767482208)
(83, ' -- ', 52.113307162099574)
(62, ' -- ', 105.65069882437891)
(104, ' -- ', 68.588480573695762)
(6, ' -- ', 14.814788499822299)
(70, ' -- ', 76.373263353880958)
(142, ' -- ', 149.39793233756134)
...
Note that you should be using a degree 29 polynomial to fit 30 points.

fault in defining numpy array as tensorflow variable

I have a x numpy array:
[0, 6, 3513, 7, 155, 794, 25, 223, 8, 32, 20, 202, 5025, 350, 91, 6, 66, 207, 5, 2]
I want to define it as a tensorflow variable as the following:
tf.Variable(x)
And I get the following error:
TypeError: Expected binary or unicode string, got [0, 6, 3513, 7, 155,
794, 25, 223, 8, 32, 20, 202, 5025, 350, 91, 6, 66, 207, 5, 2]
What the hell?
Can you share what are you trying to do, as tensorflow just defines a variable you can only use that variable when you are executing that session.
Hope Below code helps you.
import tensorflow as tf
import numpy as np
x =[0, 6, 3513, 7, 155, 794, 25, 223, 8, 32, 20, 202, 5025, 350, 91, 6,
66, 207, 5, 2]
# convert it into numpy array
w = np.array(x)
# this create a tensor variable
q = tf.Variable(x)
# create an interactive session
sess = tf.InteractiveSession()
# now you can perform operation on that tensor variable
tf.add(q,q)
x = np.array(...)
v = tf.Variable(tf.constant(x))

Categories

Resources