I'm trying to create a 3D plot with Python library "ipyvolume" where every point in the plot has a colour. The points can be repeated colours. There is a problem when it paint the points in the plot. Some idea to fix this?
Import the libraries:
import pandas as pd
import numpy as np
import ipyvolume as ipv
Load the data:
dataframe = pd.read_csv("C:/Users/j/Desktop/K - Means/test.csv",sep=",")
dataframe.head()
Picture about Dataframe:
Dataframe
Creation of the axes:
X = np.array(dataframe[["op","ex","ag"]])
y = np.array(dataframe['categoria'])
Information about X:
array([[34.297953, 41.948819, 29.370315],
[44.986842, 37.938947, 24.279098],
[41.733854, 38.999896, 34.645521],
[40.377154, 52.337538, 31.082154],
[36.664677, 48.530806, 31.138871],
[33.531771, 43.211667, 25.786667],
[31.851102, 47.182362, 19.594331],
[31.865118, 55.377559, 36.258346],
[46.393488, 39.93031 , 16.658062],
[39.436667, 32.966288, 32.291591],
[52.750992, 41.698855, 17.057176],
[41.328182, 39.173333, 21.070505],
[54.407727, 34.104318, 18.771818],
[47.610076, 39.439545, 21.438409],
[39.435149, 41.479403, 21.004104],
[48.617348, 43.617955, 19.263258],
[40.073543, 44.194724, 33.921417],
[43.37292 , 43.792263, 21.067737],
[49.792403, 41.435581, 16.433953],
[30.020465, 44.29969 , 39.117984],
[36.909459, 51.947297, 34.687568],
[50.594462, 41.383154, 17.896538],
[34.186667, 18.693542, 9.682292],
[31.215455, 44.180909, 32.87 ],
[47.27686 , 41.973372, 12.40186 ],
[45.369773, 35.925909, 23.478258],
[35.943438, 45.519531, 28.02125 ],
[36.272348, 40.065152, 28.706894],
[44.501603, 46.598931, 29.535038],
[49.028308, 38.450462, 19.791538],
[34.235923, 41.231615, 14.153692],
[53.11048 , 39.00608 , 17.2064 ],
[49.28542 , 42.117786, 21.008931],
[52.895725, 38.620229, 19.972748],
[30.691797, 59.824844, 33.395938],
[34.949528, 50.177402, 36.325276],
[41.76596 , 49.865253, 30.071414],
[30.825938, 55.912578, 29.489922],
[38.948976, 44.460866, 27.345827],
[46.955854, 35.376179, 23.747561],
[45.053969, 48.950992, 24.374427],
[45.088504, 50.765276, 25.71252 ],
[42.444615, 45.780231, 24.745615],
[40.046439, 37.722197, 30.568258],
[52.535221, 35.290973, 15.793009],
[56.691163, 31.135698, 20.439651],
[48.709282, 44.728513, 19.387538],
[53.453713, 38.522321, 16.655907],
[31.450855, 45.490983, 40.583162],
[31.891474, 53.373368, 24.296316],
[49.077731, 45.670798, 17.449202],
[36.196989, 42.358817, 24.191613],
[38.91342 , 46.979524, 28.669524],
[60.225087, 28.902609, 14.337043],
[35.545054, 30.295484, 39.422796],
[56.815859, 38.419375, 13.961641],
[49.47 , 30.96626 , 23.053053],
[47.811742, 41.36447 , 20.816439],
[35.779512, 31.227724, 27.689919],
[55.974031, 33.09 , 21.330698],
[40.502021, 34.040957, 16.767979],
[38.78828 , 36.947204, 24.048172],
[52.082462, 39.402308, 16.628231],
[57.427596, 33.121827, 12.412404],
[39.528547, 42.353077, 23.810769],
[39.36155 , 40.205116, 26.27124 ],
[66.665564, 26.855564, 15.602331],
[48.587099, 26.988702, 9.948168],
[52.675729, 35.32625 , 16.510208],
[45.813043, 53.54587 , 30.403261],
[44.765313, 43.954375, 24.824609],
[42.643386, 33.345984, 14.643386],
[44.512578, 37.723594, 15.144922],
[51.830571, 44.304667, 10.049524],
[42.202857, 38.628681, 21.68989 ],
[57.241308, 33.237462, 16.194154],
[36.353298, 39.223723, 26.603617],
[35.566589, 48.679535, 29.923023],
[33.422105, 56.539263, 32.230842],
[31.7503 , 44.3443 , 39.1499 ],
[33.332362, 46.603622, 37.348898],
[41.929385, 41.960077, 17.815385],
[57.145227, 31.194545, 16.385 ],
[46.137348, 43.874697, 15.843258],
[49.331231, 34.458231, 23.982462],
[44.171154, 43.299846, 27.451538],
[49.322373, 41.494915, 14.199153],
[46.158281, 47.806719, 23.341641],
[48.355859, 35.778281, 15.101563],
[47.143474, 40.162316, 20.52 ],
[48.403333, 36.152326, 12.157829],
[40.281616, 35.341515, 20.805657],
[49.049323, 32.918647, 22.447594],
[47.737462, 41.528077, 19.694385],
[48.743333, 42.93187 , 17.984797],
[38.766702, 42.88383 , 22.15266 ],
[38.471406, 41.289922, 39.664375],
[54.911368, 42.269895, 11.263263],
[37.240989, 46.254286, 31.804286],
[46.319462, 38.176692, 14.143846],
[53.331333, 33.349333, 18.497333],
[51.006406, 36.351563, 22.484609],
[47.646364, 39.943939, 23.249848],
[32.683125, 54.681667, 35.906667],
[65.067447, 25.46617 , 14.787447],
[54.431756, 37.019847, 19.690305],
[35.834375, 44.595625, 23.930625],
[39.546441, 45.188475, 25.213644],
[41.114 , 41.884769, 19.713231],
[50.898163, 38.136837, 19.937347],
[45.669015, 44.523106, 20.548864],
[37.411719, 43.379531, 33.332422],
[31.541828, 47.688172, 28.897527],
[41.483701, 50.352283, 30.561496],
[36.813721, 52.722403, 14.703256],
[43.81828 , 42.931613, 17.494624],
[39.31561 , 30.73935 , 13.23122 ],
[63.995606, 26.921818, 9.305985],
[44.541328, 45.529453, 33.89125 ],
[35.420439, 41.05807 , 24.249737],
[45.162043, 34.678602, 22.719355],
[38.499688, 46.513828, 34.344766],
[55.293566, 49.822326, 20.592791],
[46.21 , 35.002222, 19.006667],
[54.151721, 32.722131, 11.041475],
[43.443893, 23.982901, 17.032443],
[40.120985, 27.149545, 23.975758],
[53.95 , 42.411488, 16.108347],
[48.796045, 46.014478, 14.642985],
[43.805615, 36.315846, 21.608308],
[51.161 , 44.074 , 17.386154],
[58.380294, 45.653922, 12.822843],
[40.345769, 37.003923, 17.285538],
[40.808939, 43.961591, 18.982424],
[57.962308, 33.373538, 17.684 ],
[35.569389, 38.904885, 31.624351],
[31.960417, 48.533125, 40.096458],
[71.696129, 27.57121 , 19.093548],
[51.537405, 36.465344, 23.008168],
[36.258913, 45.225652, 39.427283]])
Information about y:
array([7, 7, 4, 2, 4, 7, 7, 5, 7, 7, 3, 1, 1, 2, 8, 3, 4, 6, 2, 4, 2, 3,
3, 7, 2, 4, 8, 1, 4, 3, 8, 1, 2, 7, 4, 5, 1, 2, 2, 1, 6, 2, 6, 1,
1, 2, 6, 3, 1, 7, 2, 8, 6, 2, 8, 2, 1, 3, 8, 2, 8, 4, 2, 1, 8, 9,
1, 1, 2, 4, 6, 8, 8, 4, 9, 2, 8, 4, 4, 9, 5, 2, 4, 1, 2, 7, 2, 3,
2, 1, 2, 7, 2, 2, 1, 7, 7, 2, 4, 6, 1, 1, 1, 4, 2, 4, 2, 8, 7, 5,
9, 9, 8, 9, 7, 1, 8, 2, 4, 8, 8, 2, 2, 1, 2, 1, 6, 2, 4, 2, 1, 1,
1, 7, 3, 7, 4, 2, 1, 1], dtype=int64)
In this piece of code I am trying to add different colours by every point in the plot:
fig = ipv.figure()
colores=['blue','red','green','cyan','yellow','orange','black','pink','brown','purple']
asignar=[]
for row in y:
asignar.append(colores[row])
scatter=ipv.scatter(X[:, 0], X[:, 1], X[:, 2],marker="sphere", color=asignar, size=2)
ipv.selector_default
ipv.show()
The result of the last piece of code is an infinite execution.
Changing the scatter's color the plot is created:
fig = ipv.figure()
colores=['blue','red','green','cyan','yellow','orange','black','pink','brown','purple']
asignar=[]
for row in y:
asignar.append(colores[row])
scatter=ipv.scatter(X[:, 0], X[:, 1], X[:, 2],marker="sphere", color="red", size=2)
ipv.selector_default
ipv.show()
Plot
Suggestion from https://github.com/maartenbreddels/ipyvolume/issues/12#issuecomment-284685146 might work, something like:
import ipyvolume as ipv
import matplotlib
c = matplotlib.cm.afmhot(np.linspace(0, 1, len(y)))
ipv.quickscatter(X[:, 0], X[:, 1], X[:, 2],marker="sphere",color=c,size=2)
I'm doing bioinformatics and we map small RNA on mRNA. We have the mapping coordinate of a protein on each mRNA and we calculate the relative distance between the place where the protein bound the mRNA and the site that is bound by a small RNA.
I obtain the following dataset :
dist eff
-69 3
-68 2
-67 1
-66 1
-60 1
-59 1
-58 1
-57 2
-56 1
-55 1
-54 1
-52 1
-50 2
-48 3
-47 1
-46 3
-45 1
-43 1
0 1
1 2
2 12
3 18
4 18
5 13
6 9
7 7
8 5
9 3
10 1
13 2
14 3
15 2
16 2
17 2
18 2
19 2
20 2
21 3
22 1
24 1
25 1
26 1
28 2
31 1
38 1
40 2
When i plot the data, i have 3 pics : 1 at around 3 -4
another one around 20 and a last one around -50.
I try cubic spline interpolation, but it does'nt work very well for my data.
My idea was to do curve fitting with a sum of gaussians.
For example in my case, estimate 3 gaussian curve at point 5,20 and -50.
How can i do so ?
I looked at scipy.optimize.curve_fit(), but how can i fit the curve at precise intervalle ?
How can i add the curve to have one single curve ?
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats
import scipy.optimize
data = np.array([-69,3, -68, 2, -67, 1, -66, 1, -60, 1, -59, 1,
-58, 1, -57, 2, -56, 1, -55, 1, -54, 1, -52, 1,
-50, 2, -48, 3, -47, 1, -46, 3, -45, 1, -43, 1,
0, 1, 1, 2, 2, 12, 3, 18, 4, 18, 5, 13, 6, 9,
7, 7, 8, 5, 9, 3, 10, 1, 13, 2, 14, 3, 15, 2,
16, 2, 17, 2, 18, 2, 19, 2, 20, 2, 21, 3, 22, 1,
24, 1, 25, 1, 26, 1, 28, 2, 31, 1, 38, 1, 40, 2])
x, y = data.reshape(-1, 2).T
def tri_norm(x, *args):
m1, m2, m3, s1, s2, s3, k1, k2, k3 = args
ret = k1*scipy.stats.norm.pdf(x, loc=m1 ,scale=s1)
ret += k2*scipy.stats.norm.pdf(x, loc=m2 ,scale=s2)
ret += k3*scipy.stats.norm.pdf(x, loc=m3 ,scale=s3)
return ret
params = [-50, 3, 20, 1, 1, 1, 1, 1, 1]
fitted_params,_ = scipy.optimize.curve_fit(tri_norm,x, y, p0=params)
plt.plot(x, y, 'o')
xx = np.linspace(np.min(x), np.max(x), 1000)
plt.plot(xx, tri_norm(xx, *fitted_params))
plt.show()
>>> fitted_params
array([ -60.46845528, 3.801281 , 13.66342073, 28.26485602,
1.63256981, 10.31905367, 110.51392765, 69.11867159,
63.2545624 ])
So you can see your idea of the three peaked function doesn't agree too much with your real data.