Original arguements get overwritten - python

So I have this function in Python:
def newk(kor, flds):
field=0.5*flds
knw=[]
for i in range(flds):
ktemp=kor
if ktemp[2]+i>field:
ktemp[2]-=(i-1)
else:
ktemp[2]+=i
knw+=[ktemp]
print knw
print ktemp
print kor, '\n'
return knw
which is called by:
knew=newk(kvals, folds)
My original kvals gets overwritten for some reason. Kvals is a list.
Also ktemp keeps accumulating like knw suppose to and it screws
everything up. My output looks like this:
[[0.05, 0.05, 0.166667]] [0.05, 0.05, 0.166667] [0.05, 0.05, 0.166667]
[[0.05, 0.05, 1.166667], [0.05, 0.05, 1.166667]] [0.05, 0.05,
1.166667] [0.05, 0.05, 1.166667]
[[0.05, 0.05, -0.8333330000000001], [0.05, 0.05, -0.8333330000000001],
[0.05, 0.05, -0.8333330000000001]] [0.05, 0.05, -0.8333330000000001]
[0.05, 0.05, -0.8333330000000001]
K point values are: [0.05, 0.05, -0.8333330000000001] (original kvals was [0.05,0.05,0.166667])
But I need my output to look like this: knw would be [[0.05, 0.05, 0.166667],[0.05, 0.05, 1.166667],[0.05, 0.05, -0.833333], kval would be [0.05, 0.05, 0.166667]
Also, when i change ktemp=kor in the loop to constant ktemp=[0.05, 0.05, 0.166667] everything works.

When you ktemp=kor you end up with two names pointing at the same list object & so a modification to ktemp is the same as modifying kor. If you want a copy of the list, you need to say ktemp = kor[:] (assuming kor is just numbers - if you want a 'deep copy' of a list with complex objects, that's a different issue).

Related

Python display numbers without a comma in a list

below I put the code I would like to get the result like: 0.1, 0.2, 0.3, 0.4 .... but I get this result [0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9] how can I remove those zeros after the decimal point?
squares = []
for i in range(10):
squares.append(i * (0.1))
print(squares)
You can use something like this:
>>> ['{:.2}'.format(i * 0.1) for i in range(10)]
Use the str method format to specify how many decimals to display.
squares = []
for i in range(10):
squares.append(i * (0.1))
print(*["{:.1f}".format(s) for s in squares], sep=', ')
0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9
Sup, Kozinski. Hope you're having a great time.
squares = []
for i in range(10):
squares.append(round(i * (0.1), 1)) #integers will be stored in a proper format
print(squares)
Check out this round function

Append in an array results in a list Python

I have the following code
points=candies
K=5
centers = []
for i in range(K):
centers.append(random.choice(points))
centers
which results in basically a list of arrays
[array([0.6 , 0.92, 0.29]),
array([0.99, 0.23, 0.45]),
array([0.65, 0.6 , 0.03]),
array([0.21, 0.22, 0.55]),
array([0.62, 0.84, 0.83])]
What I want would be a single array like
array[[0.6 , 0.92, 0.29],
[0.99, 0.23, 0.45],
[0.65, 0.6 , 0.03],
[0.21, 0.22, 0.55],
[0.62, 0.84, 0.83]]
What do I have to change?
Either convert the list of arrays to a 2D array:
np.array(centers)
Or start right from an empty array and populate it:
centers = np.empty((K,3))
for i in range(K):
centers[i] = random.choice(points)

Finding the probability of a variable in collection of lists

I have a selection of lists of variables
import numpy.random as npr
w = [0.02, 0.03, 0.05, 0.07, 0.11, 0.13, 0.17]
x = 1
y = False
z = [0.12, 0.2, 0.25, 0.05, 0.08, 0.125, 0.175]
v = npr.choice(w, x, y, z)
I want to find the probability of the value V being a selection of variables eg; False or 0.12.
How do I do this.
Heres what I've tried;
import numpy.random as npr
import math
w = [0.02, 0.03, 0.05, 0.07, 0.11, 0.13, 0.17]
x = 1
y = False
z = [0.12, 0.2, 0.25, 0.05, 0.08, 0.125, 0.175]
v = npr.choice(w, x, y, z)
from collections import Counter
c = Counter(0.02, 0.03, 0.05, 0.07, 0.11, 0.13, 0.17,1,False,0.12, 0.2, 0.25, 0.05, 0.08, 0.125, 0.175)
def probability(0.12):
return float(c[v]/len(w,x,y,z))
which I'm getting that 0.12 is an invalid syntax
There are several issues in the code, I think you want the following:
import numpy.random as npr
import math
from collections import Counter
def probability(v=0.12):
return float(c[v]/len(combined))
w = [0.02, 0.03, 0.05, 0.07, 0.11, 0.13, 0.17]
x = [1]
y = [False]
z = [0.12, 0.2, 0.25, 0.05, 0.08, 0.125, 0.175]
combined = w + x + y + z
v = npr.choice(combined)
c = Counter(combined)
print(probability())
print(probability(v=0.05))
1) def probability(0.12) does not make sense; you will have to pass a variable which can also have a default value (above I use 0.12)
2) len(w, x, y, z) does not make much sense either; you probably look for a list that combines all the elements of w, x, y and z. I put all of those in the list combined.
3) One would also have to put in an additional check, in case the user passes e.g. v=12345 which is not included in combined (I leave this to you).
The above will print
0.0625
0.125
which gives the expected outcome.

searching k nearest neighbors in numpy

I'm new to Python. I want to use numpy and sklearn to do KNN. However, there's a nan in my data. I set dtype of genfromtxt to None but the array will look like below:
[('ADT1_YEAST', 0.58, 0.61, 0.47, 0.13, 0.5, 0.0, 0.48, 0.22, 'MIT')
('ADT2_YEAST', 0.43, 0.67, 0.48, 0.27, 0.5, 0.0, 0.53, 0.22, 'MIT')
('ADT3_YEAST', 0.64, 0.62, 0.49, 0.15, 0.5, 0.0, 0.53, 0.22, 'MIT') ...,
('ZNRP_YEAST', 0.67, 0.57, 0.36, 0.19, 0.5, 0.0, 0.56, 0.22, 'ME2')
('ZUO1_YEAST', 0.43, 0.4, 0.6, 0.16, 0.5, 0.0, 0.53, 0.39, 'NUC')
('G6PD_YEAST', 0.65, 0.54, 0.54, 0.13, 0.5, 0.0, 0.53, 0.22, 'CYT')]
then, I will get data type not understood on NearestNeighbors function.
Here is my code:
npGem = np.genfromtxt('temp.data', dtype=None)
X = np.array(npGem)
nbrs = NearestNeighbors(n_neighbors=5, algorithm='ball_tree').fit(X)
can anyone teach me how to make the list be read? Thanks in advance.
If I understand the problem, you're really asking how to encode the categorical variables such that they can be properly interpreted by the nearest neighbors algorithm. You can do this with sklearn as explained in 4.2.4. Encoding categorical features. On the other hand, if you have incomplete features, 4.2.6. Imputation of missing values.
I think you need to get the data into a matrix properly. I typically do something like this:
import numpy as np
features = [] # list of lists of the feature vairables.
classes = [] # list of the target variables
for line in f:
line = line.strip().split() # will split the line into pieces on any white spaces
features.append(line[1:-1]) # or whatever indices your features are in
classes.append(line[-1]) # or whatever index your target variable is in
classes = np.array(classes)
features = np.array(features,dtype=np.float)

Creating and accessing list of lists in python

I am bit new to python. I started today. My code looks like this as follows
testcases=[
(([0.5,0.4,0.3],'HHTH'),[0.4166666666666667, 0.432, 0.42183098591549295, 0.43639398998330553]),
(([0.14,0.32,0.42,0.81,0.21],'HHHTTTHHH'),[0.5255789473684211, 0.6512136991788505, 0.7295055220497553, 0.6187139453483192, 0.4823974597714815, 0.3895729901052968, 0.46081730193074644, 0.5444108434105802, 0.6297110187222278]),
(([0.14,0.32,0.42,0.81,0.21],'TTTHHHHHH'),[0.2907741935483871, 0.25157009005730924, 0.23136284577678012, 0.2766575695593804, 0.3296000585271367, 0.38957299010529806, 0.4608173019307465, 0.5444108434105804, 0.6297110187222278]),
(([0.12,0.45,0.23,0.99,0.35,0.36],'THHTHTTH'),[0.28514285714285714, 0.3378256513026052, 0.380956725493104, 0.3518717367468537, 0.37500429586037076, 0.36528605387582497, 0.3555106542906013, 0.37479179323540324]),
(([0.03,0.32,0.59,0.53,0.55,0.42,0.65],'HHTHTTHTHHT'),[0.528705501618123, 0.5522060353798126, 0.5337142767315369, 0.5521920592821695, 0.5348391689038525, 0.5152373451083692, 0.535385450497415, 0.5168208803156963, 0.5357708613431963, 0.5510509656933194, 0.536055356823069])]
print 'Inputs'
print '======'
for inputs,output in testcases:
print inputs[0]
print 'Outputs'
print '======='
for inputs,output in testcases:
print output[0]
In the above code gives output as follows
Inputs
======
[0.5, 0.4, 0.3]
[0.14, 0.32, 0.42, 0.81, 0.21]
[0.14, 0.32, 0.42, 0.81, 0.21]
[0.12, 0.45, 0.23, 0.99, 0.35, 0.36]
[0.03, 0.32, 0.59, 0.53, 0.55, 0.42, 0.65]
Outputs
=======
0.416666666667
0.525578947368
0.290774193548
0.285142857143
0.528705501618
But I need to access each row of the testcases list above like to get the output as follows what should be the code?
([0.5,0.4,0.3],'HHTH'),[0.4166666666666667, 0.432, 0.42183098591549295, 0.43639398998330553])
This will print what you're after (if that's what you mean).
>>> for inputs,outputs in testcases:
... print '%r, %r' % (inputs, outputs)
([0.5, 0.4, 0.3], 'HHTH'), [0.4166666666666667, 0.432, 0.42183098591549295, 0.43639398998330553]
([0.14, 0.32, 0.42, 0.81, 0.21], 'HHHTTTHHH'), [0.5255789473684211, 0.6512136991788505, 0.7295055220497553, 0.6187139453483192, 0.4823974597714815, 0.3895729901052968, 0.46081730193074644, 0.5444108434105802, 0.6297110187222278]
([0.14, 0.32, 0.42, 0.81, 0.21], 'TTTHHHHHH'), [0.2907741935483871, 0.25157009005730924, 0.23136284577678012, 0.2766575695593804, 0.3296000585271367, 0.38957299010529806, 0.4608173019307465, 0.5444108434105804, 0.6297110187222278]
([0.12, 0.45, 0.23, 0.99, 0.35, 0.36], 'THHTHTTH'), [0.28514285714285714, 0.3378256513026052, 0.380956725493104, 0.3518717367468537, 0.37500429586037076, 0.36528605387582497, 0.3555106542906013, 0.37479179323540324]
([0.03, 0.32, 0.59, 0.53, 0.55, 0.42, 0.65], 'HHTHTTHTHHT'), [0.528705501618123, 0.5522060353798126, 0.5337142767315369, 0.5521920592821695, 0.5348391689038525, 0.5152373451083692, 0.535385450497415, 0.5168208803156963, 0.5357708613431963, 0.5510509656933194, 0.536055356823069]
And if you really require the extra ) at the end of the line change the print statement to:
print '%r, %r)' % (inputs, outputs)
Seems like you need a nested loop for that. e.g:
for inputs,output in testcases:
for output in outputs:
print output

Categories

Resources