I want to change a parameters in a list of elements QF and QD as following:
Lattice is a list includes these elements for example :
lattice = [QF, QD, QF, QD]
And the indexes of these elements in the list are given in quad_indexes.
QF and QD has two parameters (K and FamName):
QF.K = 1.7 and QD.K = -2.1
QF.FamName = 'QF'
QD.FamName = 'QD'
I want to gave the random value for the parameter K for each element individually
I tried:
i =0
Quad_strength_err =[]
while(i < len(quad_indexes)):
if(lattice[quad_indexes[i]].FamName == 'QF'):
lattice[quad_indexes[i]].K *= (1 + errorQF*random())
i += 1
elif(lattice[quad_indexes[i]].FamName == 'QD'):
lattice[quad_indexes[i]].K *= (1 + errorQD * random())
i += 1
for j in quad_indexes:
quad_strength_err = lattice[j].K
Quad_strength_err.append(quad_strength_err)
The problem is when i print (Quad_strength_err) i got fixed value for each of QF and QD, for example:
[1.8729820159805597, -2.27910323371567, 1.8729820159805597, -2.27910323371567]
I am looking foe a result for example:
[1.7729820159805597, -2.17910323371567, 1.8729820159805597, -2.27910323371567]
TL;DR
You need to make copies of QF and QD - you're making aliases.
The problem
The problem is almost certainly due to aliasing.
When you initialize lattice with this line:
lattice = [QF, QD, QF, QD]
what you are doing is creating a structure with two pointers to QF and two pointers to QD.
In your loop, you then modify QF twice, once via lattice[0] and again via lattice[2], and ditto for QD.
The solution
What you need to do is create copies, maybe shallow, maybe deep.
Try this option:
from copy import copy
lattice = [copy(QF), copy(QD), copy(QF), copy(QD)]
and if that still doesn't work, you might need a deepcopy:
from copy import deepcopy
lattice = [deepcopy(QF), deepcopy(QD), deepcopy(QF), deepcopy(QD)]
Or, a more compact version of the same code, just cause I like comprehensions:
from copy import deepcopy
lattice = [deepcopy(x) for x in (QF, QD, QF, QD)]
If your code is correct, you can't expect a different result for your first and third element of lattice (resp second and last), since it's (to simplify) the same element.
Using id, you can easily show that, lattice[0] and lattice[2] share the same id, therefore modifying lattice[0] will modify lattice[2].
You can duplicate each object QF and QD to git rid of this behavior
I tryed to build a working sample of your problem starting by building simple QF and QD classes :
from random import random
class QF():
def __init__(self):
self.K = 1.7
self.FamName = "QF"
class QD():
def __init__(self):
self.K = -2.1
self.FamName = "QD"
then I create the lattice with different instance of the classis by calling them with ()
lattice = [QF(), QD(), QF(), QD()]
I think your mistake comes from this step as QF refers to the class it self and QF creats a brand new instance that you can modify separatly from the other one. For example if you do QF.K = 3 and then a = QF(), a.K should return you 3.
finally I apply some randomness using the random imported previously :
for i in lattice:
if i.FamName == "QF":
i.K = (1 + errorQF*random())
elif i.FamName == "QD":
i.K = (1 + errorQD*random())
form which I get :
print(*[i.K for i in lattice])
>>>> 1.148989048860669 0.9324164812782919 1.0652187255939742 0.6860911849022292
Related
I am new to Python, and am struggling with a task that I assume is an extremely simple one for an experienced programmer.
I am trying to create a list of lists of coordinates for different lines. For instance:
list = [ [(x,y), (x,y), (x,y)], [Line 2 Coordinates], ....]
I have the following code:
masterlist_x = list(range(-5,6))
oneline = []
data = []
numberoflines = list(range(2))
i = 1
for i in numberoflines:
slope = randint(-5,5)
y_int = randint(-10,10)
for element in masterlist_x:
oneline.append((element,slope * element + y_int))
data.append(oneline)
The output of the variable that should hold the coordinates to one line (oneline) holds two lines:
Output
I know this is an issue with the outer looping mechanism, but I am not sure how to proceed.
Any and all help is much appreciated. Thank you very much!
#khuynh is right, you simply had the oneline = [] in wrong place, you put all the coords in one line.
Also, you have a couple unnecessary things in your code:
you don't need list() the range(), you can just iterate them directly with for
also you don't need to declare the i for the for, it does it itself
that i is not actually used, which is fine. Python convention for unused variables is _
Fixed version:
from random import randint
masterlist_x = range(-5,6)
data = []
numberoflines = range(2)
for _ in numberoflines:
oneline = []
slope = randint(-5,5)
y_int = randint(-10,10)
for element in masterlist_x:
oneline.append((element,slope * element + y_int))
data.append(oneline)
print(data)
Also on-line there where you can run it: https://repl.it/repls/GreedyRuralProduct
I suspect the whole thing could be also made with much less code, and in a way in a simpler fashion, as a list comprehension ..
UPDATE: the inner loop is indeed very suitable for a list comprehension. Maybe the outer could be made into one as well, and the whole thing could two nested list comprehensions, but I only got confused when tried that. But this is clear:
from random import randint
masterlist_x = range(-5,6)
data = []
numberoflines = range(2)
for _ in numberoflines:
slope = randint(-5,5)
y_int = randint(-10,10)
oneline = [(element, slope * element + y_int)
for element in masterlist_x]
data.append(oneline)
print(data)
Again on repl.it too: https://repl.it/repls/SoupyIllustriousApplicationsoftware
I am implementing a Hierarchy clustering algorithm(with similarity) using python 3.6, the following doing is basically build new empty graph ,and keep connecting the the group(represent by list here ) with largest similarity on original recursively
the code in position 1 of code ,I want to return the best partition, however the function return is exactly the same as comminity_list,it looks like best_partition = comminity_list. make best_partition point to the address of 'comminity_list' how come it happens, what I got wrong here? how should I fix that ?
def pearson_clustering(G):
H = nx.create_empty_copy(G). # build a empty copy of G(no vetices)
best = 0 #for current modularity
current =0 #for best modualarty
A = nx.adj_matrix(G). #get adjacent matrix
org_deg =deg_dict(A, G.nodes()) # degree of G
org_E = G.number_of_edges(). # number of edges of G
comminity_list = intial_commnity_list(G) # function return a list of lists here
best_partition = None
p_table =pearson_table(G) #pearson_table return a dictionary of each pair Pearson correlation
l = len(comminity_list)
while True:
if(l == 2): break
current = modualratiry(H,org_deg,org_E) #find current modularity
l = len(comminity_list)
p_build_cluster(p_table,H,G,comminity_list) #building clustering on H
if(best < current):
best_partition = comminity_list. #postion1
best = current #find the clustering with largest modularity
return best_partition #postion2
it looks like best_partition = comminity_list. make best_partition point to the address of 'comminity_list' how come it happens, what I got wrong here? how should I fix that ?
That is just python's implicit assignment behaviour. When you do "best_partition = comminity_list" you just assign comminity_list to the same address as best_partition.
If you want to explicitly copy the list you can use this (which replaces the list best_partition with the comminity_list):
best_partition[:] = comminity_list
or the copy function. If comminity_list has sublists you will need the deepcopy function instead, from the same module (otherwise you will get a copy of the original list, but the sublists will still be just address references).
best_partition = comminity_list.copy
I'm starting with numba and my first goal is to try and accelerate a not so complicated function with a nested loop.
Given the following class:
class TestA:
def __init__(self, a, b):
self.a = a
self.b = b
def get_mult(self):
return self.a * self.b
and a numpy ndarray that contains class TestA objects. Dimension (N,) where N is usually ~3 million in length.
Now given the following function:
def test_no_jit(custom_class_obj_container):
container_length = len(custom_class_obj_container)
sum = 0
for i in range(container_length):
for j in range(i + 1, container_length):
obj_i = custom_class_obj_container[i]
obj_j = custom_class_obj_container[j]
sum += (obj_i.get_mult() + obj_j.get_mult())
return sum
I've tried to play around numba to get it to work with the function above however I cannot seem to get it to work with nopython=True flag, and if it's set to false, then the runtime is higher than the no-jit function.
Here is my latest try in trying to jit the function (also using nb.prange):
#nb.jit(nopython=False, parallel=True)
def test_jit(custom_class_obj_container):
container_length = len(custom_class_obj_container)
sum = 0
for i in nb.prange(container_length):
for j in nb.prange(i + 1, container_length):
obj_i = custom_class_obj_container[i]
obj_j = custom_class_obj_container[j]
sum += (obj_i.get_mult() + obj_j.get_mult())
return sum
I've tried to search around but I cannot seem to find a tutorial of how to define a custom class in the signature, and how would I go in order to accelerate a function of that sort and get it to run on GPU and possibly (any info regarding that matter would be highly appreciated) to get it to run with cuda libraries - which are installed and ready to use (previously used with tensorflow)
The numba docs give an example of creating a custom type, even for nopython mode: https://numba.pydata.org/numba-doc/latest/extending/interval-example.html
In your case though, unless this is a really slimmed down version of what you actually want to do, it seems like the easiest approach would be to re-use existing types. Additionally, the construction of a 3M length object array is going to be slow, and produce fragmented memory (as the objects are not being stored in contiguous blocks).
An example of how using record arrays might be used to solve the problem:
x_dt = np.dtype([('a', np.float64),
('b', np.float64)])
n = 30000
buf = np.arange(n*2).reshape((n, 2)).astype(np.float64)
vec3 = np.recarray(n, dtype=x_dt, buf=buf)
#numba.njit
def mult(a):
return a.a * a.b
#numba.jit(nopython=True, parallel=True)
def sum_of_prod(vector):
sum = 0
vector_len = len(vector)
for i in numba.prange(vector_len):
for j in numba.prange(i + 1, vector_len):
sum += mult(vector[i]) + mult(vector[j])
return sum
sum_of_prod(vec3)
FWIW, I'm no numba expert. I found this question when searching for how to implement a custom type in numba for non-numerical stuff. In your case, because this is highly numerical, I think a custom type is probably overkill.
I am writing a scientific code in python to calculate the energy of a system.
Here is my function : cte1, cte2, cte3, cte4 are constants previously computed; pii is np.pi (calculated beforehand, since it slows the loop otherwise). I calculate the 3 components of the total energy, then sum them up.
def calc_energy(diam):
Energy1 = cte2*((pii*diam**2/4)*t)
Energy2 = cte4*(pii*diam)*t
d=diam/t
u=np.sqrt((d)**2/(1+d**2))
cc= u**2
E = sp.special.ellipe(cc)
K = sp.special.ellipk(cc)
Id=cte3*d*(d**2+(1-d**2)*E/u-K/u)
Energy3 = cte*t**3*Id
total_energy = Energy1+Energy2+Energy3
return (total_energy,Energy1)
My first idea was to simply loop over all values of the diameter :
start_diam, stop_diam, step_diam = 1e-10, 500e-6, 1e-9 #Diametre
diametres = np.arange(start_diam,stop_diam,step_diam)
for d in diametres:
res1,res2 = calc_energy(d)
totalEnergy.append(res1)
Energy1.append(res2)
In an attempt to speed up calculations, I decided to use numpy to vectorize, as shown below :
diams = diametres.reshape(-1,1) #If not reshaped, calculations won't run
r1 = np.apply_along_axis(calc_energy,1,diams)
However, the "vectorized" solution does not properly work. When timing I get 5 seconds for the first solution and 18 seconds for the second one.
I guess I'm doing something the wrong way but can't figure out what.
With your current approach, you're applying a Python function to each element of your array, which carries additional overhead. Instead, you can pass the whole array to your function and get an array of answers back. Your existing function appears to work fine without any modification.
import numpy as np
from scipy import special
cte = 2
cte1 = 2
cte2 = 2
cte3 = 2
cte4 = 2
pii = np.pi
t = 2
def calc_energy(diam):
Energy1 = cte2*((pii*diam**2/4)*t)
Energy2 = cte4*(pii*diam)*t
d=diam/t
u=np.sqrt((d)**2/(1+d**2))
cc= u**2
E = special.ellipe(cc)
K = special.ellipk(cc)
Id=cte3*d*(d**2+(1-d**2)*E/u-K/u)
Energy3 = cte*t**3*Id
total_energy = Energy1+Energy2+Energy3
return (total_energy,Energy1)
start_diam, stop_diam, step_diam = 1e-10, 500e-6, 1e-9 #Diametre
diametres = np.arange(start_diam,stop_diam,step_diam)
a = calc_energy(diametres) # Pass the whole array
Edit: I've found what the problem boils down to:
If you run this code:
A = ones((10,4))
view = A[:,1]
view.fill(7)
A
or
A = ones((10,4))
view = A[:,1:3]
view.fill(7)
A
You'll see that the columns of A change
If you run this:
A = ones((10,4))
view = A[:,(1,2)]
view.fill(7)
A
There's no side effects on A. Is this behavior on purpose or a bug?
I have a function that calculates the amount I have to rotate certain columns of x,y points in a matrix. The function only takes one input - a matrix mat:
def rotate(mat):
In the function, I create views to make working with each section easier:
rot_mat = mat[:,(col,col+1)]
Then, I calculate a rotation angle and apply it back on the view that I had created before:
rot_mat[row,0] = cos(rot)*x - sin(rot)*y
rot_mat[row,1] = sin(rot)*x + cos(rot)*y
If I perform this in the main body of my program, the changes to my rot_mat view would propagate to the original matrix mat. When I turned it into a function, the views stopped having side effects on the original matrix. What's the reasoning for this and is there any way to get around it? I should also note that it isn't changing mat within the function itself. At the end, I just try to return mat but no changes have been made.
Full code for function:
def rotate(mat):
# Get a reference shape
ref_sh = 2*random.choice(range(len(filelist)))
print 'Reference shape is '
print (ref_sh/2)
# Create a copy of the reference point matrix
ref_mat = mat.take([ref_sh,ref_sh+1],axis=1)
# Calculate rotation for each set of points
for col in range(len(filelist)):
col = col * 2 # Account for the two point columns
rot_mat = mat[:,(col,col+1)]
# Numerator = sum of wi*yi - zi*xi
numer = inner(ref_mat[:,0],rot_mat[:,1]) - inner(ref_mat[:,1],rot_mat[:,0])
# Denominator = sum of wi*xi + zi*yi
denom = inner(ref_mat[:,0],rot_mat[:,0]) + inner(ref_mat[:,1],rot_mat[:,1])
rot = arctan(numer/denom)
# Rotate the points in rot_mat. As it's a view of mat, the effects are
# propagated.
for row in range(num_points):
x = rot_mat[row,0]
y = rot_mat[row,1]
rot_mat[row,0] = cos(rot)*x - sin(rot)*y
rot_mat[row,1] = sin(rot)*x + cos(rot)*y
return mat
When you do view = A[:,(1,2)] you are using advanced indexing (Numpy manual: Advanced Indexing), which means that the array returns a copy, not a view. It's advanced because your indexing object is a tuple "containing at least one sequence" (the sequence being the tuple (1,2)). The total explicit selection object obj in your case would equal (slice(None), (1,2)), i.e. A[(slice(None), (1,2))] returns the same thing as A[:,(1,2)].
As larsmans suggests above, it seems that __getitem__ and __setitem__ behave differently for advanced indexing, which makes sense, because assigning values to a copy would have no use (the copy would not be stored).