Creating a Running Total column in a matrix? - python

Related to this question I recently posted & resolved.
If the 3rd column of a matrix is to be changed to be the running total of the sums, how would I adjust my code to do so?
This is where I'm at so far:
def dice(n):
rolls = np.empty(shape=(n, 3),dtype=int)
for i in range(n):
x = random.randint(1,6)
y = random.randint(1,6)
if x in [1,3,5] or y in [1,3,5]:
sum_2_dice = x + y
z = np.cumsum(sum_2_dice)
rolls[i,:] = x, y, z
else:
sum_2_dice = -(x+y) # meaning you "lose the sum" basically
z = np.cumsum(sum_2_dice)
rolls[i,:] = x, y, z
return rolls `
So for example: dice(2)
returns
array[[2, 6, -8],
[1, 3, 4],
[5, 2, 7])
when it should really be returning:
array[[2, 6, -8],
[1, 3, -4],
[5, 2, 3])
I thought np.cumsum would be doing something, but I'm not sure. Is a while loop needed to do this (I'm not sure of where it'd be applied)? I've tried various adjustments like instead of having z = np.cumsum(sum_2_dice) I did sum_2_dice += sum_2_dice (consequently the code that followed it was rolls[i,:] = x, y, sum_2_dice but that was terribly wrong since all that ended up doing was doubling the sum values in every column, not doing any sort of running total calculations.

For your purposes, an easy way to keep track of z would be to initialise it outside of the loop, then keep adding the value of sum_2_dice.
def dice(n):
z = 0
rolls = np.empty(shape=(n, 3),dtype=int)
for i in range(n):
x = random.randint(1,6)
y = random.randint(1,6)
if x in [1,3,5] or y in [1,3,5]:
sum_2_dice = x + y
z += sum_2_dice
rolls[i,:] = x, y, z
else:
sum_2_dice = -(x+y) # meaning you "lose the sum" basically
z += sum_2_dice
rolls[i,:] = x, y, z
return rolls
print (dice(3))
#[[ 6 2 -8]
# [ 4 5 1]
# [ 1 5 7]]
For reference, numpy.cumsum is normally used to get the cumulative sum of the elements of arrays, for example:
test = np.arange(0,5,1)
print (test)
# [0 1 2 3 4]
print (np.cumsum(test))
# [0 1 3 6 10]

Related

python appending does not work while using multiprocessing

So the problem is that appending "e" to my list "ok = []" does not have an effect, but that is weird considering that, when doing print(e) just one line above ok.append(e), the value of "e" is printed out as it should.
no need to understand the program and what it does, the main issure here is just that appending some value to my list does not have effect, even though the value is real.
I tried to use ok = [] inside of if __name__=='__main__': however that gave me the error NameError: name 'ok' is not defined so i then tried to use "global ok" inside of "some_function" however that gave me the same results
import time
import multiprocessing as mp
ratios1 = [1/x for x in range(1,11)]
ratios2 = [y/1 for y in range(1,11)]
x = 283
y = 436
ok = []
def some_function(x_, y_):
list_ = [[a, b] for a in range(1, 1980 + 1) for b in range(1, 1980 + 1) if a / b == x_ / y_]
for e in list_:
if not e[0] in [h[0] for h in ok]:
if not e[1] in [u[1] for u in ok]:
print(e)
ok.append(e)
if __name__=='__main__':
processes = []
if x / y in ratios1 or x / y in ratios2:
some_function(x_=x, y_=y)
else:
for X_, Y_ in [
[x, y],
[x - 1, y], [x, y - 1], [x + 1, y], [x, y + 1],
[x - 2, y], [x, y - 2], [x + 2, y], [x, y + 2],
[x - 3, y], [x, y - 3], [x + 3, y], [x, y + 3]
]:
p = mp.Process(target=some_function, args=(X_,Y_))
processes.append(p)
start = time.time()
for p_ in processes:
p_.start()
for p_ in processes:
p_.join()
end = time.time()
print(f"finished in {end - start} sec")
print(ok)
when running this is output:
[...] # other values of "e"
[283, 433] # some random "e" value
[566, 866] # some random "e" value
[849, 1299] # some random "e" value
[1132, 1732] # some random "e" value
finished in 0.8476874828338623 sec # execution time
[] # the "ok" list being printed out at the end
after adding print(id(ok)) both in "some_function" and in the end, it gives me the following output:
OBS: I removed print(e) for this output
2489040444480
3014871358528
2324227431488
2471301880896
1803966487616
2531583073344
1665411652672
2149818113088
2330038901824
1283883998272
2498472320064
2147028311104
2509405887552
finished in 0.8341867923736572 sec
2589544128640
[]
you need a list that can be accessed from more than one process, which is made by using a multiprocessing.Manager.list, and you have to pass it as an argument, you cannot have it as a global, as inheriting globals is OS sepecific.
using a managed list is slower than a normal list, so if you find the performance unacceptable you should really try to work with only local variables and forget about using globals, as IPC is an expensive process.
import time
import multiprocessing as mp
ratios1 = [1/x for x in range(1,11)]
ratios2 = [y/1 for y in range(1,11)]
x = 283
y = 436
def some_function(x_, y_, ok_list):
list_ = [[a, b] for a in range(1, 1980 + 1) for b in range(1, 1980 + 1) if a / b == x_ / y_]
for e in list_:
if not e[0] in [h[0] for h in ok_list]:
if not e[1] in [u[1] for u in ok_list]:
print(e)
ok_list.append(e)
if __name__=='__main__':
manager = mp.Manager()
ok_list = manager.list()
processes = []
if x / y in ratios1 or x / y in ratios2:
some_function(x_=x, y_=y)
else:
for X_, Y_ in [
[x, y],
[x - 1, y], [x, y - 1], [x + 1, y], [x, y + 1],
[x - 2, y], [x, y - 2], [x + 2, y], [x, y + 2],
[x - 3, y], [x, y - 3], [x + 3, y], [x, y + 3]
]:
p = mp.Process(target=some_function, args=(X_,Y_,ok_list))
processes.append(p)
start = time.time()
for p_ in processes:
p_.start()
for p_ in processes:
p_.join()
end = time.time()
print(f"finished in {end - start} sec")
print(ok_list)
This should work, the problem was that when you start the process the objects it uses are not really passed to it as much as they are cloned. Using muliprocessing.Pool.starmap allows us to return values from the process which circumvents this issue.
We use starmap and not just map, so that we can pass multiple parameters to some_function. Additionally the Pool lets you replace the for X_,Y_ in ... loop and run it multiprocessed.
import time
import multiprocessing as mp
from multiprocessing import Pool
ratios1 = [1/x for x in range(1,11)]
ratios2 = [y/1 for y in range(1,11)]
x = 283
y = 436
ok = []
def some_function(x_, y_):
list_ = [[a, b] for a in range(1, 1980 + 1) for b in range(1, 1980 + 1) if a / b == x_ / y_]
for e in list_:
if not e[0] in [h[0] for h in ok]:
if not e[1] in [u[1] for u in ok]:
print(e)
ok.append(e)
return ok
if __name__=='__main__':
processes = []
res=[]
if x / y in ratios1 or x / y in ratios2:
some_function(x_=x, y_=y)
else:
start = time.time()
with Pool(13) as p:
res = p.starmap(some_function, [[x, y],
[x - 1, y], [x, y - 1], [x + 1, y], [x, y + 1],
[x - 2, y], [x, y - 2], [x + 2, y], [x, y + 2],
[x - 3, y], [x, y - 3], [x + 3, y], [x, y + 3]])
ok = res
end = time.time()
print(f"finished in {end - start} sec")
print(ok)

z3 giving a surprising answer

I tried this:
import z3
n,x,y,z = z3.Ints('n x y z')
s = z3.Solver()
s.add(4/n==1/x+1/y+1/z)
s.add(x>0)
s.add(n>0)
s.add(y>0)
s.add(z>0)
s.check()
s.model()
and I get:
[x = 1, n = 2, z = 3, y = 1, div0 = [(1, 1) → 1, (4, 2) → 2, else → 0], mod0 = [(1, 3) → 1, else → 0]]
But 4/2 does not equal 1/1+1/1+1/3.
What am I doing wrong?
You've declared n, x, y, z as integers. So division is done as an integer, giving you 1/1 = 1 and 1/3 = 0; hence satisfying the equality 2=2.
The obvious thing to do is to use Real values for this problem.
Changing the declaration to:
n,x,y,z = z3.Reals('n x y z')
produces:
[z = 1, y = 1, n = 4/3, x = 1]
which does satisfy the equation you posed trivially.
In case you actually do want n, x, y, z to be integers; then you should convert them to reals before you divide, like this:
import z3
n,x,y,z = z3.Ints('n x y z')
s = z3.Solver()
s.add(4/z3.ToReal(n)==1/z3.ToReal(x)+1/z3.ToReal(y)+1/z3.ToReal(z))
s.add(x>0)
s.add(n>0)
s.add(y>0)
s.add(z>0)
print(s.check())
print(s.model())
This prints:
sat
[n = 4, x = 3, z = 3, y = 3]
again satisfying your constraints.

Plotting a function of two variables based on data from a file

Suppose the values of a function are written line by line in the file in the form x y f(x, y) and I read this file into the list of lists [ [x1, y1, f(x1, y1)], ..., [xN, yN, f(xN, yN)] ]:
with open('data.txt', 'r') as file:
data = [[float(x) for x in line.split()] for line in file]
The question is how to plot this function.
I managed to write a program (see below) that implements the task (the values for data are taken as an example), but it looks too complicated and, it seems to me, there is a more elegant solution.
import numpy as np
import matplotlib.pyplot as plt
data = [[0, 0, 1], [0, 1, 1], [1, 0, 2], [1, 1, 3]]
x, y, _ = np.array(data).T.tolist()
x = list(set(x))
y = list(set(y))
def f(x, y):
for val in data:
if x == val[0] and y == val[1]:
return val[2]
X, Y = np.meshgrid(x, y)
Z = [[f(x_, y_) for x_ in x] for y_ in y]
cp = plt.contourf(Y, X, Z)
plt.colorbar(cp)
plt.show()
Therefore, I think it is more correct to ask how to solve the problem gracefully, elegantly.
I found a way to significantly speed up the filling of Z. The solution, of course, is not elegant.
import numpy as np
import matplotlib.pyplot as plt
data = [[0, 0, 1], [0, 1, 1], [1, 0, 2], [1, 1, 3]]
x, y, _ = np.array(data).T.tolist()
x = list(set(x))
y = list(set(y))
X, Y = np.meshgrid(x, y)
#####
Z = np.zeros((len(x), len(y)))
x_ind = {xx[1] : xx[0] for xx in enumerate(x)}
y_ind = {yy[1] : yy[0] for yy in enumerate(y)}
for d in data:
xx, yy, zz = d
Z[x_ind[xx], y_ind[yy]] = zz
#####
cp = plt.contourf(Y, X, Z)
plt.colorbar(cp)
plt.show()
I compare these two approaches with a simple code:
import numpy as np
import sys
import time
def f1(data, x, y):
for val in data:
if x == val[0] and y == val[1]:
return val[2]
def init1(data, x, y):
Z = [[f1(data, x_, y_) for x_ in x] for y_ in y]
return Z
def init2(data, x, y):
Z = np.zeros((len(x), len(y)))
x_ind = {xx[1] : xx[0] for xx in enumerate(x)}
y_ind = {yy[1] : yy[0] for yy in enumerate(y)}
for d in data:
xx, yy, zz = d
Z[x_ind[xx], y_ind[yy]] = zz
return Z
def test(n):
data = []
x = y = [nn / n for nn in range(n+1)]
for xx in x:
for yy in y:
data.append([xx, yy, 0.])
t1 = time.time()
init1(data, x, y)
dt1 = time.time() - t1
t2 = time.time()
init2(data, x, y)
dt2 = time.time() - t2
print(f'n = {n:5d} ; t1 = {dt1:10.3f} s ; t2 = {dt2:10.3f} s')
def main():
n = 10
if len(sys.argv) > 1:
try:
n = int(sys.argv[1])
except:
print(f'Can not convert "{sys.argv[1]}" to integer')
exit(0)
if n <= 0:
print(f'n is negative or equal zero')
exit(0)
test(n)
if __name__ == '__main__':
main()
Without specifying the characteristics of the machine on which the program was run, I will only give the result of its work for n = 100 and n = 200:
$ python test.py 100 ; python test.py 200
n = 100 ; t1 = 1.092 s ; t2 = 0.001 s
n = 200 ; t1 = 19.312 s ; t2 = 0.005 s
Of course, this is still an inefficient way. So, for example, it will take 2 seconds for a 4000 by 4000 grid.
I want to note that the new method works acceptably on small and medium amounts of data, and the operating time of matplotlib is significantly longer. On large amounts of data, the problems are primarily related to matplotlib.
I think that, although the solution is not elegant, it solves the problem at an acceptable speed. To be honest, I'm not even sure that the result can be significantly accelerated.

Explain why does the program prints out 4 for both x and y?

This is the code that I wrote, Also I Want to know why didn’t x and y swap numbers?
x = 3
y = 4
x = y
y = x
print (y)
print (x)
Output:
4
4
Your code is essentially doing this:
x = 3
y = 4
x = y # y is 4
y = x # x is now 4 also
To swap the values stored in variables you usually have to have a "temporary" variable.
def swap(x, y):
temp = x
x = y
y = temp
Your code is assigning x to y then assigning it back.
x = 3 #x=3
y = 4 #y=4
x = y # x = y = 4 -> x=4
y = x # y = x = 4 -> y=4
If you want to swap vals, then just do this
x = 3
y = 4
x,y = y,x
It also works with any number of vals
x=3
y=4
z=5
#you can also assign like this
x,y,z = 3,4,5
x,y,z = y,z,x
#this makes x=4, y=5, z=3

Python code for Lagrange interpolation - determining the equation of the polynomial

The following code takes in a single value, x, and a list of points, X, and determines the value of the Lagrange polynomial through the list of points at the given x value.
def chunkIt(seq, num):
avg = len(seq) / float(num)
out = []
last = 0.0
while last < len(seq):
out.append(seq[int(last):int(last + avg)])
last += avg
return out
def product(list):
p = 1
for i in list:
p *= i
return p
def Lagrange(x,X):
T = np.zeros((2,len(X)))
list = []
for i in range(len(X)):
for j in range(len(X)):
if i != j:
list.append((x-X[j][0])/(X[i][0]-X[j][0]))
p = []
for i in chunkIt(list,len(X)):
p.append(product(i))
for i in range(len(X)):
T[0][i] = p[i]
T[1][i] = X[i][1]
list2 = []
for i in range(len(X)):
list2.append(T[0][i]*T[1][i])
return sum(list2)
For example:
x, X = 3, [[0,0],[1,1],[2,0.5]]
gives a value of -1.5.
How do I modify this code to determine the equation of the polynomial through the list of points? i.e. if I put x = 'x' as the input, I want it to return -0.75x**2 + 1.75x [for the given example]
import numpy as np
from pypoly import Polynomial
x, X = 3, [[0, 0], [1, 1], [2, 0.5]]
order = len(X)
This is the order of the resulting Lagrange polynomial. For your example, order is 3.
equations = np.array([[point[0] ** i for i in range(order)] for point in X])
values = np.array([point[1] for point in X])
coefficients = np.linalg.solve(equations, values)
This sets up simultaneous equations by substituting the points into a general polynomial. For order 3, the general polynomial is:
a * x ** 2 + b * x ** 1 + c * x ** 0 = y
It solves the system of simultaneous equations to find coefficients. For order 3, we get the values of a, b, c.
print 'coefficients', list(coefficients)
coefficients [0.0, 1.75, -0.75]
p = Polynomial(*coefficients)
Here, the * operator splits the elements of the array-like into individual values to be passed as arguments to Polynomial().
print p
1.75 * X - 0.75 * X**2
print p(x)
-1.5
To install PyPolynomial with pip, use:
for Python 2:
pip install PyPolynomial
for Python 3:
pip3 install PyPolynomial

Categories

Resources