Related
PROBLEM: I was wondering if it is possible to replace this for loop with a list comprehension. Trying to learn idiomatic Python.
STATEMENT:
Take row #0 of X and c
Multiply corresponding elements (you get 5 numbers for 5 columns)
Sum these 5 numbers
Subtract it from y[0]
square it for the final result
Repeat for all rows of X, c and columns of y
Find the index of c with the smallest result in step 5
import numpy as np
X = np.array([[66, 5, 15, 2, 500],
[21, 3, 50, 1, 100],
[120, 15, 5, 2, 1200]])
y = np.array([250000, 60000, 525000])
c = np.array([[3000, 200 , -50, 5000, 100],
[2000, -250, -100, 150, 250],
[3000, -100, -150, 0, 150]])
def find_best(X, y, c):
smallest_error = np.Inf
best_index = -1
idx = 0
sq_error_list = []
for coeff in c:
sq_error = sum((y - (X # c[idx]))**2)
sq_error_list.append(sq_error)
idx += 1
best_index = sq_error_list.index(min(sq_error_list))
print("the best set is set %d" % best_index)
find_best(X, y, c)
You can solve it with this
np.argmin(np.sum((y[:,None]-X # c.T)**2,axis=0))
Suppose there are 4 unsorted arrays as given below:
A = [0, 100, -100, 50, 200]
B = [30, 100, 20, 0]
C = [0, 20, -1, 80]
D = [50, 0, -200, 1]
Suppose X is 0, so the few of the possible O/P should be (pick 1 element from each array which satisfy condition):
0,0,0,0
-100, 100, 0, 0
-100, 30, 20,50 .. etc.
I was able to devise the algorithm which can do this in O(n^3LogN), is there any better way to achieve the same?
My Solution:
1- Sort each array.
2- Fixed the element from array A.
3- run three loops for the rest of the arrays and take the sum of each element:
if sum > 0 (return -1, no such elements exit)
if sum == 0 (return current elements)
if sum < 0 (then advance the pointer from the array for which the current element is minimum.)
Any suggestion over this?
a kind of dynamic programming approach.
initialize sums (a dict of the form {possible_sum0: [way_to_get_sum0, ...]}) with the first list A. this results in
sums = {0: [[0]], 100: [[100]], -100: [[-100]], 50: [[50]], 200: [[200]]}
the update that dictionary with the lists B and C. sums will now contain entries like
sums = {...,
30: [[0, 30, 0]],
50: [[0, 30, 20], [50, 0, 0]],
29: [[0, 30, -1]], ...}
then in find_sum i sort the last list D and the sums for some speedup and break if a give sum X is no longer accessible.
here is the code:
from collections import defaultdict
A = [0, 100, -100, 50, 200]
B = [30, 100, 20, 0]
C = [0, 20, -1, 80]
D = [50, 0, -200, 1]
def initialize_sums(lst):
return {item: [[item]] for item in lst}
def update_sums(sums, lst):
new_sums = defaultdict(list)
for sm, ways in sums.items():
for item in lst:
new_sum = sm + item
for way in ways:
new_sums[new_sum].append(way + [item])
return new_sums
def find_sum(sums, last_lst, X):
last_lst = sorted(last_lst)
ret = []
for sm, ways in sorted(sums.items()):
for item in last_lst:
x = sm + item
if x > X:
break
if x == X:
for way in ways:
ret.append(way + [item])
break
return ret
sums = initialize_sums(lst=A)
sums = update_sums(sums, lst=B)
sums = update_sums(sums, lst=C)
ret = find_sum(sums, last_lst=D, X=0)
print(ret)
# [[-100, 30, 20, 50], [0, 0, -1, 1], [-100, 100, -1, 1], ...]
...did not analyze the overall complexity though.
We can have O(n^2) by hashing pair sums for A and B and checking if for any one of them, sum_AB[i] there might be an X - sum_AB[i] hashed in the pair sums of C and D.
In some circumstances it could be more efficient to enumerate those sums by multiplying each pair of lists as counts of coefficients in polynomials, using a FFT for O(m log m) complexity, where m is the range.
Assuming your arrays all have the same length n (+/- some constant value) you can get O(n^3) by using a set for the fourth array:
from itertools import product
ds = set(D)
for a, b, c in product(A, B, C):
d = X - a - b - c
if d in ds:
print(a, b, c, d)
If one or multiple arrays contain (many) extreme values you can also take shortcuts by checking the running sum against the min and max of subsequent arrays to see if X can still be reached. For example:
ds = set(D)
c_min, c_max = min(C), max(C)
d_min, d_max = min(ds), max(ds)
for a in A:
for b in B:
s = a + b
if s + c_min + d_min > X or s + c_max + d_max < X:
continue # Shortcut here.
for c in C:
d = X - a - b - c
if d in ds:
print(a, b, c, d)
You can further extend this by storing solutions that have already been found for a running sum (of the first two arrays for example) and hence taking a shortcut whenever such a sum is encountered again (by reordering with the min/max check one can avoid repeated computation of s + min/max values):
ds = set(D)
c_min, c_max = min(C), max(C)
d_min, d_max = min(ds), max(ds)
shortcuts = {}
for a in A:
for b in B:
s = a + b
if s in shortcuts:
for c, d in shortcuts[s]:
print(a, b, c, d)
continue
shortcuts[s] = []
if s + c_min + d_min > X or s + c_max + d_max < X:
continue
for c in C:
d = X - a - b - c
if d in ds:
print(a, b, c, d)
shortcuts[s].append((c, d))
A = [0, 100, -100, 50, 200]
B = [30, 100, 20, 0]
C = [0, 20, -1, 80]
D = [50, 0, -200, 1]
solutions = [(x1,x2,x3,x4) for x1 in A for x2 in B for x3 in C for x4 in D if sum([x1,x2,x3,x4]) == 0]
print(solutions)
Output:
>>>[(0, 0, 0, 0), (0, 0, -1, 1), (100, 100, 0, -200), (100, 20, 80, -200), (-100, 30, 20, 50), (-100, 100, 0, 0), (-100, 100, -1, 1), (-100, 20, 80, 0), (200, 0, 0, -200)]
This does exactly what you listed in your steps and works for any size, I don't know if it can get any easier finding all solutions for different list sizes.
find all combinations for an array
def dOfSums(li):
return {sum(x):x for x in sum([list(itertools.combinations(li, i)) for i in range(2,len(li))],[])}
find sums for a number in an array
def findSums(li, num):
return [((namestr(l), dOfSums(l)[num]) for l in li if num in dOfSums(l).keys() ]
name the array
def namestr(obj):
return [name for name in globals() if globals()[name] is obj].pop()
test
for el in findSums([A,B,C,D],50):
print(el)
('A', (0, 100, -100, 50))
('B', (30, 20, 0))
('D', (50, 0))
for el in findSums([A,B,C,D],100):
print(el)
('A', (0, -100, 200))
('B', (100, 0))
('C', (0, 20, 80))
for el in findSums([A,B,C,D],0):
print(el)
('A', (0, 100, -100))
I have four separate lists of integers that I need to use concurrently in an equation:
h = [160, 193, 162, 17, 0]
d = [32, 1, 34, 35, 4]
t = [1, 2, 3, 4, 5]
r = [2, 5, 1, 3, 4]
s = h - (d + t + r)
I am trying to create one function to which I can pass each separate list as an argument to use in the function. I want to be able to take the value at each successive index on each list and then use them in the correct place in the equation. I would then take the value of s at each index and populate a new list.
So for example the equation at index[0] should read:
s = 160 - (32 + 1 + 2)
How can I take each integer value from list? I have tried to use the enumerate function and I have read about the * function, but I am not sure that I am supposed to be unpacking the lists - should I not just be iterating over them with a for loop?
def getSingles(h, d, r, t)
singles = []
for n, val in enumerate(h):
hit = val
for n, val in enumerate(d):
double = val
for n, val in enumerate(t):
triple = val
for n, val in enumerate(r):
run = val
I am basically suck here - is this even possible? Thank you!
You could zip them together. Something like this should work:
>>> H = [160, 193, 162, 17, 0]
>>> D = [32, 1, 34, 35, 4]
>>> T = [1, 2, 3, 4, 5]
>>> R = [2, 5, 1, 3, 4]
>>>
>>> for h, d, t, r in zip(H, D, T, R):
... s = h - (d + t + r)
... print(s)
...
125
185
124
-25
-13
Note that if you're using Python 2.x and using very large lists, you might want to use itertools.izip instead.
You can use zip function and a lambda in map :
>>> map(lambda x: x[0] - (x[1] + x[2] + x[3]),zip(h,d,t,r))
[125, 185, 124, -25, -13]
A pandas Series makes this easy:
>>> import pandas as pd
>>> h = pd.Series([160, 193, 162, 17, 0])
>>> d = pd.Series([32, 1, 34, 35, 4])
>>> t = pd.Series([1, 2, 3, 4, 5])
>>> r = pd.Series([2, 5, 1, 3, 4])
>>> s = h - (d + t + r)
>>> s
0 125
1 185
2 124
3 -25
4 -13
dtype: int64
If you have h,d,t,r data in a CSV file, you can use pandas.read_csv() to read that into a pandas Dataframe. A Dataframe is like an array of Series, and can calculate new columns in a similar fashion.
I am having some problems resizing a list in python. I have a vector (A) with -9999999 as a few of the elements. I want to find those elements remove them and remove the corresponding elements in B.
I have tried to index the non -9999999 values like this:
i = [i for i in range(len(press)) if press[i] !=-9999999]
But I get an error when I try to use the index to reshape press and my other vector.
Type Error: list indices must be integers, not list
The vectors have a length of about 26000
Basically if I have vector A I want to remove -9999999 elements from A and 65 and 32 in B.
A = [33,55,-9999999,44,78,22,-9999999,10,34]
B = [22,33,65,87,43,87,32,77,99]
Since you mentioned vector, so I think you're looking for a NumPy based solution:
>>> import numpy as np
>>> a = np.array(A)
>>> b = np.array(B)
>>> b[a!=-9999999]
array([22, 33, 87, 43, 87, 77, 99])
Pure Python solution using itertools.compress:
>>> from itertools import compress
>>> list(compress(B, (x != -9999999 for x in A)))
[22, 33, 87, 43, 87, 77, 99]
Timing comparisons:
>>> A = [33,55,-9999999,44,78,22,-9999999,10,34]*10000
>>> B = [22,33,65,87,43,87,32,77,99]*10000
>>> a = np.array(A)
>>> b = np.array(B)
>>> %timeit b[a!=-9999999]
100 loops, best of 3: 2.78 ms per loop
>>> %timeit list(compress(B, (x != -9999999 for x in A)))
10 loops, best of 3: 22.3 ms per loop
A = [33,55,-9999999,44,78,22,-9999999,10,34]
B = [22,33,65,87,43,87,32,77,99]
A1, B1 = (list(x) for x in zip(*((a, b) for a, b in zip(A, B) if a != -9999999)))
print(A1)
print(B1)
This yields:
[33, 55, 44, 78, 22, 10, 34]
[22, 33, 87, 43, 87, 77, 99]
c = [j for i, j in zip(A, B) if i != -9999999]
zip merges two lists, creating a list of the pairs (x, y). Using list comprehension you can filter the elements that are -999999 in A.
I'm new to python and I'm writing a program fro matrix but there is a problem I don't know to get the right output and I need help with it.
this is the question:Given a nXn matrix A and a kXn matrix B find AB .
and here is what I have so far. Thank you in advance
def matrixmult (A, B):
rows_A = len(A)
cols_A = len(A[0])
rows_B = len(B)
cols_B = len(B[0])
if cols_A != rows_B:
print "Cannot multiply the two matrices. Incorrect dimensions."
return
# Create the result matrix
# Dimensions would be rows_A x cols_B
C = [[0 for row in range(cols_B)] for col in range(rows_A)]
print C
for i in range(rows_A):
for j in range(cols_B):
for k in range(cols_A):
C[i][j] += A[i][k]*B[k][j]
return C
Your function:
def matrixmult (A, B):
rows_A = len(A)
cols_A = len(A[0])
rows_B = len(B)
cols_B = len(B[0])
if cols_A != rows_B:
print "Cannot multiply the two matrices. Incorrect dimensions."
return
# Create the result matrix
# Dimensions would be rows_A x cols_B
C = [[0 for row in range(cols_B)] for col in range(rows_A)]
print C
for i in range(rows_A):
for j in range(cols_B):
for k in range(cols_A):
C[i][j] += A[i][k]*B[k][j]
return C
Which appears to be the same as this function.
If I run this:
matrix=[[1,2,3],
[4,5,6],
[7,8,9]]
print matrixmult(matrix, matrix) # that is your function...
It returns:
[[30, 36, 42], [66, 81, 96], [102, 126, 150]]
This is the same as Numpy:
import numpy as np
a=np.array(matrix)
b=np.array(matrix)
print np.dot(a,b)
# [[ 30 36 42]
[ 66 81 96]
[102 126 150]]
And the same as the matrix multiply more tersely stated:
def mult(mtx_a,mtx_b):
tpos_b = zip( *mtx_b)
rtn = [[ sum( ea*eb for ea,eb in zip(a,b)) for b in tpos_b] for a in mtx_a]
return rtn
So -- it is probably your input data that is the issue.
Use numPy library to solve your problem.
import numpy as np
x = np.array( ((2,3), (3, 5)) )
y = np.array( ((1,2), (5, -1)) )
print x * y
array([[ 2, 6],
[15, -5]])
More examples:
http://www.python-course.eu/matrix_arithmetic.php
Download numPy:
http://scipy.org/Download
One liner:
def matrixmult(m1, m2):
return [
[sum(x * y for x, y in zip(m1_r, m2_c)) for m2_c in zip(*m2)] for m1_r in m1
]
Explanation:
zip(*m2) - gets a column from the second matrix
zip(m1_r, m2_c) - creates tuple from m1 row and m2 column
sum(...) - sums multiplication row * col
Test:
m1 = [[1, 2, 3], [4, 5, 6]]
m2 = [[7, 8], [9, 10], [11, 12]]
result = matrixmult(m1, m2)
assert result == [[58, 64], [139, 154]]