How can I do some matrix addition in rethinkDB? - python

So essentially I have this variable question[1]
where question[1] is: [[1, 0, 0], [1, 0, 0], [0,1,0] ...]
I want to be able to add them vertically so I get one array like so
[1,0,0]+[1,0,0]=[2,0,0] + [0,1,0] = [2,1,0] + ....
Additionally, the arrays might be longer or shorter (but will be at least two long)
How could I do this?
The API Doc has the following example:
sequence1 = [100, 200, 300, 400]
sequence2 = [10, 20, 30, 40]
sequence3 = [1, 2, 3, 4]
r.map(sequence1, sequence2, sequence3,
lambda val1, val2, val3: (val1 + val2 + val3)).run(conn)
with result:
[111, 222, 333, 444]
But this won't account for a variable amount of inputs as I want. Answer in python please!

From #mglukov
r.expr([[100, 200, 300, 400],[10, 20, 30, 40],[1, 2, 3, 4]]).reduce((left,right) => {
return left.map(right, (leftVal, rightVal) => { return leftVal.add(rightVal); });
})
Good question!

Related

Clustering a list with nearest values without sorting

I have a list like this
tst = [1,3,4,6,8,22,24,25,26,67,68,70,72,0,0,0,0,0,0,0,4,5,6,36,38,36,31]
I want to group the elements from above list into separate groups/lists based on the difference between the consecutive elements in the list (differing by 1 or 2 or 3).
I have tried following code
def slice_when(predicate, iterable):
i, x, size = 0, 0, len(iterable)
while i < size-1:
if predicate(iterable[i], iterable[i+1]):
yield iterable[x:i+1]
x = i + 1
i += 1
yield iterable[x:size]
tst = [1,3,4,6,8,22,24,25,26,67,68,70,72,0,0,0,0,0,0,0,4,5,6,36,38,36,31]
slices = slice_when(lambda x,y: (y - x > 2), tst)
whola=(list(slices))
I got this results
[[1, 3, 4, 6, 8], [22, 24, 25, 26], [67, 68, 70, 72, 0, 0, 0, 0, 0, 0, 0], [4, 5, 6], [36, 38, 36, 31]]
In 3rd list it doesn't separate the sequence of zeros into another list. Any kind of help highly appreciate. Thank you
I guess this is what you want?
tst = [1,3,4,6,8,22,24,25,26,67,68,70,72,0,0,0,0,0,0,0,4,5,6,36,38,36,31]
slices = slice_when(lambda x,y: (abs(y - x) > 2), tst) # Use abs!
whola=(list(slices))
print(whola)

How to add padding in a dataset to fill up to 50 items in a list and replace NaN with 0?

I have the following encoded text column in my dataset:
[182, 4]
[14, 2, 31, 42, 72]
[362, 685, 2, 399, 21, 16, 684, 682, 35, 7, 12]
Somehow I want this column to be filled up to 50 items on each row, assuming no row is larger than 50 items. And where there is no numeric value I want a 0 to be placed.
In the example the wanted outcome would be:
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,182, 4]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,14, 2, 31, 42, 72]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,362, 685, 2, 399, 21, 16, 684, 682, 35, 7, 12]
Try this:
>>> y=[182,4]
>>> ([0]*(50-len(y))+y)
Assuming you parsed the lists from the string columns already, a very basic approach could be as follows:
a = [182, 4]
b = [182, 4, 'q']
def check_numeric(element):
# assuming only integers are valid numeric values
try:
element = int(element)
except ValueError:
element = 0
return element
def replace_nonnumeric(your_list):
return [check_numeric(element) for element in your_list]
# change the desired length to your needs (change 15 to 50)
def fill_zeros(your_list, desired_length=15):
prepend = (desired_length - len(your_list)) * [0]
result = prepend + your_list
return result
aa = replace_nonnumeric(a)
print(fill_zeros(aa))
bb = replace_nonnumeric(b)
print(fill_zeros(bb))
This code outputs:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 182, 4] # <-- aa
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 182, 4, 0] # <-- bb
However, I suggest using this code as a basis and adopt it to your needs.
Especially when parsing a lot of entries from the "list as strings" column, writing a parsing function and calling it via pandas' .apply() would be nice approach.

How can I multiply list items in a dict with another list in Python

I have a dictionary with player names and their points, and what I need to do is multiply each list item with coefficients from another list resulting in a new array with multiplied points:
points = {mark : [650, 400, 221, 0, 3], bob : ([240, 300, 5, 0, 0], [590, 333, 20, 30, 0]), james : [789, 201, 0, 0, 1]}
coefficients = [5, 4, 3, 2, 1]
So for example for Mark:
player_points = [650*5, 400*4, 221*3, 0*2, 3*1]
And for Bob:
player_points = [240*5, 300*4, 5*3, 0*2, 0*1], [590*5, 333*4, 20*3, 30*2, 0*1]
What I tried was the following but it didn't work whatsoever:
def calculate_points(points, coefficients):
i = 0
for coefficient in coefficients:
player_points = coefficient * points[i]
i += 1
return player_points
def main():
points = {"mark": [650, 400, 221, 0, 3],
"bob": ([240, 300, 5, 0, 0], [590, 333, 20, 30, 0]),
"james": [789, 201, 0, 0, 1]}
coefficients = [5, 4, 3, 2, 1]
player_points = calculate_points(points, coefficients)
print(player_points)
main()
For list multilplication you can do
player_point = [i*j for i,j in zip(point['mark'], coefficients)]
So if you want a player_point dictionnary:
player_points = {}
For name in points.keys():
player_points[name] = [i*j for i,j in zip(points[name], coefficients)]
Here is code that works using a for loop:
points = {"mark" : [650, 400, 221, 0, 3], "bob" : [240, 300, 5, 0, 0],"joe" : [590, 333, 20, 30, 0], "james" : [789, 201, 0, 0, 1]}
coefficients = [5, 4, 3, 2, 1]
for element in points:
player_points= []
for i in range(len(points.get(element))):
player_points.append(points.get(element)[i]*coefficients[i])
print(player_points)
This will give the output of
[3250,1600,663,0,3]
[1200,1200,15,0,0]
[2950,1332,60,60,0]
[3945,804,0,0,1]
Your data structure is irregular which make processing it much harder than it needs to be. If all the dictionary values were tuples, a simple dictionary comprehension could be used. As it is, you sometimes have an array, and sometimes a tuple which requires the code to deal with exceptions and type detection.
Here's how it would work if the structure was consistent (i.e. tuples for all values)
points = { "mark" : ([650, 400, 221, 0, 3],),
"bob" : ([240, 300, 5, 0, 0], [590, 333, 20, 30, 0]),
"james" : ([789, 201, 0, 0, 1],)
}
coefficients = [5, 4, 3, 2, 1]
player_points = { pl:tuple([p*c for p,c in zip(pt,coefficients)] for pt in pts)
for pl,pts in points.items() }
print(player_points)
{
'mark' : ([3250, 1600, 663, 0, 3],),
'bob' : ([1200, 1200, 15, 0, 0], [2950, 1332, 60, 60, 0]),
'james': ([3945, 804, 0, 0, 1],)
}
If you don't want to adjust your structure, you'll need a function that handles the inconsistency:
points = { "mark" : [650, 400, 221, 0, 3],
"bob" : ([240, 300, 5, 0, 0], [590, 333, 20, 30, 0]),
"james" : [789, 201, 0, 0, 1]
}
coefficients = [5, 4, 3, 2, 1]
def applyCoeffs(pts,coeffs):
if isinstance(pts,list):
return [p*c for p,c in zip(pts,coeffs)]
else:
return tuple(applyCoeffs(pt,coeffs) for pt in pts)
player_points = { pl: applyCoeffs(pts,coefficients) for pl,pts in points.items() }
print(player_points)
{
'mark' : [3250, 1600, 663, 0, 3],
'bob' : ([1200, 1200, 15, 0, 0], [2950, 1332, 60, 60, 0]),
'james': [3945, 804, 0, 0, 1]
}

How to convert python dict to 3D numpy array?

I have a python dict with n_keys where each value is a 2D array (dim1,dim2).
I want to transfer this into a 3D numpy array of (dim1,dim2,n_keys).
How can I do it fast without a lot of nested loops?
EDIT:
Example:
featureMatrix = np.empty((len(featureDict.values()[0]),
len(featureDict.values()[0][0,:]),
len(featureDict.keys())))
for k,keys in enumerate(featureDict.keys()):
value=featureDict[keys]
for i in range(0,len(value[:,0]),1):
for j in range(0,len(value[0,:]),1):
featureMatrix[i,j,k]=value[i,j]
dict-ionaries are unordered so you probably don't want to simply stack them but you can simply stack the values nevertheless with array3d = np.dstack(somedict.values()).
Here is some example case:
>>> somedict = dict(a = np.arange(4).reshape(2,2),
b = np.arange(4).reshape(2,2) + 10,
c = np.arange(4).reshape(2,2) + 100,
d = np.arange(4).reshape(2,2) + 1000)
>>> array3d = np.dstack(somedict.values())
>>> array3d.shape
(2, 2, 4)
>>> array3d # unordered because of dict unorderedness, order depends for all practical purposes on chance
array([[[ 10, 0, 1000, 100],
[ 11, 1, 1001, 101]],
[[ 12, 2, 1002, 102],
[ 13, 3, 1003, 103]]])
or in case you want to stack it sorted by the key of the dictionary:
>>> array3d = np.dstack((somedict[i] for i in sorted(somedict.keys())))
>>> array3d # sorted by the keys!
array([[[ 0, 10, 100, 1000],
[ 1, 11, 101, 1001]],
[[ 2, 12, 102, 1002],
[ 3, 13, 103, 1003]]])

Python - Print array of array as table

Given an array of array A defined as
A = [[1, 2, 3, 4], [10, 20, 30, 40], [100, 200, 300, 400]],
if print function is called
for i in range(0,3):
print A[i]
the following is the output
[1, 2, 3, 4]
[10, 20, 30, 40]
[100, 200, 300, 400].
How can I get a "prettier" output like this:
[ 1, 2, 3, 4]
[ 10, 20, 30, 40]
[100, 200, 300, 400]
???
Thank you
All you need to know is the maximum number of digits that there could be. If the number of digits is three as in your example, do this:
for i in A:
print(", ".join([str(l).rjust(3) for l in i]))
Using str(i).rjust(3) puts i right-justified in a field of width 3 where the extra characters are spaces. You could make them zeros with str(i).zfill(3), or you could make them anything you want with str(i).rjust(3, "&") for example.
Output:
1, 2, 3, 4
10, 20, 30, 40
100, 200, 300, 400
Of course, to make it applicable for more situations, you could use len(str(max(map(max, A)))) instead of hardcoding the 3.
This code will more dynamic. This will find the maximum number's length and rjust by max_len.
A = [[1, 2, 3, 4], [10, 20, 30, 40], [11100, 20033, 300, 400]]
max_len = len(str(max( max(i) for i in A)))
for i in A:
print(", ".join([str(l).rjust(max_len) for l in i]))
Why not using a numpy array for this ?
import numpy as np
print np.array(A)
[[ 1 2 3 4]
[ 10 20 30 40]
[100 200 300 400]]
You can't have those spaces in there if the record is an integer, but this code seems to split up the array nicely.
A = [1999999, 2, 3, 47678], [10, 20, 30, 40], [100, 200, 300, 400]
MaxLength=0
for i in range(0,3):
for x in A[i]:
MaxLength =len(str(x)) if MaxLength<len(str(x)) else MaxLength
for i in range(0,3):
for x in range(0,len(A[i])):
Length=MaxLength-len(str(A[i][x]))
print((" "*Length)+str(A[i][x]),end="|")
print()
If you do want, you can call this up in a definition, just do this:
def TableFormat(A):
MaxLength=0
for i in range(0,3):
for x in A[i]:
MaxLength =len(str(x)) if MaxLength<len(str(x)) else MaxLength
for i in range(0,3):
for x in range(0,len(A[i])):
Length=MaxLength-len(str(A[i][x]))
print((" "*Length)+str(A[i][x]),end="|")
print()
Then you can print the table neatly by doing TableFormat(A) with A as the array. The end="|" can be swapped for anything you want to divide the records with
Easiest way is to use two for loops
for list in A:
for element in list:
print element, '\t',
print '\n'
Try this.

Categories

Resources