converting list into a matrix in Python - python

If you have a list of elements lets say:
res =
['(18,430)', '(19,430)', '(19,429)', '(19,428)', '(19,427)', '(18,426)', '(17,426)', '(17,425)', '(17,424)', '(17,423)', '(17,422)', '(17,421)', '(17,420)', '(16,421)', '(14,420)', '(11,419)', '(9,417)', '(7,416)', '(4,414)', '(3,414)', '(2,412)', '(1,412)', '(-1,410)', '(-2,409)', '(-2,408)', '(-3,407)', '(-3,406)', '(-3,405)', '(-3,404)', '(-3,403)', '(-3,402)', '(-3,401)', '(-3,400)', '(-4,399)', '(-4,398)', '(-5,398)', '(-6,398)', '(-7,397)', '(-7,396)', '(-6,395)', '(-5,395)', '(-4,393)', '(-3,391)', '(6,384)', '(12,378)', '(24,370)', '(42,358)', '(107,304)', '(151,255)', '(207,196)', '(259,121)', '(389,-28)', '(456,-84)', '(515,-134)', '(569,-182)', '(650,-260)', '(688,-294)', '(723,-317)', '(740,-328)', '(762,-342)', '(767,-347)', '(768,-349)', '(769,-352)', '(769,-357)', '(769,-359)', '(768,-361)', '(768,-364)', '(766,-370)', '(765,-371)', '(764,-374)', '(763,-376)', '(761,-378)', '(760,-381)', '(758,-385)', '(752,-394)', '(747,-401)', '(742,-407)', '(735,-413)', '(724,-421)', '(719,-424)', '(718,-425)', '(717,-425)'], ['(18,430)', '(19,430)', '(19,429)', '(19,428)', '(19,427)', '(18,426)', '(17,426)', '(17,425)', '(17,424)', '(17,423)', '(17,422)', '(17,421)', '(17,420)', '(16,421)', '(14,420)', '(11,419)', '(9,417)', '(7,416)', '(4,414)', '(3,414)', '(2,412)', '(1,412)', '(-1,410)', '(-2,409)', '(-2,408)', '(-3,407)', '(-3,406)', '(-3,405)', '(-3,404)', '(-3,403)', '(-3,402)', '(-3,401)', '(-3,400)', '(-4,399)', '(-4,398)', '(-5,398)', '(-6,398)', '(-7,397)', '(-7,396)', '(-6,395)', '(-5,395)', '(-4,393)', '(-3,391)', '(6,384)', '(12,378)', '(24,370)', '(42,358)', '(107,304)', '(151,255)', '(207,196)', '(259,121)', '(389,-28)', '(456,-84)', '(515,-134)', '(569,-182)', '(650,-260)', '(688,-294)', '(723,-317)', '(740,-328)', '(762,-342)', '(767,-347)', '(768,-349)', '(769,-352)', '(769,-357)', '(769,-359)', '(768,-361)', '(768,-364)', '(766,-370)', '(765,-371)', '(764,-374)', '(763,-376)', '(761,-378)', '(760,-381)', '(758,-385)', '(752,-394)', '(747,-401)', '(742,-407)', '(735,-413)', '(724,-421)', '(719,-424)', '(718,-425)', '(717,-425)']
and we want to make all these values into a matrix where we can update values.
All these values in the list are going to be the values of the rows and columns of a matrix?
Basically:
row1 = '(18,430)', row2 = '(19,430)', row3 = '(19,429)',.....,rown='(717,-425)', column1 = '(18,430)', column2 = '(19,430)', column3 = '(19,429)', ..... ,columnn= '(717,-425)'
How can we do that in Python and later I want to update values in the rows and columns? I tried to do this where I repeat the list and make it into a matrix.
But it does not give me what I want.
Res_List = [res,res]
print(np.array(Res_List))
So I am still wondering how we can do this in Python.
I also tried:
mat = np.array([res,res]).T
print(mat)
and it kind of gives me what I want but not quite.
This gives me:
[['(18,430)' '(18,430)']
['(19,430)' '(19,430)']
['(19,429)' '(19,429)']
['(19,428)' '(19,428)']
['(19,427)' '(19,427)']
['(18,426)' '(18,426)']
['(17,426)' '(17,426)']
['(17,425)' '(17,425)']
['(17,424)' '(17,424)']
['(17,423)' '(17,423)']
['(17,422)' '(17,422)']
['(17,421)' '(17,421)']
['(17,420)' '(17,420)']
['(16,421)' '(16,421)']
['(14,420)' '(14,420)']
['(11,419)' '(11,419)']
['(9,417)' '(9,417)']
['(7,416)' '(7,416)']
['(4,414)' '(4,414)']
['(3,414)' '(3,414)']
['(2,412)' '(2,412)']
['(1,412)' '(1,412)']
['(-1,410)' '(-1,410)']
['(-2,409)' '(-2,409)']
['(-2,408)' '(-2,408)']
['(-3,407)' '(-3,407)']
['(-3,406)' '(-3,406)']
['(-3,405)' '(-3,405)']
['(-3,404)' '(-3,404)']
['(-3,403)' '(-3,403)']
['(-3,402)' '(-3,402)']
['(-3,401)' '(-3,401)']
['(-3,400)' '(-3,400)']
['(-4,399)' '(-4,399)']
['(-4,398)' '(-4,398)']
['(-5,398)' '(-5,398)']
['(-6,398)' '(-6,398)']
['(-7,397)' '(-7,397)']
['(-7,396)' '(-7,396)']
['(-6,395)' '(-6,395)']
['(-5,395)' '(-5,395)']
['(-4,393)' '(-4,393)']
['(-3,391)' '(-3,391)']
['(6,384)' '(6,384)']
['(12,378)' '(12,378)']
['(24,370)' '(24,370)']
['(42,358)' '(42,358)']
['(107,304)' '(107,304)']
['(151,255)' '(151,255)']
['(207,196)' '(207,196)']
['(259,121)' '(259,121)']
['(389,-28)' '(389,-28)']
['(456,-84)' '(456,-84)']
['(515,-134)' '(515,-134)']
['(569,-182)' '(569,-182)']
['(650,-260)' '(650,-260)']
['(688,-294)' '(688,-294)']
['(723,-317)' '(723,-317)']
['(740,-328)' '(740,-328)']
['(762,-342)' '(762,-342)']
['(767,-347)' '(767,-347)']
['(768,-349)' '(768,-349)']
['(769,-352)' '(769,-352)']
['(769,-357)' '(769,-357)']
['(769,-359)' '(769,-359)']
['(768,-361)' '(768,-361)']
['(768,-364)' '(768,-364)']
['(766,-370)' '(766,-370)']
['(765,-371)' '(765,-371)']
['(764,-374)' '(764,-374)']
['(763,-376)' '(763,-376)']
['(761,-378)' '(761,-378)']
['(760,-381)' '(760,-381)']
['(758,-385)' '(758,-385)']
['(752,-394)' '(752,-394)']
['(747,-401)' '(747,-401)']
['(742,-407)' '(742,-407)']
['(735,-413)' '(735,-413)']
['(724,-421)' '(724,-421)']
['(719,-424)' '(719,-424)']
['(718,-425)' '(718,-425)']
['(717,-425)' '(717,-425)']]
but what I want is the columns like how
they are designed but the rows to be the same
as the columns and that we are able to update
and put values into the matrix.

Maybe what you want is a dict:
matrix = {
k: {l: 0 for l in res}
for k in res
}
All the values are initialized to 0.
You can easily update values in matrix; for example, you can increase the value of a 'cell' of one:
matrix['(18,430)']['(19,430)'] += 1
or set it to a specific value:
matrix['(18,430)']['(19,430)'] = 10
and retrieve it:
val = matrix['(18,430)']['(19,430)']

you can use Numpy.
for converting a list to a matrix like array you should write it as list of lists (or tuples). First of all your list contain strings so we first convert strings to tuples as follow:
new_list = [eval(i) for i in res]
I used eval because your strings is in tuple form so we can tell python treat them as a chunk of code.
then lets convert this new_list to array as follow:
import numpy as np
matrix = np.array(new_list )
now you can access your matrix elements as matrix[i, j] where i, j are row and column respectively. for changing a specific value of in certain location just assign it as usual:
matrix[i, j] = new_value

Related

Sort given 2d array in order of ascending by python3.x

I am a python beginner, please help me with this python case.
Sort given 2d array in order of ascending.Flatten the 2D array, and sort it such that first sort order is the first number, second sort order is the second number'''# Example:
input_arr = [
['55-29', '55-32', '62-3', '84-38'],
['36-84', '23-53', '22-58', '48-15'],
['72-80', '48-6', '11-86', '73-23'],
['93-51', '55-11', '93-49', '72-10'],
['93-66', '71-32', '16-75', '55-9']
]
 ouput_arr = ['11-86', '16-75', '22-58', '23-53', '36-84', '48-6', '48-15', '55-9', '55-11', '55-29', '55-32', '62-3', '71-32', '72-10', '72-80', '73-23', '84-38', '93-49', '93-51', '93-66'] 
def sort_2d_array(input_arr=input_arr) -> list:   
#TODO
    pass
Try this
you need to flatten your list into single list like this
tmp = [t for x in input_arr for t in x]
then do the sorting based on first element before the -, like this
print(list(sorted(tmp,key=lambda x: int(x.split('-')[0]))))
this will give you your desired output.
['11-86', '16-75', '22-58', '23-53', '36-84', '48-15', '48-6', '55-29', '55-32', '55-11', '55-9', '62-3', '71-32', '72-80', '72-10', '73-23', '84-38', '93-51', '93-49', '93-66']
for second condition you can try
print(list(sorted(tmp,key=lambda x: (int(x.split('-')[0]) , int(x.split('-')[1])))))
this will give you
['11-86', '16-75', '22-58', '23-53', '36-84', '48-6', '48-15', '55-9', '55-11', '55-29', '55-32', '62-3', '71-32', '72-10', '72-80', '73-23', '84-38', '93-49', '93-51', '93-66']

Python - masking in a for loop?

I have three arrays, r_vals, Tgas_vals, and n_vals. They are all numpy arrays of the shape (9998.). The arrays have repeated values and I want to iterate over the unique values of r_vals and find the corresponding values of Tgas_vals, and n_vals so I can use the last two arrays to calculate the weighted average. This is what I have right now:
def calc_weighted_average (r_vals,Tgas_vals,n_vals):
for r in r_vals:
mask = r == r_vals
count = 0
count += 1
for t in Tgas_vals[mask]:
print (count, np.average(Tgas_vals[mask]*n_vals[mask]))
weighted_average = calc_weighted_average (r_vals,Tgas_vals,n_vals)
The problem I am running into is that the function is only looping through once. Did I implement mask incorrectly, or is the problem somewhere else in the for loop?
I'm not sure exactly what you plan to do with all the averages, so I'll toss this out there and see if it's helpful. The following code will calculate a bunch of weighted averages, one per unique value of r_vals and store them in a dictionary(which is then printed out).
def calc_weighted_average (r_vals, z_vals, Tgas_vals, n_vals):
weighted_vals = {} #new variable to store rval=>weighted ave.
for r in np.unique(r_vals):
mask = r_vals == r # I think yours was backwards
weighted_vals[r] = np.average(Tgas_vals[mask]*n_vals[mask])
return weighted_vals
weighted_averages = calc_weighted_average (r_vals, z_vals, Tgas_vals, n_vals)
for rval in weighted_averages:
print ('%i : %0.4f' % (rval, weighted_averages[rval])) #assuming rval is integer
alternatively, you may want to factor in "z_vals" in somehow. Your question was not clear in this.

How to efficiently mutate certain num of values in an array?

Given an initial 2-D array:
initial = [
[0.6711999773979187, 0.1949000060558319],
[-0.09300000220537186, 0.310699999332428],
[-0.03889999911189079, 0.2736999988555908],
[-0.6984000205993652, 0.6407999992370605],
[-0.43619999289512634, 0.5810999870300293],
[0.2825999855995178, 0.21310000121593475],
[0.5551999807357788, -0.18289999663829803],
[0.3447999954223633, 0.2071000039577484],
[-0.1995999962091446, -0.5139999985694885],
[-0.24400000274181366, 0.3154999911785126]]
The goal is to multiply some random values inside the array by a random percentage. Lets say only 3 random numbers get replaced by a random multipler, we should get something like this:
output = [
[0.6711999773979187, 0.52],
[-0.09300000220537186, 0.310699999332428],
[-0.03889999911189079, 0.2736999988555908],
[-0.6984000205993652, 0.6407999992370605],
[-0.43619999289512634, 0.5810999870300293],
[0.84, 0.21310000121593475],
[0.5551999807357788, -0.18289999663829803],
[0.3447999954223633, 0.2071000039577484],
[-0.1995999962091446, 0.21],
[-0.24400000274181366, 0.3154999911785126]]
I've tried doing this:
def mutate(array2d, num_changes):
for _ in range(num_changes):
row, col = initial.shape
rand_row = np.random.randint(row)
rand_col = np.random.randint(col)
cell_value = array2d[rand_row][rand_col]
array2d[rand_row][rand_col] = random.uniform(0, 1) * cell_value
return array2d
And that works for 2D arrays but there's chance that the same value is mutated more than once =(
And I don't think that's efficient and it only works on 2D array.
Is there a way to do such "mutation" for array of any shape and more efficiently?
There's no restriction of which value the "mutation" can choose from but the number of "mutation" should be kept strict to the user specified number.
One fairly simple way would be to work with a raveled view of the array. You can generate all your numbers at once that way, and make it easier to guarantee that you won't process the same index twice in one call:
def mutate(array_anyd, num_changes):
raveled = array_anyd.reshape(-1)
indices = np.random.choice(raveled.size, size=num_changes, replace=False)
values = np.random.uniform(0, 1, size=num_changes)
raveled[indices] *= values
I use array_anyd.reshape(-1) in favor of array_anyd.ravel() because according to the docs, the former is less likely to make an inadvertent copy.
The is of course still such a possibility. You can add an extra check to write back if you need to. A more efficient way would be to use np.unravel_index to avoid creating a view to begin with:
def mutate(array_anyd, num_changes):
indices = np.random.choice(array_anyd.size, size=num_changes, replace=False)
indices = np.unravel_indices(indices, array_anyd.shape)
values = np.random.uniform(0, 1, size=num_changes)
raveled[indices] *= values
There is no need to return anything because the modification is done in-place. Conventionally, such functions do not return anything. See for example list.sort vs sorted.
Using shuffle instead of random_choice, this would be a different solution. It works on an array of any shape.
def mutate(arrayIn, num_changes):
mult = np.zeros(arrayIn.ravel().shape[0])
mult[:num_changes] = np.random.uniform(0,1,num_changes)
np.random.shuffle(mult)
mult = mult.reshape(arrayIn.shape)
arrayIn = arrayIn + mult*arrayIn
return arrayIn

python fastest way to match strings with huge data size

I have a huge table data (or record array) with elements:
tbdata[i]['a'], tbdata[i]['b'], tbdata[i]['c']
which are all integers, and i is a random number between 0 and 1 million (the size of the table).
I also have a list called Name whose elements are all names (900 names in total) of files, such as '/Users/Desktop/Data/spe-3588-55184-0228.jpg' (modified), all containing three numbers.
Now I want to select those data from my tbdata whose elements mentioned above all match the three numbers in the names of list Name. Here's the code I originally wrote:
Data = []
for k in range(0, len(tbdata)):
for i in range(0, len(NameA5)):
if Name[i][43:47] == str(tbdata[k]['a']) and\
Name[i][48:53] == str(tbdata[k]['b']) and\
Name[i][55:58] == str(tbdata[k]['c']):
Data.append(tbdata[k])
Python ran for the whole night and still haven't finished, since either the size of data is huge or my algorithm is too slow...I'm wondering what's the fastest way to complete such a task? Thanks!
You can construct a lookup tree like this:
a2b2c = {}
for name in NameA5:
a = int(name[43:47])
b = int(name[48:53])
c = int(name[55:58])
if a not in a2b2c2name:
a2b2c2name[a] = {}
if b not in a2b2c2name[a]:
a2b2c2name[a][b] = {}
a2b2c2name[a][b][c] = True
for k in range(len(tbdata)):
a = tbdata[k]['a']
b = tbdata[k]['b']
c = tbdata[k]['c']
if a in a2b2c2name and b in a2b2c2name[a] and c in a2b2c2name[a][b]:
Data.append(tbdata[k])

Divide a array into multiple (individual) arrays based on a bin size in python

I am reading a file that contains values like this:
-0.68285 -6.919616
-0.7876 -14.521115
-0.64072 -43.428411
-0.05368 -11.561341
-0.43144 -34.768892
-0.23268 -10.793603
-0.22216 -50.341101
-0.41152 -90.083377
-0.01288 -84.265557
-0.3524 -24.253145
How do i split this into individual arrays based on the value in column 1 with a bin width of 0.1?
i want my output something like this:
array1=[[-0.05368, -11.561341],[-0.01288, -84.265557]]
array2=[[-0.23268, -10.79360] ,[-0.22216, -50.341101]]
array3=[[-0.3524, -24.253145]]
array4=[[-0.43144, -34.768892], [-0.41152, -90.083377]]
array5=[[-0.68285, -6.919616],[-0.64072, -43.428411]]
array6=[[-0.7876, -14.521115]]
Here's a simple solution using Python's round function and dictionary class:
lines = open('some_file.txt').readlines()
dictionary = {}
for line in lines:
nums = line[:-1].split(' ') #remove the newline and split the columns
k = round(float(nums[0]), 1) #round the first column to get the bucket
if k not in dictionary:
dictionary[k] = [] #add an empty bucket
dictionary[k].append([float(nums[0]), float(nums[1])])
#add the numbers to the bucket
print dictionary
To get a particular bucket (like .3), just do:
x = dictionary[0.3]
or
x = dictionary.get(0.3, [])
if you just want an empty list returned for empty buckets.

Categories

Resources