compare vector lists

compare vector lists - python

I need to compare two list of vectors and take their equal elements, like:
veclist1 = [(0.453 , 0.232 , 0.870), (0.757 , 0.345 , 0.212), (0.989 , 0.232 , 0.543)]
veclist2 = [(0.464 , 0.578 , 0.870), (0.327 , 0.335 , 0.562), (0.757 , 0.345 , 0.212)]
equalelements = [(0.757 , 0.345 , 0.212)]
obs: The order of the elements don't matter!
And also, if possible I wanted to only consider till the 2nd decimal in the comparison
but without rounding or shortening them. Is it possible ?
Thx in advance!

# Get new lists with rounded values
veclist1_rounded = [tuple(round(val, 2) for val in vec) for vec in veclist1]
veclist2_rounded = [tuple(round(val, 2) for val in vec) for vec in veclist2]
# Convert to sets and calculate intersection (&)
slct_rounded = set(veclist1_rounded) & set(veclist2_rounded)
# Pick original elements from veclist1:
# - get index of the element from the rounded list
# - get original element from the original list
equalelements = [veclist1[veclist1_rounded.index(el)] for el in slct_rounded]
In this case we select the entries of veclist1 if only the rounded entries are equal. Otherwise the last line needs to be adjusted.
If all original elements are needed, the final list can be calculated using both original lists:
equalelements = ([veclist1[veclist1_rounded.index(el)] for el in slct_rounded]
+ [veclist2[veclist2_rounded.index(el)] for el in slct_rounded])
Note: round might have issues, which should be solved in current Python version. Nevertheless, it might be better to use strings instead:
get_rounded = lambda veclist: [tuple(f'{val:.2f}' for val in vec) for vec in veclist]
veclist1_rounded, veclist2_rounded = get_rounded(veclist1), get_rounded(veclist2)

Related

converting list into a matrix in Python

If you have a list of elements lets say:
res =
['(18,430)', '(19,430)', '(19,429)', '(19,428)', '(19,427)', '(18,426)', '(17,426)', '(17,425)', '(17,424)', '(17,423)', '(17,422)', '(17,421)', '(17,420)', '(16,421)', '(14,420)', '(11,419)', '(9,417)', '(7,416)', '(4,414)', '(3,414)', '(2,412)', '(1,412)', '(-1,410)', '(-2,409)', '(-2,408)', '(-3,407)', '(-3,406)', '(-3,405)', '(-3,404)', '(-3,403)', '(-3,402)', '(-3,401)', '(-3,400)', '(-4,399)', '(-4,398)', '(-5,398)', '(-6,398)', '(-7,397)', '(-7,396)', '(-6,395)', '(-5,395)', '(-4,393)', '(-3,391)', '(6,384)', '(12,378)', '(24,370)', '(42,358)', '(107,304)', '(151,255)', '(207,196)', '(259,121)', '(389,-28)', '(456,-84)', '(515,-134)', '(569,-182)', '(650,-260)', '(688,-294)', '(723,-317)', '(740,-328)', '(762,-342)', '(767,-347)', '(768,-349)', '(769,-352)', '(769,-357)', '(769,-359)', '(768,-361)', '(768,-364)', '(766,-370)', '(765,-371)', '(764,-374)', '(763,-376)', '(761,-378)', '(760,-381)', '(758,-385)', '(752,-394)', '(747,-401)', '(742,-407)', '(735,-413)', '(724,-421)', '(719,-424)', '(718,-425)', '(717,-425)'], ['(18,430)', '(19,430)', '(19,429)', '(19,428)', '(19,427)', '(18,426)', '(17,426)', '(17,425)', '(17,424)', '(17,423)', '(17,422)', '(17,421)', '(17,420)', '(16,421)', '(14,420)', '(11,419)', '(9,417)', '(7,416)', '(4,414)', '(3,414)', '(2,412)', '(1,412)', '(-1,410)', '(-2,409)', '(-2,408)', '(-3,407)', '(-3,406)', '(-3,405)', '(-3,404)', '(-3,403)', '(-3,402)', '(-3,401)', '(-3,400)', '(-4,399)', '(-4,398)', '(-5,398)', '(-6,398)', '(-7,397)', '(-7,396)', '(-6,395)', '(-5,395)', '(-4,393)', '(-3,391)', '(6,384)', '(12,378)', '(24,370)', '(42,358)', '(107,304)', '(151,255)', '(207,196)', '(259,121)', '(389,-28)', '(456,-84)', '(515,-134)', '(569,-182)', '(650,-260)', '(688,-294)', '(723,-317)', '(740,-328)', '(762,-342)', '(767,-347)', '(768,-349)', '(769,-352)', '(769,-357)', '(769,-359)', '(768,-361)', '(768,-364)', '(766,-370)', '(765,-371)', '(764,-374)', '(763,-376)', '(761,-378)', '(760,-381)', '(758,-385)', '(752,-394)', '(747,-401)', '(742,-407)', '(735,-413)', '(724,-421)', '(719,-424)', '(718,-425)', '(717,-425)']
and we want to make all these values into a matrix where we can update values.
All these values in the list are going to be the values of the rows and columns of a matrix?
Basically:
row1 = '(18,430)', row2 = '(19,430)', row3 = '(19,429)',.....,rown='(717,-425)', column1 = '(18,430)', column2 = '(19,430)', column3 = '(19,429)', ..... ,columnn= '(717,-425)'
How can we do that in Python and later I want to update values in the rows and columns? I tried to do this where I repeat the list and make it into a matrix.
But it does not give me what I want.
Res_List = [res,res]
print(np.array(Res_List))
So I am still wondering how we can do this in Python.
I also tried:
mat = np.array([res,res]).T
print(mat)
and it kind of gives me what I want but not quite.
This gives me:
[['(18,430)' '(18,430)']
['(19,430)' '(19,430)']
['(19,429)' '(19,429)']
['(19,428)' '(19,428)']
['(19,427)' '(19,427)']
['(18,426)' '(18,426)']
['(17,426)' '(17,426)']
['(17,425)' '(17,425)']
['(17,424)' '(17,424)']
['(17,423)' '(17,423)']
['(17,422)' '(17,422)']
['(17,421)' '(17,421)']
['(17,420)' '(17,420)']
['(16,421)' '(16,421)']
['(14,420)' '(14,420)']
['(11,419)' '(11,419)']
['(9,417)' '(9,417)']
['(7,416)' '(7,416)']
['(4,414)' '(4,414)']
['(3,414)' '(3,414)']
['(2,412)' '(2,412)']
['(1,412)' '(1,412)']
['(-1,410)' '(-1,410)']
['(-2,409)' '(-2,409)']
['(-2,408)' '(-2,408)']
['(-3,407)' '(-3,407)']
['(-3,406)' '(-3,406)']
['(-3,405)' '(-3,405)']
['(-3,404)' '(-3,404)']
['(-3,403)' '(-3,403)']
['(-3,402)' '(-3,402)']
['(-3,401)' '(-3,401)']
['(-3,400)' '(-3,400)']
['(-4,399)' '(-4,399)']
['(-4,398)' '(-4,398)']
['(-5,398)' '(-5,398)']
['(-6,398)' '(-6,398)']
['(-7,397)' '(-7,397)']
['(-7,396)' '(-7,396)']
['(-6,395)' '(-6,395)']
['(-5,395)' '(-5,395)']
['(-4,393)' '(-4,393)']
['(-3,391)' '(-3,391)']
['(6,384)' '(6,384)']
['(12,378)' '(12,378)']
['(24,370)' '(24,370)']
['(42,358)' '(42,358)']
['(107,304)' '(107,304)']
['(151,255)' '(151,255)']
['(207,196)' '(207,196)']
['(259,121)' '(259,121)']
['(389,-28)' '(389,-28)']
['(456,-84)' '(456,-84)']
['(515,-134)' '(515,-134)']
['(569,-182)' '(569,-182)']
['(650,-260)' '(650,-260)']
['(688,-294)' '(688,-294)']
['(723,-317)' '(723,-317)']
['(740,-328)' '(740,-328)']
['(762,-342)' '(762,-342)']
['(767,-347)' '(767,-347)']
['(768,-349)' '(768,-349)']
['(769,-352)' '(769,-352)']
['(769,-357)' '(769,-357)']
['(769,-359)' '(769,-359)']
['(768,-361)' '(768,-361)']
['(768,-364)' '(768,-364)']
['(766,-370)' '(766,-370)']
['(765,-371)' '(765,-371)']
['(764,-374)' '(764,-374)']
['(763,-376)' '(763,-376)']
['(761,-378)' '(761,-378)']
['(760,-381)' '(760,-381)']
['(758,-385)' '(758,-385)']
['(752,-394)' '(752,-394)']
['(747,-401)' '(747,-401)']
['(742,-407)' '(742,-407)']
['(735,-413)' '(735,-413)']
['(724,-421)' '(724,-421)']
['(719,-424)' '(719,-424)']
['(718,-425)' '(718,-425)']
['(717,-425)' '(717,-425)']]
but what I want is the columns like how
they are designed but the rows to be the same
as the columns and that we are able to update
and put values into the matrix.

Maybe what you want is a dict:
matrix = {
k: {l: 0 for l in res}
for k in res
}
All the values are initialized to 0.
You can easily update values in matrix; for example, you can increase the value of a 'cell' of one:
matrix['(18,430)']['(19,430)'] += 1
or set it to a specific value:
matrix['(18,430)']['(19,430)'] = 10
and retrieve it:
val = matrix['(18,430)']['(19,430)']

you can use Numpy.
for converting a list to a matrix like array you should write it as list of lists (or tuples). First of all your list contain strings so we first convert strings to tuples as follow:
new_list = [eval(i) for i in res]
I used eval because your strings is in tuple form so we can tell python treat them as a chunk of code.
then lets convert this new_list to array as follow:
import numpy as np
matrix = np.array(new_list )
now you can access your matrix elements as matrix[i, j] where i, j are row and column respectively. for changing a specific value of in certain location just assign it as usual:
matrix[i, j] = new_value

Why is this Python array not slicing?

Initial data is:
array([[0.0417634 ],
[0.04493844],
[0.04932728],
[0.04601787],
[0.04511007],
[0.04312284],
[0.0451733 ],
[0.04560687],
[0.04263394],
[0.04183227],
[0.048634 ],
[0.05198746],
[0.05615724],
[0.05787913], dtype=float32)
then i transformed it in 2d array
array2d = np.reshape(dataset, (-1, 2))
now i have
array([[0.0417634 , 0.04493844],
[0.04932728, 0.04601787],
[0.04511007, 0.04312284],
[0.0451733 , 0.04560687],
[0.04263394, 0.04183227],
[0.048634 , 0.05198746],
[0.05615724, 0.05787913],
[0.05989346, 0.0605077 ], dtype=float32)
Now i'm going to calcolulate the mean between each element of the array
paa = []
paa.append(array2d.mean(axis=1))
now i want a list of intervals from this list
intervals = paa[::10]
intervals
but the result is the same list (paa). Why? Already tried to convert it in np.array(paa)
Expected a new list with less elements. Since 10 is the nr of steps i'm expecting [0.0417634, ... paa[11], .... paa[21] .... ]

np.mean will return a np.array. You are taking the result and appending it into a list. When you are slicing it, you're getting the 0th (and only) element in paa, which is an entire np.array.
Get rid of the list and append and slice directly into the result of mean.

Can't calculate two values replacing certain sign

How can I add two values replacing / between them? I tried with item.replace("/","+") but it only places the + sign replacing / and does nothing else. For example If i try the logic on 2/1.5 and -3/-4.5, I get 2+1.5 and -3+-4.5.
My intention here is to add the two values replacing / between them and divide it into 2 so that the result becomes 1.875 and -3.75 respectively if I try the logic on (2/1.5 and -3/-4.5).
This is my try so far:
for item in ['2/1.5','-3/-4.5']:
print(item.replace("/","+"))
What I'm having now:
2+1.5
-3+-4.5
Expected output (adding the two values replacing / with + and then divide result by two):
1.75
-3.75

Since / is only a separator, you don't really need to replace it with +, but use it with split, and then sum up the parts:
for item in ['2/1.5', '-3/-4.5']:
result = sum(map(float, item.split('/'))) / 2
print(result)
or in a more generalized form:
from statistics import mean
for item in ['2/1.5', '-3/-4.5']:
result = mean(map(float, item.split('/')))
print(result)

You can do it using eval like this:
for item in ['2/1.5','-3/-4.5']:
print((eval(item.replace("/","+")))/2)

My answer is not that different from others, except I don't understand why everyone is using lists. A list is not required here because it won't be altered, a tuple is fine and more efficient:
for item in '2/1.5','-3/-4.5': # Don't need a list here
num1, num2 = item.split('/')
print((float(num1) + float(num2)) / 2)

A further elaboration of #daniel's answer:
[sum(map(float, item.split('/'))) / 2 for item in ('2/1.5','-3/-4.5')]
Result:
[1.75, -3.75]

You can do it like this(by splitting the strings into two floats):
for item in ['2/1.5','-3/-4.5']:
itemArray = item.split("/")
itemResult = float(itemArray[0]) + float(itemArray[1])
print(itemResult/2)

from ast import literal_eval
l = ['2/1.5','-3/-4.5']
print([literal_eval(i.replace('/','+'))/2 for i in l])

Get the max value of float numbers on a list on tuples?

i have the following list:
erra_eus_repo = [(u'RHSA-2017:2796', u'6.7'), (u'RHSA-2017:2796', u'6.8'), (u'RHSA-2017:2794', u'7.2'), (u'RHSA-2017:2793', u'7.3')]
what I am trying to take the floating point numbers from each tuple:
6.7, 6.8 ,7.2, 7.3
and get the max number for each version that before the dot .ie :
new_list = [ 6.8, 7.3 ]
Note that max() will not work here, since if I have 5.9 and 5.11, I will get the max as 5.9, I want the result to be 5.11 since 11 > 9.
What I have tried:
eus_major = []
eus_minor = []
for major in erra_eus_repo:
minor = (major[1][2])
major = (major[1][0])
if major not in eus_major:
eus_major.append(major)
if minor not in eus_minor:
eus_minor.append(minor)
print(eus_major, eus_minor)
currently i am getting:
[u'6', u'7'] [u'7', u'2', u'3']

You can achieve this for instance with a combination of groupby and sorting:
from itertools import groupby
srt_list = sorted(erra_eus_repo, key=lambda x: x[1]);
max_list = []
for key, group in groupby(srt_list, lambda x: x[1].split('.')[0]):
max_el = max(list(group), key = lambda y: int(y[1].split('.')[1]))
max_list.append(float(max_el[1]))
First the array is sorted based on second element of each tuple to get sequences of elements with matching non-decimal number for grouping with groupby. groupby groups the elements into just that - each group will represent a sequence X.Z with common X. In each of these sequences - groups the program finds the one with maximum decimal part treated as a stand-along number. The whole number is then appended to the list with max values as a float.

Do not treat the version numbers as floating point, treat as '.' separated strings. Then split each version string (split on '.') and compare. Like this:
def normalize_version(v):
return tuple(map(int, v.split('.')))
Then you can see:
>>> u'5.11' > u'5.9'
False
>>> normalize_version(u'5.11') > normalize_version(u'5.9')
True

Here is another take on the problem which provides the highest value for each RHSA value (which is want I think you're after):
erra_eus_repo = [(u'RHSA-2017:2796', u'6.7'), (u'RHSA-2017:2796', u'6.8'), (u'RHSA-2017:2794', u'7.2'), (u'RHSA-2017:2793', u'7.3')]
eus_major = {}
for r in erra_eus_repo:
if r[0] not in eus_major.keys():
eus_major[r[0]] = 0
if float(r[1]) > float(eus_major[r[0]]):
eus_major[r[0]] = r[1]
print(eus_major)
output:
{'RHSA-2017:2796': '6.8', 'RHSA-2017:2794': '7.2', 'RHSA-2017:2793': '7.3'}
I left the value as a string, but it could easily be cast as a float.

The following simply uses the built-in min and max functions:
erra_eus_repo = [(u'RHSA-2017:2796', u'6.7'),
(u'RHSA-2017:2796', u'6.8'),
(u'RHSA-2017:2794', u'7.2'),
(u'RHSA-2017:2793', u'7.3')]
eus_major = max(float(major[1])for major in erra_eus_repo)
eus_minor = min(float(major[1])for major in erra_eus_repo)
newlist = [eus_minor, eus_major]
print(newlist) # -> [6.7, 7.3]

This may look like you're trying to compare decimal values, but you really aren't. It goes without saying (but I will) that while 9<11, .9>.11. So the idea of splitting the number into two separate values is really the only way to get a valid comparison.
The list is a list of lists - you have the master and each has a sub-list of RHSA and a value. Apparently you want discard the first item in the list and only get the (I assume) version of that item. Here's some code that, while crude, will give you an idea of what to do. (I'd welcome comments on how to clean that up...) So I've taken the lists, split them, then split the versions into major and minor, then compared them and if nothing exists in the list, add the value. I also added for sake of testing a 6.11 version number.
lstVersion = []
lstMaxVersion=[]
erra_eus_repo = [(u'RHSA-2017:2796', u'6.7'), (u'RHSA-2017:2796', u'6.8'), (u'RHSA-2017:2796', u'6.11'), (u'RHSA-2017:2794', u'7.2'), (u'RHSA-2017:2793', u'7.3')]
for strItem in erra_eus_repo:
lstVersion.append(strItem[1])
for strVersion in lstVersion:
blnAdded = False
intMajor = int(strVersion.split('.')[0])
intMinor = int(strVersion.split('.')[1])
print 'intMajor: ', intMajor
print 'intMinor:' , intMinor
for strMaxItem in lstMaxVersion:
intMaxMajor = int(strMaxItem.split('.')[0])
intMaxMinor = int(strMaxItem.split('.')[1])
print 'strMaxitem: ', strMaxItem
print 'intMaxMajor: ', intMaxMajor
print 'intMaxMinor: ', intMaxMinor
if intMajor == intMaxMajor:
blnAdded = True
if intMinor > intMaxMinor:
lstMaxVersion.remove(strMaxItem)
lstMaxVersion.append(str(intMajor)+'.'+str(intMinor))
if not blnAdded:
lstMaxVersion.append(str(intMajor)+'.'+str(intMinor))

Python, complex looping calculations with lists or arrays

I am converting old pseudo-Fortran code into python and am struggling to create a framework within which I can perform some complex iterative calculations.
As a beginner, my first instinct is to use lists as I find them easier to work with, but i understand that arrays would probably be a more suitable method.
I already have all the input channels as lists and am hoping for a good explanation of how to set up loops for such calculations.
This is an example of the pseudo-Fortran i am replicating. Each (t) indicates a 'time-series channel' that I currently have stored as lists (ie. ECART2(t) and NNNN(t) are lists) All lists have the same number of entries.
do while ( ecart2(t) > 0.0002 .and. nnnn(t) < 2000. ) ;
mmm(t)=nnnn(t)+1.;
if YRPVBPO(t).ge.0.1 .and. YRPVBPO(t).le.0.999930338 .and. YAEVBPO(t).ge.0.000015 .and. YAEVBPO(t).le.0.000615 then do;
YM5(t) = customFunction(YRPVBPO,YAEVBPO);*
end;
YUEVBO(t) = YU0VBO(t) * YM5(t) ;*m/s
YHEVBO(t) = YCPEVBO(t)*TPO_TGETO1(t)+0.5*YUEVBO(t)*YUEVBO(t);*J/kg
YAVBO(t) = ddnn2(t)*(YUEVBO(t)**2);*
YDVBO(t) = YCPEVBO(t)**2 + 4*YHEVBO(t)*YAVBO(t) ;*
YTSVBPO(t) = (sqrt(YDVBO(t))-YCPEVBO(t))/2./YAVBO(t);*K
YUSVBO(t) = ddnn(t)*YUEVBO(t)*YTSVBPO(t);*m/s
YM7(t) = YUSVBO(t)/YU0VBO(t);*
YPHSVBPOtot(t) = (YPHEVBPO(t) - YPDHVBPO(t))/(1.+((YGAMAEVBO(t)-1)/2)*(YM7(t)**2))**(YGAMAEVBO(t)/(1-YGAMAEVBO(t)));*bar
YPHEVBPOtot(t) = YPHEVBPO(t) / (1.+rss0(t)*YM5(t)*YM5(t))**rss1(t);*bar
YDPVBPOtot(t) = YPHEVBPOtot(t) - YPHSVBPOtot(t) ;*bar
iter(t) = (YPHEVBPOtot(t) - YDPVBPOtot(t))/YPHEVBPOtot(t);*
ecart2(t)= ABS(iter(t)-YRPVBPO(t));*
aa(t)=YRPVBPO(t)+0.0001;
YRPVBPO(t)=aa(t);*
nnnn(t)=mmm(t);*
end;
Understanding the pseudo-fortran: With 'time-series data' there is an impicit loop iterating through the individual values in each list - as well as looping over each of those values until the conditions are met.
It will carry out the loop calculations on the first list values until the conditions are met. It then moves onto the second value in the lists and perform the same looping calculations until the conditions are met...
ECART2 = [2,0,3,5,3,4]
NNNN = [6,7,5,8,6,7]
do while ( ecart2(t) > 0.0002 .and. nnnn(t) < 2000. )
MMM = NNNN + 1
this looks at the first values in each list (2 and 6). Because the conditions are met, subsequent calculations are performed on the first values in the new lists such as MMM = [6+1,...]
Once the rest of the calculations have been performed (looping multiple times if the conditions are not met) only then does the second value in every list get considered. The second values (0 and 7) do not meet the conditions and therefore the second entry for MMM is 0.
MMM=[6+1, 0...]
Because 0 must be entered if conditons are not met, I am considering setting up all the 'New lists' in advance and populating them with 0s.
NB: 'customFunction()' is a separate function that is called, returning a value from two input values

MY CURRENT SOLUTION
set up all the empty lists
nPts = range(ECART2)
MMM = [0]*nPts
YM5 = [0]*nPts
etc...
then start performing calculations
for i in ECART2:
while (ECART2[i] > 0.0002) and (NNNN[i] < 2000):
MMM[i] = NNNN[i]+1
if YRPVBPO[i]>=0.1 and YRPVBPO[i]<=0.999930338 and YAEVBPO[i]>=0.000015 and YAEVBPO[i]<=0.000615:
YM5[i] = MACH_LBP_DIA30(YRPVBPO[i],YAEVBPO[i])
YUEVBO[i] = YU0VBO[i]*YM5[i]
YHEVBO[i] = YCPEVBO[i]*TGETO1[i] + 0.5*YUEVBO[i]^2
YAVBO[i] = DDNN2[i]*YUEVBO[i]^2
YDVBO[i] = YCPEVBO[i]^2 + 4*YHEVBO[i]*YAVBO[i]
etc etc...
but i'm guessing that there are better ways of doing this - such as the suggestion to use numpy arrays (something i plan on learning in the near future)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

compare vector lists - python

Related

converting list into a matrix in Python

Why is this Python array not slicing?

Can't calculate two values replacing certain sign

Get the max value of float numbers on a list on tuples?

Python, complex looping calculations with lists or arrays

Categories

Resources