python: shuffle list in respect to other attribute - python

I have two lists and I want to shuffle values in one in respect to the attributes in the other. For example:
list1 = np.array([1,1,1, 2,2,2, 3,3,3]) # spaces for better understanding
list2 = np.array([1,2,3, 4,5,6, 7,8,9])
result = [4,5,6, 1,2,3, 7,8,9]
I solved this problem by
y = split(list2, len(np.unique(list1)))
np.random.shuffle(y)
result = np.array(y).flatten()
I want it to work also for the cases when attributes in list1 are not together. Example:
list1 = np.array([1,2,3,1,2,3,1,2,3])
list2 = np.array([1,2,3,4,5,6,7,8,9])
result = [2,1,3,5,4,6,8,7,9]

Solved it:
uniques = np.unique(list1)
shuffled = uniques.copy()
np.random.shuffle(shuffled)
result = list2.copy()
for orig, new in zip(uniques, shuffled):
result[np.where(list1==orig)] = list2[np.where(list1==new)]

Related

Python - filter list from another other list with condition

list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)

Comparing 2 lists and printing the differences

I am trying to compare 2 different lists and find the differences between them. Say for example I have list 1 which consists of cat,dog,whale,hamster and list 2 which consists of dog,whale,hamster. How would I compare these two and then assign a variable to the difference which in this case is cat. Order does not matter however if there is more than one difference each of these differences should be assigned to an individual variable.
In my actual code im comparing html which consists of thousands of lines so I would prefer something as fast as possible but any is appreciated :)
str1 = 'cat,dog,whale,hamster'
str2 = 'dog,whale,hamster'
Change strings into python sets:
set1 = set(str1.split(','))
set2 = set(str2.split(','))
Get the difference:
result = set1 - set2
Which prints:
{'cat'}
You can convert it to a list or a string:
result_as_list = list(result)
result_as_string = ','.join(result)
If your lists can contain duplicates or if you need to know the elements that are only in one of the two lists, you can use Counter (from the collections module):
list1 = ['cat','dog','whale','hamster','dog']
list2 = ['dog','whale','hamster','cow','horse']
from collections import Counter
c1,c2 = Counter(list1),Counter(list2)
differences = [*((c1-c2)+(c2-c1)).elements()]
print(differences) # ['cat', 'dog', 'cow', 'horse']
This is how you are gonna do it. The function defined here will print the difference between the two lists
def Diff(list1, list2):
li_dif = [i for i in list1 + list2 if i not in list1 or i not in list2]
return li_dif
# Driver Code
list1 = ['cat','dog','whale','hamster']
list2 = ['dog','whale','hamster']
diff = Diff(list1, list2)
print(diff)
output:
['cat']
here cat is generated by the variable diff
Now if there is more than one difference, as follows:
def Diff(list1, list2):
li_dif = [i for i in list1 + list2 if i not in list1 or i not in list2]
return li_dif
# Driver Code
list1 = ['cat','dog','whale','hamster','ostrich','yak','sheep','lion','tiger']
list2 = ['dog','whale','hamster']
diff = Diff(list1, list2)
print(diff)
the output will be:
['cat','ostrich','yak','sheep','lion','tiger']
Your question is that if there is more than one difference, each of these differences should be assigned to an individual variable.
for that, we will treat the printed item as a list, let's name it list3
diff==list3
here, list3=['cat','ostrich','yak','sheep','lion','tiger']
Here, is only 6 list items, we can assign a variable to each of them as follows:
v1=list3[0]
v2=list3[1]
v3=list3[2]
v4=list3[3]
v5=list3[4]
v6=list3[5]

remove element from python list based on match from another list

I have list of s3 objects like this:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
result_list = ['uid=0987/2019/03/01/323dc.parquet']
With out using any loop is there any efficient way to achieve this considering large no of elements in list1?
You could build a set from list2 for a faster lookup and use a list comprehension to check for membership using the substring of interest:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet',
'uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
set2 = set(list2)
[i for i in list1 if i.lstrip('uid=').split('/',1)[0] not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
The substring is obtained through:
s = 'uid=123/2020/06/01/625e2ghvh.parquet'
s.lstrip('uid=').split('/',1)[0]
# '123'
This does the job. For different patterns though, or to also cover slight variations, you could go for a regex. For this example you'd need something like:
import re
[i for i in list1 if re.search(r'^uid=(\d+).*?', i).group(1) not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
This is one way to do it without loops
def filter_function(item):
uid = int(item[4:].split('/')[0])
if uid not in list2:
return True
return False
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = [123, 876]
result_list = list(filter(filter_function, list1))
How about this one:
_list2 = [f'uid={number}' for number in list2]
result = [item for item in list1 if not any([item.startswith(i) for i in _list2])] # ['uid=0987/2019/03/01/323dc.parquet']

python: Sum of products of values in two nested lists

I want sum of products (sop) of values of two nested lists.
Shift second list to the left by one after each iteration.
Store result of each sop in a list
I have two list like:
List1 = [[A,1],[B,2],[C,3]]
List2 = [[A,4],[B,5],[C,6]]
I am expecting this:
iteration1 ->
List1 = [[A,1],[B,2],[C,3]]
List2 = [[A,4],[B,5],[C,6]]
sop = (1*4)+(2*5)+(3*6) = 32
iteration2 ->
List1 = [[A,1],[B,2],[C,3]]
List2 = [[B,5],[C,6],[A,4]] #only second list shifts by one to the left
sop = (1*5)+(2*6)+(3*4) = 29
iteration2 ->
List1 = [[A,1],[B,2],[C,3]]
List2 = [[C,6],[A,4],[B,5]] #only second list shifts by one to the left
sop = (1*6)+(2*4)+(3*5) = 29
Resulting list should show the following:
resultlist = [32,29,29]
I am unable to figure out how to code this in python, can anyone please help me with this?
You should use a variable (offset) that determines how much B is rotated, rather than physically shifting a list. The modulus operation on the index can be used to simulate a circular list.
def products(A, B):
out = []
n = len(A)
for offset in range(n):
out.append(sum( A[i] * B[ (i + offset) % n ] for i in range(n)))
return out
The input is assumed to be arrays of numbers e.g. products([1,2,3],[4,5,6])
You can itertools.cycle over List2, skipping one at the end of each loop:
from itertools import cycle
List1 = [['A',1],['B',2],['C',3]]
List2 = cycle([['A',4],['B',5],['C',6]])
resultlist = []
for _ in List1:
resultlist.append(sum(a[1]*b[1] for a,b in zip(List1, List2)))
next(List2) # skip one of the cycle
print(resultlist)
Output:
[32, 29, 29]

divide list and generate series of new lists. one from each list and rest into other

I have three lists and want to sort and generate two new list. Can any one please tell how it can be done?
list1=[12,25,45], list2=[14,69], list3=[54,98,68,78,48]
I want to print the output like
chosen1=[12,14,54], rest1=[25,45,69,98,68,78,48]
chosen2=[12,14,98], rest2=[25,45,69,54,68,78,48]
and so on
(every possible combination for chosen list)
I have tried to write this but I don't know
list1=[12,25,45]
list2=[14,69]
list3=[54,98,68,78,48]
for i in xrange (list1[0],list1[2]):
for y in xrange(list2[0], list2[1]):
for z in xrange(list[0],list[4])
for a in xrange(chosen[0],[2])
chosed1.append()
for a in xrange(chosen[0],[7])
rest1.append()
Print rest1
Print chosen1
itertools.product generates all permutations of selecting one thing each out of different sets of things:
import itertools
list1 = [12,25,45]
list2 = [14,69]
list3 = [54,98,68,78,48]
for i,(a,b,c) in enumerate(itertools.product(list1,list2,list3),1):
# Note: Computing rest this way will *not* work if there are duplicates
# in any of the lists.
rest1 = [n for n in list1 if n != a]
rest2 = [n for n in list2 if n != b]
rest3 = [n for n in list3 if n != c]
rest = ','.join(str(n) for n in rest1+rest2+rest3)
print('chosen{0}=[{1},{2},{3}], rest{0}=[{4}]'.format(i,a,b,c,rest))
Output:
chosen1=[12,14,54], rest1=[25,45,69,98,68,78,48]
chosen2=[12,14,98], rest2=[25,45,69,54,68,78,48]
chosen3=[12,14,68], rest3=[25,45,69,54,98,78,48]
chosen4=[12,14,78], rest4=[25,45,69,54,98,68,48]
chosen5=[12,14,48], rest5=[25,45,69,54,98,68,78]
chosen6=[12,69,54], rest6=[25,45,14,98,68,78,48]
chosen7=[12,69,98], rest7=[25,45,14,54,68,78,48]
chosen8=[12,69,68], rest8=[25,45,14,54,98,78,48]
chosen9=[12,69,78], rest9=[25,45,14,54,98,68,48]
chosen10=[12,69,48], rest10=[25,45,14,54,98,68,78]
chosen11=[25,14,54], rest11=[12,45,69,98,68,78,48]
chosen12=[25,14,98], rest12=[12,45,69,54,68,78,48]
chosen13=[25,14,68], rest13=[12,45,69,54,98,78,48]
chosen14=[25,14,78], rest14=[12,45,69,54,98,68,48]
chosen15=[25,14,48], rest15=[12,45,69,54,98,68,78]
chosen16=[25,69,54], rest16=[12,45,14,98,68,78,48]
chosen17=[25,69,98], rest17=[12,45,14,54,68,78,48]
chosen18=[25,69,68], rest18=[12,45,14,54,98,78,48]
chosen19=[25,69,78], rest19=[12,45,14,54,98,68,48]
chosen20=[25,69,48], rest20=[12,45,14,54,98,68,78]
chosen21=[45,14,54], rest21=[12,25,69,98,68,78,48]
chosen22=[45,14,98], rest22=[12,25,69,54,68,78,48]
chosen23=[45,14,68], rest23=[12,25,69,54,98,78,48]
chosen24=[45,14,78], rest24=[12,25,69,54,98,68,48]
chosen25=[45,14,48], rest25=[12,25,69,54,98,68,78]
chosen26=[45,69,54], rest26=[12,25,14,98,68,78,48]
chosen27=[45,69,98], rest27=[12,25,14,54,68,78,48]
chosen28=[45,69,68], rest28=[12,25,14,54,98,78,48]
chosen29=[45,69,78], rest29=[12,25,14,54,98,68,48]
chosen30=[45,69,48], rest30=[12,25,14,54,98,68,78]
If you need to get 2 digit combinations from the two list and the remaining then this would be the solution:
import itertools
list1 = [12,25,45]
list2 = [14,69]
list3 = [21,34,56,32]
chosen = []
leftover = []
mergedlist = list(set(list1 + list2 + list3))
mergedNewList = [x for x in itertools.permutations(mergedlist,3)]
for i,value in enumerate(mergedNewList):
chosen.append(list(value))
leftover.append([j for j in mergedlist if j not in chosen[i]])
print chosen[i]
print leftover[i]`
I have appended the values in a single variable for chosen and for the rest in leftover as this is the most pythonic way of storing the values.

Categories

Resources