Sort list based on pattern

Sort list based on pattern - python

I'd like to know how I can easily generate a list based on the values/order of two other lists:
list_a = ['web1','web2','web3','web1','web4']
list_b = ['web2','web4','web1','web5','web1']
I'd like to retrieve the "list_b" list ordered by value from "list_a":
final = ['web1','web2','web1','web4','web5']
If an entry exist on list_b but not on list_a, then the value is appended to the list at the end.
I'm not sure where to start, my initial thinking was to retrieve all the indexes with enum [i for i, x in enumerate(mylist) if x==value], then sort the list, but I'm having hard time managing entries with multiples index (eg: web1) . Just wondering if you guys are thinking about an easy way to achieve this ?

An extremely simplistic way would be to just iterate over list_a, and should you find each element in list_b you remove it and append it to a list. Then after iterating all that remains in list_b are the elements that you need to add to the end of your list.
list_a = ['web1','web2','web3','web1','web4']
list_b = ['web2','web4','web1','web5','web1']
front = []
for ele in list_a:
if ele in list_b:
front.append(ele)
list_b.remove(ele)
final = front + list_b
print(final)
Outputs:
['web1', 'web2', 'web1', 'web4', 'web5']
Another trickier way would be to use collections.Counter and a few list comprehensions, leveraging the set intersection and difference of the counters.
from collections import Counter
cnt_a, cnt_b = Counter(list_a), Counter(list_b)
intersct = (cnt_a & cnt_b)
diff = (cnt_b - cnt_a)
final = [a for a in list_a if a in intersct] + [b for b in list_b if b in diff]

Related

Compare two list of dictionaries and get difference

I am new to python. In Python, I want to compare two list of dictionaries
Below are 2 list of dictionary I want to compare based on key which is "zrepcode" and id which is the number "1", "3", and "4"...
Code snippet is as follows:
List
1 = [{"3":[{"period":"P13","value":10,"year":2022}],"zrepcode":"55"},{"1":[{"period":"P10","value":5,"year":2023}],"zrepcode":"55"}]
List2 = [{"1":[{"period":"P1","value":10,"year":2023},{"period":"P2","value":5,"year":2023}],"zrepcode":"55"},{"3":[{"period":"P1","value":4,"year":2023},{"period":"P2","value":7,"year":2023}],"zrepcode":"55"},{"4":[{"period":"P1","value":10,"year":2023}],"zrepcode":"55"}]
After Comparision, we need the unique list of dictionary from list2.
res = [{"4":[{"period":"P1","value":10,"year":2023}],"zrepcode":"55"}]
This is the expected output, Now I don't know how I get this.

Here is my solution:
list_1 = [
{"3":[{"period":"P13","value":10,"year":2022}],"zrepcode":"55"},
{"1":[{"period":"P10","value":5,"year":2023}],"zrepcode":"55"}
]
list_2 = [
{"1":[{"period":"P1","value":10,"year":2023},{"period":"P2","value":5,"year":2023}],"zrepcode":"55"},
{"3":[{"period":"P1","value":4,"year":2023},{"period":"P2","value":7,"year":2023}],"zrepcode":"55"},
{"4":[{"period":"P1","value":10,"year":2023}],"zrepcode":"55"}]
list_1_keys = [sorted(element.keys())[0] for element in list_1]
res = [element for element in list_2 if sorted(element.keys())[0] not in list_1_keys]
I think you do not need any check on the key zrepcode because this is always the same.
let's me know if you need more explanation/details about the solution.
I hope it will help you.
EDIT
here is the solution if we take into account the zrepcode
list_1_couple = []
for element in list_1:
keys = sorted(element.keys())
list_1_couple.append([keys[0], element[keys[1]]])
res = []
for element in list_2:
keys = sorted(element.keys())
if [keys[0], element[keys[1]]] not in list_1_couple:
res.append(element)
print(res)
You can probably clean a bit the code, but at least it should works 😉
EDIT 2
If you prefer to use some one-liner
list_1_couple = [[sorted(element.keys())[0], element[sorted(element.keys())[1]]] for element in list_1 ]
res = [element for element in list_2 if [sorted(element.keys())[0], element[sorted(element.keys())[1]]] not in list_1_couple]
will do the trick too

Relationship between elements of two list: how to exploit it in Python?

SO here is my minimal working example:
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# Then I want to have these items all shuffled around, so for instance
list2 = np.random.permutation(list2)
#Now here is my problem: I want to understand which element of the new list2 came from which element of list1. I am looking for something like this:
list1.index(something)
# Basically given an element of list2, I want to understand from where it came from, in list1. I really cant think of a simple way of doing this, but there must be an easy way!
Can you please suggest me an easy solution? This is a minimal working example,however the main point is that I have a list, I do some operation on the elements and assign these to a new list. And then the items get all shuffled around and I need to understand where they came from.

enumerate, like everyone said is the best option but there is an alternative if you know the mapping relation. You can write a function that does the opposite of the mapping relation. (eg. decodes if the original function encodes.)
Then you use decoded_list = map(decode_function,encoded_list) to get a new list. Then by cross comparing this list with the original list, you can achieve your goal.
Enumerate is better if you are certain that the same list was modified using the encode_function from within the code to get the encoded list.
However, if you are importing this new list from elsewhere, eg. from a table on a website, my approach is the way to go.

You could use a permutation list/index :
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# Then I want to have these items all shuffled around, so for instance
index_list = range(len(list2))
index_list = np.random.permutation(index_list)
list3 = [list2[i] for i in index_list]
then,with input_element:
answer = index_list[list3.index(input_element)]

Based on your code:
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# made a recode of index and value
index_list2 = list(enumerate(list2))
# Then I want to have these items all shuffled around, so for instance
index_list3 = np.random.permutation(index_list2)
idx, list3 = zip(*index_list3)
#get the index of element_input in list3, then get the value of the index in idx, that should be the answer you want.
answer = idx[list3.index(element_input)]

def index3_to_1(index):
y = list3[index]
x = np.log(y)/np.log(2) # inverse y=f(x) for your operation
return list1.index(x)
This supposes that the operations you are doing on list2 are reversible. Also, it supposes that each element in list1 is unique.

Matching elements between lists in Python - keeping location

I have two lists, both fairly long. List A contains a list of integers, some of which are repeated in list B. I can find which elements appear in both by using:
idx = set(list_A).intersection(list_B)
This returns a set of all the elements appearing in both list A and list B.
However, I would like to find a way to find the matches between the two lists and also retain information about the elements' positions in both lists. Such a function might look like:
def match_lists(list_A,list_B):
.
.
.
return match_A,match_B
where match_A would contain the positions of elements in list_A that had a match somewhere in list_B and vice-versa for match_B.
I can see how to construct such lists using a for-loop, however this feels like it would be prohibitively slow for long lists.
Regarding duplicates: list_B has no duplicates in it, if there is a duplicate in list_A then return all the matched positions as a list, so match_A would be a list of lists.

That should do the job :)
def match_list(list_A, list_B):
intersect = set(list_A).intersection(list_B)
interPosA = [[i for i, x in enumerate(list_A) if x == dup] for dup in intersect]
interPosB = [i for i, x in enumerate(list_B) if x in intersect]
return interPosA, interPosB
(Thanks to machine yearning for duplicate edit)

Use dicts or defaultdicts to store the unique values as keys that map to the indices they appear at, then combine the dicts:
from collections import defaultdict
def make_offset_dict(it):
ret = defaultdict(list) # Or set, the values are unique indices either way
for i, x in enumerate(it):
ret[x].append(i)
dictA = make_offset_dict(A)
dictB = make_offset_dict(B)
for k in dictA.viewkeys() & dictB.viewkeys(): # Plain .keys() on Py3
print(k, dictA[k], dictB[k])
This iterates A and B exactly once each so it works even if they're one-time use iterators, e.g. from a file-like object, and it works efficiently, storing no more data than needed and sticking to cheap hashing based operations instead of repeated iteration.
This isn't the solution to your specific problem, but it preserves all the information needed to solve your problem and then some (e.g. it's cheap to figure out where the matches are located for any given value in either A or B); you can trivially adapt it to your use case or more complicated ones.

How about this:
def match_lists(list_A, list_B):
idx = set(list_A).intersection(list_B)
A_indexes = []
for i, element in enumerate(list_A):
if element in idx:
A_indexes.append(i)
B_indexes = []
for i, element in enumerate(list_B):
if element in idx:
B_indexes.append(i)
return A_indexes, B_indexes

This only runs through each list once (requiring only one dict) and also works with duplicates in list_B
def match_lists(list_A,list_B):
da=dict((e,i) for i,e in enumerate(list_A))
for bi,e in enumerate(list_B):
try:
ai=da[e]
yield (e,ai,bi) # element e is in position ai in list_A and bi in list_B
except KeyError:
pass

Try this:
def match_lists(list_A, list_B):
match_A = {}
match_B = {}
for elem in list_A:
if elem in list_B:
match_A[elem] = list_A.index(elem)
match_B[elem] = list_B.index(elem)
return match_A, match_B

I need to make two lists the same

I have two quite long lists and I know that all of the elements of the shorter are contained in the longer, yet I need to isolate the elements in the longer list which are not in the shorter so that I can remove them individually from the dictionary I got the longer list from.
What I have so far is:
for e in range(len(lst_ck)):
if lst_ck[e] not in lst_rk:
del currs[lst_ck[e]]
del lst_ck[e]
lst_ck is the longer list and lst_rk is the shorter, currs is the dictionary from which came lst_ck. If it helps, they are both lists of 3 digit keys from dictionaries.

Use sets to find the difference:
l1 = [1,2,3,4]
l2 = [1,2,3,4,6,7,8]
print(set(l2).difference(l1))
set([6, 7, 8]) # in l2 but not in l1
Then remove the elements.
diff = set(l2).difference(l1):
your_list[:] = [ele for ele in your_list of ele not in diff]
If you lists are very big you may prefer a generator expression:
your_list[:] = (ele for ele in your_list of ele not in diff)

If you don't care of multiple occurrences of the same item, use set.
diff = set(lst_ck) - set(lst_rk)
If you care, try this:
diff = [e for e in lst_rk if e not in lst_ck]

Modifying list elements based on key word of the element

I have many lists which I want to do some operations on some specific elements. So if I have something like:
list1 = ['list1_itemA', 'list1_itemB', 'list1_itemC', 'list1_itemD']
list2 = ['list2_itemA', 'list2_itemC','list2_itemB']
What interest me is item 'itemC' wherever it occurs in all lists and I need to isolate an element which contain itemC for next manipulations on it. I thought about sorting the lists in such a way that itemC occupies the first index which would be achieved by list[0] method.
But in my case itemA, itemB, itemC and itemD are biological species names and I dont know how to force list element occupy the first index (that would be an element with certain string e.g 'cow' in my analysis or 'itemC' here). Is this possible with Python?

You can extract items containing "itemC" without ordering, or worrying how many there are, with a "generator expression":
itemCs = []
for lst in (list1, list2):
itemCs.extend(item for item in lst if "itemC" in item)
This gives itemCs == ['list1_itemC', 'list2_itemC'].

If you're trying to save the lists with a specific string contained in the text, you can use:
parse_lists = [ list1, list2, list3 ]
matching_lists = []
search_str = "itemC"
for thisList in parse_list:
if any( search_str in item for item in thisList ):
matching_lists.append( thisList )
This has an advantage that you don't need to hard-code your list name in all your list item strings, which I'm assuming you're doing now.
Also interesting to note is that changing elements of matching_lists changes the original (referenced) lists as well. You can see this and this for clarity.

>>> [x for y in [list1, list2] for x in y if "itemC" in x]
['list1_itemC', 'list2_itemC']
or
>>> [x for y in [list1, list2] for x in y if any(search_term in x for search_term in ["itemC"])]
['list1_itemC', 'list2_itemC']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sort list based on pattern - python

Related

Compare two list of dictionaries and get difference

Relationship between elements of two list: how to exploit it in Python?

Matching elements between lists in Python - keeping location

I need to make two lists the same

Modifying list elements based on key word of the element

Categories

Resources