How to match and replace list elements in Python? - python

I have a list in Python with certain elements. I want to replace these elements with their corresponding elements from another list.
I want to have another list that relates elements in a list like:
x = ['red','orange','yellow','green','blue','purple','pink']
y = ['cherry','orange','banana','apple','blueberry','eggplant','grapefruit']
so that a new list will be created with the the corresponding elements from y whose elements are in x. So
r = ['green','red','purple']
will become
r = ['apple','cherry','eggplant']
Thanks

First create a mapping from one list to another:
my_mapping = dict(zip(x, y))
The zip part is mainly a formality: dict expects the argument to be a sequence of pairs, rather than a pair of sequences.
Then apply this mapping to every member of r using a list comprehension:
new_r = [my_mapping[elem] for elem in r]

Dictionaries are the best option for this. They are maps of key-value pairs so that if I index a dictionary with a key, it will return the value under that key.
Let's make a new dictionary using the zip function that takes two lists:
mapper = dict(zip(x,y))
Now, if we want to change a list r to its counterpart with elements from the new list:
r = [mapper[i] for i in r]
This takes each element in r and uses the dictionary to turn it into its counterpart.

Related

Using bisect.insort to insert an item into a list, which is a part of a list of lists

I'm trying to manage a list of several lists and each one should be sorted at all times, but I can't insert items into each list separately.
I had a list of n lists
arr = [[]]*n
I was trying to use
bisect.insort(arr[b],c)
to insert element c into b-th lsit while keeping it sorted, but it iserts c into every list in arr
Later on I discovered that regular incert works like that too if I use
arr[b].isert(idx,c)
Is there a possibility to change the structure of arr to continue using bisect.insort?
apparently when I do
arr = [[]]*n
it creates a list of n references to one empty list.
Instead I should have used
arr = [[] for i in range(3)]

Problems with ordinating elements according to their occurrences

I have to create this function that has as inputs a String and a list of strings; and as output a list of the indices of strings that contain the String. I have done it, but then I should ordinate the indices according to the occurrences of the String in the strings. How can i do that? This is my code:
I added the 'count' under 'if' to count the occurrences, how can i use it to ordinate the indices according to that?
You can add a list of counts in each string to your function,
def function(s,lst):
l=[]
counts = []
for i in range(len(lst)):
if s in lst[i]:
counts += [lst[i].count(s)]
l += [i]
return l, counts
Here counts is a list in which each entry is the count of occurrences of s in the string in your input list. The function now returns two lists in a tuple, for example with the first tuple element being l and the second being counts. Note that i=-1 is redundant here as i is an element of the iterable made with range and assigning a value to it before the loop doesn't change it's loop value.
You can now sort the first list based on the second list using a line modified from this post,
out_fun = function(s,inp)
out = [x for x,_ in sorted(zip(out_fun[0],out_fun[1]), key = lambda x: x[1], reverse=True)]
inp is the list of strings, for example inp = ["hello", "cure", "access code"]. out_fun is the return tuple of two lists from the function function. s is the string of interest - here as in your original example it is 'c'.
What this line does is that it first creates a list of tuples using zip, where each first element of the tuple is is element from the list of indices and the second is from the list of occurrences. The program then sorts the tuples based on the second element in reverse order (largest first). The list comprehension fetches only the first element from each tuple in the sorted result, which is again the index list.
If you have questions about this solution, feel free to ask. You have a Python 2.7 tag - in Python 3.X you would need to use list(zip()) as zip returns a zip object rather than a list.
This is a more concise version of your program:
def function(s,lst):
t = [(i,x.count(s)) for i,x in enumerate(lst) if s in x]
return t
It uses a list comprehension to create and return a list of tuples t with first element being the index of the string that has the character s and second being the count. This is not necessarily more efficient, that would need to be checked. But it's a clean one-liner that at least to me is more readable.
The list of tuples can than be sorted in a similar way to the previous program, based on second tuple element i.e. count,
out_fun = function(s,inp)
out = [x for x,_ in sorted(out_fun, key = lambda x: x[1], reverse=True)]

Python: replace values of sublist, with values looked up from another sublist without indexing

Description
I have two lists of lists which are derived from CSVs (minimal working example below). The real dataset for this too large to do this manually.
mainlist = [["MH75","QF12",0,38], ["JQ59","QR21",105,191], ["JQ61","SQ48",186,284], ["SQ84","QF36",0,123], ["GA55","VA63",80,245], ["MH98","CX12",171,263]]
replacelist = [["MH75","QF12","BA89","QR29"], ["QR21","JQ59","VA51","MH52"], ["GA55","VA63","MH19","CX84"], ["SQ84","QF36","SQ08","JQ65"], ["SQ48","JQ61","QF87","QF63"], ["MH98","CX12","GA34","GA60"]]
mainlist contains a pair of identifiers (mainlist[x][0], mainlist[x][1]) and these are associated with to two integers (mainlist[x][2] and mainlist[x][3]).
replacelist is a second list of lists which also contains the same pairs of identifiers (but not in the same order within a pair, or across rows). All sublist pairs are unique. Importantly, replacelist[x][2],replacelist[x][3] corresponds to a replacement for replacelist[x][0],replacelist[x][1], respectively.
I need to create a new third list, newlist which copies mainlist but replaces the identifiers with those from replacelist[x][2],replacelist[x][3]
For example, given:
mainlist[2] is: [JQ61,SQ48,186,284]
The matching pair in replacelist is
replacelist[4]: [SQ48,JQ61,QF87,QF63]
Therefore the expected output is
newlist[2] = [QF87,QF63,186,284]
More clearly put:
if replacelist = [[A, B, C, D]]
A is replaced with C, and B is replaced with D.
but it may appear in mainlist as [[B, A]]
Note newlist row position uses the same as mainlist
Attempt
What has me totally stumped on a simple problem is I feel I can't use basic list comprehension [i for i in replacelist if i in mainlist] as the order within a pair changes, and if I sorted(list) then I lose information about what to replace the lists with. Current solution (with commented blanks):
newlist = []
for k in replacelist:
for i in mainlist:
if k[0] and k[1] in i:
# retrieve mainlist order, then use some kind of indexing to check a series of nested if statements to work out positional replacement.
As you can see, this solution is clearly inefficient and I can't work out the best way to perform the final step in a few lines.
I can add more information if this is not clear
It'll help if you had replacelist as a dict:
mainlist = [[MH75,QF12,0,38], [JQ59,QR21,105,191], [JQ61,SQ48,186,284], [SQ84,QF36,0,123], [GA55,VA63,80,245], [MH98,CX12,171,263]]
replacelist = [[MH75,QF12,BA89,QR29], [QR21,JQ59,VA51,MH52], [GA55,VA63,MH19,CX84], [SQ84,QF36,SQ08,JQ65], [SQ48,JQ61,QF87,QF63], [MH98,CX12,GA34,GA60]]
replacements = {frozenset(r[:2]):dict(zip(r[:2], r[2:])) for r in replacements}
newlist = []
for *ids, val1, val2 in mainlist:
reps = replacements[frozenset([id1, id2])]
newlist.append([reps[ids[0]], reps[ids[1]], val1, val2])
First thing you do - transform both lists in a dictionary:
from collections import OrderedDict
maindct = OrderedDict((frozenset(item[:2]),item[2:]) for item in mainlist)
replacedct = {frozenset(item[:2]):item[2:] for item in replacementlist}
# Now it is trivial to create another dict with the desired output:
output_list = [replacedct[key] + maindct[key] for key in maindct]
The big deal here is that by using a dictionary, you cancel up the search time for the indices on the replacement list - in a list you have to scan all the list for each item you have, which makes your performance worse with the square of your list length. With Python dictionaries, the search time is constant - and do not depend on the data length at all.

Put average of nested list values into new list

I have the following list:
x = [(27.3703703703704, 2.5679012345679, 5.67901234567901,
6.97530864197531, 1.90123456790123, 0.740740740740741,
0.440136054421769, 0.867718446601942),
(25.2608695652174, 1.73913043478261, 6.07246376811594,
7.3768115942029, 1.57971014492754, 0.710144927536232,
0.4875, 0.710227272727273)]
I'm looking for a way to get the average of each of the lists nested within the main list, and create a new list of the averages. So in the case of the above list, the output would be something like:
[[26.315],[2.145],[5.87],etc...]
I would like to apply this formula regardless of the amount of lists nested within the main list.
I assume your list of tuples of one-element lists is looking for the sum of each unpacked element inside the tuple, and a list of those options. If that's not what you're looking for, this won't work.
result = [sum([sublst[0] for sublst in tup])/len(tup) for tup in x]
EDIT to match changed question
result = [sum(tup)/len(tup) for tup in x]
EDIT to match your even-further changed question
result = [[sum(tup)/len(tup)] for tup in x]
An easy way to acheive this is:
means = [] # Defines a new empty list
for sublist in x: # iterates over the tuples in your list
means.append([sum(sublist)/len(sublist)]) # Put the mean of the sublist in the means list
This will work no matter how many sublists are in your list.
I would advise you read a bit on list comprehensions:
https://docs.python.org/2/tutorial/datastructures.html
It looks like you're looking for the zip function:
[sum(l)/len(l) for l in zip(*x)]
zip combines a collection of tuples or lists pairwise, which looks like what you want for your averages. then you just use sum()/len() to compute the average of each pair.
*x notation means pass the list as though it were individual arguments, i.e. as if you called: zip(x[0], x[1], ..., x[len(x)-1])
r = [[sum(i)/len(i)] for i in x]

Fill a list in python with values from a smaller list by indexes contained in another list with the same size?

I have a list full of indexes named g_index, I have another list of the same size,named fill_list, full of values. I also have another list, P1, that is longer than g_index and fill_list. I want to create a new list of size, P1, that has the value of fill_list in the indexes of g_index, with the rest of the values being therefore, None. Note that the items in fill_list will fill the indexes in g index in a sequential order. This means that the first item of fill_list will go on index 3 of the new list ( I call it finaList in the code) and so on and so forth. The size of the list created should be the size of P1, with the values of fill_list in indexes denoted by g_index.
Here is what I have:
g_index=[3,6,2,83,7,100,5,1]
fill_list=["lk",3,6,9,"gh",4,7,34]
finalList=[]
for index in range(len(P1)):
for j in range(len(g_index)):
if index== g_index[j]:
finalList.append(fill_list)
print finalist
In my code, the final list just keeps appending itself with values, which is not what I am looking for.
You can create a list with None (or anything else) and then map the values like this:
g_index=[3,6,2,83,7,100,5,1]
fill_list=["lk",3,6,9,"gh",4,7,34]
p1 = [None] * (max(g_index) + 1)
for k, v in enumerate(g_index):
p1[v] = fill_list[k]
print(p1)
What about something like this:
final_list = []
for i in range(len(P1)):
if i in g_index:
final_list.append(fill_list)
else:
final_list.append(None)
In the above example, whatever could be the whole final_list, or an item from fill_list.

Categories

Resources