Python mapping values in a list by group

Python mapping values in a list by group - python

I have the following:
mylist = ['A','A','A','B','B','C']
colors = ['r','g','b','w','y']
I want all the same elements in mylist to get the same color, from the beginning of the color list, so the result would be like this:
result = ['r','r','r','g','g','b']
The colors, w and y would be ignored. Can't seem to get the mapping working correctly.
I have tried:
result = [[y for y in colors if set(mylist) == x] for x in mylist]
Edit: to make it more clear, ['r','g','b','w','y'] doesn't always need to be mapped to ABCDEF... mylist could have been ['cat','cat','cat','dog','dog','bird']

You may first create the mapping as a dict, then use it to get the result
mylist = ['A', 'A', 'A', 'B', 'B', 'C']
colors = ['r', 'g', 'b', 'w', 'y']
mapping = dict(zip(
sorted(set(mylist)),
colors
))
print(mapping) # {'A': 'r', 'B': 'g', 'C': 'b'}
result = [mapping[l] for l in mylist]
print(result) # ['r', 'r', 'r', 'g', 'g', 'b']

If you don't care about the order of colours:
color_map = dict(zip(set(mylist), colors))
result = [color_map[item] for item in mylist]
If you care about the order of colours:
from collections import OrderedDict
color_map = OrderedDict(zip(OrderedDict((item, True) for item in mylist), colors))
result = [color_map[item] for item in mylist]

You could use Counter to count how many times a value appears in your list.
Then use that mapping to fill your result list.
from collections import Counter
mylist = ['A','A','A','B','B','C']
colors = ['r','g','b','w','y']
result = []
for idx, (_,v) in enumerate( Counter(mylist).items() ):
result.extend( colors[idx] * v )
print(result)
Output:
['r', 'r', 'r', 'g', 'g', 'b']
Note: Requires Python > 3.7, otherwise the order of the dict is not guaranteed - this also applies to the other answers here that rely on dict.

For me the easiest way would be:
mylist = ['A', 'A', 'A', 'B', 'B', 'C']
colors = ['r', 'g', 'b', 'w', 'y']
result = []
for i, item in enumerate(sorted(set(mylist))): # sets doesn't maintain the order, so its sorted alphabetically
result.extend(colors[i] * mylist.count(item))
print(result)

I would suggest to use the dictionary instead to keep the mapping:
result = []
color_map = {}
idx = 0
for elt in mylist:
if elt not in color_map.keys():
color_map[elt] = colors[idx]
idx += 1
result.append(color_map[elt])
This also avoids iterating over the colors list separately.

Related

python - form new list from n elements to the right of a reoccurring word

Given a list of strings:
haystack = ['hay','hay','hay','needle','x','y','z','hay','hay','hay','hay','needle','a','b','c']
Question
How would I form a new list of strings that contain, say, only the three adjacent elements (to the right) of every 'needle' occurrence within haystack?

Find all the indices of "needle" and take 3 values right the indices.
# Get all indices of "needle"
idx = [idx for idx, val in enumerate(haystack) if val=="needle"]
#idx -> [3, 11]
# Take 3 values right of each index in `idx`.
[val for i in idx for val in haystack[i: i+4]]
# ['needle', 'x', 'y', 'z', 'needle', 'a', 'b', 'c']
# want it to be a list of list
[haystack[i: i+4] for i in idx]
# [['needle', 'x', 'y', 'z'], ['needle', 'a', 'b', 'c']]
# Want to exclude the "needle"
[val for i in idx for val in haystack[i+1: i+4]]
# ['x', 'y', 'z', 'a', 'b', 'c']

This is a kind of hacky solution, but it works with only one pass through the list.
it = iter(haystack)
output = [[next(it), next(it), next(it)] for s in it if s == 'needle']
# [['x', 'y', 'z'], ['a', 'b', 'c']]
This is essentially the short-form of the following:
it = iter(haystack)
output = []
while True:
try:
elem = next(it)
if elem == 'needle':
output.append([next(it), next(it), next(it)])
except StopIteration:
break
note that, in the short form, you'll get a StopIteration error if there are fewer than three elements following a 'needle'.

A simple list comprehension with list slicing seems to work as well:
out = [haystack[i+1:i+4] for i, x in enumerate(haystack) if x == 'needle']
Output:
[['x', 'y', 'z'], ['a', 'b', 'c']]

If I understood correctly then you want this...
for i in [i for i,ele in enumerate(haystack) if ele=="needle"]:
Out.extend(haystack[i+1:i+4])
print(Out)
Output
['x', 'y', 'z', 'a', 'b', 'c']

complete list if the first and last element is equal

I have a problem trying to transform a list.
The original list is like this:
[['a','b','c',''],['c','e','f'],['c','g','h']]
now I want to make the output like this:
[['a','b','c','e','f'],['a','b','c','g','h']]
When the blank is found ( '' ) merge the three list into two lists.
I need to write a function to do this for me.
Here is what I tried:
for x in mylist:
if x[len(x) - 1] == '':
m = x[len(x) - 2]
for y in mylist:
if y[0] == m:
combine(x, y)
def combine(x, y):
for m in y:
if not m in x:
x.append(m)
return(x)
but its not working the way I want.

try this :
mylist = [['a','b','c',''],['c','e','f'],['c','g','h']]
def combine(x, y):
for m in y:
if not m in x:
x.append(m)
return(x)
result = []
for x in mylist:
if x[len(x) - 1] == '':
m = x[len(x) - 2]
for y in mylist:
if y[0] == m:
result.append(combine(x[0:len(x)-2], y))
print(result)
your problem was with
combine(x[0:len(x)-2], y)
output :
[['a', 'b', 'c', 'e', 'f'], ['a', 'b', 'c', 'g', 'h']]

So you basically want to merge 2 lists? If so, you can use one of 2 ways :
Either use the + operator, or use the
extend() method.
And then you put it into a function.

I made it with standard library only with comments. Please refer it.
mylist = [['a','b','c',''],['c','e','f'],['c','g','h']]
# I can't make sure whether the xlist's item is just one or not.
# So, I made it to find all
# And, you can see how to get the last value of a list as [-1]
xlist = [x for x in mylist if x[-1] == '']
ylist = [x for x in mylist if x[-1] != '']
result = []
# combine matrix of x x y
for x in xlist:
for y in ylist:
c = x + y # merge
c = [i for i in c if i] # drop ''
c = list(set(c)) # drop duplicates
c.sort() # sort
result.append(c) # add to result
print (result)
The result is
[['a', 'b', 'c', 'e', 'f'], ['a', 'b', 'c', 'g', 'h']]

Your code almost works, except you never do anything with the result of combine (print it, or add it to some result list), and you do not remove the '' element. However, for a longer list, this might be a bit slow, as it has quadratic complexity O(n²).
Instead, you can use a dictionary to map first elements to the remaining elements of the lists. Then you can use a loop or list comprehension to combine the lists with the right suffixes:
lst = [['a','b','c',''],['c','e','f'],['c','g','h']]
import collections
replacements = collections.defaultdict(list)
for first, *rest in lst:
replacements[first].append(rest)
result = [l[:-2] + c for l in lst if l[-1] == "" for c in replacements[l[-2]]]
# [['a', 'b', 'c', 'e', 'f'], ['a', 'b', 'c', 'g', 'h']]
If the list can have more than one placeholder '', and if those can appear in the middle of the list, then things get a bit more complicated. You could make this a recursive function. (This could be made more efficient by using an index instead of repeatedly slicing the list.)
def replace(lst, last=None):
if lst:
first, *rest = lst
if first == "":
for repl in replacements[last]:
yield from replace(repl + rest)
else:
for res in replace(rest, first):
yield [first] + res
else:
yield []
for l in lst:
for x in replace(l):
print(x)
Output for lst = [['a','b','c','','b',''],['c','b','','e','f'],['c','g','b',''],['b','x','y']]:
['a', 'b', 'c', 'b', 'x', 'y', 'e', 'f', 'b', 'x', 'y']
['a', 'b', 'c', 'g', 'b', 'x', 'y', 'b', 'x', 'y']
['c', 'b', 'x', 'y', 'e', 'f']
['c', 'g', 'b', 'x', 'y']
['b', 'x', 'y']

try my solution
although it will change the order of list but it's quite simple code
lst = [['a', 'b', 'c', ''], ['c', 'e', 'f'], ['c', 'g', 'h']]
lst[0].pop(-1)
print([list(set(lst[0]+lst[1])), list(set(lst[0]+lst[2]))])

How to delete items repeated less than k in a list

In a python list, I want to delete all elements repeated less than 'k'.
for example if k == 3 then if our list is:
l = [a,b,c,c,c,a,d,e,e,d,d]
then the output must be:
[c,c,c,d,d,d]
what is a fast way to do that (my data is large), any good pythonic suggestion?
this is what I coded but I don't think it is the fastest and most pythonic way:
from collections import Counter
l = ['a', 'b', 'c', 'c', 'c', 'a', 'd', 'e', 'e', 'd', 'd']
counted = Counter(l)
temp = []
for i in counted:
if counted[i] < 3:
temp.append(i)
new_l = []
for i in l:
if i not in temp:
new_l.append(i)
print(new_l)

You can use collections.Counter to construct a dictionary mapping values to counts. Then use a list comprehension to filter for counts larger than a specified value.
from collections import Counter
L = list('abcccadeedd')
c = Counter(L)
res = [x for x in L if c[x] >=3]
# ['c', 'c', 'c', 'd', 'd', 'd']

A brute-force option would be to get the number of occurrences per item, then filter that output. The collections.Counter object works nicely here:
l = [a,b,c,c,c,a,d,e,e,d,d]
c = Counter(l)
# Counter looks like {'a': 2, 'b': 1, 'c': 3...}
l = [item for item in l if c[item]>=3]
Under the hood, Counter acts as a dictionary, which you can build yourself like so:
c = {}
for item in l:
# This will check if item is in the dictionary
# if it is, add to current count, if it is not, start at 0
# and add 1
c[item] = c.get(item, 0) + 1
# And the rest of the syntax follows from here
l = [item for item in l if c[item]>=3]

I would use a Counter from collections:
from collections import Counter
count_dict = Counter(l)
[el for el in l if count_dict[el]>2]

Any drawback with this option?
l = ['a','b','c','c','c','a','d','e','e','d','d']
res = [ e for e in l if l.count(e) >= 3]
#=> ['c', 'c', 'c', 'd', 'd', 'd']

Removing every third item from list (Deleting entries at regular interval in list)

I want to remove every 3rd item from list.
For Example:
list1 = list(['a','b','c','d','e','f','g','h','i','j'])
After removing indexes which are multiple of three the list will be:
['a','b','d','e','g','h','j']
How can I achieve this?

You may use enumerate():
>>> x = ['a','b','c','d','e','f','g','h','i','j']
>>> [i for j, i in enumerate(x) if (j+1)%3]
['a', 'b', 'd', 'e', 'g', 'h', 'j']
Alternatively, you may create the copy of list and delete the values at interval. For example:
>>> y = list(x) # where x is the list mentioned in above example
>>> del y[2::3] # y[2::3] = ['c', 'f', 'i']
>>> y
['a', 'b', 'd', 'e', 'g', 'h', 'j']

[v for i, v in enumerate(list1) if (i + 1) % 3 != 0]
It seems like you want the third item in the list, which is actually at index 2, gone. This is what the +1 is for.

Python Remove SOME duplicates from a list while maintaining order?

I want to remove certain duplicates in my python list.
I know there are ways to remove all duplicates, but I wanted to remove only consecutive duplicates, while maintaining the list order.
For example, I have a list such as the following:
list1 = [a,a,b,b,c,c,f,f,d,d,e,e,f,f,g,g,c,c]
However, I want to remove the duplicates, and maintain order, but still keep the 2 c's and 2 f's, such as this:
wantedList = [a,b,c,f,d,e,f,g,c]
So far, I have this:
z = 0
j=0
list2=[]
for i in list1:
if i == "c":
z = z+1
if (z==1):
list2.append(i)
if (z==2):
list2.append(i)
else:
pass
elif i == "f":
j = j+1
if (j==1):
list2.append(i)
if (j==2):
list2.append(i)
else:
pass
else:
if i not in list2:
list2.append(i)
However, this method gives me something like:
wantedList = [a,b,c,c,d,e,f,f,g]
Thus, not maintaining the order.
Any ideas would be appreciated! Thanks!

Not completely sure if c and f are special cases, or if you want to compress consecutive duplicates only. If it is the latter, you can use itertools.groupby():
>>> import itertools
>>> list1
['a', 'a', 'b', 'b', 'c', 'c', 'f', 'f', 'd', 'd', 'e', 'e', 'f', 'f', 'g', 'g', 'c', 'c']
>>> [k for k, g in itertools.groupby(list1)]
['a', 'b', 'c', 'f', 'd', 'e', 'f', 'g', 'c']

To remove consecutive duplicates from a list, you can use the following generator function:
def remove_consecutive_duplicates(a):
last = None
for x in a:
if x != last:
yield x
last = x
With your data, this gives:
>>> list1 = ['a','a','b','b','c','c','f','f','d','d','e','e','f','f','g','g','c','c']
>>> list(remove_consecutive_duplicates(list1))
['a', 'b', 'c', 'f', 'd', 'e', 'f', 'g', 'c']

If you want to ignore certain items when removing duplicates...
list2 = []
for item in list1:
if item not in list2 or item in ('c','f'):
list2.append(item)
EDIT: Note that this doesn't remove consecutive items

EDIT
Never mind, I read your question wrong. I thought you were wanting to keep only certain sets of doubles.
I would recommend something like this. It allows a general form to keep certain doubles once.
list1 = ['a','a','b','b','c','c','f','f','d','d','e','e','f','f','g','g','c','c']
doubleslist = ['c', 'f']
def remove_duplicate(firstlist, doubles):
newlist = []
for x in firstlist:
if x not in newlist:
newlist.append(x)
elif x in doubles:
newlist.append(x)
doubles.remove(x)
return newlist
print remove_duplicate(list1, doubleslist)

The simple solution is to compare this element to the next or previous element
a=1
b=2
c=3
d=4
e=5
f=6
g=7
list1 = [a,a,b,b,c,c,f,f,d,d,e,e,f,f,g,g,c,c]
output_list=[list1[0]]
for ctr in range(1, len(list1)):
if list1[ctr] != list1[ctr-1]:
output_list.append(list1[ctr])
print output_list

list1 = ['a', 'a', 'b', 'b', 'c', 'c', 'f', 'f', 'd', 'd', 'e', 'e', 'f', 'f', 'g', 'g', 'c', 'c']
wantedList = []
for item in list1:
if len(wantedList) == 0:
wantedList.append(item)
elif len(wantedList) > 0:
if wantedList[-1] != item:
wantedList.append(item)
print(wantedList)
Fetch each item from the main list(list1).
If the 'temp_list' is empty add that item.
If not , check whether the last item in the temp_list is
not same as the item we fetched from 'list1'.
if items are different append into temp_list.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python mapping values in a list by group - python

For me the easiest way would be: mylist = ['A', 'A', 'A', 'B', 'B', 'C'] colors = ['r', 'g', 'b', 'w', 'y'] result = [] for i, item in enumerate(sorted(set(mylist))): # sets doesn't maintain the order, so its sorted alphabetically result.extend(colors[i] * mylist.count(item)) print(result)

I would suggest to use the dictionary instead to keep the mapping: result = [] color_map = {} idx = 0 for elt in mylist: if elt not in color_map.keys(): color_map[elt] = colors[idx] idx += 1 result.append(color_map[elt]) This also avoids iterating over the colors list separately.

Related

python - form new list from n elements to the right of a reoccurring word

complete list if the first and last element is equal

How to delete items repeated less than k in a list

Removing every third item from list (Deleting entries at regular interval in list)

Python Remove SOME duplicates from a list while maintaining order?

Categories

Resources