I have a list that looks like this:
lst = [(1,'X1', 256),(1,'X2', 356),(2,'X3', 223)]
The first item of each tuple is an ID and I want to marge the items of each tuple where the ID is the same.
For example I want the list to look like this:
lst = [(1,('X1','X2'),(256,356)),(2,'X3',223)
How do I do this the easiest way?
I have tried some solutions based on own logic but it did not work out.
Use a dictionary whose keys are the IDs, so you can combine all the elements with the same ID.
from collections import defaultdict
lst = [(1,'X1', 256),(1,'X2', 356),(2,'X3', 223)]
result_dict = defaultdict(lambda: [[], []])
for id, item1, item2 in lst:
result_dict[id][0].append(item1)
result_dict[id][1].append(item2)
result = [(id, *map(tuple, vals)) for id, vals in result_dict.items()]
print(result)
Output is:
[(1, ('X1', 'X2'), (256, 356)), (2, ('X3',), (223,))]
This can be done with a single-line list comprehension (after obtaining a set of ids), and for a general case of having multiple fields other than the id (and not just 2 other fields):
lst = [(1,'X1', 256),(1,'X2', 356),(2,'X3', 223)]
ids = {x[0] for x in lst}
result = [(id, list(zip(*[x[1:] for x in lst if x[0] == id]))) for id in ids]
print(result)
# [(1, [('X1', 'X2'), (256, 356)]), (2, [('X3',), (223,)])]
So there's no need to go through a dictionary stage which is then turned into a list, and there's also no need to hardcode indexing of two elements and other such limitations.
Related
I have a question for grouping multiple list values into one values. For example I have this list
data_list = [A,A,B,B,B,C,C,C,C]
then I want to make it into this
data_list = [A, B, C]
I have tried using itertools.groupby but I still cannot find my solution
from itertools import groupby
data_list = [A,A,B,B,B,C,C,C,C]
data_group = [(key, len(list(group))) for key, group in groupby(data_list)]
print(data_group)
the expected output is data_group = [A, B, C]
the actual result is data_group = [(A, 2), (B, 3), (C, 4)]
Method-1 --
you can also use numpy to get unique values:-
import numpy as np
data_list = np.array(['A','A','B','B','B','C','C','C','C'])
np.unique(data_list)
Method-2
You can use set to get unique values but in set result will not contain the same order.
new_list = list( set(data_list) )
new_list
I hope it may help you.
Try with this code
mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)
you can also use OrderedDict to print it in order
from collections import OrderedDict
mylist = ['A','A','B','B','B','C','C','C','C']
mylist = list(OrderedDict.fromkeys(mylist))
print(mylist)
Have you tried looking into sets?
you can first cast your original data_list into a set using set(data_list) then cast that again into a list.
data_list = [A,A,B,B,B,C,C,C,C]
print(list(set(data_list)))
#OUTPUT:
['A', 'B', 'C']
What sets do is they only include unique values. Hence why when we run the set() function on your data_list var, we are left with only the unique values. Sets, in python, are signified by 'curly brackets' like those in dicts, { }, but sets do not contain key:value pairs. The list() function casts your set as a list so you can treat it like a list in the future.
A good idea is to use python sets.
Per documentation, a part of the description is:
"A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries."
For example:
my_list = [1,1,2,2,3,3]
my_set = set(my_list)
print(my_set)
type(my_set)
Will output:
{1,2,3}
set
Mind that the resulting data type is set
So, if you want your result to be a list, you can cast it back into one:
unique_values = list(set(my_list))
And if you are planning to use that a lot in your code, a function would help:
def giveUnique(x):
return list(set(x))
my_list = giveUnique(my_list)
This would change my_list with a list containing unique values
Just adapt the itertools.groupby solution you have (found?) to only use the key:
>>> data_list = [A, A, B, B, B, C, C, C, C] # with A, B, C = "ABC"
>>> [(key, len(list(group))) for key, group in groupby(data_list)]
[('A', 2), ('B', 3), ('C', 4)]
>>> [key for key, group in groupby(data_list)]
['A', 'B', 'C']
I have a list of tuples (x, ind) where x is the item and ind is it's target index in the resulting list. The list is in random order, but it can be assumed that if there are N items in the list, the values of ind in the tuples will be in [0,N) without repetition (i.e. all the valid indices will exist exactly once). How do I get a list where each tuple's position is ind?
Please do not confuse with the many existing answers of how to sort by key.
Obviously, sorting by the ind key is easy, but there would be the unnecessary extra O(n*logn) cost to what should be a O(n) operation because of the aforementioned assumption about ind values.
So:
l = [('item1',1), ('item0',0), ('item2',2), ('item4',4), ('item3',3)]
l2 = magic_rearrange(l, key=lambda x: x[1])
print(l2)
Should give:
[('item0',0), ('item1',1), ('item2',2), ('item3',3), ('item4',4)]
Assuming your indices are unique, here's one way. You can initialise a new list and just insert elements in their right position.
def magic_rearrange(l1):
l2 = [None] * len(l1)
for i in l1:
l2[i[1]] = i
return l2
And a demo:
>>> l = [('item1',1), ('item0',0), ('item2',2), ('item4',4), ('item3',3)]
>>> magic_rearrange(l)
[('item0', 0), ('item1', 1), ('item2', 2), ('item3', 3), ('item4', 4)]
There's a quicker way to do this, if you use numpy's fancy indexing.
import numpy as np
def magic_rearrange(l1):
l2 = np.repeat(None, len(l1))
l2[[x[1] for x in l1]] = l1
return l2
And a demo:
>>> magic_rearrange(l)
array([('item0', 0), ('item1', 1), ('item2', 2), ('item3', 3), ('item4', 4)], dtype=object)
Create the list first and then replace:
def magic_rearrange(l, key):
# creates list to not change original list
new_list = list(l)
# reorder on new list
for original_index, new_index in enumerate(map(key, l)):
new_list[new_index] = l[original_index]
return new_list
Here you go.
def magic_rearrange (input_list, key = lambda x: x):
result_list = [None] * len (input_list)
for p in input_list:
result_list[key (p)] = p
return result_list
We just create a list of the desired length, and then put each element in its place.
The order of operations can be arbitrary, but each element will eventually get to its position in the resulting list.
This is O(N) if copying a single list element and obtaining the key are both O(1).
The key = lambda x: x is for the default order, which is comparing the whole elements (however useless since the result is just list(range(N))).
I have read several posts on the question "how to flatten lists of lists of lists ....". And I came up with this solution:
points = [[[(6,3)],[]],[[],[]]]
from itertools import chain
list(chain.from_iterable(points))
However my list looks sometimes like this:
[[[(6,3)],[]],[[],[]]]
Not sure if it is correct but I hope you understand.
The point is the leaf element is a tuple and when calling the above code it also removes the tuple and just returns [6,3].
So what could i do to just get [(6,3)] ?
How about this,
lists = [[[(6,3)],[]],[[],[]]]
r = [t for sublist in lists for l in sublist for t in l]
print(r)
# [(6, 3)]
maybe its not the best solution, but it works fine:
def flat(array):
result = []
for i in range(len(array)):
if type(array[i]) == list:
for j in flat(array[i]):
result.append(j)
else:
result.append(array[i])
return result
print flat([[[(6,3)],[]],[[],[]]] )
and the result is:
>>>
[(6, 3)]
>>>
If I have the list:
list1 = [(12, "AB", "CD"), (13, "EF", "GH"), (14, "IJ", "KL")]
I want to get the index of the group that has the value 13 in it:
if 13 in list1[0]:
idx = list1.index(13)
item = list1[idx]
print str(item)
[13, EF, GH]
When I try this, I keep getting "Index not in list", even though it is passing the if statement because it is finding the value 13 within the list.
You can use next and enumerate:
>>> list1 = [(12, "AB", "CD"), (13, "EF", "GH"), (14, "IJ", "KL")]
>>> next(i for i,x in enumerate(list1) if 13 in x)
1
With a simple for-loop:
for i, item in enumerate(list1):
if 13 in item:
print i
break
...
1
Update:
If the first item in each tuple is unique and you're doing this multiple times then create a dict first. Dicts provide O(1) lookup while lists O(N)
>>> list1 = [(12, "AB", "CD"), (13, "EF", "GH"), (14, "IJ", "KL")]
>>> dic = {x[0]:x[1:] for x in list1}
Accessing items:
>>> dic[12]
('AB', 'CD')
>>> dic[14]
('IJ', 'KL')
#checking key existence
>>> if 17 in dic: #if a key exists in dic then do something
#then do something
Given the added criterion from the comment "I really don't care where they are in the list" the task becomes much easier and far more obvious
def get_ids(id, tuple_list):
"""returns members from tuple_list whose first element is id"""
return [x for x in tuple_list if x[0] == id]
This isn't as expensive as one might expect if you recall that tuples are immutable objects. When the interpreter builds the new list, it only contains the internal id (reference) of the tuples of interest. This is in keeping with the original question asking for a list of indices. List comprehensions as used here are an efficient way of constructing new lists as much of the work is done internal to the interpreter. In short, many intuitions from C-like languages about performance don't apply well to Python.
As Ashwini noted, if the id numbers in the tuples are unique, and you are making multiple queries, then a dictionary might be a more suitable structure. Even if the id numbers aren't unique, you could use a dictionary of lists of tuples, but it is best to do the clearest thing first and not guess at the performance in advance.
As with the dictionary example, because an empty list is "falsey" in Python, you can use the same sort of conditional:
hits = get_ids(13, list1)
if hits:
# we got at least one tuple back
else:
# no 13s to be had
Suppose I have the following dictionary and list:
my_dictionary = {1:"hello", 2:"goodbye", 3:"World", "sand":"box"}
my_list = [1,2,3]
Is there a direct (Pythonic) way to get the key-value pairs out of the dictionary for which the keys are elements in the list, in an order defined by the list order?
The naive approach is simply to iterate over the list and pull out the values in the map one by one, but I wonder if python has the equivalent of list slicing for dictionaries.
Don't know if pythonic enough but this is working:
res = [(x, my_dictionary[x]) for x in my_list]
This is a list comprehension, but, if you need to iterate that list only once, you can also turn it into a generator expression, e.g. :
for el in ((x, my_dictionary[x]) for x in my_list):
print el
Of course the previous methods work only if all elements in the list are present in the dictionary; to account for the key-not-present case you can do this:
res = [(x, my_dictionary[x]) for x in my_list if x in my_dictionary]
>>> zip(my_list, operator.itemgetter(*my_list)(my_dictionary))
[(1, 'hello'), (2, 'goodbye'), (3, 'World')]
How about this? Take every item in my_list and pass it to the dictionary's get method. It also handles exceptions around missing keys by replacing them with None.
map(my_dictionary.get, my_list)
If you want tupples zip it -
zip(my_list, map(my_dictionary.get, my_list))
If you want a new dict, pass the tupple to dict.
dict(zip(my_list, map(my_dictionary.get, my_list)))
A straight forward way would be to pick each item from the dictionary and check if the key is present in the list
>>> [e for e in my_dictionary.items() if e[0] in my_list]
[(1, 'hello'), (2, 'goodbye'), (3, 'World')]
The above search would be linear so you might gain some performance by converting the list to set
>>> [e for e in my_dictionary.items() if e[0] in set(my_list)]
[(1, 'hello'), (2, 'goodbye'), (3, 'World')]
And finally if you need a dictionary instead of a list of key,value pair tuples you can use dictionary comprehension
>>> dict(e for e in my_dictionary.items() if e[0] in set(my_list))
{1: 'hello', 2: 'goodbye', 3: 'World'}
>>>