Flatten tuple containing list to multiple tuples - python

I have a list of tuples that each contain a "key" and a list "value":
[('ABE', ['ORD', 'ATL', 'DTW'])]
Here, ABE is the "key" and the list ['ORD', 'ATL', 'DTW'] is the "value".
How can I "flatten" this RDD structure by mapping each of the original key-value tuples to three tuples, all with same "key" each with a different element of the value list?
My desired output is
[('ABE', 'ORD'), ('ABE','ATL'), ('ABE','DTW')]

This can be accomplished in a single list comprehension:
data = [('ABE', ['ORD', 'ATL', 'DTW'])]
flattened = [
(key, elem)
for key, value in data
for elem in value
]
print(flattened)
outputs
[('ABE', 'ORD'), ('ABE', 'ATL'), ('ABE', 'DTW')]

With itertools.zip_longest with key as fillvalue.
from itertools import zip_longest
lst = [('ABE', ['ORD', 'ATL', 'DTW']), ('1', ['A', 'B', 'C'])]
res = []
for key, sublist in lst:
res.append(tuple(zip_longest([key], sublist, fillvalue=key)))
print(res)

I have tested Brian solution and got "PipelinedRDD' object is not iterable".
Therefore I added collect() function for the data.
data = [('ABE', ['ORD', 'ATL', 'DTW'])]
flattened = [
(key, elem)
for key, value in data.collect()
for elem in value
]
print(flattened)

Related

How can I turn a list of values to a list of dictionaries with the same key added to each value?

Let's say I have this list of IDs/values:
Input:
['aaa','bbb','ccc']
I've been stuck on figuring out how I could get the following list of dictionaries. Each dictionary contain the key "id" paired with each of the IDs/values from the list.
Desired Output:
[{'id': 'aaa'}, {'id': 'bbb'}, '{id': 'ccc'}]
You can create a list of dictionaries with a list comprehension:
list_ = ['aaa','bbb','ccc']
output = [{'id': elem} for elem in list_]
You can use list slicing concepts here to get your result.
list1 = ['aaa','bbb','ccc']
finalOutput = [{'id': value} for value in list1]
you can loop and append those dictionaries to list like this:
some_list = ['aaa', 'bbb', 'ccc']
dictionary_list = []
for i in some_list:
dictionary_list.append({'id': i})
You will get a list like you wanted.
['aaa','bbb','ccc']
#place the word 'ID' in front of each element in the list
['ID: '+x for x in ['aaa','bbb','ccc']]

How to manipulate the dictionary keys and values

I have a dictionary and I want to change the key of the dictionary into a unique value.
final_dict = {"name1":['raj','raj','raj'],"name2":['Rahul','Thor','max','Rahul'],"name3":['Jhon','Jhon'], "name4":['raj','raj'], "name5":['Rahul','Thor','max']}
First of all, I need unique values for each key like this
final_dict = {"name1":['raj'],"name2":['Thor','max','Rahul'],"name3":['Jhon'], "name4":['raj'], "name5":['Rahul','Thor','max']}
and then I need to convert the keys as values and vales as key
the final output I needed
output = {"raj":['name1','name4'], "('Thor','max','Rahul')":[name2,name5], "jhon":[name3]}
I tried this but I got only the unique values
mtype=[]
for key_name in final_dict:
a = set(final[key_name])
#print(tuple(a))
mtype.append(tuple(a))
print(mtype)
u = set(mtype)
print(u)
Here's a first shot at the problem. I'd carefully read through each line and make sure you understand what is going on. Feel free to ask follow ups in the comments!
from collections import defaultdict
input = { ... }
output = defaultdict(list)
for key, values in input.items():
unique_values = tuple(sorted(set(values)))
output[unique_values].append(key)
output = dict(output) # Transform defaultdict to a dict
Here is a straight forward way -
The set() takes care of the unique and the list comprehension below changes the dict items to item, key tuples.
dict.setdefault() allows appending each of the tuples into a blank list [] (key, value) pairs where the same key gets a list of values.
d = {}
l = [(i,k) for k,v in final_dict.items() for i in set(v)]
#print(l)
#[('raj', 'name1'), ('Thor', 'name2'), ('max', 'name2'), ('Rahul', 'name2'),
#('Jhon', 'name3'), ('raj', 'name4'), ('Thor', 'name5'), ('max', 'name5'),
#('Rahul', 'name5')]
for x, y in l:
d.setdefault(x, []).append(y)
print(d)
{'Jhon': ['name3'],
'Rahul': ['name2', 'name5'],
'Thor': ['name2', 'name5'],
'max': ['name2', 'name5'],
'raj': ['name1', 'name4']}

Sort list in a dict that has key as a list of dicts

I am trying to sort a list by keys of dict that has keys as a lists with other dicts.
big_list = [{"Key1": 1, "Key2" :2, "Key3": [{"sortable_key": value1, value2, ..}]}]
The goal is to sort the big_list by sortable_key. I am trying to accomplish sorting by using lambda:
big_list = list(sorted(big_list, key=lambda x: x["Key3"]["sortable_key"]))
But this is not working (not sorting the list, I assume that it would sort it in particular order like alphabetical). Am I trying to access the sortable_key in a wrong manner or is this not achievable by using lambda?
Thanks!
Full example of big_list:
big_list = [{'number': '7',
'code': '1106',
'name': 'John',
'det': [{'subcode': 'AA11','subname': 'Alan','age': 11},
{'subcode': 'CC11','subname': 'Hugo','age': 22},
{'subcode': 'BB11','subname': 'Walt','age': 18}]}]
In this case I need to sort list by 'subcode'.
You forgot a [0] since x["Key3"]is a list:
big_list = list(sorted(big_list, key=lambda x: x["Key3"][0]["sortable_key"]))

Merge tuples with the same key

How to merge a tuple with the same key
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
and turn them into
list_2 = [("AAA", [123, 456]), ("AAW", [147, 124])]
The most performant approach is to use a collections.defaultdict dictionary to store data as an expanding list, then convert back to tuple/list if needed:
import collections
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
c = collections.defaultdict(list)
for a,b in list_1:
c[a].extend(b) # add to existing list or create a new one
list_2 = list(c.items())
result:
[('AAW', [147, 124]), ('AAA', [123, 456])]
note that the converted data is probably better left as dictionary. Converting to list again loses the "key" feature of the dictionary.
On the other hand, if you want to retain the order of the "keys" of the original list of tuples, unless you're using python 3.6/3.7, you'd have to create a list with the original "keys" (ordered, unique), then rebuild the list from the dictionary. Or use an OrderedDict but then you cannot use defaultdict (or use a recipe)
You can use a dict to keep track of the indices of each key to keep the time complexity O(n):
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
list_2 = []
i = {}
for k, s in list_1:
if k not in i:
list_2.append((k, s))
i[k] = len(i)
else:
list_2[i[k]][1].extend(s)
list_2 would become:
[('AAA', [123, 456]), ('AAW', [147, 124])]
You can create a dictionary and loop through the list. If the item present in dictionary append the value to already existing list else assign the value to key.
dict_1 = {}
for item in list_1:
if item[0] in dict_1:
dict_1[item[0]].append(item[1][0])
else:
dict_1[item[0]] = item[1]
list_2 = list(dict_1.items())
Similarly to other answers, you can use a dictionary to associate each key with a list of values. This is implemented in the function merge_by_keys in the code snippet below.
import pprint
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
def merge_by_key(ts):
d = {}
for t in ts:
key = t[0]
values = t[1]
if key not in d:
d[key] = values[:]
else:
d[key].extend(values)
return d.items()
result = merge_by_key(list_1)
pprint.pprint(result)

extract specific value from dictionary of list of dictionary

I have dataset like this:
{'project-1': [{'id':'1','name':'john'},{'id':'20','name':'steve'}],
'project-2': [{'id':'6','name':'jack'},{'id':'42','name':'anna'}]}
what I want to extract is the name of all people:
['john','steve','jack','anna']
How to get these list using python?
my_dict = {
'project-1': [{'id':'1','name':'john'},{'id':'20','name':'steve'}],
'project-2': [{'id':'6','name':'jack'},{'id':'42','name':'anna'}]
}
You can use a list comprehension it get the name field from each dictionary contained within the sublists (i.e. within the values of the original dictionary).
>>> [d.get('name') for sublists in my_dict.values() for d in sublists]
['john', 'steve', 'jack', 'anna']
Iterate over the dict, then over the values of the current dict:
for d_ in d.values():
for item in d_:
print item['name']
Or in comprehension
names = [item['name'] for d_ in d.values() for item in d_]
print names
['john', 'steve', 'jack', 'anna']
This should do it.
d = {'project-1': [{'id':'1','name':'john'},{'id':'20','name':'steve'}],
'project-2': [{'id':'6','name':'jack'},{'id':'42','name':'anna'}]}
result = list()
for key in d:
for x in d[key]:
result.append(x['name'])
Many solutions trying same old approach here using two loop:
Here is different approach:
One line solution without any loop:
You can use lambda function with map:
data={'project-1': [{'id':'1','name':'john'},{'id':'20','name':'steve'}],
'project-2': [{'id':'6','name':'jack'},{'id':'42','name':'anna'}]}
print(list(map(lambda x:list(map(lambda y:y['name'],x)),data.values())))
output:
[['john', 'steve'], ['jack', 'anna']]
name_id = {'project-1': [{'id':'1','name':'john'},{'id':'20','name':'steve'}], 'project-2': [{'id':'6','name':'jack'},{'id':'42','name':'anna'}]}
name_id['project-1'][0]['name'] = 'john'
name_id['project-1'][1]['name'] = 'steve'
name_id['project-2'][0]['name'] = 'jack'
name_id['project-2'][1]['name'] = 'anna'
The ['project-1'] gets the value corresponding to the project-1 key in the dictionary name_id. [0] is the list index for the first element in the dictionary value. ['name'] is also a key, but of the dictionary in the first element of the list. It gives you the final value that you want.

Categories

Resources