Converting list to dictionary with list elements as index - Python - python

From Python: create dict from list and auto-gen/increment the keys (list is the actual key values)?, it's possible to create a dict from a list using enumerate to generate tuples made up of incremental keys and elements in life, i.e.:
>>> x = ['a', 'b', 'c']
>>> list(enumerate(x))
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> dict(enumerate(x))
{0: 'a', 1: 'b', 2: 'c'}
It is also possible to reverse the key-value by iterating through every key in the dict (assuming that there is a one-to-one mapping between key-value pairs:
>>> x = ['a', 'b', 'c']
>>> d = dict(enumerate(x))
>>> {v:k for k,v in d.items()}
{'a': 0, 'c': 2, 'b': 1}
Given the input list ['a', 'b', 'c'], how can achieve the dictionary where the elements as the key and incremental index as values without trying to loop an additional time to reverse the dictionary?

How about simply:
>>> x = ['a', 'b', 'c']
>>> {j:i for i,j in enumerate(x)}
{'a': 0, 'c': 2, 'b': 1}

Related

Sorting list based on dictionary keys in python

Is there a short way to sort a list based on the order of another dictionary keys?
suppose I have:
lst = ['b', 'c', 'a']
dic = { 'a': "hello" , 'b': "bar" , 'c': "foo" }
I want to sort the list to be ['a','b','c'] based on the order of dic keys.
You can create a lookup of keys versus their insertion order in dic. To do so you can write:
>>> lst = ['d', 'b', 'c', 'a']
>>> dic = {"a": "hello", "b": "bar", "c": "foo"}
>>> order = {k: i for i, k in enumerate(dic)}
>>> order
{'a': 0, 'b': 1, 'c': 2}
Using this you can write a simple lookup for the key argument of sorted to rank items based on order.
>>> sorted(lst, key=order.get)
['a', 'b', 'c']
If there are values in lst that are not found in dic you should call get using a lambda so you can provide a default index. You'll have to choose if you want to rank unknown items at the start or end.
Default to the start:
>>> lst = ['d', 'b', 'c', 'a']
>>> sorted(lst, key=lambda k: order.get(k, -1))
['d', 'a', 'b', 'c']
Default to the end:
>>> lst = ['d', 'b', 'c', 'a']
>>> sorted(lst, key=lambda k: order.get(k, len(order)))
['a', 'b', 'c', 'd']

Creating a dictionary using indices of list

What is the most efficient way to create a dictionary from a string/list? For example, if I have a list ['a', 'b', 'c', 'd'], how would I create the dictionary for which the elements of the list are the keys, and the indices are the values? How it would look like for the above list: {'a': 0, 'b': 1, 'c': 2, 'd': 3}
enumerate() will return the elements and their indexes, you can use this in a dictionary comprehension.
l = ['a', 'b', 'c', 'd']
d = {value: index for index, value in enumerate(l)}
you can use this:
lista = ['a', 'b', 'c', 'd']
dictionary = {}
n = 0
for el in lista:
dictionary[el] = n
n += 1

Convert a list of str and lists to dict

In Python, how do I convert a list that contains strings and lists that have two values into a dictionary such that the key is the string and the value is a list of lists such that the first value of each list is the key.
For example, the current list I have is:
['A', ['A', 1], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2], ['C',3]]
and I want the dictionary:
{'A': [['A', 1]], 'B': [['B',1], ['B',2]], 'C': [['C',1], ['C',2], ['C',3]]}
Thank you.
EDIT: The number of lists that follow a string is arbitrary.
With this, no matter the order of the list, it selects exactly what you're looking for.
def new(list_):
new_dic = {x:[y for y in list_ if type(y) == list and y[0] == x] for x in list_ if type(x) == str}
print(new_dic)
new(['A', ['A', 1], ['A',2], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2]])
d = {l: [] for l in mylist if type(l) is str}
for l in mylist:
if type(l) is list:
d[l[0]].append(l)
You can try defaultdict
from collections import defaultdict
my_dict = defaultdict(list)
my_list = ['A', ['A', 1], ['A',2], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2]]
for index in my_list:
if len(index) > 1:
my_dict[index[0]].append(index)
It seems like it doesn't matter what the string values are in your list. Based on the current structure of the list provided, and the required output, you can just check for the lists inside the list, and by using the defaultdict construct, you can simply just craft your dictionary accordingly:
from collections import defaultdict
l = ['A', ['A', 1], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2], ['C',3]]
d = defaultdict(list)
for data in l:
if type(data) is list:
d[data[0]].append(data)
Output:
defaultdict(<class 'list'>, {'A': [['A', 1]], 'C': [['C', 1], ['C', 2], ['C', 3]], 'B': [['B', 1], ['B', 2]]})
So, here, the defaultdict will take a list as its default collection value. Therefore, when adding a new key, the default value will be a list. As you iterate over the list, simply check the type of the data in the list. When you find a list, you insert it in to your dictionary taking the first value of that list as the key, and then append the list as the value. It should give you the output you are looking for.

Can I reverse a Counter into a list of lists without multiples?

Using the Collection Counter,
l1 = ['a', 'b', 'b', 'c', 'c', 'b', 'e']
l2 = ['a', 'b', 'b', 'c', 'c', 'b','d']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
# Intersection
c1 & c2
>>> Counter({'b': 3, 'c': 2, 'a': 1})
What idiom could distribute Collections Counter into a list of lists where each multiple appears only once in each list?
[['a', 'b', 'c'],['b', 'c'],['b']]
Don't know if you were looking for a one-liner, but here is a one-liner:
Code:
[sorted(y for y in z if y is not None)
for z in it.izip_longest(*[[k] * l for k, l in c.items()])]
How?
Two key things here:
[k] * l gives a list of the counter keys which is counter values long
izip_longest() will put the lists togther and pad fill with none for the shorter lists
Test Code:
from collections import Counter
c = Counter({'b': 3, 'c': 2, 'a': 1})
import itertools as it
print([sorted(y for y in z if y is not None)
for z in it.izip_longest(*[[k] * l for k, l in c.items()])])
Results:
[['a', 'b', 'c'], ['b', 'c'], ['b']]
You can try this:
import itertools
the_dict = {'b': 3, 'c': 2, 'a': 1}
the_frequencies = [[a]*b for a, b in the_dict.items()]
the_list = itertools.izip_longest(the_frequencies)
the_final = map(list, list(itertools.izip_longest(*the_frequencies)))
print [[i for i in b if i != None] for b in the_final]
This solution uses itertools to zip the lists contained in the_frequencies that are created by multiplying a list of the key multiplied its corresponding value. izip then forms a list with the rows of the elements in the_frequencies, storing None if the count of the current iteration is greater than the length of any list in the list of lists.

Python - count and group items in list stored in dictionary

I have seen examples on how to count items in dictionary or list. My dictionary stored multiple lists. Each list stores multiple items.
d = dict{}
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
1. I want to count frequency of each alphabet, i.e. the results should be
A - 4
B - 1
C - 2
D - 1
E - 1
F - 1
2. I want to have group by each alphabet, i.e. the results should be
A - text1, text2, text4, text5
B - text4
C - text1, text3
D - text3
E - text1
F - text1
How can I achieve both by using some Python existing libraries without using many for loops?
To get to (2), you would have to first invert the keys and values of a dictionary, and store them in a list. Once you are there, use groupby with a key to get to the structure of (2).
from itertools import groupby
arr = [(x,t) for t, a in d.items() for x in a]
# [('A', 'text2'), ('C', 'text3'), ('D', 'text3'), ('A', 'text1'), ('C', 'text1'), ('E', 'text1'), ('F', 'text1'), ('A', 'text4'), ('B', 'text4'), ('A', 'text5')]
res = {g: [x[1] for x in items] for g, items in groupby(sorted(arr), key=lambda x: x[0])}
#{'A': ['text1', 'text2', 'text4', 'text5'], 'C': ['text1', 'text3'], 'B': ['text4'], 'E': ['text1'], 'D': ['text3'], 'F': ['text1']}
res2 = {x: len(y) for x, y in res.items()}
#{'A': 4, 'C': 2, 'B': 1, 'E': 1, 'D': 1, 'F': 1}
PS: I am hoping you'd meaningful variable names in your real code.
There are a few ways to accomplish this, but if you'd like to handle things without worrying about import ing additional modules or installing and importing external modules, this method will work cleanly 'out of the box.'
With d as your starting dictionary:
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
create a new dict, called letters, for your results to live in, and populate it with your letters, taken from d.keys(), by creating the letter key if it isn't present, and creating a list with the count and the key from das it's value. If it's already there, increment the count, and append the current key from d to it's d key list in the value.
letters = {}
for item in d.keys():
for letter in d[item]:
if letter not in letters.keys():
letters[letter] = [1,[item]]
else:
letters[letter][0] += 1
letters[letter][1] += [item]
This leaves you with a dict called letters containing values of the counts and the keys from d that contain the letter, like this:
{'E': [1, ['text1']], 'C': [2, ['text3', 'text1']], 'F': [1, ['text1']], 'A': [4, ['text2', 'text4', 'text1', 'text5']], 'B': [1, ['text4']], 'D': [1, ['text3']]}`
Now, to print your first list, do:
for letter in sorted(letters):
print(letter, letters[letter][0])
printing each letter and the contents of the first, or 'count' index of the list as its value, and using the built-in sorted() function to put things in order.
To print the second, likewise sorted(), do the same, but with the second, or 'key', index of the list in its value, .joined using a , into a string:
for letter in sorted(letters):
print(letter, ', '.join(letters[letter][1]))
To ease Copy/Paste, here's the code unbroken by my ramblings:
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
letters = {}
for item in d.keys():
for letter in d[item]:
if letter not in letters.keys():
letters[letter] = [1,[item]]
else:
letters[letter][0] += 1
letters[letter][1] += [item]
print(letters)
for letter in letters:
print(letter, letters[letter][0])
print()
for letter in letters:
print(letter, ', '.join(letters[letter][1]))
Hope this helps!
from collections import Counter, defaultdict
from itertools import chain
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
counter = Counter(chain.from_iterable(d.values()))
group = defaultdict(list)
for k, v in d.items():
for i in v:
group[i].append(k)
out:
Counter({'A': 4, 'B': 1, 'C': 2, 'D': 1, 'E': 1, 'F': 1})
defaultdict(list,
{'A': ['text2', 'text4', 'text1', 'text5'],
'B': ['text4'],
'C': ['text1', 'text3'],
'D': ['text3'],
'E': ['text1'],
'F': ['text1']})
For your first task:
from collections import Counter
d = {
'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']
}
occurrences = Counter(''.join(''.join(values) for values in d.values()))
print(sorted(occurrences.items(), key=lambda l: l[0]))
Now let me explain it:
''.join(values) turns the list (e.g. ['A', 'B', 'C', 'D'] into 'ABCD')
Then you join each list from the dictionary into one string (the outer ''.join())
Counter is a class from the builtin package collections, which simply counts the elements in the iterable (string in this case) and reproduces them as tuples of (key, value) pairs (e.g. ('A', 4))
Finally, I sort the Counter items (it's just like a dictionary) alphabetically (key=lambda l: l[0] where l[0] is the letter from the (key, value) pair.
As I saw, you already have the solution for your second problem.
from collections import defaultdict
alphabets = defaultdict(list)
his is a way to acheive this:
for text, letters in d.items():
for letter in letters:
alphabets[letter].append(text)
for letter, texts in sorted(alphabets.items()):
print(letter, texts)
for letter, texts in sorted(alphabets.items()):
print(letter, len(texts))
note that if you have A - text1, text2, text4, text5 to get to A - 4 is just a matter of counting the texts.

Categories

Resources