Related
I have a json:
{ 'a': ['c','d','e'],
'b': ['a','c','e'],
'c': ['a','b','d'],
....and so on
And also I have dictionary:
{'a':'abc','b':'cdf'}
Now, I want to change all these 'a' to 'abc' in the json file:
so, finally my json looks like:
{ 'abc': ['c','d','e'],
'cdf': ['abc','c','e'],
'c': ['a','cdf','d'],
....and so on
A = {'a' : ['b', 'c', 'd'],
'b' : ['a', 'c', 'd'],
'c' : ['a', 'b', 'd']}
B = {'a' : 'abc', 'b' : 'bcd'}
C = {}
for key in A.keys():
#Checking for key new value
if key in B.keys():
new_key = B[key]
else:
#If there is no new value for key
new_key = key
C[new_key] = []
for element in A[key]:
#Checking for new value of element
if element in B.keys():
new_element = B[element]
else:
#If there is no new value for element
new_element = element
C[new_key].append(new_element)
C is the modified version of A
You can also try this:
dict_1 = {
'a': ['c','d','e'],
'b': ['a','c','e'],
'c': ['a','b','d']
}
dict_2 = {'a':'abc','b':'cdf'}
for key in list(dict_1):
if key in dict_2:
dict_1[dict_2[key]] = dict_1.pop(key)
print(dict_1)
Output:
{'c': ['a', 'b', 'd'], 'abc': ['c', 'd', 'e'], 'cdf': ['a', 'c', 'e']}
{ dic.get(key, key): [dic.get(i, i) for i in value] for key, value in js.items() }
Where js is a dictionary representing your json and dic represents your dictionary {'a':'abc','b':'cdf'}
I need to create a dictionary structure in the below format.
list_1 = [1,2,3,4,5]
list_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
list_3 = random.sample(list_2 , random.randint(0,len(list_2))
Loop as the len(list_1) and need to loop using list_2 and then using the random sample created using list_3, assigning each value of the list_3 as a value in the inner dictionary below, while iterating.
*Needed dictionary format:
my_dict = { 1: { 1: a,
2:b,
3,c},
2: { 1: 'd',
2: 'g'},
3: {1, 'e',
2, 'f',
3, 'g'}
.....
}*
My code:
list_1 = [1,2,3,4,5]
list_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
list_4= []
j= 1
while i <= len(list_1):
list_3 = random.sample(list_2 , random.randint(0,len(list_2))
for k in list_3:
my_dict= { i: { j: k,
}
}
j+=1
i+=1
list_3 = random.sample(list_2 , random.randint(0,len(list_2))
list_4.append(my_dict)
The jth value should increment after every iterating of list_3 and keep adding a new jth key + value (k)
After the loop ends of list_3, another sample list (list_3) should be created and the above same repeats in the new ith key and gets added to the dictionary.
I am not getting the required result and need help if anyone can fix the code.
Thank you!
You're overwriting your my_dict in every iteration of the loop. You only need to add a new subdict with key i:
list_1 = [1,2,3,4,5]
list_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
my_dict = {}
for i in list_1:
list_3 = random.sample(list_2 , random.randint(0,len(list_2)))
my_dict[i] = dict(enumerate(list_3, 1))
So, you have a list of keys and a list of values and you want to create a dictionary with one entry for each key, where the value for each key is a dictionary with a random selection from values as values and sequential numerical keys, starting at 1.
keys = [1, 2, 3, 4, 5]
values = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
result = {
k: {n: v
for n, v in enumerate(random.sample(values, random.randint(0, len(values))), 1)
}
for k in keys }
This:
defines a dictionary;
result= {}
with an entry for each key k in keys;
result= {k: {} for k in keys}
with each dictionary having keys and values from an enumeration starting at 1;
{n: v for n, v in enumerate([], 1)}
with the values being a random sample from values;
random.sample(values, _)
with a length between 0 and all items in values.
random.sample(random.randint(0, len(values)))
And Python allows you to just turn the enumeration into a dict directly:
result = {
k: dict(enumerate(random.sample(values, random.randint(0, len(values))), 1))
for k in keys }
you can try this:
import random
list_1 = [1,2,3,4,5]
list_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
for i in list_1:
list_3 = random.sample(list_2 , random.randint(0,len(list_2)))
my_dict[i]=dict(enumerate(list_3,1))
print(my_dict)
I need to make sure I don't have more than one of the same keys, if so, leave the first one and add their value (make it a list) to the existing key
this is what I tried:
my_dict = {1: "A", 2: "B", 1: "C"}
new_dict={}
list_keys = list(my_dict.keys())
list_values = list(my_dict.values())
for i in range(len(list_values)):
if list_keys[i] in new_dict.keys():
new_dict[list_keys[i]].append(list(list_values[i]))
else:
new_dict.update({list_keys[i]: list_values[i]})
return new_dict
The result required:
{1: ["A", "C"], 2: ["B"]}
The most concise way of reversing a dict like that uses a defaultdict:
from collections import defaultdict
d = {"A": 1, "B": 2, "C": 1}
rev = defaultdict(list)
for k, v in d.items():
rev[v].append(k)
rev
# defaultdict(<class 'list'>, {1: ['A', 'C'], 2: ['B']})
That first line doesn't make sense. A dictionary cannot have two values for the same key, so when you run that first line, the resulting dictionary is:
my_dict = {1: 'A', 2: 'B', 1: 'C'}
print(my_dict)
# {1: 'C', 2: 'B'}
What you could do is iterate over lists of desired keys and values, and build the dictionary that you want that way:
my_keys = [1, 2, 1]
my_vals = ['A', 'B', 'C']
my_dict = {}
for k, v in zip(my_keys, my_vals):
if k in my_dict.keys():
if not isinstance(my_dict[k], list):
my_dict[k] = [my_dict[k]] # convert to a list
my_dict[k].append(v)
else:
my_dict[k] = v
print(my_dict)
# {1: ['A', 'C'], 2: 'B'}
Based on the comments, you originally had a dictionary in_dict = {'A': 1, 'B': 2, 'C':1}. Given this in_dict, you can get the desired result by setting my_keys = in_dict.values() and my_vals = in_dict.keys() in the code above.
Using comprehensions if you want
a = {"A":1, "B":2, "C":1}
{value: [item_[0] for item_ in a.items() if item_[1] == value] for value in set(a.values())}
Output
{1: ['A', 'C'], 2: ['B']}
Considering that I have two lists like:
l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = [
'x','q','we','da','po',
'a', 'el1', 'el2', 'el3', 'el4',
'b', 'some_other_el_1', 'some_other_el_2',
'c', 'another_element_1', 'another_element_2',
'd', '', '', 'another_element_3', 'd4'
]
and I need to create a dictionary where the keys are those element from second list that are found in the first and values are lists of elements found between "keys" like:
result = {
'a': ['el1', 'el2', 'el3', 'el4'],
'b': ['some_other_el_1', 'some_other_el_2'],
'c': ['another_element_1', 'another_element_2'],
'd': ['', '', 'another_element_3', 'd4']
}
What's a more pythonic way to do this?
Currently I'm doing this :
# I'm not sure that the first element in the second list
# will also be in the first so I have to create a key
k = ''
d[k] = []
for x in l2:
if x in l1:
k = x
d[k] = []
else:
d[k].append(x)
But I'm quite positive that this is not the best way to do it and it also doesn't looks nice :)
Edit:
I also have to mention that no list is necessary ordered and neither the second list must start with an element from the first one.
I don't think you'll do much better if this is the most specific statement of the problem. I mean I'd do it this way, but it's not much better.
import collections
d = collections.defaultdict(list)
s = set(l1)
k = ''
for x in l2:
if x in s:
k = x
else:
d[k].append(x)
For fun, you can also do this with itertools and 3rd party numpy:
import numpy as np
from itertools import zip_longest, islice
arr = np.where(np.in1d(l2, l1))[0]
res = {l2[i]: l2[i+1: j] for i, j in zip_longest(arr, islice(arr, 1, None))}
print(res)
{'a': ['el1', 'el2', 'el3', 'el4'],
'b': ['some_other_el_1', 'some_other_el_2'],
'c': ['another_element_1', 'another_element_2'],
'd': ['', '', 'another_element_3', 'd4']}
Here is a version using itertools.groupby. It may or may not be more efficient than the plain version from your post, depending on how groupby is implemented, because the for loop has fewer iterations.
from itertools import groupby
from collections import defaultdict, deque
def group_by_keys(keys, values):
"""
>>> sorted(group_by_keys('abcdef', [
... 1, 2, 3,
... 'b', 4, 5,
... 'd',
... 'a', 6, 7,
... 'c', 8, 9,
... 'a', 10, 11, 12
... ]).items())
[('a', [6, 7, 10, 11, 12]), ('b', [4, 5]), ('c', [8, 9])]
"""
keys = set(keys)
result = defaultdict(list)
current_key = None
for is_key, items in groupby(values, key=lambda x: x in keys):
if is_key:
current_key = deque(items, maxlen=1).pop() # last of items
elif current_key is not None:
result[current_key].extend(items)
return result
This doesn't distinguish between keys that don't occur in values at all (like e and f), and keys for which there are no corresponding values (like d). If this information is needed, one of the other solutions might be better suited.
Updated ... Again
I misinterpreted the question. If you are using large lists then list comprehensions are the way to go and they are fairly simple once you learn how to use them.
I am going to use two list comprehensions.
idxs = [i for i, val in enumerate(l2) if val in l1] + [len(l2)+1]
res = {l2[idxs[i]]: list(l2[idxs[i]+1: idxs[i+1]]) for i in range(len(idxs)-1)}
print(res)
Results:
{'a': ['el1', 'el2', 'el3', 'el4'],
'b': ['some_other_el_1', 'some_other_el_2'],
'c': ['another_element_1', 'another_element_2'],
'd': ['', '', 'another_element_3', 'd4']}
Speed Testing for large lists:
import collections
l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = [
'x','q','we','da','po',
'a', 'el1', 'el2', 'el3', 'el4', *(str(i) for i in range(300)),
'b', 'some_other_el_1', 'some_other_el_2', *(str(i) for i in range(100)),
'c', 'another_element_1', 'another_element_2', *(str(i) for i in range(200)),
'd', '', '', 'another_element_3', 'd4'
]
def run_comp():
idxs = [i for i, val in enumerate(l2) if val in l1] + [len(l2)+1]
res = {l2[idxs[i]]: list(l2[idxs[i]+1: idxs[i+1]]) for i in range(len(idxs)-1)}
def run_other():
d = collections.defaultdict(list)
k = ''
for x in l2:
if x in l1:
k = x
else:
d[k].append(x)
import timeit
print('For Loop:', timeit.timeit(run_other, number=1000))
print("List Comprehension:", timeit.timeit(run_comp, number=1000))
Results:
For Loop: 0.1327093063242541
List Comprehension: 0.09343156142774986
old stuff below
This is rather simple with list comprehensions.
{key: [val for val in l2 if key in val] for key in l1}
Results:
{'a': ['a', 'a1', 'a2', 'a3', 'a4'],
'b': ['b', 'b1', 'b2', 'b3', 'b4'],
'c': ['c', 'c1', 'c2', 'c3', 'c4'],
'd': ['d', 'd1', 'd2', 'd3', 'd4'],
'e': [],
'f': []}
The code below shows what is happening above.
d = {}
for key in l1:
d[key] = []
for val in l2:
if key in val:
d[key].append(val)
The list comprehension / dictionary comprehension (First piece of code) is actually way faster. List comprehensions are creating the list in place which is much faster than walking through and appending to the list. Appending makes the program walk the list, allocate more memory, and add the data to the list which can be very slow for large lists.
References:
http://www.pythonforbeginners.com/basics/list-comprehensions-in-python
https://docs.python.org/3.6/tutorial/datastructures.html#list-comprehensions
You can use itertools.groupby:
import itertools
l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = ['x', 'q', 'we', 'da', 'po', 'a', 'el1', 'el2', 'el3', 'el4', 'b', 'some_other_el_1', 'some_other_el_2', 'c', 'another_element_1', 'another_element_2', 'd', '', '', 'another_element_3', 'd4']
groups = [[a, list(b)] for a, b in itertools.groupby(l2, key=lambda x:x in l1)]
final_dict = {groups[i][-1][-1]:groups[i+1][-1] for i in range(len(groups)-1) if groups[i][0]}
Output:
{'a': ['el1', 'el2', 'el3', 'el4'], 'b': ['some_other_el_1', 'some_other_el_2'], 'c': ['another_element_1', 'another_element_2'], 'd': ['', '', 'another_element_3', 'd4']}
Your code is readable, does the job and is reasonably efficient. There's no need to change much!
You could use more descriptive variable names and replace l1 with a set for faster lookup:
keys = ('a', 'c', 'b', 'e', 'f', 'd')
keys_and_values = [
'x','q','we','da','po',
'a', 'el1', 'el2', 'el3', 'el4',
'b', 'some_other_el_1', 'some_other_el_2',
'c', 'another_element_1', 'another_element_2',
'd', '', '', 'another_element_3', 'd4'
]
current_key = None
result = {}
for x in keys_and_values:
if x in keys:
current_key = x
result[current_key] = []
elif current_key:
result[current_key].append(x)
print(result)
# {'a': ['el1', 'el2', 'el3', 'el4'],
# 'c': ['another_element_1', 'another_element_2'],
# 'b': ['some_other_el_1', 'some_other_el_2'],
# 'd': ['', '', 'another_element_3', 'd4']}
def find_index():
idxs = [l2.index(i) for i in set(l1).intersection(set(l2))]
idxs.sort()
idxs+= [len(l2)+1]
res = {l2[idxs[i]]: list(l2[idxs[i]+1: idxs[i+1]]) for i in range(len(idxs)-1)}
return(res)
Comparison of methods, using justengel's test:
justengel
run_comp: .455
run_other: .244
mkrieger1
group_by_keys: .160
me
find_index: .068
Note that my method ignores keys that don't appear l2, and doesn't handle cases where keys appear more than once in l2. Adding in empty lists for keys that don't appear in l2 can be done by {**res, **{key: [] for key in set(l1).difference(set(l2))}}, which raises the time to .105.
Even cleaner than turning l1 into a set, use the keys of the dictionary you're building. Like this
d = {x: [] for x in l1}
k = None
for x in l2:
if x in d:
k = x
elif k is not None:
d[k].append(x)
This is because (in the worst case) your code would be iterating over all the values in l1 for every value in l2 on the if x in l1: line, because checking if a value is in a list takes linear time. Checking if a value is in a dictionary's keys is constant time in the average case (same with sets, as already suggested by Eric Duminil).
I set k to None and check for it because your code would've returned d with '': ['x','q','we','da','po'], which is presumably not what you want. This assumes l1 can't contain None.
My solution also assumes it's okay for the resulting dictionary to contain keys with empty lists if there are items in l1 that never appear in l2. If that's not okay, you can remove them at the end with
final_d = {k: v for k, v in d.items() if v}
Here is a list containing duplicates:
l1 = ['a', 'b', 'c', 'a', 'a', 'b']
Here is the desired result:
l1 = ['a', 'b', 'c', 'a_1', 'a_2', 'b_1']
How can the duplicates be renamed by appending a count number?
Here is an attempt to achieve this goal; however, is there a more Pythonic way?
for index in range(len(l1)):
counter = 1
list_of_duplicates_for_item = [dup_index for dup_index, item in enumerate(l1) if item == l1[index] and l1.count(l1[index]) > 1]
for dup_index in list_of_duplicates_for_item[1:]:
l1[dup_index] = l1[dup_index] + '_' + str(counter)
counter = counter + 1
In Python, generating a new list is usually much easier than changing an existing list. We have generators to do this efficiently. A dict can keep count of occurrences.
l = ['a', 'b', 'c', 'a', 'a', 'b']
def rename_duplicates( old ):
seen = {}
for x in old:
if x in seen:
seen[x] += 1
yield "%s_%d" % (x, seen[x])
else:
seen[x] = 0
yield x
print list(rename_duplicates(l))
I would do something like this:
a1 = ['a', 'b', 'c', 'a', 'a', 'b']
a2 = []
d = {}
for i in a1:
d.setdefault(i, -1)
d[i] += 1
if d[i] >= 1:
a2.append('%s_%d' % (i, d[i]))
else:
a2.append(i)
print a2
Based on your comment to #mathmike, if your ultimate goal is to create a dictionary from a list with duplicate keys, I would use a defaultdict from the `collections Lib.
>>> from collections import defaultdict
>>> multidict = defaultdict(list)
>>> multidict['a'].append(1)
>>> multidict['b'].append(2)
>>> multidict['a'].append(11)
>>> multidict
defaultdict(<type 'list'>, {'a': [1, 11], 'b': [2]})
I think the output you're asking for is messy itself, and so there is no clean way of creating it.
How do you intend to use this new list? Would a dictionary of counts like the following work instead?
{'a':3, 'b':2, 'c':1}
If so, I would recommend:
from collections import defaultdict
d = defaultdict(int) # values default to 0
for key in l1:
d[key] += 1
I wrote this approach for renaming duplicates in a list with any separator and a numeric or alphabetical postfix (e.g. _1, _2 or _a, _b, _c etc.). Might not be the best you could write efficient-wise, but I like this as a clean readable code which is also scalable easily.
def rename_duplicates(label_list, seperator="_", mode="numeric"):
"""
options for 'mode': numeric, alphabet
"""
import string
if not isinstance(label_list, list) or not isinstance(seperator, str):
raise TypeError("lable_list and separator must of type list and str, respectively")
for item in label_list:
l_count = label_list.count(item)
if l_count > 1:
if mode == "alphabet":
postfix_str = string.ascii_lowercase
if len(postfix_str) < l_count:
# do something
pass
elif mode == "numeric":
postfix_str = "".join([str(i+1) for i in range(l_count)])
else:
raise ValueError("the 'mode' could be either 'numeric' or 'alphabet'")
postfix_iter = iter(postfix_str)
for i in range(l_count):
item_index = label_list.index(item)
label_list[item_index] += seperator + next(postfix_iter)
return label_list
label_list = ['a', 'b', 'c', 'a', 'a', 'b']
use the function:
rename_duplicates(label_list)
result:
['a_1', 'b_1', 'c', 'a_2', 'a_3', 'b_2']