how to write python to replace the next perl code? - python

I have just encountered Perl code similar to the following:
my #keys = qw/ 1 2 3 /;
my #vals = qw/ a b c /;
my %hash;
#hash{#keys} = #vals;
This code populates an associative array given a list of keys and a list of values. For example, the above code creates the following data structure (expressed as JSON):
{
"1": "a",
"2": "b",
"3": "c"
}
How would one go about doing this in Python?

Like this:
import json
keys = [1, 2, 3]
vals = ['a', 'b', 'c']
hash = dict(zip(keys, vals))
json.dumps(hash)
=> '{"1": "a", "2": "b", "3": "c"}'

That json is pretty much a polyglot with Python. Once you assign it to a name, though, it stops being a polyglot.
hf = {
"1": "a",
"2": "b",
"3": "c"
}
You can also iteratively align items into a dictionary.
letters = ('a', 'b', 'c', )
numbers = ('1', '2', '3', )
hf = { n : l for n, l in zip(numbers, letters) }

You can do:
>>> keys='123'
>>> vals='abc'
>>> dict(zip(keys,vals))
{'1': 'a', '3': 'c', '2': 'b'}
(Python note: strings are iterable, so list('abc') is the rough equivalent of my #vals = qw/ a b c /; in Perl)
Then if you want JSON:
>>> import json
>>> json.dumps(dict(zip(keys,vals)))
'{"1": "a", "3": "c", "2": "b"}'

Related

comparing inner dictionaries in Python

I am trying to create a Python function that receives a dictionary whose values are inner dictionaries. If the keys of the inner dictionaries are the same, it should return 1, if not it should return 0.
This is the code I tried:
def f(dct: dict) -> int:
for i in range(len(dct)):
for j in range(len(dct)):
dct1 = list(dct.values())
if dct1[i].keys() == dct1[j].keys():
return 1
else:
return 0
it actually worked when the input dictionary have only two inner dictionaries but didn't work for three.
For example:
f(
{
"A": {1: "a", 2: "b"},
"B": {2: "c", 3: "d"},
}
)
returned 0 (which is the result I wanted)
but
f(
{
"A": {1: "a", 2: "b"},
"B": {2: "c", 3: "d"},
"C": {1: "c", 2: "d"},
}
)
returned 1, which is not the result I wanted.
How do I fix it, please?
So you want to ensure all of the dictionaries that are the values of dct have the same keys (ignoring the values)?
def all_key_sets_equal(dct: dict) -> bool:
key_sets = [set(nd) for nd in dct.values()]
return all(key_set == key_sets[0] for key_set in key_sets)

Converting a dataset from one for to another

I have one dataset in a list form and I want to convert it into another dataset under certain conditions.
Conditions
"a" = 1
"b" = 2
"c" = 3
input_list = ["a", "b", "c"]
# something happens
output_list = [1, 2, 3]
What to do?
Represent your set of conditions in a dictionary:
conditions = {
"a": 1,
"b": 2,
"c": 3
}
Use that dictionary in order to generate the output:
input_list = ["a", "b", "c"]
output_list = [conditions[x] for x in input_list]
What you want to achieve is a mapping.
Your conditions are a map, dictionary (Python lingo) or hash-table.
Each value there (like 1) corresponds to a key in the dataset (like "a").
To represent this mapping you can use a table-like datastructure that maps a key to a value. In Python this datastructure is called a dictionary:
mapping = {"a": 1, "b": 2, "c": 3} # used to translate from key to value (e.g. "a" to 1)
input_list = ["a", "b", "c"]
# something happens
output_list = []
for v in input_list:
mapped_value = mapping[v] # get the corresponding value, translate or map it
print(v + " -> " + mapped_value)
output_list.append(mapped_value)
print(output_list) # [1, 2, 3]
See also
Map (mathematics) - Wikipedia
Convert numbers into corresponding letter using Python

Create a nested tree from list

From a list of lists, I would like to create a nested dictionary of which the keys would point to the next value in the sublist. In addition, I would like to count the number of times a sequence of sublist values occurred.
Example:
From a list of lists as such:
[['a', 'b', 'c'],
['a', 'c'],
['b']]
I would like to create a nested dictionary as such:
{
'a': {
{'b':
{
'c':{}
'count_a_b_c': 1
}
'count_a_b*': 1
},
{'c': {},
'count_a_c': 1
}
'count_a*': 2
},
{
'b':{},
'count_b':1
}
}
Please note that the names of the keys for counts do not matter, they were named as such for illustration.
i was curious how i would do this and came up with this:
lst = [['a', 'b', 'c'],
['a', 'c'],
['b']]
tree = {}
for branch in lst:
count_str = 'count_*'
last_node = branch[-1]
cur_tree = tree
for node in branch:
if node == last_node:
count_str = count_str[:-2] + f'_{node}'
else:
count_str = count_str[:-2] + f'_{node}_*'
cur_tree[count_str] = cur_tree.get(count_str, 0) + 1
cur_tree = cur_tree.setdefault(node, {})
nothing special happening here...
for your example:
import json
print(json.dumps(tree, sort_keys=True, indent=4))
produces:
{
"a": {
"b": {
"c": {},
"count_a_b_c": 1
},
"c": {},
"count_a_b_*": 1,
"count_a_c": 1
},
"b": {},
"count_a_*": 2,
"count_b": 1
}
it does not exactly reproduce what you imagine - but that is in part due to the fact that your desired result is not a valid python dictionary...
but it may be a starting point for you to solve your problem.

Merge values of a dictionary by key based on custom function

Assume you have two dictionaries and you want to merge the two dictionaries by applying a function to the values that have matching keys. here I use the + operator as binary function.
x = { 1: "a", 2: "b", 3: "c" }
y = { 1: "A", 2: "B", 3: "C" }
result = { t[0][0]: t[0][1] + t[1][1] for t in zip(sorted(x.items()), sorted(y.items())) }
print result # gives { 1: "aA", 2: "bB", 3: "cC" }
I would prefer a self contained expression instead of statements, but this is unreadable.
so far I'm doing:
def dzip(f, a, b):
least_keys = set.intersection(set(a.keys()), set(b.keys()))
copy_dict = dict()
for i in least_keys.keys():
copy_dict[i] = f(a[i], b[i])
return copy_dict
print dzip(lambda a,b: a+b,x,y)
Is there a more readable solution to this than the expression I gave?
In the first case, you can directly use a dict comprehension:
>>> x = { 1: "a", 2: "b", 3: "c" }
>>> y = { 1: "A", 2: "B", 3: "C" }
>>> {key: x.get(key, "") + y.get(key, "") for key in set.intersection(set(x.keys()), set(y.keys()))}
{1: 'aA', 2: 'bB', 3: 'cC'}
So that in your second piece of code, you can simplify it to a simple one liner:
def dzip(f, a, b):
return {key: f(a.get(key, ""), b.get(key, "")) for key in set.inersection(set(a.keys()) + set(b.keys()))}
You can even define dzip as a lambda:
dzip = lambda f, a, b: {key: f(a.get(key, ""), b.get(key, ""))
for key in set.intersection(set(a.keys()), set(b.keys()))}
In a single run, this becomes:
>>> dzip = lambda f, a, b: {key: f(a.get(key, ""), b.get(key, ""))
... for key in set.intersection(set(a.keys()), set(b.keys()))}
>>>
>>> print dzip(lambda a,b: a+b,x,y)
{1: 'aA', 2: 'bB', 3: 'cC'}
Note that this will work even if x and y have different sets of keys (just something that can break in your first version of the code).
You can use Counter for this type of dict merging
from collections import Counter
>>>Counter(x)+Counter(y)
Counter({3: 'cC', 2: 'bB', 1: 'aA'})

Comparing python dictionaries and find diffrence of the two

So im trying to write a python program that will take 2 .json files compare the contents and display the differences between the two. So far my program takes user input to select two files and compares the two just fine. I have hit a wall trying to figure out how to print what the actual differences are between the two files.
my program:
#!/usr/bin/env python2
import json
#get_json() requests user input to select a .json file
#and creates a python dict with the information
def get_json():
file_name = raw_input("Enter name of JSON File: ")
with open(file_name) as json_file:
json_data = json.load(json_file)
return json_data
#compare_json(x,y) takes 2 dicts, and compairs the contents
#print match if equal, or not a match if there is difrences
def compare_json(x,y):
for x_values, y_values in zip(x.iteritems(), y.iteritems()):
if x_values == y_values:
print 'Match'
else:
print 'Not a match'
def main():
json1 = get_json()
json2 = get_json()
compare_json(json1, json2)
if __name__ == "__main__":
main()
example of my .json:
{
"menu": {
"popup": {
"menuitem": [
{
"onclick": "CreateNewDoc()",
"value": "New"
},
{
"onclick": "OpenDoc()",
"value": "Open"
},
{
"onclick": "CloseDoc()",
"value": "Close"
}
]
},
"id": "file",
"value": "File"
}
}
Your problem stems from the fact that dictionaries are stored in a structure with an internal logical consistency - when you ask for someDict.items() and someOtherDict.items(), the key-value pairs of elements are computed by the same algorithm. However, due to differences in the keys that may be present in either dictionary, identical keys may not be present in the corresponding index in either list returned by the call to dict.items(). As a result, you are much better off checking if a particular key exists in another dictionary, and comparing the associated value in both.
def compare_json(x,y):
for x_key in x:
if x_key in y and x[x_key] == y[x_key]:
print 'Match'
else:
print 'Not a match'
if any(k not in x for k in y):
print 'Not a match'
If you want to print out the actual differences:
def printDiffs(x,y):
diff = False
for x_key in x:
if x_key not in y:
diff = True
print "key %s in x, but not in y" %x_key
elif x[x_key] != y[x_key]:
diff = True
print "key %s in x and in y, but values differ (%s in x and %s in y)" %(x_key, x[x_key], y[x_key])
if not diff:
print "both files are identical"
You might want to try out the jsondiff library in python.
https://pypi.python.org/pypi/jsondiff/0.1.0
The examples referenced from the site are below.
>>> from jsondiff import diff
>>> diff({'a': 1}, {'a': 1, 'b': 2})
{<insert>: {'b': 2}}
>>> diff({'a': 1, 'b': 3}, {'a': 1, 'b': 2})
{<update>: {'b': 2}}
>>> diff({'a': 1, 'b': 3}, {'a': 1})
{<delete>: ['b']}
>>> diff(['a', 'b', 'c'], ['a', 'b', 'c', 'd'])
{<insert>: [(3, 'd')]}
>>> diff(['a', 'b', 'c'], ['a', 'c'])
{<delete>: [1]}
# Similar items get patched
>>> diff(['a', {'x': 3}, 'c'], ['a', {'x': 3, 'y': 4}, 'c'])
{<update>: [(1, {<insert>: {'y': 4}})]}
# Special handling of sets
>>> diff({'a', 'b', 'c'}, {'a', 'c', 'd'})
{<add>: set(['d']), <discard>: set(['b'])}
# Parse and dump JSON
>>> print diff('["a", "b", "c"]', '["a", "c", "d"]', parse=True, dump=True, indent=2)
{
"$delete": [
1
],
"$insert": [
[
2,
"d"
]
]
}

Categories

Resources