How to matching two Dictionaries after comparing the index values - python

I want the key in the resulting dict to be the key in dict1 (i.e. k1) and the value in the resulting dict is the key from dict2 (k2) that has a value v2 equal to the value for k1 in dict1 i.e. v1 (v2==v1)
First dict1
dict1 = {
"a": "121",
"b": "132",
"c": "312",
"d": "434",
"e": "564",
"f": "663",
}
The second one is -
dict2 = {
"a": "312",
"b": "121",
"c": "564",
"d": "663",
"e": "434",
"f": "132",
}
The result should look like this -
Results = {
"a": "b",
"b": "f",
"c": "a",
"d": "e",
"e": "c",
"f": "d",
}
Dict is key-value pair. I would like to compare the value of dict1 with the value of dict2 and print the key of dict2

You need to create a new dict that swap key and value from dict2. Note that your requirement imply the values in dict2 are unique and can serve as keys.
Then
dict1 = {
"0": "1",
"1": "4",
"2": "5",
"3": "6",
"4": "7",
"5": "8",
}
dict2 = {
"0": "6",
"1": "8",
"2": "4",
"3": "1",
"4": "5",
"5": "7",
}
dict3 = {val:key for key, val in dict2.items()}
result = {key:dict3.get(val) for key, val in dict1.items()}
print(result)

Related

how to save multi level dict per line?

i have this dict
dd = {
"A": {"a": {"1": "b", "2": "f"}, "z": ["z", "q"]},
"B": {"b": {"1": "c", "2": "g"}, "z": ["x", "p"]},
"C": {"c": {"1": "d", "2": "h"}, "z": ["y", "o"]},
}
and i wanna have it formated in one line like this in a file i used
with open('file.json', 'w') as file: json.dump(dd, file, indent=1)
# result
{
"A": {
"a": {
"1": "b",
"2": "f"
},
"z": [
"z",
"q"
]
},
"B": {
"b": {
"1": "c",
"2": "g"
},
"z": [
"x",
"p"
]
},
"C": {
"c": {
"1": "d",
"2": "h"
},
"z": [
"y",
"o"
]
}
}
i also tried but gave me string and list wrong
with open('file.json', 'w') as file: file.write('{\n' +',\n'.join(json.dumps(f"{i}: {dd[i]}") for i in dd) +'\n}')
# result
{
"A: {'a': {'1': 'b', '2': 'f'}, 'z': ['z', 'q']}",
"B: {'b': {'1': 'c', '2': 'g'}, 'z': ['x', 'p']}",
"C: {'c': {'1': 'd', '2': 'h'}, 'z': ['y', 'o']}"
}
the result i wanna is
{
"A": {"a": {"1": "b", "2": "f"}, "z": ["z", "q"]},
"B": {"b": {"1": "c", "2": "g"}, "z": ["x", "p"]},
"C": {"c": {"1": "d", "2": "h"}, "z": ["y", "o"]},
}
how do i print the json content one line per dict while all inside is one line too?
i plan to read it using json.load
Stdlib json module does not really support that, but you should be able to write a function which does similar pretty easily. Something like:
import json
def my_dumps(dd):
lines = []
for k, v in dd.items():
lines.append(json.dumps({k: v})[1:-1])
return "{\n" + ",\n".join(lines) + "\n}"
If all you wanted was to wrap json to some more human-friendly line width, without totally spacing out everything like using indent option does, then another option might be using textwrap:
>>> print("\n".join(textwrap.wrap(json.dumps(dd), 51)))
{"A": {"a": {"1": "b", "2": "f"}, "z": ["z", "q"]},
"B": {"b": {"1": "c", "2": "g"}, "z": ["x", "p"]},
"C": {"c": {"1": "d", "2": "h"}, "z": ["y", "o"]}}
x = ['{\n']
for i in dd :
x.append('"'+i+'": '+str(dd[i]).replace("'",'"')+",\n")
x[-1] = x[-1][:-2]
x.append("\n}")
with open('file.json', 'w') as file:
file.writelines(x)
Image of the output :-

ETL by parsing JSON dynamically, Python

I new to python.
I want read auth column from PostgreSQL which gives a json. I need to parse it and get the relevant api credentials in it. Then based on these, I want to get the data which is again json but this time its deeply nested json and objects can be more or less in different json. Now, from these JSON, I want to get all the keys and insert these in Source column names in the source table as rows of sourceColumnNames column. Target Column may have less columns then source lets say only a and d from source as name and PostalCode.
I am wondering how can I achieve this. It looks to be done something like scala case classes, target and source model classes but its needed to be done in python. How?
Data in AuthColumn is
{ "url": "https://api.myUrl.com/v2",
"headers": {
"Authorization": "TheSecretAccessToken2022",
"Content-Type": "application/json"
},
"data": {
"query": "{ boards{ items{ name column_values {a b c d} } } }"
} }
I need to parse it to get credentials and execute the query.
Then it will return some JSON which I need to parse.
This JSON could be like this
{
"data": {
"boards": [{
"name": "DP",
"id": "123",
"description": null,
"items": [{
"name": "TheColumn",
"column_values": [{
"a": "PDs",
"b": "PDs",
"c": "CI",
"d": "PV"
}, {
"a": "SLUD",
"b": "SLUD",
"c": "d",
"d": "MFO"
}, {
"a": "ST",
"b": "ST",
"c": "CI",
"d": "UC"
}, {
"a": "c",
"b": "c",
"c": "CI",
"d": "NC"
}, {
"a": "OP",
"b": "op",
"c": "CI",
"d": "0 days"
}, {
"a": "OPd",
"b": "OPd",
"c": "CI",
"d": "2022-02-25"
}, {
"a": "CD",
"b": "cd",
"c": "d",
"d": "2022-02-25"
}, {
"a": "cld",
"b": "cld",
"c": "d",
"d": "2022-04-22"
}, {
"a": "SoDce",
"b": "soDce",
"c": "CI",
"d": ""
}, {
"a": "MOD",
"b": "MOD",
"c": "date",
"d": ""
}, {
"a": "PP",
"b": "PP",
"c": "nuUDic",
"d": "625000"
}, {
"a": "UD",
"b": "UD",
"c": "nuUDic",
"d": ""
}, {
"a": "PAVSP",
"b": "PAVSP",
"c": "neUDic",
"d": ""
}, {
"a": "LendeUD",
"b": "lendeUD",
"c": "CI",
"d": "TBD"
}, {
"a": "ESP",
"b": "ESP",
"c": "CI",
"d": ""
}, {
"a": "ac",
"b": "ac",
"c": "CI",
"d": "Chicago"
}, {
"a": "SLd",
"b": "SLd",
"c": "CI",
"d": ""
}, {
"a": "UA",
"b": "UA",
"c": "CI",
"d": ""
}, {
"a": "UD",
"b": "UD",
"c": "CI",
"d": ""
}, {
"a": "R?",
"b": "R",
"c": "CI",
"d": ""
}, {
"a": "DDE",
"b": "DDE",
"c": "CI",
"d": ""
}, {
"a": "SOD",
"b": "SOD",
"c": "CI",
"d": ""
}, {
"a": "NOS",
"b": "NOS",
"c": "d",
"d": ""
}]
}, {
"name": "BBB",
"column_values": [{
"a": "PeUDs",
"b": "PeUDs",
"c": "CI",
"d": "PV"
}, {
"a": "SLUD",
"b": "SLUD",
"c": "d",
"d": "Ddd"
}, {
"a": "ST",
"b": "ST",
"c": "CI",
"d": "UC"
}, {
"a": "c",
"b": "c",
"c": "CI",
"d": "NC"
}, {
"a": "OP",
"b": "op",
"c": "CI",
"d": "0 days"
}, {
"a": "OPd",
"b": "OPd",
"c": "CI",
"d": "2022-02-23"
}, {
"a": "CD",
"b": "cd",
"c": "d",
"d": "2022-02-23"
}, {
"a": "cld",
"b": "cld",
"c": "d",
"d": "2022-03-04"
}, {
"a": "SoDce",
"b": "soDce",
"c": "CI",
"d": ""
}, {
"a": "MOD",
"b": "MOD",
"c": "date",
"d": ""
}, {
"a": "PP",
"b": "PP",
"c": "nuUDic",
"d": "3200"
}, {
"a": "UD",
"b": "UD",
"c": "numeic",
"d": ""
}, {
"a": "PDVSP",
"b": "PDVSP",
"c": "nueUDic",
"d": ""
}, {
"a": "ESP",
"b": "ESP",
"c": "CI",
"d": ""
}, {
"a": "ac",
"b": "ac",
"c": "CI",
"d": "Chicago a"
}, {
"a": "SLd",
"b": "SLd",
"c": "CI",
"d": ""
}, {
"a": "UA",
"b": "UA",
"c": "CI",
"d": ""
}, {
"a": "UD",
"b": "UD",
"c": "CI",
"d": ""
}, {
"a": "R?",
"b": "R",
"c": "CI",
"d": ""
}, {
"a": "DDE",
"b": "DDE",
"c": "CI",
"d": "DooU"
}, {
"a": "SOD",
"b": "SOD",
"c": "CI",
"d": ""
}, {
"a": "IU",
"b": "IU",
"c": "CI",
"d": ""
},{ "a": "DD",
"b": "DD",
"c": "CI",
"d": ""
}, {
"a": "LOS",
"b": "LOS",
"c": "num",
"d": ""
}, {
"a": "NOS",
"b": "NOS",
"c": "d",
"d": ""
}] }] }] }}
Now, I want to parse this Json and get keys and insert then to columnNames column in Meta Data Table
as
sourceColumnNames
name
id
description
items_name
a
b
c
d
Then I will query auth, get creds, and get values based on these source columns.
So far,
I have parsed JSON by json in python using index.
import json
with open('path/file.json') as myJson:
read_myjson = json.load(myJson)
read_data = read_myjson['data']
read_board = read_myjson['data']['boards']
board_name = read_myjson['data']['boards'][0]['name']
board_id = read_myjson['data']['boards'][0]['id']
board_description = read_myjson['data']['boards'][0]['description']
board_items = read_myjson['data']['boards'][0]['items']
board_items_name = read_myjson['data']['boards'][0]['items'][0]['name']
board_items_columnValues = read_myjson['data']['boards'][0]['items'][0]['column_values']
board_items_columnValues_title = read_myjson['data']['boards'][0]['items'][0]['column_values'][0]['a']
board_items_columnValues_id = read_myjson['data']['boards'][0]['items'][0]['column_values'][0]['b']
board_items_columnValues_type = read_myjson['data']['boards'][0]['items'][0]['column_values'][0]['c']
board_items_columnValues_text = read_myjson['data']['boards'][0]['items'][0]['column_values'][0]['d']
# for loop on Header
print("printing Header loop : ")
for key, val in read_myjson.items():
print(key, ":::", val)
headerKey = key
headerValue = val
print("printing data loop : it gives board key and its value")
for key, val in read_data.items():
# print(key, ":::", val)
datakey = key
dataValue = val
# print(datakey, "::::", dataValue)
print(" items loop")
# for key, val in read_board.items():
for item in board_items:
for key, val in item.items():
# print(key, ":::", val)
compDataAsKey = key
compDataAsValue = val
print(" Items_column_values loop")
columnKeys = []
columnValues = []
for items in board_items_columnValues:
for key, val in items.items():
# print(key, ":", val)
# compColumnKey = key
# compColumnValue = val
columnKeys.append(key)
columnValues.append(val)
I have also tried dataclasses in python but cant actually map the class to json parse etc.
import json
import orjson, dataclasses
with open('path/AuthJsonSample.json') as myJson:
read_myjson = json.load(myJson)
#dataclasses.dataclass
class AuthData:
url: str
headers: str
data: str
How can I make this etl pipeline?

Get column's max value from a nested dictionary

The output below is a pretty printed snapshot of a portion of a dictionary that I am trying to work with. I'm looking to output the highest value of all entries in column p, as well as it's main dictionary key.
In the example output below, the value for p in GRTEUR is higher than any other values of p from any of the other main keys so I would like to return the main key and the value, so GRTEUR and -0.1752234098475558.
I've read about Pandas and using pandas.DataFrame.max() but I'm not finding any examples on how to evaluate the values from a key (p) of a nested dictionary (1h).
Any pointers?
data = {
"LUNAEUR": {
"1h": {
"ot": "2021-07-09 08:00:00",
"o": 6.033,
"h": 6.551,
"l": 5.983,
"ct": "2021-07-09 08:59:59.999000",
"p": -1.660459342023591
},
"stream0": {
"c": 6.444,
"v": 1393.808,
"ct": "2021-07-09 09:59:59.999000"
},
"stream1": {
"c": 6.446,
"v": 1171.177,
"ct": "2021-07-09 09:59:59.999000"
}
},
"THETAEUR": {
"1h": {
"ot": "2021-07-09 08:00:00",
"o": 4.992,
"h": 5.076,
"l": 4.956,
"ct": "2021-07-09 08:59:59.999000",
"p": -0.2963841138114934
},
"stream0": {
"c": 5.061,
"v": 492.138,
"ct": "2021-07-09 09:59:59.999000"
},
"stream1": {
"c": 5.067,
"v": 423.079,
"ct": "2021-07-09 09:59:59.999000"
}
},
"GRTEUR": {
"1h": {
"ot": "2021-07-09 08:00:00",
"o": 0.5616,
"h": 0.5717,
"l": 0.5523,
"ct": "2021-07-09 08:59:59.999000",
"p": -0.1752234098475558
},
"stream0": {
"c": 0.5707,
"v": 105.17,
"ct": "2021-07-09 09:59:59.999000"
},
"stream1": {
"c": 0.571,
"v": 19.71,
"ct": "2021-07-09 09:59:59.999000"
}
}
}
Filter the data using python max(..., key=...):
key, value = max(data.items(), key=lambda x: x[1]["1h"]["p"])
print(key, value["1h"]["p"])
To ignore those keys whose values don't contain the "p", you could either provide a very small default value
import sys
max(data.items(), key=lambda x: x[1]["1h"].get("p", -sys.float_info.max))
or filter before finding the max:
max(((key, val) for key, val in data.items() if "p" in val["1h"]),
key=lambda x: x[1]["1h"]["p"])
The reduce function gives the values of nested keys in each dictionary. Maybe you could try this:
def deep_get(dictionary, *keys):
print(keys)
return reduce(lambda d, key: d.get(key, None) if isinstance(d, dict) else None, keys, dictionary)
val_list=[]
key_list=["LUNAEUR","THETAEUR","GRTEUR"]
for item in key_list:
key1=item
key2='1h'
key3='p'
print(deep_get(data, key1,key2,key3))
val_list.append(deep_get(data, key1,key2,key3))
print(max(val_list)
Output:
-0.1752234098475558

Python - Create a list of objects based on counts of another list of objects

I have a list of objects and I want to create another list of items but grouped by "Name" and two fields which are number of instances of a particular instance type.
I have this :
result = [
{"Name": "Foo", "Type": "A", "RandomColumn1": "1"},
{"Name": "Bar", "Type": "B", "RandomColumn2": "2"},
{"Name": "Foo", "Type": "A", "RandomColumn3": "3"},
{"Name": "Bar", "Type": "A", "RandomColumn4": "4"},
{"Name": "Foo", "Type": "B", "RandomColumn5": "5"},
]
I am trying to get a count of the number of different "Type" columns whilst discarding any other column - RandomColumnX in this case.
I want the above to come out like this:
[{"Name": "Foo", "A": 2, "B": 1}, {"Name": "Bar", "A": 1, "B": 1}]
I tried doing something like this :
group_requests = [{
"Name": key,
"A": len([d for d in list(value) if d.get('Type') == 'A']),
"B": len([y for y in list(value) if y['Type'] == 'B']),
} for key, value in groupby(result, key=lambda x: x['Name'])]
However, it does not count the values in the "B" column and the count for this key is always 0.
Can anybody help me?
Mistake 1.
In order for the itertools.groupby to work your input iterable needs to already be sorted on the same key function.
result = sorted(result, key=lambda x: x["Name"])
Mistake 2.
The returned group i.e value is itself an iterator, so you need to save the output in order to iterate over it multiple times.
group_requests = []
for key, value in itertools.groupby(result, key=lambda x: x["Name"]):
value = list(value) # save the output
temp = {
"Name": key,
"A": len([d for d in value if d.get("Type") == "A"]),
"B": len([y for y in value if y["Type"] == "B"]),
}
group_requests.append(temp)
If someone wants without list comprehension. It can be achieve like this
from collections import defaultdict
result = [{"Name": "Foo", "Type": "A", "RandomColumn1": "1"},
{"Name": "Bar", "Type": "B", "RandomColumn2": "2"},
{"Name": "Foo", "Type": "A", "RandomColumn3": "3"},
{"Name": "Bar", "Type": "A", "RandomColumn4": "4"},
{"Name": "Foo", "Type": "B", "RandomColumn5": "5"}]
group_requests = []
counterA = defaultdict(int)
counterB = defaultdict(int)
names = set()
for val in result:
name = val["Name"]
type = val["Type"]
if type == "A":
counterA[name] += 1
else:
counterB[name] += 1
names.add(name)
for name in names:
group_requests.append({
"Name": name,
"A": counterA[name],
"B": counterB[name]
})
print(group_requests)

Parsing a json nested dictionary in python

I need to reach the value "y "of "B" in result.
{
"Response": {
"Result": [2]
0: {
"A": "x"
"B": "y"
"C": "z"
}
1: {
"A": "d"
"B": "e"
"C": "f"
"D": "g"
}
}
}
my attempt ['Response']['Result'][0]['B'] produces the given error
IndexError: list index out of range
Any help will be appreciated. Thanks.
The key 0 is not under "Result" you should use ['Response'][0]['B']

Categories

Resources