How to accept any key in python json dictionary? - python

Here's a simplified version of the JSON I am working with:
{
"libraries": [
{
"library-1": {
"file": {
"url": "foobar.com/.../library-1.bin"
}
}
},
{
"library-2": {
"application": {
"url": "barfoo.com/.../library-2.exe"
}
}
}
]
}
Using json, I can json.loads() this file. I need to be able to find the 'url', download it, and save it to a local folder called library. In this case, I'd create two folders within libraries/, one called library-1, the other library-2. Within these folder's would be whatever was downloaded from the url.
The issue, however, is being able to get to the url:
my_json = json.loads(...) # get the json
for library in my_json['libraries']:
file.download(library['file']['url']) # doesn't access ['application']['url']
Since the JSON I am using uses a variety of accessors, sometimes 'file', other times 'dll' etc, I can't use one specific dictionary key. How can I use multiple. Would there be a modular way to do this?
Edit: There are numerous accessors, 'file', 'application' and 'dll' are only some examples.

You can just iterate through each level of the dictionary and download the files if you find a url.
urls = []
for library in my_json['libraries']:
for lib_name, lib_data in library.items():
for module_name, module_data in lib_data.items():
url = module_data.get('url')
if url is not None:
# create local directory with lib_name
# download files from url to local directory
urls.append(url)
# urls = ['foobar.com/.../library-1.bin', 'barfoo.com/.../library-2.exe']

This should work:
for library in my_json['libraries']:
for value in library.values():
for url in value.values():
file.download(url)

I would suggest doing it like this:
for library in my_json['libraries']:
library_data = library.popitem()[1].popitem()[1]
file.download(library_data['url'])

Try this
for library in my_json['libraries']:
if 'file' in library:
file.download(library['file']['url'])
elif 'dll' in library:
file.download(library['dll']['url'])
It just sees if your dict(created by parsing JSON) has a key named 'file'. If so, then use 'url' of the dict corresponds to the 'file' key. If not, try the same with 'dll' keyword.
Edit: If you don't know the key to access the dict containing the url, try this.
for library in my_json['libraries']:
for key in library:
if 'url' in library['key']:
file.download(library[key]['url'])
This iterates over all the keys in your library. Then, whichever key contains an 'url', downloads using that.

Related

Unable to parse JSON from file using Python3

I'm trying to get the value of ["pooled_metrics"]["vmaf"]["harmonic_mean"] from a JSON file I want to parse using python. This is the current state of my code:
for crf in crf_ranges:
vmaf_output_crf_list_log = job['config']['output_dir'] + '/' + build_name(stream) + f'/vmaf_{crf}.json'
# read the vmaf_output_crf_list_log file and get value from ["pooled_metrics"]["vmaf"]["harmonic_mean"]
with open(vmaf_output_crf_list_log, 'r') as json_vmaf_file:
# load the json_string["pooled_metrics"] into a python dictionary
vm = json.loads(json_vmaf_file.read())
vmaf_values.append((crf, vm["pooled_metrics"]["vmaf"]["harmonic_mean"]))
This will give me back the following error:
AttributeError: 'dict' object has no attribute 'loads'
I always get back the same AttributeError not matter if I use "load" or "loads".
I validated the contents of the JSON, which is valid using various online validators, but still, I am not able to load the JSON for further parsing operations.
I expect that I can load a file that contains valid JSON data. The content of the file looks like this:
{
"frames": [
{
"frameNum": 0,
"metrics": {
"integer_vif_scale2": 0.997330,
}
},
],
"pooled_metrics": {
"vmaf": {
"min": 89.617207,
"harmonic_mean": 99.868023
}
},
"aggregate_metrics": {
}
}
Can somebody provide me some advice onto this behavior, what does it seem so absolutely impossible to load this JSON file?
loads is a method for the json library as the docs say https://docs.python.org/3/library/json.html#json.loads. In this case you are having a AttributeError this means that probably you have created another variable named "json" and when you call json.loads is calling that variable hence it won't have a loads method.

Expanding a Scribunto module that doesn't have a function

I want to get the return value of this Wikimedia Scribunto module in Python. Its source code is roughly like this:
local Languages = {}
Languages = {
["aa"] = {
name = "afarština",
dir = "ltr",
name_attr_gen_pl = "afarských"
},
-- More languages...
["zza"] = {
name = "zazaki",
dir = "ltr"
}
}
return Languages
In the Wiktextract library, there is already Python code to accomplish similar tasks:
def expand_template(sub_domain: str, text: str) -> str:
import requests
# https://www.mediawiki.org/wiki/API:Expandtemplates
params = {
"action": "expandtemplates",
"format": "json",
"text": text,
"prop": "wikitext",
"formatversion": "2",
}
r = requests.get(f"https://{sub_domain}.wiktionary.org/w/api.php",
params=params)
data = r.json()
return data["expandtemplates"]["wikitext"]
This works for languages like French because there the Scribunto module has a well-defined function that returns a value, as an example here:
Scribunto module:
p = {}
function p.affiche_langues_python(frame)
-- returns the needed stuff here
end
The associated Python function:
def get_fr_languages():
# https://fr.wiktionary.org/wiki/Module:langues/analyse
json_text = expand_template(
"fr", "{{#invoke:langues/analyse|affiche_langues_python}}"
)
json_text = json_text[json_text.index("{") : json_text.index("}") + 1]
json_text = json_text.replace(",\r\n}", "}") # remove tailing comma
data = json.loads(json_text)
lang_data = {}
for lang_code, lang_name in data.items():
lang_data[lang_code] = [lang_name[0].upper() + lang_name[1:]]
save_json_file(lang_data, "fr")
But in our case we don't have a function to call.
So if we try:
def get_cs_languages():
# https://cs.wiktionary.org/wiki/Modul:Languages
json_text = expand_template(
"cs", "{{#invoke:Languages}}"
)
print(json_text)
we get <strong class="error"><span class="scribunto-error" id="mw-scribunto-error-0">Chyba skriptu: Musíte uvést funkci, která se má zavolat.</span></strong> usage: get_languages.py [-h] sub_domain lang_code get_languages.py: error: the following arguments are required: sub_domain, lang_code. (Translated as "You have to specify a function you want to call. But when you enter a function name as a parameter like in the French example, it complains that that function does not exist.)
What could be a way to solve this?
The easiest and most general way is to get the return value of the module as JSON and parse it in Python.
Make another module that exports a function dump_as_json that takes the name of the first module as a frame argument and returns the first module as JSON. In Python, expand {{#invoke:json module|dump_as_json|Module:module to dump}} using the expandtemplates API and parse the return value of the module invocation as JSON with json.loads(data["expandtemplates"]["wikitext"]).
Text of Module:json module (call it what you want):
return {
dump_as_json = function(frame)
local module_name = frame.args[0]
local json_encode = mw.text.jsonEncode
-- json_encode = require "Module:JSON".toJSON
return json_encode(require(module_name))
end
}
With pywikibot:
from pywikibot import Site
site = Site(code="cs", fam="wiktionary")
languages = json.loads(site.expand_text("{{#invoke:json module|dump_as_json|Module:module to dump}}")
If you get the error Lua error: Cannot pass circular reference to PHP, this means that at least one of the tables in Module:module to dump is referenced by another table more than once, like if the module was
local t = {}
return { t, t }
To handle these tables, you will have to get a pure-Lua JSON encoder function to replace mw.text.jsonEncode, like the toJSON function from Module:JSON on English Wiktionary.
One warning about this method that is not relevant for the module you are trying to get: string values in the JSON will only be accurate if they were NFC-normalized valid UTF-8 with no special ASCII control codes (U+0000-U+001F excluding tab U+0009 and LF U+000A) when they were returned from Module:module to dump. As on a wiki page, the expandtemplates API will replace ASCII control codes and invalid UTF-8 with the U+FFFD character, and will NFC-normalize everything else. That is, "\1\128e" .. mw.ustring.char(0x0301) would be modified to the equivalent of mw.ustring.char(0xFFFD, 0xFFFD, 0x00E9). This doesn't matter in most cases (like if the table contains readable text), but if it did matter, the JSON-encoding module would have to output JSON escapes for non-NFC character sequences and ASCII control codes and find some way to encode invalid UTF-8.
If, like the module you are dumping, Module:module to dump is a pure table of literal values with no references to other modules or to Scribunto-only global values, you could also get its raw wikitext with the Revisions API and parse it in Lua on your machine and pass it to Python. I think there is a Python extension that allows you to directly use a Lua state in Python.
Running a module with dependencies on the local machine is not possible unless you go to the trouble of setting up the full Scribunto environment on your machine, and figuring out a way to download the module dependencies and make them available to the Lua state. I have sort of done this myself, but it isn't necessary for your use case.

Get a Json file that represents device(object) from a specific folder and generating kafka event

I need make tool using Python that takes JSON file(device) from a folder(devices) and generates kafka events for the device.events topic. before generating event it has to check if device exists in folder.
I wrote this but i still dont know how to open file inside the folder.
import json
test_devices = r'''
{
"data": ["{\"mac_address\": null, \"serial_number\": \"BRUCE03\", \"device_type\": \"STORAGE\", \"part_number\": \"STCHPRTffOM01\", \"extra_attributes\": [], \"platform_customer_id\": \"a44a5c7c65ae11eba916dc31c0b885\", \"application_customer_id\": \"4b232de87dcf11ebbe01ca32d32b6b77\"}"],
"id": "b139937e-5107-4125-b9b0-d05d17ad2bea",
"source":"CCS",
"specversion":"1.0",
"time": "2021-01-13T14:44:18.972181+00:00",
"type":"` `",
"topic":"` `"
}
'''
data = json.loads(test_devices)
print(type(test_devices))
print(data)
Kafka is an implementation detail and doesn't relate to your questions.
how to open file inside the folder
import json
with open("/path/to/device.json") as f:
content = json.load(f)
print(content)
has to check if device exists in folder
import os.path
print("Exists? :", os.path.exists("/path/to/device.json"))

How to print attributes of a json file

I would appreciate some help: how could I print just the country from the info obtained via this API call? Thanks!
import requests
import json
url = "https://randomuser.me/api/"
data = requests.get(url).json()
print(data)
You should play a little more with the json in order to learn how to use it, a helpful way to understand them is to go layer by layer printing the keys dict.keys() to see where you should go next if you dont have a documentation
in this particular case it returns a dictionary with the following first layer structure:
{
"results": [ ... ]
"info": { ... }
}
where results contains a single dictionary inside, therefore we can take
data['results'][0] to wok with
there is 'location', and there is a 'country', you can access this in that order to print the country:
print(data['results'][0]['location']['country'])

Python: Pretty print json file with array ID's

I would like to pretty print a json file where i can see the array ID's. Im working on a Cisco Nexus Switch with NX-OS that runs Python (2.7.11). Looking at following code:
cmd = 'show interface Eth1/1 counters'
out = json.loads(clid(cmd))
print (json.dumps(out, sort_keys=True, indent=4))
This gives me:
{
"TABLE_rx_counters": {
"ROW_rx_counters": [
{
"eth_inbytes": "442370508663",
"eth_inucast": "76618907",
"interface_rx": "Ethernet1/1"
},
{
"eth_inbcast": "4269",
"eth_inmcast": "49144",
"interface_rx": "Ethernet1/1"
}
]
},
"TABLE_tx_counters": {
"ROW_tx_counters": [
{
"eth_outbytes": "217868085254",
"eth_outucast": "66635610",
"interface_tx": "Ethernet1/1"
},
{
"eth_outbcast": "1137",
"eth_outmcast": "557815",
"interface_tx": "Ethernet1/1"
}
]
}
}
But i need to access the field by:
rxuc = int(out['TABLE_rx_counters']['ROW_rx_counters'][0]['eth_inucast'])
rxmc = int(out['TABLE_rx_counters']['ROW_rx_counters'][1]['eth_inmcast'])
rxbc = int(out['TABLE_rx_counters']['ROW_rx_counters'][1]['eth_inbcast'])
txuc = int(out['TABLE_tx_counters']['ROW_tx_counters'][0]['eth_outucast'])
txmc = int(out['TABLE_tx_counters']['ROW_tx_counters'][1]['eth_outmcast'])
txbc = int(out['TABLE_tx_counters']['ROW_tx_counters'][1]['eth_outbcast'])
So i need to know the array ID (in this example zeros and ones) to access the information for this interface. It seems pretty easy with only 2 arrays, but imagine 500. Right now, i always copy the json code to jsoneditoronline.org where i can see the ID's:
Is there an easy way to make the IDs visible within python itself?
You posted is valid JSON.
The image is from a tool that takes the data from JSON and displays it. You can display it in any way you want, but the contents in the file will need to be valid JSON.
If you do not need to load the JSON later, you can do with it whatever you like, but json.dumps() will give you JSON only.

Categories

Resources