How to print attributes of a json file - python

I would appreciate some help: how could I print just the country from the info obtained via this API call? Thanks!
import requests
import json
url = "https://randomuser.me/api/"
data = requests.get(url).json()
print(data)

You should play a little more with the json in order to learn how to use it, a helpful way to understand them is to go layer by layer printing the keys dict.keys() to see where you should go next if you dont have a documentation
in this particular case it returns a dictionary with the following first layer structure:
{
"results": [ ... ]
"info": { ... }
}
where results contains a single dictionary inside, therefore we can take
data['results'][0] to wok with
there is 'location', and there is a 'country', you can access this in that order to print the country:
print(data['results'][0]['location']['country'])

Related

Find all unique values for field in Elasticsearch through python

I've been scouring the web for some good python documentation for Elasticsearch. I've got a query term that I know returns the information I need, but I'm struggling to convert the raw string into something Python can interpret.
This will return a list of all unique 'VALUE's in the dataset.
{"find": "terms", "field": "hierarchy1.hierarchy2.VALUE"}
Which I have taken from a dashboarding tool which accesses this data.
But I don't seem to be able to convert this into correct python.
I've tried this:
body_test = {"find": "terms", "field": "hierarchy1.hierarchy2.VALUE"}
es = Elasticsearch(SETUP CONNECTION)
es.search(
index="INDEX_NAME",
body = body_test
)
but it doesn't like the find value. I can't find anything in the documentation about find.
RequestError: RequestError(400, 'parsing_exception', 'Unknown key for
a VALUE_STRING in [find].')
The only way I've got it to slightly work is with
es_search = (
Search(
using=es,
index=db_index
).source(['hierarchy1.hierarchy2.VALUE'])
)
But I think this is pulling the entire dataset and then filtering (which I obviously don't want to be doing each time I run this code). This needs to be done through python and so I cannot simply POST the query I know works.
I am completely new to ES and so this is all a little confusing. Thanks in advance!
So it turns out that the find in this case was specific to Grafana (the dashboarding tool I took the query from.
In the end I used this site and used the code from there. It's a LOT more complicated than I thought it was going to be. But it works very quickly and doesn't put a strain on the database (which my alternative method was doing).
In case the link dies in future years, here's the code I used:
from elasticsearch import Elasticsearch
es = Elasticsearch()
def iterate_distinct_field(es, fieldname, pagesize=250, **kwargs):
"""
Helper to get all distinct values from ElasticSearch
(ordered by number of occurrences)
"""
compositeQuery = {
"size": pagesize,
"sources": [{
fieldname: {
"terms": {
"field": fieldname
}
}
}
]
}
# Iterate over pages
while True:
result = es.search(**kwargs, body={
"aggs": {
"values": {
"composite": compositeQuery
}
}
})
# Yield each bucket
for aggregation in result["aggregations"]["values"]["buckets"]:
yield aggregation
# Set "after" field
if "after_key" in result["aggregations"]["values"]:
compositeQuery["after"] = \
result["aggregations"]["values"]["after_key"]
else: # Finished!
break
# Usage example
for result in iterate_distinct_field(es, fieldname="pattern.keyword", index="strings"):
print(result) # e.g. {'key': {'pattern': 'mypattern'}, 'doc_count': 315}

How to use json API information from python requests

I want to be able to read json API information and depending on the information make something happen. For example:
I get this information from a Streamelements API.
{
"donation":{
"user":{
"username":"StreamElements"
"geo":"null"
"email":"streamelements#streamelements.com"
}
"message":"This is a test"
"amount":100
"currency":"USD"
}
"provider":"paypal"
"status":"success"
"deleted":false
"_id":"5c0aab85de9a4c6756a14e0d"
"channel":"5b2e2007760aeb7729487dab"
"transactionId":"IMPORTED"
"createdAt":""2018-12-07T17:19:01.957Z""
"approved":"allowed"
"updatedAt":""2018-12-07T17:19:01.957Z""
}
I then want to check if the amount on that specific tip is $10 and if that is the case I want something to happen.
This is what I have so far but I do not know how to get the right variable:
data = json.loads(url.text)
if (data[0]['amount'] == 10):
DoTheThing();
The amount field is under the inner donation object:
data = json.loads(url.text)
donation_amount = data["donation"]["amount"]
if donation_amount == 10:
# do something
Verified using the Stream Elements Tips API documentation.

Python: Pretty print json file with array ID's

I would like to pretty print a json file where i can see the array ID's. Im working on a Cisco Nexus Switch with NX-OS that runs Python (2.7.11). Looking at following code:
cmd = 'show interface Eth1/1 counters'
out = json.loads(clid(cmd))
print (json.dumps(out, sort_keys=True, indent=4))
This gives me:
{
"TABLE_rx_counters": {
"ROW_rx_counters": [
{
"eth_inbytes": "442370508663",
"eth_inucast": "76618907",
"interface_rx": "Ethernet1/1"
},
{
"eth_inbcast": "4269",
"eth_inmcast": "49144",
"interface_rx": "Ethernet1/1"
}
]
},
"TABLE_tx_counters": {
"ROW_tx_counters": [
{
"eth_outbytes": "217868085254",
"eth_outucast": "66635610",
"interface_tx": "Ethernet1/1"
},
{
"eth_outbcast": "1137",
"eth_outmcast": "557815",
"interface_tx": "Ethernet1/1"
}
]
}
}
But i need to access the field by:
rxuc = int(out['TABLE_rx_counters']['ROW_rx_counters'][0]['eth_inucast'])
rxmc = int(out['TABLE_rx_counters']['ROW_rx_counters'][1]['eth_inmcast'])
rxbc = int(out['TABLE_rx_counters']['ROW_rx_counters'][1]['eth_inbcast'])
txuc = int(out['TABLE_tx_counters']['ROW_tx_counters'][0]['eth_outucast'])
txmc = int(out['TABLE_tx_counters']['ROW_tx_counters'][1]['eth_outmcast'])
txbc = int(out['TABLE_tx_counters']['ROW_tx_counters'][1]['eth_outbcast'])
So i need to know the array ID (in this example zeros and ones) to access the information for this interface. It seems pretty easy with only 2 arrays, but imagine 500. Right now, i always copy the json code to jsoneditoronline.org where i can see the ID's:
Is there an easy way to make the IDs visible within python itself?
You posted is valid JSON.
The image is from a tool that takes the data from JSON and displays it. You can display it in any way you want, but the contents in the file will need to be valid JSON.
If you do not need to load the JSON later, you can do with it whatever you like, but json.dumps() will give you JSON only.

How to accept any key in python json dictionary?

Here's a simplified version of the JSON I am working with:
{
"libraries": [
{
"library-1": {
"file": {
"url": "foobar.com/.../library-1.bin"
}
}
},
{
"library-2": {
"application": {
"url": "barfoo.com/.../library-2.exe"
}
}
}
]
}
Using json, I can json.loads() this file. I need to be able to find the 'url', download it, and save it to a local folder called library. In this case, I'd create two folders within libraries/, one called library-1, the other library-2. Within these folder's would be whatever was downloaded from the url.
The issue, however, is being able to get to the url:
my_json = json.loads(...) # get the json
for library in my_json['libraries']:
file.download(library['file']['url']) # doesn't access ['application']['url']
Since the JSON I am using uses a variety of accessors, sometimes 'file', other times 'dll' etc, I can't use one specific dictionary key. How can I use multiple. Would there be a modular way to do this?
Edit: There are numerous accessors, 'file', 'application' and 'dll' are only some examples.
You can just iterate through each level of the dictionary and download the files if you find a url.
urls = []
for library in my_json['libraries']:
for lib_name, lib_data in library.items():
for module_name, module_data in lib_data.items():
url = module_data.get('url')
if url is not None:
# create local directory with lib_name
# download files from url to local directory
urls.append(url)
# urls = ['foobar.com/.../library-1.bin', 'barfoo.com/.../library-2.exe']
This should work:
for library in my_json['libraries']:
for value in library.values():
for url in value.values():
file.download(url)
I would suggest doing it like this:
for library in my_json['libraries']:
library_data = library.popitem()[1].popitem()[1]
file.download(library_data['url'])
Try this
for library in my_json['libraries']:
if 'file' in library:
file.download(library['file']['url'])
elif 'dll' in library:
file.download(library['dll']['url'])
It just sees if your dict(created by parsing JSON) has a key named 'file'. If so, then use 'url' of the dict corresponds to the 'file' key. If not, try the same with 'dll' keyword.
Edit: If you don't know the key to access the dict containing the url, try this.
for library in my_json['libraries']:
for key in library:
if 'url' in library['key']:
file.download(library[key]['url'])
This iterates over all the keys in your library. Then, whichever key contains an 'url', downloads using that.

Best way to change type of column in dynamoDB

I have the DynamoDB table that is filled from different services/sources. The table has next schema:
{
"Id": 14782,
"ExtId": 1478240974, //pay attention it is Number
"Name": "name1"
}
Sometimes, after services started to work I had found that one service sends data in incorrect format. It looks like:
{
"Id": 14782,
"ExtId": "1478240974", //pay attention it is String
"Name": "name1"
}
DynamoDB is NoSQL database so, now I have millions mixed records that are difficult to query or scan. I understand that my main fault was missed validation.
Now I have to go throw all records and if it is inappropriate type - remove it and add with same data but with the correct format. Is it possible to do in another gracefully way?
So it was pretty easy. It is possible to do with attribute_type method.
First of all, I added imports:
from boto3.dynamodb.conditions import Attr
import boto3
And my code:
attr = Attr('ExtId').attribute_type('S')
response = table.scan(FilterExpression = attr)
items = response['Items']
while 'LastEvaluatedKey' in response:
response = table.scan(FilterExpression = attr, ExclusiveStartKey = response['LastEvaluatedKey'])
items.extend(response['Items'])
It is possible to find more condition customization in the next article - DynamoDB Customization Reference

Categories

Resources