Python-JSON - How to parse API output? - python

I'm pretty new.
I wrote this python script to make an API call from blockr.io to check the balance of multiple bitcoin addresses.
The contents of btcaddy.txt are bitcoin addresses seperated by commas. For this example, let it parse this.
import urllib2
import json
btcaddy = open("btcaddy.txt","r")
urlRequest = urllib2.Request("http://btc.blockr.io/api/v1/address/info/" + btcaddy.read())
data = urllib2.urlopen(urlRequest).read()
json_data = json.loads(data)
balance = float(json_data['data''address'])
print balance
raw_input()
However, it gives me an error. What am I doing wrong? For now, how do I get it to print the balance of the addresses?

You've done multiple things wrong in your code. Here's my fix. I recommend a for loop.
import json
import urllib
addresses = open("btcaddy.txt", "r").read()
base_url = "http://btc.blockr.io/api/v1/address/info/"
request = urllib.urlopen(base_url+addresses)
result = json.loads(request.read())['data']
for balance in result:
print balance['address'], ":" , balance['balance'], "BTC"
You don't need an input at the end, too.

Your question is clear, but your tries not.
You said, you have a file, with at least, more than registry. So you need to retrieve the lines of this file.
with open("btcaddy.txt","r") as a:
addresses = a.readlines()
Now you could iterate over registries and make a request to this uri. The urllib module is enough for this task.
import json
import urllib
base_url = "http://btc.blockr.io/api/v1/address/info/%s"
for address in addresses:
request = urllib.request.urlopen(base_url % address)
result = json.loads(request.read().decode('utf8'))
print(result)
HTTP sends bytes as response, so you should to us decode('utf8') as approach to handle with data.

Related

How to print selected text from JSON file using Python

I'm new to python and have undertaken my first project to automate something for my role (I'm in the network space, so forgive me if this is terrible!).
I'm required to to download a .json file from the below link:
https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519
My script goes through and retrieves the manual download link.
The reason I'm getting the URL in this way, is that the download link changes every fortnight when MS update the file.
My preference is to extract the "addressPrefixes" contents from the names of "AzureCloud.australiacentral", "AzureCloud.australiacentral2", "AzureCloud.australiaeast", "AzureCloud.australiasoutheast".
I'm then wanting to strip out characters of " & ','.
Each of the subnet ranges should then reside on a new line and be placed in a text file.
If I perform the below, I'm able to get the output that I am wanting.
Am I correct in thinking that I can use a for loop to achieve this? If so, would it be better to use a Python dictionary as opposed to using JSON formatted output?
# Script to check Azure IPs
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Import Modules for script
import requests
import re
import json
import urllib.request
search = 'https://download.*?\.json'
ms_dl_centre = "https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519"
requests_get = requests.get(ms_dl_centre)
json_url_search = re.search(search, requests_get.text)
json_file = json_url_search.group(0)
with urllib.request.urlopen(json_file) as url:
contents = json.loads(url.read().decode())
print(json.dumps(contents['values'][1]['properties']['addressPrefixes'], indent = 0)) #use this to print contents from json entry 1
I'm not convinced that using re to parse HTML is a good idea. BeautifulSoup is more suited to the task. Upon inspection of the HTML response I note that there's a span element of class file-link-view1 that seems to uniquely identify the URL to the JSON download. Assuming that to be a robust approach (i.e. Microsoft don't change the way the download URL is presented) then this is how I'd do it:-
import requests
from bs4 import BeautifulSoup
namelist = ["AzureCloud.australiacentral", "AzureCloud.australiacentral2",
"AzureCloud.australiaeast", "AzureCloud.australiasoutheast"]
baseurl = 'https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519'
with requests.Session() as session:
response = session.get(baseurl)
response.raise_for_status()
soup = BeautifulSoup(response.text, 'html.parser')
downloadurl = soup.find('span', class_='file-link-view1').find('a')['href']
response = session.get(downloadurl)
response.raise_for_status()
json = response.json()
for n in json['values']:
if n['name'] in namelist:
print(n['name'])
for ap in n['properties']['addressPrefixes']:
print(ap)
#andyknight, thanks for your direction. I'd up vote you but as I'm a noob, it doesn't permit from doing so.
I've taken the basis of your python script and added in some additional components.
I removed the print statement for the region name in the .txt file, as this is file is referenced by a firewall, which is looking for IP addresses.
I've added in Try/Except/Else for portion of the script, to identify if there is ever an error with reaching the URL, or other unspecified error. I've leveraged logging to send an email based on the status of the script. If an exception is thrown I get an email with traceback information, otherwise I receive an email advising the script was successful.
This writes out the specific prefixes for AU regions into a .txt file.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import requests
import logging
import logging.handlers
from bs4 import BeautifulSoup
smtp_handler = logging.handlers.SMTPHandler(mailhost=("sanitised.smtp[.]xyz", 25),
fromaddr="UpdateIPs#sanitised[.]xyz",
toaddrs="FriendlyAdmin#sanitised[.]xyz",
subject=u"Check Azure IP Script completion status.")
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
logger.addHandler(smtp_handler)
namelist = ["AzureCloud.australiacentral", "AzureCloud.australiacentral2",
"AzureCloud.australiaeast", "AzureCloud.australiasoutheast"]
baseurl = 'https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519'
with requests.Session() as session:
response = session.get(baseurl)
try:
response.raise_for_status()
soup = BeautifulSoup(response.text, 'html.parser')
downloadurl = soup.find('span', class_='file-link-view1').find('a')['href']
response = session.get(downloadurl)
response.raise_for_status()
json = response.json()
for n in json['values']:
if n['name'] in namelist:
for ap in n['properties']['addressPrefixes']:
with open('Check_Azure_IPs.txt', 'a') as file:
file.write(ap + "\n")
except requests.exceptions.HTTPError as e:
logger.exception(
"URL is no longer valid, please check the URL that's defined in this script with MS, as this may have changed.\n\n")
except Exception as e:
logger.exception("Unknown error has occured, please review script")
else:
logger.info("Script has run successfully! Azure IPs have been updated.")
Please let me know if you think there is a better way to handle this, otherwise this is marked as answered. I appreciate your help greatly!

Parsing Json in python 3, get email from API

I'm trying to do a little code that gets the emails (and other things in the future) from an API. But I'm getting "TypeError: list indices must be integers or slices, not str" and I don't know what to do about it. I've been looking at other questions here but I still don't get it. I might be a bit slow when it comes to this.
I've also been watching some tutorials on the tube, and done the same as them, but still getting different errors. I run Python 3.5.
Here is my code:
from urllib.request import urlopen
import json, re
# Opens the url for the API
url = 'https://jsonplaceholder.typicode.com/posts/1/comments'
r = urlopen(url)
# This should put the response from API in a Dict
result= r.read().decode('utf-8')
data = json.loads(result)
#This shuld get all the names from the the Dict
for name in data['name']: #TypeError here.
print(name)
I know that I could regex the text and get the result that I want.
Code for that:
from urllib.request import urlopen
import re
url = 'https://jsonplaceholder.typicode.com/posts/1/comments'
r = urlopen(url)
result = r.read().decode('utf-8')
f = re.findall('"email": "(\w+\S\w+)', result)
print(f)
But that seems like the wrong way to do this.
Can someone please help me understand what I'm doing wrong here?
data is a list of dicts, that's why you are getting TypeError while iterating on it.
The way to go is something like this:
for item in data: # item is {"name": "foo", "email": "foo#mail..."}
print(item['name'])
print(item['email'])
#PiAreSquared's comment is correct, just a bit more explanation here:
from urllib.request import urlopen
import json, re
# Opens the url for the API
url = 'https://jsonplaceholder.typicode.com/posts/1/comments'
r = urlopen(url)
# This should put the response from API in a Dict
result= r.read().decode('utf-8')
data = json.loads(result)
# your data is a list of elements
# and each element is a dict object, so you can loop over the data
# to get the dict element, and then access the keys and values as you wish
# see below for some example
for element in data: #TypeError here.
name = element['name']
email = element['email']
# if you want to get all names, you should do
names = [element['name'] for element in data]
# same to get all emails
emails = [email['email'] for email in data]

Unable to extract the table from API using python

I am trying to extract a table using an API but I am unable to do so. I am pretty sure that I am not using it correctly, and any help would be appreciated.
Actually I am trying to extract a table from this API but unable to figure out the right way on how to do it. This is what is mentioned in the website. I want to extract Latest_full_data table.
This is my code to get the table but I am getting error:
import urllib
import requests
import urllib.request
locu_api = 'api_Key'
def locu_search(query):
api_key = locu_api
url = 'https://www.quandl.com/api/v3/databases/WIKI/metadata?api_key=' + api_key
response = urllib.request.urlopen(url).read()
json_obj = str(response, 'utf-8')
datanew = json.loads(json_obj)
return datanew
When I do print(datanew). Update: Even if I change it to return data new, error is still the same.
I am getting this below error:
name 'datanew' is not defined
I had the same issues with urrlib before. If possible, try to use requests it's a better designed and working library in my opinion. Also, it is capable of reading JSON with a single function so no need to run it through multiple lines Sample code here:
import requests
locu_api = 'api_Key'
def locu_search():
url = 'https://www.quandl.com/api/v3/databases/WIKI/metadata?api_key=' + api_key
return requests.get(url).json()
locu_search()
Edit:
The endpoint that you are calling might not be the correct one. I think you are looking for the following one:
import requests
api_key = 'your_api_key_here'
def locu_search(dataset_code):
url = f'https://www.quandl.com/api/v3/datasets/WIKI/{dataset_code}/metadata.json?api_key={api_key}'
req = requests.get(url)
return req.json()
data = locu_search("FB")
This will return with all the metadata regarding a company. In this case Facebook.
Maybe it doesn't apply to your specific problem, but what I normally do is the following:
import requests
def get_values(url):
response = requests.get(url).text
values = json.loads(response)
return values

Can't extract JSON from an http request

I'm having problems getting data from an HTTP response. The format unfortunately comes back with '\n' attached to all the key/value pairs. JSON says it must be a str and not "bytes".
I have tried a number of fixes so my list of includes might look weird/redundant. Any suggestions would be appreciated.
#!/usr/bin/env python3
import urllib.request
from urllib.request import urlopen
import json
import requests
url = "http://finance.google.com/finance/info?client=ig&q=NASDAQ,AAPL"
response = urlopen(url)
content = response.read()
print(content)
data = json.loads(content)
info = data[0]
print(info)
#got this far - planning to extract "id:" "22144"
When it comes to making requests in Python, I personally like to use the requests library. I find it easier to use.
import json
import requests
r = requests.get('http://finance.google.com/finance/info?client=ig&q=NASDAQ,AAPL')
json_obj = json.loads(r.text[4:])
print(json_obj[0].get('id'))
The above solution prints: 22144
The response data had a couple unnecessary characters at the head, which is why I am only loading the relevant (json) portion of the response: r.text[4:]. This is the reason why you couldn't load it as json initially.
Bytes object has method decode() which converts bytes to string. Checking the response in the browser, seems there are some extra characters at the beginning of the string that needs to be removed (a line feed character, followed by two slashes: '\n//'). To skip the first three characters from the string returned by the decode() method we add [3:] after the method call.
data = json.loads(content.decode()[3:])
print(data[0]['id'])
The output is exactly what you expect:
22144
JSON says it must be a str and not "bytes".
Your content is "bytes", and you can do this as below.
data = json.loads(content.decode())

How do I fix a "JSONDecodeError: No JSON object could be decoded: line 1 column 0 (char 0)"?

I'm trying to get Twitter API search results for a given hashtag using Python, but I'm having trouble with this "No JSON object could be decoded" error. I had to add the extra % towards the end of the URL to prevent a string formatting error. Could this JSON error be related to the extra %, or is it caused by something else? Any suggestions would be much appreciated.
A snippet:
import simplejson
import urllib2
def search_twitter(quoted_search_term):
url = "http://search.twitter.com/search.json?callback=twitterSearch&q=%%23%s" % quoted_search_term
f = urllib2.urlopen(url)
json = simplejson.load(f)
return json
There were a couple problems with your initial code. First you never read in the content from twitter, just opened the url. Second in the url you set a callback (twitterSearch). What a call back does is wrap the returned json in a function call so in this case it would have been twitterSearch(). This is useful if you want a special function to handle the returned results.
import simplejson
import urllib2
def search_twitter(quoted_search_term):
url = "http://search.twitter.com/search.json?&q=%%23%s" % quoted_search_term
f = urllib2.urlopen(url)
content = f.read()
json = simplejson.loads(content)
return json

Categories

Resources