How to access url from dict structure of a json url

How to access url from dict structure of a json url - python

req = urllib.request.Request(url)
data = urllib.request.urlopen(req).read().decode()
Once i get the aws site string i can load it up in python, but how do i catch it?
The .json is actually a website with below structure. The aws website is a csv when you open it.
Is there a library within json which can help with this?
Structure

You can get URL like this:
import json
data = json.load(file object)
url = data.get('export').get('url')
If it is a string use:
json.loads(string)
If it is a json file use:
json.load(file object)
More Information

Related

TypeError: expected str, bytes or os.PathLike object, not dict

This is my code:
from os import rename, write
import requests
import json
url = "https://api.github.com/search/users?q=%7Bquery%7D%7B&page,per_page,sort,order%7D"
data = requests.get(url).json()
print(data)
outfile = open("C:/Users/vladi/Desktop/json files Vlad/file structure first attemp.json", "r")
json_object = json.load(outfile)
with open(data,'w') as endfile:
endfile.write(json_object)
print(endfile)
I want to call API request.
I want to take data from this URL: https://api.github.com/search/users?q=%7Bquery%7D%7B&page,per_page,sort,order%7D,
and rewrite it with my own data which is my file called file structure first attemp.json
and update this URL with my own data.

import requests
url = "https://api.github.com/search/usersq=%7Bquery%7D%7B&page,per_page,sort,order%7D"
data = requests.get(url)
with open(data,'w') as endfile:
endfile.write(data.text)
json.loads() returns a Python dictionary, which cannot be written to a file. Simply write the returned string from the URL.
response.json() is a built in feature that requests uses to load the JSON returned from the URL. So you are loading the JSON twice.

Post pdf to api (python) returns response with wrong encoding

I am trying to use an ocr API with python to convert pdf to text. The API i'm using is : https://www.convertapi.com/pdf-to-txt . When i upload the file through the website it works perfectly but the API call has the following issue:
Python code:
import requests
url ='https://v2.convertapi.com/convert/pdf/to/txt?Secret=mykey'
files = {'file': open('C:\<some_url>\filename.pdf', 'rb')}
r = requests.post(url, files=files)
The API call works fine, but it when i try to access the response through
r.text
it returns giberish: (Notice the FileData section)
'{"ConversionCost":4,"Files":[{"FileName":"stateoftheartKWextraction.txt","FileExt":"txt","FileSize":60179,"FileData":"QXV0b21hdGljIEtleXBocmFzZSBFeHRyYWN0aW9uOiBBIFN1cnZleSBvZiB0aGUgU3RhdGUgb2YgdGhlIEFydA0KDQpLYXppIFNhaWR1bCBIYXNhbiAgYW5kICBWaW5jZW50IE5nDQpIdW1hbiBMYW5ndWFnZSBUZWNobm9sb2d5IFJlc2VhcmNoIEluc3RpdHV0ZSBVbml2ZXJzaXR5IG9mIFRleGFzIGF0IERhbGxhcyBSaWNoYXJkc29uLCBUWCA3NTA4My0wNjg4DQp7c2FpZHVsLHZpbmNlfUBobHQudXRkYWxsYXMuZW...
Even if i use json load to convert it into a dict, it still prints the text in giberish.
I've tried to upload the file as not binary but that doesn't work(it throws an exception).
I've tried many pdf files and they all were in english.
Thank you.

The text is decoded, so you need to decode it. Let's take the first file as an example.
import base64
r = r.json()
text = r['Files'][0]['FileData']
print(base64.b64decode(text))
By the way, they seem to have a Python library as well, you might want to check that out: https://github.com/ConvertAPI/convertapi-python

Loading multiple JSON files

So I am trying to load multiple JSON files with Python HTTP requests, but I cant figure out to do it corecctly.
Loading one JSON file with python is pretty simple:
response = requests.get(url)
te = response.content.decode()
da = json.loads(te[te.find("{"):te.rfind("}")+1]
But how can I load multiple JSON files?
I have a list of URLs and I tried to request every URL with a loop and then load every line of the result, but it seems this does not work.
This is the code I am using:
t = []
for url in urls:
re = requests.get(url)
te = req.content.decode()
daten = json.loads(te[te.find("{"):te.rfind("}")+1])
t.append(daten)
But I am getting this error:
JSONDecodeError: Expecting value: line 1 column 1 (char 0).
I am pretty new with JSOn but I do understand that I cant read it line for line with a loop, becuase it destructs the JSON struture(?).
So how can I read multiple JSON files?
EDIT: Found the error.
Some links are not in correct JSON.

With requests library, If the endpoint you are requesting returns a well formed json response, all you need to do is call the .json() method on the response object:
t = []
for url in urls:
re = requests.get(url)
t.append(re.json())
Then, if you want to handle bad responses, wrap the code above in a try:...except block

Assuming you receive correct json from any site, you didn't construct result json.
You might write something like
t = []
for url in urls:
t.append(requests.get(url).content.decode('utf-8'))
result = json.loads('{{"data": [{}]}}'.format(','.join(t)))

How to return data pulled from python requests call into json

I am trying to make a GHE API call and convert the returned data into JSON. I am sure this is fairly simple (my current code writes the data into a .txt file) but I am incredibly new to python.
I am having a hard time understanding how to use json.dumps.
import requests
import json
GITHUB_ENTERPRISE_TOKEN = 'token xxx'
SEARCH_QUERY = "Evidence+locker+Seed+in:readme"
headers = {
'Authorization': GITHUB_ENTERPRISE_TOKEN,
}
url = "https://github.ibm.com/api/v3/search/repositories?q=" + SEARCH_QUERY
#Setup url to include GHE api endpoint and the search query
response = requests.get(url, headers=headers)
with open('./evidencelockerevidence.txt', 'w') as file:
file.write(response.text)
#writes to a .txt file the evidence fetched from GHE
Rather than the last two lines of functional code writing the data into a .txt file I would like to return it as JSON object in the same directory.

json.dumps simply stringify, thus, serialize your JSON object so you can store it as a plain text file. Its counterpart is json.loads.
f = open('a.jsonl', 'wt')
f.write(json.dumps(jobj))
People usually write one JSON object per line, a.k.a, jsonl format.
json.dump directly store your JSON object to a file. Its counterpart is json.load.
json.dump(jobj, open('a.json', 'wt'))
A json format file contains only one JSON object in a single line or multiple lines.

Python - How to fill out a web form then download the file that is generated

I'm trying to use use python to go to this page, https://comtrade.un.org/data/, fill in the form, and "click" the download button. Then get the csv file that is generated.
Anyone have some sample code for automating the download in python?
Thx.

You might be interested in trying out pywinauto. I have not had too much experience, but I do believe it could do the job.
Good luck!

The site you are accessing has an exposed API and you can use that form to generate the API URL and simply call that to return a JSON or CSV response. To get this with Python you can use requests and the core JSON module to parse the data if you want to use the data inside Python:
CSV File
import requests
api_url = 'https://comtrade.un.org/api/get?max=500&type=C&freq=A&px=HS&ps=2017&r=all&p=0&rg=all&cc=TOTAL&fmt=csv'
response = requests.get(api_url)
data = response.content
with open('output.csv', 'wb') as output:
output.write(data)
Note the fmt=csv property in the URL.
Python Dictionary
import requests, json
api_url = 'https://comtrade.un.org/api/get?max=500&type=C&freq=A&px=HS&ps=2017&r=all&p=0&rg=all&cc=TOTAL'
response = requests.get(api_url)
data = json.loads(response.content)
print(data)
Note that the API URL in the example came from submitting the default form and clicking 'View API Call'. Under the generated table.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to access url from dict structure of a json url - python

You can get URL like this: import json data = json.load(file object) url = data.get('export').get('url') If it is a string use: json.loads(string) If it is a json file use: json.load(file object) More Information

Related

TypeError: expected str, bytes or os.PathLike object, not dict

Post pdf to api (python) returns response with wrong encoding

Loading multiple JSON files

How to return data pulled from python requests call into json

Python - How to fill out a web form then download the file that is generated

Categories

Resources