Python print only .jpg url's from json

Python print only .jpg url's from json - python

How can I print only .jpg/.png urls from API using json?
`
import requests
import json
r = requests.get("https://random.dog/woof.json")
print("Kod:", r.status_code)
def jprint(obj):
text = json.dumps(obj, sort_keys=True, indent=4)
print(text)
jprint(r.json())
`
Results:
`
{
"fileSizeBytes": 78208,
"url": "https://random.dog/24141-29115-27188.jpg"
}
`
I tried .endswith() but without any success.
I'm beginner.

This example will use str.endswith to check, if dictionary value under the key url contains the string .jpg or .png.
I'm doing 10 requests and using time.sleep to wait 1 second between each request:
import requests
from time import sleep
url = "https://random.dog/woof.json"
for i in range(10):
print("Attempt {}".format(i + 1))
data = requests.get(url).json()
if data["url"].endswith((".jpg", ".png")):
print(data["url"])
sleep(1)
Prints (for example, data returned from the server is random):
Attempt 1
https://random.dog/aa8e5e24-5c58-4963-9809-10f4aa695cfc.jpg
Attempt 2
https://random.dog/5b2a4e74-58da-4519-a67b-d0eed900b676.jpg
Attempt 3
https://random.dog/56217498-0e6b-4c24-bdd1-cc5dbb2201bb.jpg
Attempt 4
https://random.dog/l6CIQaS.jpg
Attempt 5
Attempt 6
Attempt 7
https://random.dog/3b5eae93-b3bd-4012-b789-64eb6cdaac65.png
Attempt 8
https://random.dog/oq9izk0057hy.jpg
Attempt 9
Attempt 10

Try to use .lower() as endswith is case sensitive that's why when you use .endswith() it's not working. because some data it has upper case file format.
def jprint(obj):
text = json.dumps(obj, sort_keys=True, indent=4)
if 'url' in obj and obj['url'] and obj['url'].lower().endswith((".jpg", ".png")):
print(obj['url'])

This code should do it:
import requests
import json
r = requests.get("https://random.dog/woof.json")
print("Kod:", r.status_code)
def jprint(obj):
text = json.dumps(obj, sort_keys=True, indent=4)
print(text)
jprint(r.json())
to_remove = []
for key, item in r.json().items():
try:
if not (item.endswith(".png") or item.endswith(".gif") or item.endswith(".jpg") or item.endswith(".mp4")):
to_remove.append([key, item]) # The key is useless- delete it later
except AttributeError: # the value is not a string!
to_remove.append([key, item]) # The key is useless- delete it later
new_r = {}
for key, item in r.json().items():
if not [key, item] in to_remove:
new_r[key] = item # item was not in to_remove, so add it to the new filtered dict.
jprint(new_r)
Explanation
After we get the request, we loop through the r.json() dict, and if the value is either an int or does not end with a photo/video extension, add the key and value to an array. Then create a new dict, and store all values from the original dictionary (r) which are not mentioned in the array.
Example Output
Kod: 200
{
"fileSizeBytes": 2896573,
"url": "https://random.dog/2fc649b9-f688-4e65-a1a7-cdbc6228e0c0.mp4"
}
{
"url": "https://random.dog/2fc649b9-f688-4e65-a1a7-cdbc6228e0c0.mp4"
}

Related

Changing variable for parameters for http json get requests

Pretty new to python so go easy on me :). This code works below but I was wondering if there is a way to change the indcode parameter by doing a loop so I do not have to repeat the requests.get.
paraD = dict()
paraD["area"] = "123"
paraD["periodtype"] = "2"
paraD["indcode"] = "722"
paraD["$limit"]=1000
#Open URL and get data for business indcode 722
document_1 = requests.get(dataURL, params=paraD)
bizdata_1 = document_1.json()
#Open URL and get data for business indcode 445
paraD["indcode"] = "445"
document_2 = requests.get(dataURL, params=paraD)
bizdata_2 = document_2.json()
#Open URL and get data for business indcode 311
paraD["indcode"] = "311"
document_3 = requests.get(dataURL, params=paraD)
bizdata_3 = document_3.json()
#Combine the three lists
output = bizdata_1 + bizdata_2 + bizdata_3

Since indcode is the only parameter that changes for each request, we will put that in a list and make the web requests inside a loop.
data_url = ""
post_params = dict()
post_params["area"] = "123"
post_params["periodtype"] = "2"
post_params["$limit"]=1000
# The list of indcode values
ind_codes = ["722", "445", "311"]
output = []
# Loop on indcode values
for code in ind_codes:
# Change indcode parameter value in the loop
post_params["indcode"] = code
try:
response = requests.get(data_url, params=post_params)
data1 = response.json()
output.append(data1)
except:
print("web request failed")
# More error handling / retry if required
print(output)

Assuming you're using Python 3.9+ you can combine dictionaries using the | operator. However, you need to be sure that you understand exactly what this will do. It's more likely that the code to combine dictionaries will be more complex.
When using the requests module it is very important to check the HTTP status code returned from the function (HTTP verb) you're calling.
Here's an approach to the stated problem that may work (depending on how the dictionary merge is effected).
from urllib.error import HTTPError
from requests import get as GET
from requests.exceptions import Timeout, TooManyRedirects, RequestException
from sys import stderr
# base parameters
params = {'area': '123', 'periodtype': '2', '$limit': 1000}
# the indcodes
indcodes = ('722', '445', '311')
# gets the JSON response (as a Python dictionary)
def getjson(url, params):
try:
(r := GET(url, params, timeout=1.0)).raise_for_status()
return r.json() # all good
# if we get any of these exceptions, report to stderr and return an empty dictionary
except (HTTPError, ConnectionError, Timeout, TooManyRedirects, RequestException) as e:
print(e, file=stderr)
return {}
# any exception here is not associated with requests/urllib. Report and raise
except Exception as f:
print(f, file=stderr)
raise
# an empty dictionary
target = {}
# build the target dictionary
# May not produce desired results depending on how the dictionary merge should be carried out
for indcode in indcodes:
target |= getjson('https://httpbin.org/json', params | {'indcode' : indcode})

Pytest -Get one item from multiple returned values

I've e2e_te_data.json file which includes my 2 different test points. It means I will have 2 test case data and give the pytest and it will execute 2 different test cases.
`e2e_te_data.json
[{ "dataSource":"dataSource1",
"machineName":"MachineName_X",
},
{` "dataSource":"dataSource2",
"machineName":"MachineName_Y",
}]
--`-------This is my code:
def read_test_data_from_json():
JsonFile = open('..\\e2eTestData.json','r')
h=[]
convertedJsonStr=[]
json_input = JsonFile.read()
parsedJsonStr = json.loads(json_input) # Parse JSON string to Python dict
for i in range(0, len(parsedJsonStr)):
convertedJsonStr.append(json.dumps(parsedJsonStr[i]))
h.append(parsedJsonStr[i]['machineName'])
return convertedJsonStr,h
#pytest.mark.parametrize("convertedJsonStr,h", (read_test_data_from_json()[0],read_test_data_from_json()[1]))
def test_GetFrequencyOfAllToolUsage(convertedJsonStr,h):
objAPI=HTTPMethods()
frequencyOfToolResultFromAPIRequest=objAPI.getFrequencyOfTools(read_test_data_from_json[0])
print(h)
Value of convertedJsonstr variable
I want to get one item of convertedJsonStr and h returned from read_test_data_from_json method when it comes into test_GetFrequencyOfAllToolUsage method. But I see all items of convertedJsonStr and h as image above.

First Item
def read_test_data_from_json():
JsonFile = json.load(open('..\\e2eTestData.json','r'))
# First item
return JsonFile[0], JsonFile[0]["machineName"]
Last item
return JsonFile[-1], JsonFile[-1]["machineName"]
Random item
item = random.choice(JsonFile)
return item, item["machineName"]

Cannot read json file properly

I want to read json file from online https://api.myjson.com/bins/y2k4y and print based on the parameter like 'ipv4', 'ipv6'. I have written code in python. It is showing me this error. Please advise me to solve this problem.
Python code
import requests
import json
# Method To Get REST Data In JSON Format
def getResponse(url,choice):
response = requests.get(url)
if(response.ok):
jData = json.loads(response.content)
if(choice=="deviceInfo"):
print("working")
deviceInformation(jData)
else:
print("NOT working")
response.raise_for_status()
# Parses JSON Data To Find Switch Connected To H4
def deviceInformation(data):
global switch
global deviceMAC
global hostPorts
switchDPID = ""
print ( data)
for i in data:
print ("i: ", i['ipv4'])
deviceInfo = "https://api.myjson.com/bins/y2k4y"
getResponse(deviceInfo,"deviceInfo")
Error
File "E:/aaa/e.py", line 27, in deviceInformation
print ("i: ", i['ipv4'])
TypeError: string indices must be integers
json file
https://api.myjson.com/bins/y2k4y

jData is a dict with only 1 key devices. devices contain all the other info which you require.
Change for i in data to for i in data['devices']:
for i in data['devices']:
print ("i: ", i['ipv4'])
print ("i: ", i['ipv6'])

Try using for i in data['devices']: instead of for i in data:

Python json dumps syntax error when appending list of dict

I got two functions that return a list of dictionary and i'm trying to get json to encode it, it works when i try doing it with my first function, but now i'm appending second function with a syntax error of ": expected". I will eventually be appending total of 7 functions that each output a list of dict. Is there a better way of accomplishing this?
import dmidecode
import simplejson as json
def get_bios_specs():
BIOSdict = {}
BIOSlist = []
for v in dmidecode.bios().values():
if type(v) == dict and v['dmi_type'] == 0:
BIOSdict["Name"] = str((v['data']['Vendor']))
BIOSdict["Description"] = str((v['data']['Vendor']))
BIOSdict["BuildNumber"] = str((v['data']['Version']))
BIOSdict["SoftwareElementID"] = str((v['data']['BIOS Revision']))
BIOSdict["primaryBIOS"] = "True"
BIOSlist.append(BIOSdict)
return BIOSlist
def get_board_specs():
MOBOdict = {}
MOBOlist = []
for v in dmidecode.baseboard().values():
if type(v) == dict and v['dmi_type'] == 2:
MOBOdict["Manufacturer"] = str(v['data']['Manufacturer'])
MOBOdict["Model"] = str(v['data']['Product Name'])
MOBOlist.append(MOBOdict)
return MOBOlist
def get_json_dumps():
jsonOBJ = json
#Syntax error is here, i can't use comma to continue adding more, nor + to append.
return jsonOBJ.dumps({'HardwareSpec':{'BIOS': get_bios_specs()},{'Motherboard': get_board_specs()}})

Use multiple items within your nested dictionary.
jsonOBJ.dumps({
'HardwareSpec': {
'BIOS': get_bios_specs(),
'Motherboard': get_board_specs()
}
})
And if you want multiple BIOS items or Motherboard items, just use a list.
...
'HardwareSpec': {
'BIOS': [
get_bios_specs(),
get_uefi_specs()
]
...
}

If you want a more convenient lookup of specs, you can just embed a dict:
jsonOBJ.dumps({'HardwareSpec':{'BIOS': get_bios_specs(),
'Motherboard': get_board_specs()
}
})

Fetching language detection from Google api

I have a CSV with keywords in one column and the number of impressions in a second column.
I'd like to provide the keywords in a url (while looping) and for the Google language api to return what type of language was the keyword in.
I have it working manually. If I enter (with the correct api key):
http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&key=myapikey&q=merde
I get:
{"responseData": {"language":"fr","isReliable":false,"confidence":6.213709E-4}, "responseDetails": null, "responseStatus": 200}
which is correct, 'merde' is French.
so far I have this code but I keep getting server unreachable errors:
import time
import csv
from operator import itemgetter
import sys
import fileinput
import urllib2
import json
E_OPERATION_ERROR = 1
E_INVALID_PARAMS = 2
#not working
def parse_result(result):
"""Parse a JSONP result string and return a list of terms"""
# Deserialize JSON to Python objects
result_object = json.loads(result)
#Get the rows in the table, then get the second column's value
# for each row
return row in result_object
#not working
def retrieve_terms(seedterm):
print(seedterm)
"""Retrieves and parses data and returns a list of terms"""
url_template = 'http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&key=myapikey&q=%(seed)s'
url = url_template % {"seed": seedterm}
try:
with urllib2.urlopen(url) as data:
data = perform_request(seedterm)
result = data.read()
except:
sys.stderr.write('%s\n' % 'Could not request data from server')
exit(E_OPERATION_ERROR)
#terms = parse_result(result)
#print terms
print result
def main(argv):
filename = argv[1]
csvfile = open(filename, 'r')
csvreader = csv.DictReader(csvfile)
rows = []
for row in csvreader:
rows.append(row)
sortedrows = sorted(rows, key=itemgetter('impressions'), reverse = True)
keys = sortedrows[0].keys()
for item in sortedrows:
retrieve_terms(item['keywords'])
try:
outputfile = open('Output_%s.csv' % (filename),'w')
except IOError:
print("The file is active in another program - close it first!")
sys.exit()
dict_writer = csv.DictWriter(outputfile, keys, lineterminator='\n')
dict_writer.writer.writerow(keys)
dict_writer.writerows(sortedrows)
outputfile.close()
print("File is Done!! Check your folder")
if __name__ == '__main__':
start_time = time.clock()
main(sys.argv)
print("\n")
print time.clock() - start_time, "seconds for script time"
Any idea how to finish the code so that it will work? Thank you!

Try to add referrer, userip as described in the docs:
An area to pay special attention to
relates to correctly identifying
yourself in your requests.
Applications MUST always include a
valid and accurate http referer header
in their requests. In addition, we
ask, but do not require, that each
request contains a valid API Key. By
providing a key, your application
provides us with a secondary
identification mechanism that is
useful should we need to contact you
in order to correct any problems. Read
more about the usefulness of having an
API key
Developers are also encouraged to make
use of the userip parameter (see
below) to supply the IP address of the
end-user on whose behalf you are
making the API request. Doing so will
help distinguish this legitimate
server-side traffic from traffic which
doesn't come from an end-user.
Here's an example based on the answer to the question "access to google with python":
#!/usr/bin/python
# -*- coding: utf-8 -*-
import json
import urllib, urllib2
from pprint import pprint
api_key, userip = None, None
query = {'q' : 'матрёшка'}
referrer = "https://stackoverflow.com/q/4309599/4279"
if userip:
query.update(userip=userip)
if api_key:
query.update(key=api_key)
url = 'http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&%s' %(
urllib.urlencode(query))
request = urllib2.Request(url, headers=dict(Referer=referrer))
json_data = json.load(urllib2.urlopen(request))
pprint(json_data['responseData'])
Output
{u'confidence': 0.070496580000000003, u'isReliable': False, u'language': u'ru'}
Another issue might be that seedterm is not properly quoted:
if isinstance(seedterm, unicode):
value = seedterm
else: # bytes
value = seedterm.decode(put_encoding_here)
url = 'http://...q=%s' % urllib.quote_plus(value.encode('utf-8'))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python print only .jpg url's from json - python

Related

Changing variable for parameters for http json get requests

Pytest -Get one item from multiple returned values

Cannot read json file properly

Python json dumps syntax error when appending list of dict

Fetching language detection from Google api

Categories

Resources