Related
I am using the http://api.carmd.com/v3.0/decode?vin= api. Whenever I hard-code the vin as per example below:
response = requests.get("http://api.carmd.com/v3.0/decode?vin=WDC0J4KB9JF321921", headers=headers)
print(response.json())
it works fine:
{'message': {'code': 0, 'message': 'ok', 'credentials': 'valid', 'version': 'v3.0.0', 'endpoint': 'decode', 'counter': 9}, 'data': {'year': 2018, 'make': 'MERCEDES-BENZ', 'model': 'GLC COUPE', 'manufacturer': 'CHRYSLER', 'engine': 'L4, 2.0L; DOHC; 16V; DI; Turbo', 'trim': '300 4MATIC', 'transmission': 'AUTOMATIC'}}
But whenever I use the following example:
text = pytesseract.image_to_string(img)
text = remove(text)
text = str(text)
response = requests.get("http://api.carmd.com/v3.0/decode?vin={vin}".format(vin=text), headers=headers)
The following gets returned:
{'message': {'code': 1001, 'message': 'Invalid vin format', 'credentials': 'valid', 'version': 'v3.0.0', 'endpoint': 'decode', 'counter': 10}, 'data': ''}
Any ideas on why this is happening?
Update 1:
"text" is being generator through OCR. It is the text being read from the image. It is being read correctly as it is generating the same VIN number as the hard coded one,
I want to be able to GET information from API 1 and match it with API 2 and be able to update API 2's information with API 1. I am trying to figure out the most efficient/automated way to accomplish this as it also needs to be updated at a interval of every 10 minutes
I can query and get the results from API 1 this is my code and what my code looks like.
import json
import requests
myToken = '52c32f6588004cb3ab33b0ff320b8e4f'
myUrl = 'https://api1.com/api/v1/devices.json'
head = {'Authorization': 'Token {}'.format(myToken)}
response = requests.get(myUrl, headers=head)
r = json.loads(response.content)
r
The payload looks like this from API 1
{ "device" : {
"id": 153,
"battery_status" : 61,
"serial_no": "5QBYGKUI05",
"location_lat": "-45.948917",
"location_lng": "29.832179",
"location_address": "800 Laurel Rd, Lansdale, PA 192522,USA"}
}
I want to be able to take this information and match by "serial_no" and update all the other pieces of information for the corresponding device in API 2
I query the data for API 2 and this is what my code looks like
params = {
"location":'cf6707e3-f0ae-4040-a184-737b21a4bbd1',
"dateAdded":'ge:11/23/2020'}
url = requests.get('https://api2.com/api/assets',auth=('api2', '123456'), params=params)
r = json.loads(url.content)
r['items']
The JSON payload looks like this
[{'id': '064ca857-3783-460e-a7a2-245e054dcbe3',
'name': 'Apple Laptop 1',
'model': {'id': '50f5993e-2abf-49c8-86e0-8743dd58db6f',
'name': 'MacBook Pro'},
'manufacturer': {'id': 'f56244e2-76e3-46da-97dd-f72f92ca0779',
'name': 'APPLE'},
'room': {'id': '700ff2dc-0118-46c6-936a-01f0fa88c620',
'name': 'Storage Room 1',
'thirdPartyId': ''},
'location': {'id': 'cf6707e3-f0ae-4040-a184-737b21a4bbd1',
'name': 'Iron Mountain',
'thirdPartyId': ''},
'position': 'NonMounted',
'containerAsset': {'id': '00000000-0000-0000-0000-000000000000',
'name': None},
'baseAsset': {'id': '064ca857-3783-460e-a7a2-245e054dcbe3',
'name': 'Apple Laptop 1'},
'description': None,
'status': {'id': 'df9906d8-2856-45e3-9cba-bd7a1ac4971f',
'name': 'Production'},
'serialNumber': '5QBYGKUI06',
'tagNumber': None,
'alternateTagNumber': None,
'verificationStatus': {'id': 'cb3560a9-eef5-47b9-b033-394d3a09db18',
'name': 'Verified'},
'requiresRFID': False,
'requiresHangTag': False,
'bottomPosition': 0.0,
'leftPosition': 0.0,
'rackPosition': 'Front',
'labelX': None,
'labelY': None,
'verifyNameInRear': False,
'verifySerialNumberInRear': False,
'verifyBarcodeInRear': False,
'isNonDataCenter': False,
'rotate': False,
'customer': {'id': '00000000-0000-0000-0000-000000000000', 'name': None},
'thirdPartyId': '',
'temperature': None,
'dateLastScanned': None,
'placement': 'Floor',
'lastScannedLabelX': None,
'lastScannedLabelY': None,
'userDefinedValues': [{'userDefinedKeyId': '79e77a1e-4030-4308-a8ff-9caf40c04fbd',
'userDefinedKeyName': 'Longitude ',
'value': '-75.208917'},
{'userDefinedKeyId': '72c8056e-9b7d-40ac-9270-9f5929097e82',
'userDefinedKeyName': 'Address',
'value': '800 Laurel Rd, New York ,NY 19050, USA'},
{'userDefinedKeyId': '31aeeb91-daef-4364-8dd6-b0e3436d6a51',
'userDefinedKeyName': 'Battery Level',
'value': '67'},
{'userDefinedKeyId': '22b7ce4f-7d3d-4282-9ecb-e8ec2238acf2',
'userDefinedKeyName': 'Latitude',
'value': '35.932179'}]}
The documentation provided by API 2 tells me they only support PUT for updates as of right now but I would also want to know how I would do this using PATCH as it will be available in the future. So the data payload that I need to successful PUT is this
payload = {'id': '064ca857-3783-460e-a7a2-245e054dcbe3',
'name': 'Apple Laptop 1',
'model': {'id': '50f5993e-2abf-49c8-86e0-8743dd58db6f',
'name': 'MacBook Pro'},
'manufacturer': {'id': 'f56244e2-76e3-46da-97dd-f72f92ca0779',
'name': 'APPLE'},
'room': {'id': '700ff2dc-0118-46c6-936a-01f0fa88c620',
'name': 'Storage Room 1',
'thirdPartyId': ''},
'status': {'id': 'df9906d8-2856-45e3-9cba-bd7a1ac4971f',
'name': 'Production'},
'serialNumber': '5QBYGKUI06',
'verificationStatus': {'id': 'cb3560a9-eef5-47b9-b033-394d3a09db18',
'name': 'Verified'},
'requiresRFID': 'False',
'requiresHangTag': 'False',
'userDefinedValues': [{'userDefinedKeyId': '79e77a1e-4030-4308-a8ff-9caf40c04fbd',
'userDefinedKeyName': 'Longitude ',
'value': '-75.248920'},
{'userDefinedKeyId': '72c8056e-9b7d-40ac-9270-9f5929097e82',
'userDefinedKeyName': 'Address',
'value': '801 Laurel Rd, New York, Ny 192250, USA'},
{'userDefinedKeyId': '31aeeb91-daef-4364-8dd6-b0e3436d6a51',
'userDefinedKeyName': 'Battery Level',
'value': '67'},
{'userDefinedKeyId': '22b7ce4f-7d3d-4282-9ecb-e8ec2238acf2',
'userDefinedKeyName': 'Latitude',
'value': '29.782177'}]}
So apart of this is figuring out how I can query the json data portions that I need for the update
I am able to update the information using this line
requests.put('https://api2.com/api/assets/064ca857-3783-460e-a7a2-245e054dcbe3',auth=('API2', '123456'), data=json.dumps(payload))
but I need for it to dynamically update so I don't think the hard coded id parameter in the line will be efficient in a automation/efficiency standpoint. If anybody has any ideas, resources to point me in the right direction to know more about this process (I don't really know what it is even called) would be greatly appreciated.
Not entirely sure what you are trying to do here, but if you want to pull information nested in the responses you can do this.
Serial number from API 1
r['device']['serial_no']
Serial number for API 2
either r[0]['serialNumber'] or r['items'][0]['serialNumber'] depending on what you are showing
To modify the payload serial number, for example
payload['serialNumber'] = '123456abcdef'
This is my first time dealing with json data. So I'm not that familiar with the structure of json.
I got some data through "we the people" e-petition sites with following code:
url = "https://api.whitehouse.gov/v1/petitions.json?limit=3&offset=0&createdBefore=1573862400"
jdata_2 = requests.get(url).json()
Yet, I realize this is something different from... the ordinary json structure since I got some error while I tried to convert it into excel file with pandas
df = pandas.read_json(jdata_2)
Obviously, I must miss something which I must have done before using pandas.read_json() code.
I have searched for the answer but most of questions are "How can I convert json data into excel data", which needs json data. For my case, I scraped it from the url, so I thought I could make that strings into json data, and then try to convert it into excel data as well. So I tried to use json.dump() as well, but it didn't work as well.
I know it must be the naive question. But I'm not sure where I can start with this naive question. If anyone can instruct me how to deal with it, I would really appreciate it. Or link me some references that I can study as well.
Thank you for your help in advance.
This is the json data with the requests, and I pprint it with indent=4.
Input:
url = "https://api.whitehouse.gov/v1/petitions.json?limit=3&offset=0&createdBefore=1573862400"
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(jdata_2)
Output :
{ 'metadata': { 'requestInfo': { 'apiVersion': 1,
'query': { 'body': None,
'createdAfter': None,
'createdAt': None,
'createdBefore': '1573862400',
'isPublic': 1,
'isSignable': None,
'limit': '3',
'mock': 0,
'offset': '0',
'petitionsDefaultLimit': '1000',
'publicThreshold': 149,
'responseId': None,
'signatureCount': None,
'signatureCountCeiling': None,
'signatureCountFloor': 0,
'signatureThreshold': None,
'signatureThresholdCeiling': None,
'signatureThresholdFloor': None,
'sortBy': 'DATE_REACHED_PUBLIC',
'sortOrder': 'ASC',
'status': None,
'title': None,
'url': None,
'websiteUrl': 'https://petitions.whitehouse.gov'},
'resource': 'petitions'},
'responseInfo': { 'developerMessage': 'OK',
'errorCode': '',
'moreInfo': '',
'status': 200,
'userMessage': ''},
'resultset': {'count': 1852, 'limit': 3, 'offset': 0}},
'results': [ { 'body': 'Please save kurdish people in syria \r\n'
'pleaee save north syria',
'created': 1570630389,
'deadline': 1573225989,
'id': '2798897',
'isPublic': True,
'isSignable': False,
'issues': [ { 'id': 326,
'name': 'Homeland Security & '
'Defense'}],
'petition_type': [ { 'id': 291,
'name': 'Call on Congress to '
'act on an issue'}],
'reachedPublic': 0,
'response': [],
'signatureCount': 149,
'signatureThreshold': 100000,
'signaturesNeeded': 99851,
'status': 'closed',
'title': 'Please save rojava north syria\r\n'
'please save kurdish people\r\n'
'please stop erdogan\r\n'
'plaease please',
'type': 'petition',
'url': 'https://petitions.whitehouse.gov/petition/please-save-rojava-north-syria-please-save-kurdish-people-please-stop-erdogan-plaease-please'},
{ 'body': 'Kane Friess was a 2 year old boy who was '
"murdered by his mom's boyfriend, Gyasi "
'Campbell. Even with expert statements from '
'forensic anthropologists, stating his injuries '
'wete the result of homicide. Mr. Campbell was '
'found guilty of involuntary manslaughter. This '
"is an outrage to Kane's Family and our "
'community.',
'created': 1566053365,
'deadline': 1568645365,
'id': '2782248',
'isPublic': True,
'isSignable': False,
'issues': [ { 'id': 321,
'name': 'Criminal Justice Reform'}],
'petition_type': [ { 'id': 281,
'name': 'Change an existing '
'Administration '
'policy'}],
'reachedPublic': 0,
'response': [],
'signatureCount': 149,
'signatureThreshold': 100000,
'signaturesNeeded': 99851,
'status': 'closed',
'title': "Kane's Law. Upon which the murder of a child, "
'regardless of circumstances, be seen as 1st '
'degree murder. A Federal Law.',
'type': 'petition',
'url': 'https://petitions.whitehouse.gov/petition/kanes-law-upon-which-murder-child-regardless-circumstances-be-seen-1st-degree-murder-federal-law'},
{ 'body': "Schumer and Pelosi's hatred and refusing to "
'work with President Donald J. Trump is holding '
'America hostage. We the people know securing '
'our southern border is a priority which will '
'not happen with these two in office. Lets '
'build the wall NOW!',
'created': 1547050064,
'deadline': 1549642064,
'id': '2722358',
'isPublic': True,
'isSignable': False,
'issues': [ {'id': 306, 'name': 'Budget & Taxes'},
{ 'id': 326,
'name': 'Homeland Security & '
'Defense'},
{'id': 29, 'name': 'Immigration'}],
'petition_type': [ { 'id': 291,
'name': 'Call on Congress to '
'act on an issue'}],
'reachedPublic': 0,
'response': [],
'signatureCount': 149,
'signatureThreshold': 100000,
'signaturesNeeded': 99851,
'status': 'closed',
'title': 'Remove Chuck Schumer and Nancy Pelosi from '
'office',
'type': 'petition',
'url': 'https://petitions.whitehouse.gov/petition/remove-chuck-schumer-and-nancy-pelosi-office'}]}
And this is the Error message I got
Input :
df = pandas.read_json(jdata_2)
Output :
ValueError: Invalid file path or buffer object type: <class 'dict'>
You can try the below code as well, it is working fine
URL = "https://api.whitehouse.gov/v1/petitions.json?limit=3&offset=0&createdBefore=1573862400"
// fetching the json response from the URL
req = requests.get(URL)
text_data= req.text
json_dict= json.loads(text_data)
//converting json dictionary to python dataframe for results object
df = pd.DataFrame.from_dict(json_dict["results"])
Finally, saving the dataframe to excel format i.e xlsx
df.to_excel("output.xlsx")
I currently have this code in Python using feedparser:
import feedparser
RSS_FEEDS = {'cnn': 'http://rss.cnn.com/rss/edition.rss'}
def get_news_test(publication="cnn"):
feed = feedparser.parse(RSS_FEEDS[publication])
articles_cnn = feed['entries']
for article in articles_cnn:
print(article)
get_news_test()
The above code returns all the current articles. Here is a sample of one of the articles it returned:
{'title': "China's internet shutdowns tactics are spreading worldwide", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': "China's internet shutdowns tactics are spreading worldwide"}, 'summary': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html'}], 'link': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'id': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'guidislink': False, 'published': 'Fri, 18 Jan 2019 07:40:48 GMT', 'published_parsed': time.struct_time(tm_year=2019, tm_mon=1, tm_mday=18, tm_hour=7, tm_min=40, tm_sec=48, tm_wday=4, tm_yday=18, tm_isdst=0), 'media_content': [{'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-super-169.jpg', 'height': '619', 'width': '1100'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-large-11.jpg', 'height': '300', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-large-gallery.jpg', 'height': '552', 'width': '414'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-video-synd-2.jpg', 'height': '480', 'width': '640'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-live-video.jpg', 'height': '324', 'width': '576'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-gallery.jpg', 'height': '360', 'width': '270'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-story-body.jpg', 'height': '169', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-assign.jpg', 'height': '186', 'width': '248'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-hp-video.jpg', 'height': '144', 'width': '256'}]}
Now I know I can return some portions of this for instance the title by calling:
print(article.title)
But, I am stumped as to how to get the image data from the feed.
Each article entry has a list of assets in media_content. Each asset node contains the media type (I only saw 'image'), size, url, etc.
To simply list the media type and url for each asset, you can use the following:
import feedparser
feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")
for article in feed["entries"]:
for media in article.media_content:
print(f"medium: {media['medium']}")
print(f" url: {media['url']}")
Output:
medium: image
url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-t1-main.jpg
medium: image
url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-assign.jpg
medium: image
url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-hp-video.jpg
...
If you want to request and save assets of type 'image', you can use requests:
import feedparser
import os
import requests
feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")
for article in feed["entries"]:
for media in article.media_content:
if media["medium"] == "image":
img_data = requests.get(media["url"]).content
with open(os.path.basename(media["url"]), "wb") as handler:
handler.write(img_data)
I am trying to ge around with APIs in general. To test this I coded this little snippet of code to get a list of all the channels on the Swedish national public service radio, and I want to print the ID and NAME of the channels:
import requests as rq
import json
from pprint import pprint
resp = rq.get('http://api.sr.se/api/v2/channels?
format=json&indent=TRUE')
respjson = json.loads(resp.text)
pprint (respjson['id'])
And I get the error
File "sr-api.py", line 9, in <module>
pprint (respjson['id']['name'])
KeyError: 'id'
The (abbreviated) 'respjson' looks like this
{'channels': [{'channeltype': 'Rikskanal',
'color': '31a1bd',
'id': 132,
'image': 'http://static-cdn.sr.se/sida/images/132/2186745_512_512.jpg?preset=api-default-square',
'imagetemplate': 'http://static-cdn.sr.se/sida/images/132/2186745_512_512.jpg',
'liveaudio': {'id': 132,
'statkey': '/app/direkt/p1[k(132)]',
'url': 'http://sverigesradio.se/topsy/direkt/srapi/132.mp3'},
'name': 'P1',
'scheduleurl': 'http://api.sr.se/v2/scheduledepisodes?channelid=132',
'siteurl': 'http://sverigesradio.se/p1',
'xmltvid': 'p1.sr.se'},
{'channeltype': 'Lokal kanal',
'color': 'c31eaa',
'id': 200,
'image': 'http://static-cdn.sr.se/sida/images/200/2186775_512_512.jpg?preset=api-default-square',
'imagetemplate': 'http://static-cdn.sr.se/sida/images/200/2186775_512_512.jpg',
'liveaudio': {'id': 200,
'statkey': '/app/direkt/p4 jämtland[k(200)]',
'url': 'http://sverigesradio.se/topsy/direkt/srapi/200.mp3'},
'name': 'P4 Jämtland',
'scheduleurl': 'http://api.sr.se/v2/scheduledepisodes?channelid=200',
'siteurl': 'http://sverigesradio.se/jamtland/',
'xmltvid': 'p4jmtl.sr.se'}],
'copyright': 'Copyright Sveriges Radio 2017. All rights reserved.',
'pagination': {'nextpage': 'http://api.sr.se/v2/channelsformat=json&indent=true&page=2',
'page': 1,
'size': 10,
'totalhits': 55,
'totalpages': 6}}
Channels is a list. You have to iterate on it to get all channels and print their ids.
# starting from respjson
respjson = {
'channels': [
{
'channeltype': 'Rikskanal',
'color': '31a1bd',
'id': 132,
'image': 'http://static-cdn.sr.se/sida/images/132/2186745_512_512.jpg?preset=api-default-square',
'imagetemplate': 'http://static-cdn.sr.se/sida/images/132/2186745_512_512.jpg',
'liveaudio': {'id': 132,
'statkey': '/app/direkt/p1[k(132)]',
'url': 'http://sverigesradio.se/topsy/direkt/srapi/132.mp3'},
'name': 'P1',
'scheduleurl': 'http://api.sr.se/v2/scheduledepisodes?channelid=132',
'siteurl': 'http://sverigesradio.se/p1',
'xmltvid': 'p1.sr.se'},
{
'channeltype': 'Lokal kanal',
'color': 'c31eaa',
'id': 200,
'image': 'http://static-cdn.sr.se/sida/images/200/2186775_512_512.jpg?preset=api-default-square',
'imagetemplate': 'http://static-cdn.sr.se/sida/images/200/2186775_512_512.jpg',
'liveaudio': {'id': 200,
'statkey': '/app/direkt/p4 jämtland[k(200)]',
'url': 'http://sverigesradio.se/topsy/direkt/srapi/200.mp3'},
'name': 'P4 Jämtland',
'scheduleurl': 'http://api.sr.se/v2/scheduledepisodes?channelid=200',
'siteurl': 'http://sverigesradio.se/jamtland/',
'xmltvid': 'p4jmtl.sr.se'
}
],
'copyright': 'Copyright Sveriges Radio 2017. All rights reserved.',
'pagination': {
'nextpage': 'http://api.sr.se/v2/channelsformat=json&indent=true&page=2',
'page': 1,
'size': 10,
'totalhits': 55,
'totalpages': 6
}
}
for channel in respjson['channels']:
print(channel['id'])
What you want to do is look thru the dictionaries presented inside the channels, you can do that with the following...
for dic in respjson['channels']:
pprint(dic['id'])