Python trouble with JSON objects schema to parse iTunes id lookup - python

I am trying to parse the information about applications form the itunes lookup tool for example https://itunes.apple.com/lookup?id=880047117.
Right now I am trying to open up a connection using requests, JSON and a JSON objects schema. However the schema keeps failing with the error:
Traceback (most recent call last):
File "parse_and_query.py", line 33, in <module>
details = get_app_details(880047117)
File "/usr/lib/python3.5/site-packages/jsonobjects/schema.py", line 87, in wrapper
return self.parse(func(*args, **kwargs))
File "/usr/lib/python3.5/site-packages/jsonobjects/fields.py", line 169, in parse
return self.run_validation(value)
File "/usr/lib/python3.5/site-packages/jsonobjects/fields.py", line 136, in run_validation
is_empty, value = self.validate_empty_values(value)
File "/usr/lib/python3.5/site-packages/jsonobjects/fields.py", line 105, in validate_empty_values
self.fail('required')
File "/usr/lib/python3.5/site-packages/jsonobjects/fields.py", line 165, in fail
raise ValidationError(msg.format(**kwargs), self.field_name)
jsonobjects.exceptions.ValidationError: ['This field is required.']
The schema is declared as it's own class object and instantiated but it continually fails. I have the ID's of apps that I would like to look up the JSON information on. If there is an easier way that I am missing please let me know, I don't have access to the iTunes API.
#!/usr/local/bin/python3.5
import json
import requests
import jsonobjects as jo
from jsonschema import Draft4Validator
class iTunesAppSchema(jo.Schema):
id = jo.IntegerField('trackId')
url = jo.Field('trackViewUrl')
name = jo.StringField('trackName')
rating = jo.FloatField('averageUserRating')
reviews = jo.IntegerField('userRatingCountForCurrentVersion')
version = jo.StringField('version')
bundle_id = jo.StringField('bundleId')
publisher_id = jo.IntegerField('artistId')
publisher_url = jo.Field('artistViewUrl')
publisher_name = jo.StringField('artistName')
categories = jo.ListField('genres', child=jo.StringField())
parser = iTunesAppSchema('results[0]')
#parser.as_decorator
def get_app_details(app_id):
url = 'https://itunes.apple.com/lookup?id={}'
return requests.get(url.format(app_id)).json()
# https://itunes.apple.com/lookup?id=880047117
details = get_app_details(880047117)
print(details)

Related

Can we convert google-cloud-dialogflow api returned types to json?

I having trouble converting dialogflow types such as ListIntentsResponse, EntityType to json. I have researched a lot into this. Converting every entry one by one is a headache thats why I want a workaround.
I have tried using google.protobuf.json_format methods. But it doesnt works. says UNknown field : DESCRIPTOR
from google.protobuf.json_format import *
client = dialogflow.IntentsClient()
request = dialogflow.ListIntentsRequest(
parent=f'projects/{DIALOGFLOW_PROJECT_ID}/agent'
)
response = client.list_intents(request)
# print(response)
print(MessageToJson(response ,descriptor_pool=None))```
**Error==>>>**
Traceback (most recent call last):
File "c:\Users\1150-Bilal\Desktop\chatbot\intents.py", line 12, in <module>
intentlist()
File "c:\Users\1150-Bilal\Desktop\chatbot\intents.py", line 10, in intentlist
print(MessageToJson(response ,descriptor_pool=None))
File "C:\Users\1150-Bilal\Desktop\chatbot\botenv\lib\site-packages\google\protobuf\json_format.py", line 130, in MessageToJson
return printer.ToJsonString(message, indent, sort_keys, ensure_ascii)
File "C:\Users\1150-Bilal\Desktop\chatbot\botenv\lib\site-packages\google\protobuf\json_format.py", line 197, in ToJsonString
js = self._MessageToJsonObject(message)
File "C:\Users\1150-Bilal\Desktop\chatbot\botenv\lib\site-packages\google\protobuf\json_format.py", line 203, in _MessageToJsonObject
message_descriptor = message.DESCRIPTOR
File "C:\Users\1150-Bilal\Desktop\chatbot\botenv\lib\site-packages\google\cloud\dialogflow_v2\services\intents\pagers.py", line 74, in __getattr__
return getattr(self._response, name)
File "C:\Users\1150-Bilal\Desktop\chatbot\botenv\lib\site-packages\proto\message.py", line 747, in __getattr__
raise AttributeError(
AttributeError: Unknown field for ListIntentsResponse: DESCRIPTOR

How to send data post data to jobs api elasticsearch

I'm trying to post data to a machine learning api using elasticsearch. What format does the json docs need to be in?
I've attempted to send data with json docs separated by newline in a txt file. I've also tried converting back and forth to json using dump and load to no avail. The documentation states that the documents can be separated by whitespace, but no matter what I try it won't accept them.
https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-post-data.html
Here is an example of a json doc saved as file_name.json:
[{"myid": "id1", "client": "client1", "submit_date": 1514764857},
{"my_id": "id2", "client": "client_2", "submit_date": 1514764857}]
Here is the basic code needed to post data:
from elasticsearch import Elasticsearch
from elasticsearch.client.xpack import MlClient
es = elastic_connection()
es_ml = MlClient(es)
def post_training_data(directory='Training Data', file_name='file_name.json'):
with open(os.path.join(directory, file_name), mode='r') as train_file:
train_data = json.load(train_file)
es_ml.post_data(job_id=job_id, body=train_data)
post_training_data()
This is the specific error I am getting with this:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "..\train_model.py", line 218, in post_training_data
self.es_ml.post_data(job_id=self.job_id, body=train_data)
File "..\inc_anamoly\lib\site-packages\elasticsearch\client\utils.py", line 76, in _wrapped
return func(*args, params=params, **kwargs)
File "..\inc_anamoly\lib\site-packages\elasticsearch\client\xpack\ml.py", line 81, in post_data
body=self._bulk_body(body))
AttributeError: 'MlClient' object has no attribute '_bulk_body'
This turns out to be a bug. Issue reported.
https://github.com/elastic/elasticsearch-py/issues/959

appending input to the end of api call in python

def i(bot,update,args):
coin=args
infoCall =requests.get("https://api.coingecko.com/api/v3/coins/").json()
coinId = infoCall ['categories']
update.message.reply_text(coinId)
I would like to add to the end of the api request the args declared in coins=args so that it retrieves the info my user requests but this is the error i get
coinId = infoCall ['categories']
KeyError: 'categories'
which my guess is because its not formating the request correctly so the api is giving a 404 and not the info being requested
def i(bot,update,args):
coin=args
infoCall =requests.get("https://api.coingecko.com/api/v3/coins/").json()
infoCall = json.loads(infoCall)+str(coins)
coinId = infoCall['categories']
update.message.reply_text(str (coinId))
after adding this, this is the new error i get
Traceback (most recent call last):
File "C:\Users\Matthew\AppData\Local\Programs\Python\Python37-32\lib\site-packages\telegram\ext\dispatcher.py", line 279, in process_update
handler.handle_update(update, self)
File "C:\Users\Matthew\AppData\Local\Programs\Python\Python37-32\lib\site-packages\telegram\ext\commandhandler.py", line 173, in handle_update
return self.callback(dispatcher.bot, update, **optional_args)
File "C:/Users/Matthew/Desktop/coding_crap/CryptoBotBetav2.py", line 78, in i
infoCall = json.loads(infoCall)+str(coins)
File "C:\Users\Matthew\AppData\Local\Programs\Python\Python37-32\lib\json\__init__.py", line 341, in loads
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not list
Basically you are not appending the args param to the api point that's why you were getting the error. You need to append the 'bitcoin' to the api point before you make the request rather than on the output.
A typical example would be as follow. I have removed the update and other unused variables. You can put them as you need.
import requests
def i(args):
coin=args
infoCall =requests.get("https://api.coingecko.com/api/v3/coins/"+ args).json()
coinId = infoCall ['categories']
print(coinId)
# update.message.reply_text(coinId)
i('bitcoin')
Output:
['Cryptocurrency']

How to add quandl Api key to my request in python

I had made a private quandl account and received my quandl Api key but how to add the key to my request in python?
My Code:
import pandas as pd
import quandl
df = quandl.get('EOD/V')
print(df.head())
Error:
Traceback (most recent call last):
File "C:\Users\qasim\Documents\python_machine_learning\regression.py", line 4, in <module>
df = quandl.get('EOD/V')
File "C:\Python27\lib\site-packages\quandl\get.py", line 48, in get
data = Dataset(dataset_args['code']).data(params=kwargs, handle_column_not_found=True)
File "C:\Python27\lib\site-packages\quandl\model\dataset.py", line 47, in data
return Data.all(**updated_options)
File "C:\Python27\lib\site-packages\quandl\operations\list.py", line 14, in all
r = Connection.request('get', path, **options)
File "C:\Python27\lib\site-packages\quandl\connection.py", line 36, in request
return cls.execute_request(http_verb, abs_url, **options)
File "C:\Python27\lib\site-packages\quandl\connection.py", line 44, in execute_request
cls.handle_api_error(response)
File "C:\Python27\lib\site-packages\quandl\connection.py", line 85, in handle_api_error
raise klass(message, resp.status_code, resp.text, resp.headers, code)
quandl.errors.quandl_error.ForbiddenError: (Status 403) (Quandl Error QEPx05) You have attempted to view a premium database in anonymous mode, i.e., without providing a Quandl key. Please register for a free Quandl account, and then include your API key with your requests.
[Finished in 6.0s]
From the Configuration doc part, you can set it with ApiConfig.api_key :
import quandl
quandl.ApiConfig.api_key = 'tEsTkEy123456789'
Also, from the quandl python doc, you can define additional optional arguments in the get method :
:param str api_key: Downloads are limited to 50 unless api_key is specified
So you could also use :
df = quandl.get('EOD/V', api_key='tEsTkEy123456789')
Simply give api key value in authtoken
import quandl
df = quandl.get('EOD/V', authtoken=='tEsTkEy123456789')

memory error when retrieving data from Songkick

I have built a scraper to retrieve concert data from songkick by using their api. However, it takes a lot of time to retrieve all the data from these artists. After scraping for approximately 15 hours the script is still running but the JSON file doesn’t change anymore. I interrupted the script and I checked if I could access my data with TinyDB. Unfortunately I get the following error. Does anybody know why this is happening?
Error:
('cannot fetch url', 'http://api.songkick.com/api/3.0/artists/8689004/gigography.json?apikey=###########&min_date=2015-04-25&max_date=2017-03-01')
8961344
Traceback (most recent call last):
File "C:\Users\rmlj\Dropbox\Data\concerts.py", line 42, in <module>
load_events()
File "C:\Users\rmlj\Dropbox\Data\concerts.py", line 27, in load_events
print(artist)
File "C:\Python27\lib\idlelib\PyShell.py", line 1356, in write
return self.shell.write(s, self.tags)
KeyboardInterrupt
>>> mydat = db.all()
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
mydat = db.all()
File "C:\Python27\lib\site-packages\tinydb\database.py", line 304, in all
return list(itervalues(self._read()))
File "C:\Python27\lib\site-packages\tinydb\database.py", line 277, in _read
return self._storage.read()
File "C:\Python27\lib\site-packages\tinydb\database.py", line 31, in read
raw_data = (self._storage.read() or {})[self._table_name]
File "C:\Python27\lib\site-packages\tinydb\storages.py", line 105, in read
return json.load(self._handle)
File "C:\Python27\lib\json\__init__.py", line 287, in load
return loads(fp.read(),
MemoryError
below you can find my script
import urllib2
import requests
import json
import csv
import codecs
from tinydb import TinyDB, Query
db = TinyDB('events.json')
def load_events():
MIN_DATE = "2015-04-25"
MAX_DATE = "2017-03-01"
API_KEY= "###############"
with open('artistid.txt', 'r') as f:
for a in f:
artist = a.strip()
print(artist)
url_base = 'http://api.songkick.com/api/3.0/artists/{}/gigography.json?apikey={}&min_date={}&max_date={}'
url = url_base.format(artist, API_KEY, MIN_DATE, MAX_DATE)
# url = u'http://api.songkick.com/api/3.0/search/artists.json?query='+artist+'&apikey=WBmvXDarTCEfqq7h'
try:
r = requests.get(url)
resp = r.json()
if(resp['resultsPage']['totalEntries']):
results = resp['resultsPage']['results']['event']
for x in results:
print(x)
db.insert(x)
except:
print('cannot fetch url',url);
load_events()
db.close()
print ("End of script")
MemoryError is a built in Python exception (https://docs.python.org/3.6/library/exceptions.html#MemoryError) so it looks like the process is out of memory and this isn't really related to Songkick.
This question probably has the information you need to debug this: How to debug a MemoryError in Python? Tools for tracking memory use?

Categories

Resources