This is my first time to ask something here. I've been trying to access the Youtube API to get something for an experiment I'm doing. Everything's working so far. I just wanted to ask about this very inconsistent error that I'm getting.
-----------
1
Title: All Movie Trailers of New York Comic-Con (2016) Power Rangers, John Wick 2...
Uploaded by: KinoCheck International
Uploaded on: 2016-10-12T14:43:42.000Z
Video ID: pWOH-OZQUj0
2
Title: Movieclips Trailers
Uploaded by: Movieclips Trailers
Uploaded on: 2011-04-01T18:43:14.000Z
Video ID: Traceback (most recent call last):
File "scrapeyoutube.py", line 24, in <module>
print "Video ID:\t", search_result['id']['videoId']
KeyError: 'videoId'
I tried getting the video ID ('videoID' as per documentation). But for some reason, the code works for the 1st query, and then totally flops for the 2nd one. It's weird because it's only happening for this particular element. Everything else ('description','publishedAt', etc.) is working. Here's my code:
from apiclient.discovery import build
import json
import pprint
import sys
APINAME = 'youtube'
APIVERSION = 'v3'
APIKEY = 'secret teehee'
service = build(APINAME, APIVERSION, developerKey = APIKEY)
#volumes source ('public'), search query ('androide')
searchrequest = service.search().list(q ='movie trailers', part ='id, snippet', maxResults = 25).execute()
searchcount = 0
print "-----------"
for search_result in searchrequest.get("items", []):
searchcount +=1
print searchcount
print "Title:\t", search_result['snippet']['title']
# print "Description:\t", search_result['snippet']['description']
print "Uploaded by:\t", search_result['snippet']['channelTitle']
print "Uploaded on:\t", search_result['snippet']['publishedAt']
print "Video ID:\t", search_result['id']['videoId']
Hope you guys can help me. Thanks!
Use 'get' method for result.
result['id'].get('videoId')
there are in some element no this key.
if you use square parenteces, python throw exeption keyError, but if you use 'get' method, python return None for element whitch have not key videoId
Using search() method returns channels, playlists as well together with videos in search. That might be why your problem.
I use their interactive playgrounds to learn the structure of returned JSON, functions, etc. For your question, I suggest to visit https://developers.google.com/youtube/v3/docs/search/list .
Make sure if a kind of an item is "youtube#video", then access videoId of that item.
Sample of code:
...
for index in response["items"]: # response is a JSON file I have got from API
tmp = {} # temporary dict to assert into my custom JSON
if index["id"]["kind"] == "youtube#video":
tmp["videoID"] = index["id"]["videoId"]
...
This is a part of code from my personal project I am currently working on.
Because some results to Key "ID", return:
{u'kind': u'youtube#playlist', u'playlistId': u'PLd0_QArxznVHnlvJp0ki5bpmBj4f64J7P'}
You can see, there is no key "videoId".
Related
I tried the steps mentioned in this article.
https://matthewbilyeu.com/blog/2022-09-01/responding-to-recruiter-emails-with-gpt-3
There is a screenshot that says: Here's an example from the OpenAI Playground.
I typed all the text in "playground" but do not get similar response as shown in that image. I expected similar text like {"name":"William", "company":"BillCheese"} I am not sure how to configure the parameters in openAI web interface.
Update:
I used this code:
import json
import re, textwrap
import openai
openai.api_key = 'xxx'
prompt = f"""
Hi Matt! This is Steve Jobs with Inforation Edge Limited ! I'm interested in having you join our team here.
"""
completion = openai.Completion.create(
model="text-davinci-002",
prompt=textwrap.dedent(prompt),
max_tokens=20,
temperature=0,
)
try:
json_str_response = completion.choices[0].text
json_str_response_clean = re.search(r".*(\{.*\})", json_str_response).groups()[0]
print (json.loads(json_str_response_clean))
except (AttributeError, json.decoder.JSONDecodeError) as exception:
print("Could not decode completion response from OpenAI:")
print(completion)
raise exception
and got this error:
Could not decode completion response from OpenAI:
AttributeError: 'NoneType' object has no attribute 'groups'
You're running into this problem: Regex: AttributeError: 'NoneType' object has no attribute 'groups'
Take a look at this line:
json_str_response_clean = re.search(r".*(\{.*\})", json_str_response).groups()[0]
The regex can't find anything matching the pattern, so it returns None. None does not have .groups() so you get an error. I don't have enough details to go much further, but the link above might get you there.
I don't know why both the questioner as well as one reply above me are using RegEx. According to the OpenAI documentation, a Completion will return a JSON object.
No need to catch specific things complexly - just load the return into a dictionary and access the fields you need:
import json
# ...
# Instead of the try ... except block, just load it into a dictionary.
response = json.loads(completion.choices[0].text)
# Access whatever field you need
response["..."]
this worked for me:
question = "Write a python function to detect anomlies in a given time series"
response = openai.Completion.create(
model="text-davinci-003",
prompt=question,
temperature=0.9,
max_tokens=150,
top_p=1,
frequency_penalty=0.0,
presence_penalty=0.6,
stop=[" Human:", " AI:"]
)
print(response)
print("==========Python Code=========")
print(response["choices"][0]["text"])
I trust all is well with you and yours. Thank you for taking a moment to read through this and I apologize if this is a repeat (if it is point me to the right spot and I will read through that!)
I am trying to hit the twitter api via tweepy (cause im to new to figure out python and the twitter official api) and return a result in a useable format.
import Auth_Codes
import json
twitter_auth_keys = {
"consumer_key" : Auth_Codes.consumer_key,
"consumer_secret" : Auth_Codes.consumer_secret,
"access_token" : Auth_Codes.access_token,
"access_token_secret" : Auth_Codes.access_token_secret
}
auth = tweepy.OAuthHandler(
twitter_auth_keys["consumer_key"],
twitter_auth_keys["consumer_secret"]
)
auth.set_access_token(
twitter_auth_keys["access_token"],
twitter_auth_keys["access_token_secret"]
)
api = tweepy.API(auth)
#api.search_tweets(q = "Aztar")
searched_tweets = [tweet for tweet in tweepy.Cursor(api.search_tweets,
q = "What you want to search",
lang = 'en',
result_type = 'recent',
count = 1)
.items(1)]
print(searched_tweets)
print(type(searched_tweets))
when this is executed, I get a very large response that I cannot fully post here.
it is also type: <class 'list'>
I hope that added the spoiler button as intended. My issue is that I have tried in several different ways to convert this into an actual json, and I am struggling as every guide I am following online leads me to a dead end (granted I am learning lots!). In node.js, I would normally leverage a map and sort it that way. Is there something similar I can do here? Not all the data is relevant to me.
Thanks in advance, and really sorry about not knowing how to add a spoiler button if it is at all possible.
I have added the following to it:
searched_tweets_dict = json.loads(searched_tweets)
print(searched_tweets_dict)
and the result is the following error code:
Traceback (most recent call last):
File "E:\Dropbox\Backup\Github\Python\Mid_Journey\Search.py", line 33, in <module>
searched_tweets_dict = json.loads(searched_tweets)
File "C:\Pthyon_3.10\lib\json\__init__.py", line 339, in loads
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not list
Why are you using a Cursor if you are only requesting one tweet?
And why don't you just use the generator instead of creating that list?
Anyway, the json object is already included in the Tweepy objects (._json).
cursor = tweepy.Cursor(
api.search_tweets,
q = "What you want to search",
lang = 'en',
result_type = 'recent',
count = 1
)
for tweet in cursor.items(1):
print(tweet._json)
I am using an API from this site https://dev.whatismymmr.com, and I want to specifically request for the closestRank but I just get a KeyError: 'ranked.closestRank'. but I can get the entire ['ranked'] object (which contains the closestRank) but I just end up with a lot of information I don't need.
How can I end up with just the Closest rank?
My code
import requests
LeagueName = input ("Summoner name")
base = ("https://eune.whatismymmr.com/api/v1/summoner?name=")
Thething = base + LeagueName
print (Thething)
response = requests.get(Thething)
print(response.status_code)
MMR = response.json()
print (MMR['ranked.closestRank'])
The API command
<queue>.closestRank (the queue is the game mode, it can be normal or ranked)
you can use the summoner name babada27 for testing.
Hope this is what you are looking for -
change The last line to
print (MMR["ranked"]["closestRank"])
I am playing around with the Zillow API, but I am having trouble retrieving the rent data. Currently I am using a Python Zillow wrapper, but I am not sure if it works for pulling the rent data.
This is the help page I am using for the Zillow API:
https://www.zillow.com/howto/api/GetSearchResults.htm
import pyzillow
from pyzillow.pyzillow import ZillowWrapper, GetDeepSearchResults
import pandas as pd
house = pd.read_excel('Housing_Output.xlsx')
### Login to Zillow API
address = ['123 Test Street City, State Abbreviation'] # Fill this in with an address
zip_code = ['zip code'] # fill this in with a zip code
zillow_data = ZillowWrapper(API KEY)
deep_search_response = zillow_data.get_deep_search_results(address, zip_code)
result = GetDeepSearchResults(deep_search_response)
# These API calls work, but I am not sure how to retrieve the rent data
print(result.zestimate_amount)
print(result.tax_value)
ADDING ADDITIONAL INFO:
Chapter 2 talks how to pull rent data by creating a XML function called zillowProperty. My skills going into XML aren't great, but I think I need to either:
a) import some xml package to help read it
b) save the code as an XML file and use the open function to read the file
https://www.amherst.edu/system/files/media/Comprehensive_Evaluation_-_Ningyue_Christina_Wang.pdf
I am trying to provide the code in here, but it won't let me break to the next line for some reason.
We can see that rent is not a field one can get using the pyzillow package, by looking into the attributes of your result by running dir(result), as well as the code here: Pyzillow source code.
However, thanks to the beauty of open source, you can edit the source code of this package and get the functionality you are looking for. Here is how:
First, locate where the code sits in your hard drive. Import pyzillow, and run:
pyzillow?
The File field shows this for me:
c:\programdata\anaconda3\lib\site-packages\pyzillow\__init__.py
Hence go to c:\programdata\anaconda3\lib\site-packages\pyzillow (or whatever it shows for you) and open the pyzillow.py file with a text editor.
Now we need to do two changes.
One: Inside the get_deep_search_results function, you'll see params. We need to edit that to turn the rentzestimate feature on. So change that function to:
def get_deep_search_results(self, address, zipcode):
"""
GetDeepSearchResults API
"""
url = 'http://www.zillow.com/webservice/GetDeepSearchResults.htm'
params = {
'address': address,
'citystatezip': zipcode,
'zws-id': self.api_key,
'rentzestimate': True # This is the only line we add
}
return self.get_data(url, params)
Two: Go to class GetDeepSearchResults(ZillowResults), and add the following into the attribute_mapping dictionary:
'rentzestimate_amount': 'result/rentzestimate/amount'
Voila! The customized&updated Python package now returns the Rent Zestimate! Let's try:
from pyzillow.pyzillow import ZillowWrapper, GetDeepSearchResults
address = ['11 Avenue B, Johnson City, NY']
zip_code = ['13790']
zillow_data = ZillowWrapper('X1-ZWz1835knufc3v_38l6u')
deep_search_response = zillow_data.get_deep_search_results(address, zip_code)
result = GetDeepSearchResults(deep_search_response)
print(result.rentzestimate_amount)
Which correctly returns the Rent Zestimate of $1200, which can be validated at the Zillow page of that address.
I have the following python code that is working ok to use reddit's api and look up the front page of different subreddits and their rising submissions.
from pprint import pprint
import requests
import json
import datetime
import csv
import time
subredditsToScan = ["Arts", "AskReddit", "askscience", "aww", "books", "creepy", "dataisbeautiful", "DIY", "Documentaries", "EarthPorn", "explainlikeimfive", "food", "funny", "gaming", "gifs", "history", "jokes", "LifeProTips", "movies", "music", "pics", "science", "ShowerThoughts", "space", "sports", "tifu", "todayilearned", "videos", "worldnews"]
ofilePosts = open('posts.csv', 'wb')
writerPosts = csv.writer(ofilePosts, delimiter=',')
ofileUrls = open('urls.csv', 'wb')
writerUrls = csv.writer(ofileUrls, delimiter=',')
for subreddit in subredditsToScan:
front = requests.get(r'http://www.reddit.com/r/' + subreddit + '/.json')
rising = requests.get(r'http://www.reddit.com/r/' + subreddit + '/rising/.json')
front.text
rising.text
risingData = rising.json()
frontData = front.json()
print(len(risingData['data']['children']))
print(len(frontData['data']['children']))
for i in range(0, len(risingData['data']['children'])):
author = risingData['data']['children'][i]['data']['author']
score = risingData['data']['children'][i]['data']['score']
subreddit = risingData['data']['children'][i]['data']['subreddit']
gilded = risingData['data']['children'][i]['data']['gilded']
numOfComments = risingData['data']['children'][i]['data']['num_comments']
linkUrl = risingData['data']['children'][i]['data']['permalink']
timeCreated = risingData['data']['children'][i]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
for j in range(0, len(frontData['data']['children'])):
author = frontData['data']['children'][j]['data']['author'].encode('utf-8').strip()
score = frontData['data']['children'][j]['data']['score']
subreddit = frontData['data']['children'][j]['data']['subreddit'].encode('utf-8').strip()
gilded = frontData['data']['children'][j]['data']['gilded']
numOfComments = frontData['data']['children'][j]['data']['num_comments']
linkUrl = frontData['data']['children'][j]['data']['permalink'].encode('utf-8').strip()
timeCreated = frontData['data']['children'][j]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
It works well and scrapes the data accurately but it constantly gets interrupted, seemingly randomly, and has a run time crash, saying:
Traceback (most recent call last):
File "dataGather1.py", line 27, in <module>
for i in range(0, len(risingData['data']['children'])):
KeyError: 'data'
I have no idea why this error is occuring on and off and not consistently. I thought maybe I am calling the API too much so it stops me from accessing it so I threw a sleep in my code but that did not help. Any ideas?
When there are no data on the response from the API there are is no key data on the dictionary so you get a keyError on some subreddits. You need to use a try catch
The json you are parsing doesn't contain the 'data' element. Thus you get an error. I think your hunch is correct though. It is probably rate limiting, or that you're asking for hidden/deleted entries.
Reddit is very strict about accessing their API without playing nice. Meaning you should register your app and use a meaningful user-agent to your requets, and you should probably use the python library for this kind of thing: https://praw.readthedocs.io/en/latest/
Without registering it seems to my experience that the direct REST reddit API is even more strict than the 1 request per 2 seconds rule they have (had?).
Python raises a KeyError whenever a dict() object is requested (using the format a = adict[key]) and the key is not in the dictionary.
It seems like when you are getting this error, your data value is empty.
You might just try to get the length of the dictionary before you execute the for loop. If it’s empty, it will just not run. Some interesting error checking here might help.
size = len(risingData)
if size:
for i in range(0,size):
…