I trust all is well with you and yours. Thank you for taking a moment to read through this and I apologize if this is a repeat (if it is point me to the right spot and I will read through that!)
I am trying to hit the twitter api via tweepy (cause im to new to figure out python and the twitter official api) and return a result in a useable format.
import Auth_Codes
import json
twitter_auth_keys = {
"consumer_key" : Auth_Codes.consumer_key,
"consumer_secret" : Auth_Codes.consumer_secret,
"access_token" : Auth_Codes.access_token,
"access_token_secret" : Auth_Codes.access_token_secret
}
auth = tweepy.OAuthHandler(
twitter_auth_keys["consumer_key"],
twitter_auth_keys["consumer_secret"]
)
auth.set_access_token(
twitter_auth_keys["access_token"],
twitter_auth_keys["access_token_secret"]
)
api = tweepy.API(auth)
#api.search_tweets(q = "Aztar")
searched_tweets = [tweet for tweet in tweepy.Cursor(api.search_tweets,
q = "What you want to search",
lang = 'en',
result_type = 'recent',
count = 1)
.items(1)]
print(searched_tweets)
print(type(searched_tweets))
when this is executed, I get a very large response that I cannot fully post here.
it is also type: <class 'list'>
I hope that added the spoiler button as intended. My issue is that I have tried in several different ways to convert this into an actual json, and I am struggling as every guide I am following online leads me to a dead end (granted I am learning lots!). In node.js, I would normally leverage a map and sort it that way. Is there something similar I can do here? Not all the data is relevant to me.
Thanks in advance, and really sorry about not knowing how to add a spoiler button if it is at all possible.
I have added the following to it:
searched_tweets_dict = json.loads(searched_tweets)
print(searched_tweets_dict)
and the result is the following error code:
Traceback (most recent call last):
File "E:\Dropbox\Backup\Github\Python\Mid_Journey\Search.py", line 33, in <module>
searched_tweets_dict = json.loads(searched_tweets)
File "C:\Pthyon_3.10\lib\json\__init__.py", line 339, in loads
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not list
Why are you using a Cursor if you are only requesting one tweet?
And why don't you just use the generator instead of creating that list?
Anyway, the json object is already included in the Tweepy objects (._json).
cursor = tweepy.Cursor(
api.search_tweets,
q = "What you want to search",
lang = 'en',
result_type = 'recent',
count = 1
)
for tweet in cursor.items(1):
print(tweet._json)
Related
I'm trying to write a program that will stream tweets from Twitter using their Stream API and Tweepy. Here's the relevant part of my code:
def on_data(self, data):
if data.user.id == "25073877" or data.in_reply_to_user_id == "25073877":
self.filename = trump.csv
elif data.user.id == "30354991" or data.in_reply_to_user_id == "30354991":
self.filename = harris.csv
if not 'RT #' in data.text:
csvFile = open(self.filename, 'a')
csvWriter = csv.write(csvFile)
print(data.text)
try:
csvWriter.writerow([data.text, data.created_at, data.user.id, data.user.screen_name, data.in_reply_to_status_id])
except:
pass
def on_error(self, status_code):
if status_code == 420:
return False
What the code should be doing is streaming the tweets and writing the text of the tweet, the creation date, the user ID of the tweeter, their screen name, and the reply ID of the status they're replying to if the tweet is a reply. However, I get the following error:
File "test.py", line 13, in on_data
if data.user.id == "25073877" or data.in_reply_to_user_id == "25073877":
AttributeError: 'unicode' object has no attribute 'user'
Could someone help me out? Thanks!
EDIT: Sample of what is being read into "data"
{"created_at":"Fri Feb 15 20:50:46 +0000 2019","id":1096512164347760651,"id_str":"1096512164347760651","text":"#realDonaldTrump \nhttps:\/\/t.co\/NPwSuJ6V2M","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":25073877,"in_reply_to_user_id_str":"25073877","in_reply_to_screen_name":"realDonaldTrump","user":{"id":1050189031743598592,"id_str":"1050189031743598592","name":"Lauren","screen_name":"switcherooskido","location":"United States","url":null,"description":"Concerned citizen of the USA who would like to see Integrity restored in the US Government. Anti-marxist!\nSigma, INTP\/J\nREJECT PC and Identity Politics #WWG1WGA","translator_type":"none","protected":false,"verified":false,"followers_count":1459,"friends_count":1906,"listed_count":0,"favourites_count":5311,"statuses_count":8946,"created_at":"Thu Oct 11 00:59:11 +0000 2018","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"FF691F","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/1068591478329495558\/ng_tNAXx_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/1068591478329495558\/ng_tNAXx_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/1050189031743598592\/1541441602","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/NPwSuJ6V2M","expanded_url":"https:\/\/www.conservativereview.com\/news\/5-insane-provisions-amnesty-omnibus-bill\/","display_url":"conservativereview.com\/news\/5-insane-\u2026","indices":[18,41]}],"user_mentions":[{"screen_name":"realDonaldTrump","name":"Donald J. Trump","id":25073877,"id_str":"25073877","indices":[0,16]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"und","timestamp_ms":"1550263846848"}
So I supposed the revised question is how to tell the program to only write parts of this JSON output to the CSV file? I've been using the references Twitter's stream API provides for the attributes for "data".
As stated in your comment the tweet data is in "JSON format". I believe what you mean by this is that it is a string (unicode) in JSON format, not a parsed JSON object. In order to access the fields like you want to in your code you need to parse the data string using json.
e.g.
import json
json_data_object = json.loads(data)
you can then access the fields like you would a dictionary e.g.
json_data_object['some_key']['some_other_key']
This is a very late answer, but I'm answering here because this is the first search hit when you search for this error. I was also using Tweepy and found that the JSON response object had attributes that could not be accessed.
'Response' object has no attribute 'text'
Through lots of tinkering and research, I found that in the loop where you access the Twitter API, using Tweepy, you must specify '.data' in the loop, not within it.
For example:
tweets = client.search_recent_tweets(query = "covid" , tweet.fields = ['text'])
for tweet in tweets:
print(tweet.text) # or print(tweet.data.text)
Will not work because the Response variable doesn't have access to the attributes within the JSON response object. Instead, you do something like:
tweets = client.search_recent_tweets(query = "covid" , tweet.fields = ['text'])
for tweet in tweets.data:
print(tweet.text)
Basically, this was a long-winded way to fix a problem I was having for a long time. Cheers, hopefully, other noobs like me won't have to struggle as long as I did!
I have generated the python client and server from https://editor.swagger.io/ - and the server runs correctly with no editing, but I can't seem to get the client to communicate with it - or with anything.
I suspect I'm doing something really silly but the examples I've found on the Internet either don't work or appear to be expecting that I understand how to craft the object. Here's my code (I've also tried sending nothing, a string, etc):
import time
import swagger_client
import json
from swagger_client.rest import ApiException
from pprint import pprint
# Configure OAuth2 access token for authorization: petstore_auth
swagger_client.configuration.access_token = 'special-key'
# create an instance of the API class
api_instance = swagger_client.PetApi()
d = '{"id": 0,"category": {"id": 0,"name": "string"},"name": "doggie","photoUrls": ["string"], "tags": [ { "id": 0, "name": "string" } ], "status": "available"}'
python_d = json.loads(d)
print( json.dumps(python_d, sort_keys=True, indent=4) )
body = swagger_client.Pet(python_d) # Pet | Pet object that needs to be added to the store
try:
# Add a new pet to the store
api_instance.add_pet(body)
except ApiException as e:
print("Exception when calling PetApi->add_pet: %s\n" % e)
I'm using python 3.6.4 and when the above runs I get:
Traceback (most recent call last):
File "petstore.py", line 14, in <module>
body = swagger_client.Pet(python_d) # Pet | Pet object that needs to be added to the store
File "/Users/bombcar/mef/petstore/python-client/swagger_client/models/pet.py", line 69, in __init__
self.name = name
File "/Users/bombcar/mef/petstore/python-client/swagger_client/models/pet.py", line 137, in name
raise ValueError("Invalid value for `name`, must not be `None`") # noqa: E501
ValueError: Invalid value for `name`, must not be `None`
I feel I'm making an incredibly basic mistake, but I've literally copied the JSON from https://editor.swagger.io/ - but since I can't find an actually working example I don't know what I'm missing.
The Python client generator produces object-oriented wrappers for the API. You cannot post a dict or a JSON string directly, you need to create a Pet object using the generated wrapper:
api_instance = swagger_client.PetApi()
pet = swagger_client.Pet(name="doggie", status="available",
photo_urls=["http://img.example.com/doggie.png"],
category=swagger_client.Category(id=42))
response = api_instance.add_pet(pet)
I got similar issue recently and finally fixed by upgrading python version in my local. I generated python-flask-server zip file from https://editor.swagger.io/. Then I set up the environment locally with py 3.6.10. I got same error when parsing input by using "Model_Name.from_dict()", telling me
raise ValueError("Invalid value for `name`, must not be `None`") # noqa: E501
ValueError: Invalid value for `name`, must not be `None`
Then I upgraded to python 3.7.x, the issue was resolved. I know that your problem is not related to version, however, just in case anyone got similar issue and seeking for help could see this answer.
I have the following python code that is working ok to use reddit's api and look up the front page of different subreddits and their rising submissions.
from pprint import pprint
import requests
import json
import datetime
import csv
import time
subredditsToScan = ["Arts", "AskReddit", "askscience", "aww", "books", "creepy", "dataisbeautiful", "DIY", "Documentaries", "EarthPorn", "explainlikeimfive", "food", "funny", "gaming", "gifs", "history", "jokes", "LifeProTips", "movies", "music", "pics", "science", "ShowerThoughts", "space", "sports", "tifu", "todayilearned", "videos", "worldnews"]
ofilePosts = open('posts.csv', 'wb')
writerPosts = csv.writer(ofilePosts, delimiter=',')
ofileUrls = open('urls.csv', 'wb')
writerUrls = csv.writer(ofileUrls, delimiter=',')
for subreddit in subredditsToScan:
front = requests.get(r'http://www.reddit.com/r/' + subreddit + '/.json')
rising = requests.get(r'http://www.reddit.com/r/' + subreddit + '/rising/.json')
front.text
rising.text
risingData = rising.json()
frontData = front.json()
print(len(risingData['data']['children']))
print(len(frontData['data']['children']))
for i in range(0, len(risingData['data']['children'])):
author = risingData['data']['children'][i]['data']['author']
score = risingData['data']['children'][i]['data']['score']
subreddit = risingData['data']['children'][i]['data']['subreddit']
gilded = risingData['data']['children'][i]['data']['gilded']
numOfComments = risingData['data']['children'][i]['data']['num_comments']
linkUrl = risingData['data']['children'][i]['data']['permalink']
timeCreated = risingData['data']['children'][i]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
for j in range(0, len(frontData['data']['children'])):
author = frontData['data']['children'][j]['data']['author'].encode('utf-8').strip()
score = frontData['data']['children'][j]['data']['score']
subreddit = frontData['data']['children'][j]['data']['subreddit'].encode('utf-8').strip()
gilded = frontData['data']['children'][j]['data']['gilded']
numOfComments = frontData['data']['children'][j]['data']['num_comments']
linkUrl = frontData['data']['children'][j]['data']['permalink'].encode('utf-8').strip()
timeCreated = frontData['data']['children'][j]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
It works well and scrapes the data accurately but it constantly gets interrupted, seemingly randomly, and has a run time crash, saying:
Traceback (most recent call last):
File "dataGather1.py", line 27, in <module>
for i in range(0, len(risingData['data']['children'])):
KeyError: 'data'
I have no idea why this error is occuring on and off and not consistently. I thought maybe I am calling the API too much so it stops me from accessing it so I threw a sleep in my code but that did not help. Any ideas?
When there are no data on the response from the API there are is no key data on the dictionary so you get a keyError on some subreddits. You need to use a try catch
The json you are parsing doesn't contain the 'data' element. Thus you get an error. I think your hunch is correct though. It is probably rate limiting, or that you're asking for hidden/deleted entries.
Reddit is very strict about accessing their API without playing nice. Meaning you should register your app and use a meaningful user-agent to your requets, and you should probably use the python library for this kind of thing: https://praw.readthedocs.io/en/latest/
Without registering it seems to my experience that the direct REST reddit API is even more strict than the 1 request per 2 seconds rule they have (had?).
Python raises a KeyError whenever a dict() object is requested (using the format a = adict[key]) and the key is not in the dictionary.
It seems like when you are getting this error, your data value is empty.
You might just try to get the length of the dictionary before you execute the for loop. If it’s empty, it will just not run. Some interesting error checking here might help.
size = len(risingData)
if size:
for i in range(0,size):
…
This is my first time to ask something here. I've been trying to access the Youtube API to get something for an experiment I'm doing. Everything's working so far. I just wanted to ask about this very inconsistent error that I'm getting.
-----------
1
Title: All Movie Trailers of New York Comic-Con (2016) Power Rangers, John Wick 2...
Uploaded by: KinoCheck International
Uploaded on: 2016-10-12T14:43:42.000Z
Video ID: pWOH-OZQUj0
2
Title: Movieclips Trailers
Uploaded by: Movieclips Trailers
Uploaded on: 2011-04-01T18:43:14.000Z
Video ID: Traceback (most recent call last):
File "scrapeyoutube.py", line 24, in <module>
print "Video ID:\t", search_result['id']['videoId']
KeyError: 'videoId'
I tried getting the video ID ('videoID' as per documentation). But for some reason, the code works for the 1st query, and then totally flops for the 2nd one. It's weird because it's only happening for this particular element. Everything else ('description','publishedAt', etc.) is working. Here's my code:
from apiclient.discovery import build
import json
import pprint
import sys
APINAME = 'youtube'
APIVERSION = 'v3'
APIKEY = 'secret teehee'
service = build(APINAME, APIVERSION, developerKey = APIKEY)
#volumes source ('public'), search query ('androide')
searchrequest = service.search().list(q ='movie trailers', part ='id, snippet', maxResults = 25).execute()
searchcount = 0
print "-----------"
for search_result in searchrequest.get("items", []):
searchcount +=1
print searchcount
print "Title:\t", search_result['snippet']['title']
# print "Description:\t", search_result['snippet']['description']
print "Uploaded by:\t", search_result['snippet']['channelTitle']
print "Uploaded on:\t", search_result['snippet']['publishedAt']
print "Video ID:\t", search_result['id']['videoId']
Hope you guys can help me. Thanks!
Use 'get' method for result.
result['id'].get('videoId')
there are in some element no this key.
if you use square parenteces, python throw exeption keyError, but if you use 'get' method, python return None for element whitch have not key videoId
Using search() method returns channels, playlists as well together with videos in search. That might be why your problem.
I use their interactive playgrounds to learn the structure of returned JSON, functions, etc. For your question, I suggest to visit https://developers.google.com/youtube/v3/docs/search/list .
Make sure if a kind of an item is "youtube#video", then access videoId of that item.
Sample of code:
...
for index in response["items"]: # response is a JSON file I have got from API
tmp = {} # temporary dict to assert into my custom JSON
if index["id"]["kind"] == "youtube#video":
tmp["videoID"] = index["id"]["videoId"]
...
This is a part of code from my personal project I am currently working on.
Because some results to Key "ID", return:
{u'kind': u'youtube#playlist', u'playlistId': u'PLd0_QArxznVHnlvJp0ki5bpmBj4f64J7P'}
You can see, there is no key "videoId".
I'm working on a twitch irc bot and one of the components I wanted to have available was the ability for the bot to save quotes to a pastebin paste on close, and then retrieve the same quotes on start up.
I've started with the saving part, and have hit a road block where I can't seem to get a valid post, and I can't figure out a method.
#!/usr/bin/env python3
import urllib.parse
import urllib.request
# --------------------------------------------- Pastebin Requisites --------------------------------------------------
pastebin_key = 'my pastebin key' # developer api key, required. GET: http://pastebin.com/api
pastebin_password = 'password' # password for pastebin_username
pastebin_postexp = 'N' # N = never expire
pastebin_private = 0 # 0 = Public 1 = unlisted 2 = Private
pastebin_url = 'http://pastebin.com/api/api_post.php'
pastebin_username = 'username' # user corresponding with key
# --------------------------------------------- Value clean up --------------------------------------------------
pastebin_password = urllib.parse.quote(pastebin_password, safe='/')
pastebin_username = urllib.parse.quote(pastebin_username, safe='/')
# --------------------------------------------- Pastebin Functions --------------------------------------------------
def post(title, content): # used for posting a new paste
pastebin_vars = {'api_option': 'paste', 'api_user_key': pastebin_username, 'api_paste_private': pastebin_private,
'api_paste_name': title, 'api_paste_expire_date': pastebin_postexp, 'api_dev_key': pastebin_key,
'api_user_password': pastebin_password, 'api_paste_code': content}
try:
str_to_paste = ', '.join("{!s}={!r}".format(key, val) for (key, val) in pastebin_vars.items()) # dict to str :D
str_to_paste = str_to_paste.replace(":", "") # remove :
str_to_paste = str_to_paste.replace("'", "") # remove '
str_to_paste = str_to_paste.replace(")", "") # remove )
str_to_paste = str_to_paste.replace(", ", "&") # replace dividers with &
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
print('did that work?')
except:
print("post submit failed :(")
print(pastebin_url + "?" + str_to_paste) # print the output for test
post("test", "stuff")
I'm open to importing more libraries and stuff, not really sure what I'm doing wrong after working on this for two days straight :S
import urllib.parse
import urllib.request
PASTEBIN_KEY = 'xxx'
PASTEBIN_URL = 'https://pastebin.com/api/api_post.php'
PASTEBIN_LOGIN_URL = 'https://pastebin.com/api/api_login.php'
PASTEBIN_LOGIN = 'my_login_name'
PASTEBIN_PWD = 'yyy'
def pastebin_post(title, content):
login_params = dict(
api_dev_key=PASTEBIN_KEY,
api_user_name=PASTEBIN_LOGIN,
api_user_password=PASTEBIN_PWD
)
data = urllib.parse.urlencode(login_params).encode("utf-8")
req = urllib.request.Request(PASTEBIN_LOGIN_URL, data)
with urllib.request.urlopen(req) as response:
pastebin_vars = dict(
api_option='paste',
api_dev_key=PASTEBIN_KEY,
api_user_key=response.read(),
api_paste_name=title,
api_paste_code=content,
api_paste_private=2,
)
return urllib.request.urlopen(PASTEBIN_URL, urllib.parse.urlencode(pastebin_vars).encode('utf8')).read()
rv = pastebin_post("This is my title", "These are the contents I'm posting")
print(rv)
Combining two different answers above gave me this working solution.
First, your try/except block is throwing away the actual error. You should almost never use a "bare" except clause without capturing or re-raising the original exception. See this article for a full explanation.
Once you remove the try/except, and you will see the underlying error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "paste.py", line 42, in post
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 461, in open
req = meth(req)
File "/usr/lib/python3.4/urllib/request.py", line 1112, in do_request_
raise TypeError(msg)
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.
This means you're trying to pass a unicode string into a function that's expecting bytes. When you do I/O (like reading/writing files on disk, or sending/receiving data over HTTP) you typically need to encode any unicode strings as bytes. See this presentation for a good explanation of unicode vs. bytes and when you need to encode and decode.
Next, this line:
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
Is throwing away the response, so you have no way of knowing the result of your API call. Assign this to a variable or return it from your function so you can then inspect the value. It will either be a URL to the paste, or an error message from the API.
Next, I think your code is sending a lot of unnecessary parameters to the API and your str_to_paste statements aren't necessary.
I was able to make a paste using the following, much simpler, code:
import urllib.parse
import urllib.request
PASTEBIN_KEY = 'my-api-key' # developer api key, required. GET: http://pastebin.com/api
PASTEBIN_URL = 'http://pastebin.com/api/api_post.php'
def post(title, content): # used for posting a new paste
pastebin_vars = dict(
api_option='paste',
api_dev_key=PASTEBIN_KEY,
api_paste_name=title,
api_paste_code=content,
)
return urllib.request.urlopen(PASTEBIN_URL, urllib.parse.urlencode(pastebin_vars).encode('utf8')).read()
Here it is in use:
>>> post("test", "hello\nworld.")
b'http://pastebin.com/v8jCkHDB'
I didn't know about pastebin until now. I read their api and tried it for the first time, and it worked perfectly fine.
Here's what I did:
I logged in to fetch the api_user_key.
Included that in the posting along with api_dev_key.
Checked the website, and the post was there.
Here's the code:
import urllib.parse
import urllib.request
def post(url, params):
data = urllib.parse.urlencode(login_params).encode("utf-8")
req = urllib.request.Request(login_url, data)
with urllib.request.urlopen(req) as response:
return response.read()
# Logging in to fetch api_user_key
login_url = "http://pastebin.com/api/api_login.php"
login_params = {"api_dev_key": "<the dev key they gave you",
"api_user_name": "<username goes here>",
"api_user_password": "<password goes here>"}
api_user_key = post(login_url, login_params)
# Posting some random text
post_url = "http://pastebin.com/api/api_post.php"
post_params = {"api_dev_key": "<the dev key they gave you",
"api_option": "paste",
"api_paste_code": "<head>Testing</head>",
"api_paste_private": "0",
"api_paste_name": "testing.html",
"api_paste_expire_date": "10M",
"api_paste_format": "html5",
"api_user_key": api_user_key}
response = post(post_url, post_params)
Only the first three parameters are needed for posting something, the rest are optional.
fwy the API doesn't seem to accept http requests as of writing this, so make sure to have the urls in the format of https://pas...