There's an issue with my code where no matter what I try, every time I reply to a tweet, it just posts as a regular status update on my timeline.
here is a snippet of the code
class StreamListener(tweepy.StreamListener):
def on_status(self, status):
tweetid = status.id
tweetnouser = status.text.replace("#CarlWheezerBot", "")
username = '#'+status.user.screen_name
user_tweet = gTTS(text=tweetnouser, lang='en', slow=False)
# Saving the converted audio
user_tweet.save("useraudio/text2speech.mp3")
# importing the audio and getting the audio all mashed up
text2speech = AudioFileClip("useraudio/text2speech.mp3")
videoclip = VideoFileClip("original_video/original_cut.mp4")
editedAudio = videoclip.audio
# splicing the original audio with the text2speech
compiledAudio = CompositeAudioClip([editedAudio.set_duration(3.8), text2speech.set_start(3.8)])
videoclip.audio = compiledAudio
# saving the completed video fie
videoclip.write_videofile("user_video/edited.mp4", audio_codec='aac')
upload_result = api.media_upload("user_video/edited.mp4")
api.update_status( status='#CarlWheezerBot',in_reply_to_status_id=[tweetid], media_ids=[upload_result.media_id_string], auto_populate_reply_metadata=True)
I have also tried it without any status, as well as using status.id_str. Nothing seems to work., I have done it without the metadata parameter as well. I am following the documentation word for word.
OKAY. for everyone reading this in the future
use this in_reply_to_status_id=tweetid
do not use the square brackets. Everything works perfectly now
While playing around with it, I also noticed that you should also mention author of the tweet you're replying to, especially if you're replying to an existing reply because it will still post it as status update. Line from documentation:
in_reply_to_status_id – The ID of an existing status that the update is in reply to. Note: This parameter will be ignored unless the author of the Tweet this parameter references is mentioned within the status text. Therefore, you must include #username, where username is the author of the referenced Tweet, within the update.
Related
I am using the Twitter API StreamingClient using the python module Tweepy. I am currently doing a short stream where I am collecting tweets and saving the entire ID and text from the tweet inside of a json object and writing it to a file.
My goal is to be able to collect the Twitter handle from each specific tweet and save it to a json file (preferably print it in the output terminal as well).
This is what the current code looks like:
KEY_FILE = './keys/bearer_token'
DURATION = 10
def on_data(json_data):
json_obj = json.loads(json_data.decode())
#print('Received tweet:', json_obj)
print(f'Tweet Screen Name: {json_obj.user.screen_name}')
with open('./collected_tweets/tweets.json', 'a') as out:
json.dump(json_obj, out)
bearer_token = open(KEY_FILE).read().strip()
streaming_client = tweepy.StreamingClient(bearer_token)
streaming_client.on_data = on_data
streaming_client.sample(threaded=True)
time.sleep(DURATION)
streaming_client.disconnect()
And I have no idea how to do this, the only thing I found is that someone did this:
json_obj.user.screen_name
However, this did not work at all, and I am completely stuck.
So a couple of things
Firstly, I'd recommend using on_response rather than on_data because StreamClient already defines a on_data function to parse the json. (Then it will fire on_tweet, on_response, on_error, etc)
Secondly, json_obj.user.screen_name is part of API v1 I believe, which is why it doesn't work.
To get extra data using Twitter Apiv2, you'll want to use Expansions and Fields (Tweepy Documentation, Twitter Documentation)
For your case, you'll probably want to use "username" which is under the user_fields.
def on_response(response:tweepy.StreamResponse):
tweet:tweepy.Tweet = response.data
users:list = response.includes.get("users")
# response.includes is a dictionary representing all the fields (user_fields, media_fields, etc)
# response.includes["users"] is a list of `tweepy.User`
# the first user in the list is the author (at least from what I've tested)
# the rest of the users in that list are anyone who is mentioned in the tweet
author_username = users and users[0].username
print(tweet.text, author_username)
streaming_client = tweepy.StreamingClient(bearer_token)
streaming_client.on_response = on_response
streaming_client.sample(threaded=True, user_fields = ["id", "name", "username"]) # using user fields
time.sleep(DURATION)
streaming_client.disconnect()
Hope this helped.
also tweepy documentation definitely needs more examples for api v2
KEY_FILE = './keys/bearer_token'
DURATION = 10
def on_data(json_data):
json_obj = json.loads(json_data.decode())
print('Received tweet:', json_obj)
with open('./collected_tweets/tweets.json', 'a') as out:
json.dump(json_obj, out)
bearer_token = open(KEY_FILE).read().strip()
streaming_client = tweepy.StreamingClient(bearer_token)
streaming_client.on_data = on_data
streaming_client.on_closed = on_finish
streaming_client.sample(threaded=True, expansions="author_id", user_fields="username", tweet_fields="created_at")
time.sleep(DURATION)
streaming_client.disconnect()
I am completely stuck as when dabbling in Reddit's API aka Praw I wanted to learn to save the number 1 hottest post as an mp4 however Reddit saves all of their gifs on Imgur which convert all gifs to gifv, how would I go around converting the gifv to mp4 so I can read them? Btw simply renaming it seems to lead to corruption.
This is my code so far: (details have been xxxx'd for confidentiality)
reddit = praw.Reddit(client_id ="xxxx" , client_secret ="xxxx", username = "xxxx", password ="xxxx", user_agent="xxxx")
subreddit = reddit.subreddit("dankmemes")
hot_dm = subreddit.hot(limit=1);
for sub in hot_dm:
print(sub)
url = sub.url
print(url)
print(sub.permalink)
meme = requests.get(url)
newF = open("{}.mp4".format(sub), "wb") #here the file is created but when played is corrupted
newF.write(meme.content)
newF.close()
Some posts already have an mp4 conversion inside the preview > variants portion of the json response.
Therefore to download only those posts that have a gif and therefore have an mp4 version you could do something like this:
subreddit = reddit.subreddit("dankmemes")
hot_dm = subreddit.hot(limit=10)
for sub in hot_dm:
if sub.selftext == "": # check that the post is a link to some content (image/video/link)
continue
try: # try to access variants and catch the exception thrown
has_variants = sub.preview['images'][0]['variants'] # variants contain both gif and mp4 versions (if available)
except AttributeError:
continue # no conversion available as variants doesn't exist
if 'mp4' not in has_variants: # check that there is an mp4 conversion available
continue
mp4_video = has_variants['mp4']['source']['url']
print(sub, sub.url, sub.permalink)
meme = requests.get(mp4_video)
with open(f"{sub}.mp4", "wb") as newF:
newF.write(meme.content)
Though you are most likely going to want to increase the limit of posts that you look through when searching through hot as the first post may be a pinned post (usually some rules about the subreddit), this is why I initially checked the selftext. In addition, there may be other posts that are only images, therefore with a small limit you might not return any posts that could be converted to mp4s.
F.W. This isn't just a PRAW question, it leans toward Python more than PRAW. Python people are welcome to contribute, and please note this is not my mother language xD!
Essentially, I'm writing a Reddit bot using the PRAW that does the following:
Loop through "unsaved" posts
Loop through the comments of said posts (targeting subcomments)
If the comment contains "!completed", is written by the submitter OR is a moderator, and the parent comment is not by submitter:
Do etc., e.x. print("Hey")
No, I didn't explain that too well. Examples are better, so here xD:
Use cases:
- Post by #dudeOne
- Comment by #dudeTwo
- Comment with "!completed" by #dudeOne
- Post by #dudeOne
- Comment by #dudeTwo
- Comment with "!completed" by #moderatorOne
print("Hey"), and:
- Post by #dudeOne
- Comment by #dudeOne
- Comment with "!completed" by #dudeOne
... does nothing, maybe even removes + messages #dudeOne.
Here's my messy code (xD):
import praw
import os
import re
sub = "RedditsQuests"
client_id = os.environ.get('client_id')
client_secret = os.environ.get('client_secret')
password = os.environ.get('pass')
reddit = praw.Reddit(client_id=client_id,
client_secret=client_secret,
password=password,
user_agent='r/RedditsQuests bot',
username='TheQuestMaster')
for submission in reddit.subreddit(sub).new(limit=None):
submission.comments.replace_more(limit=None)
if submission.saved is False:
for comment in submission.comments.list():
if ((("!completed" in comment.body)) and ((comment.is_submitter) or ('RedditsQuests' in comment.author.moderated())) and (comment.parent().author.name is not submission.author.name)):
print("etc...")
There's a decently-sized stack, so I've added it in this bin for your reference. To me it looks like PRAW is timing out because the if-in-for loop is taking too long. I could be wrong though!
The issue (as you've said) is somewhat sporadic but I've narrowed it down. As it turns out, trying to fetch the subreddits moderated by /u/AutoModerator will sometimes time out (presumably because the list is long).
Figuring out the issue
Here's how I found the issue. Skip this section if you're only interested in the solution.
First, I modified your script to use try and except to catch the exception when it happened. Your traceback told me that it was happening on the line that starts with if ((("!completed" in comment.body)), specifically when fetching the subreddits that a user moderates. Here was my modified script:
for submission in reddit.subreddit(sub).new(limit=None):
submission.comments.replace_more(limit=None)
if submission.saved is False:
for comment in submission.comments.list():
try:
if (
(("!completed" in comment.body))
and (
(comment.is_submitter)
or ("RedditsQuests" in comment.author.moderated())
)
and (comment.parent().author.name is not submission.author.name)
):
print("etc...")
except Exception:
print(f'Author: {comment.author} ({type(comment.author)})')
And the output:
etc...
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
etc...
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
etc...
etc...
etc...
etc...
etc...
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
etc...
With this in mind I wrote a very simple 3-line script to reproduce the issue:
import praw
reddit = praw.Reddit(...)
print(reddit.redditor("AutoModerator").moderated())
Sometimes this script would succeed but sometimes it would fail with the same socket read timeout. Presumably the timeout happens because AutoModerator moderates so many subreddits (at least 10,000), and the Reddit API takes too long to process the request.
Fixing the issue
Your script tries to determine whether the redditor in question is a moderator of the subreddit. You're doing this by checking if the subreddit is in the list of the user's moderated subreddits, but you can switch this to checking if the user is in the list of the subreddit's moderators. Not only should this not time out, but you'll be saving a lot of network requests because you can just fetch the list of moderators once.
The PRAW documentation of Subreddit shows how we can get a list of moderators of a subreddit. In your case, we can do
moderators = list(reddit.subreddit(sub).moderator())
Then, instead of checking "RedditsQuests" in comment.author.moderated(), we check
comment.author in moderators
Your code then becomes
import praw
import os
import re
sub = "RedditsQuests"
client_id = os.environ.get("client_id")
client_secret = os.environ.get("client_secret")
password = os.environ.get("pass")
reddit = praw.Reddit(
client_id=client_id,
client_secret=client_secret,
password=password,
user_agent="r/RedditsQuests bot",
username="TheQuestMaster",
)
moderators = list(reddit.subreddit(sub).moderator())
for submission in reddit.subreddit(sub).new(limit=None):
submission.comments.replace_more(limit=None)
if submission.saved is False:
for comment in submission.comments.list():
if (
(("!completed" in comment.body))
and ((comment.is_submitter) or (comment.author in moderators))
and (comment.parent().author.name is not submission.author.name)
):
print("etc...")
In my brief testing, this script runs many times faster, since we only get the list of moderators once, rather than fetching all subreddits moderated by all users who commented.
As an unrelated style note, instead of if submission.saved is False you should do if not submission.saved, which is the conventional way to check if a condition is false.
I'm trying to write a program that will stream tweets from Twitter using their Stream API and Tweepy. Here's the relevant part of my code:
def on_data(self, data):
if data.user.id == "25073877" or data.in_reply_to_user_id == "25073877":
self.filename = trump.csv
elif data.user.id == "30354991" or data.in_reply_to_user_id == "30354991":
self.filename = harris.csv
if not 'RT #' in data.text:
csvFile = open(self.filename, 'a')
csvWriter = csv.write(csvFile)
print(data.text)
try:
csvWriter.writerow([data.text, data.created_at, data.user.id, data.user.screen_name, data.in_reply_to_status_id])
except:
pass
def on_error(self, status_code):
if status_code == 420:
return False
What the code should be doing is streaming the tweets and writing the text of the tweet, the creation date, the user ID of the tweeter, their screen name, and the reply ID of the status they're replying to if the tweet is a reply. However, I get the following error:
File "test.py", line 13, in on_data
if data.user.id == "25073877" or data.in_reply_to_user_id == "25073877":
AttributeError: 'unicode' object has no attribute 'user'
Could someone help me out? Thanks!
EDIT: Sample of what is being read into "data"
{"created_at":"Fri Feb 15 20:50:46 +0000 2019","id":1096512164347760651,"id_str":"1096512164347760651","text":"#realDonaldTrump \nhttps:\/\/t.co\/NPwSuJ6V2M","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":25073877,"in_reply_to_user_id_str":"25073877","in_reply_to_screen_name":"realDonaldTrump","user":{"id":1050189031743598592,"id_str":"1050189031743598592","name":"Lauren","screen_name":"switcherooskido","location":"United States","url":null,"description":"Concerned citizen of the USA who would like to see Integrity restored in the US Government. Anti-marxist!\nSigma, INTP\/J\nREJECT PC and Identity Politics #WWG1WGA","translator_type":"none","protected":false,"verified":false,"followers_count":1459,"friends_count":1906,"listed_count":0,"favourites_count":5311,"statuses_count":8946,"created_at":"Thu Oct 11 00:59:11 +0000 2018","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"FF691F","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/1068591478329495558\/ng_tNAXx_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/1068591478329495558\/ng_tNAXx_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/1050189031743598592\/1541441602","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/NPwSuJ6V2M","expanded_url":"https:\/\/www.conservativereview.com\/news\/5-insane-provisions-amnesty-omnibus-bill\/","display_url":"conservativereview.com\/news\/5-insane-\u2026","indices":[18,41]}],"user_mentions":[{"screen_name":"realDonaldTrump","name":"Donald J. Trump","id":25073877,"id_str":"25073877","indices":[0,16]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"und","timestamp_ms":"1550263846848"}
So I supposed the revised question is how to tell the program to only write parts of this JSON output to the CSV file? I've been using the references Twitter's stream API provides for the attributes for "data".
As stated in your comment the tweet data is in "JSON format". I believe what you mean by this is that it is a string (unicode) in JSON format, not a parsed JSON object. In order to access the fields like you want to in your code you need to parse the data string using json.
e.g.
import json
json_data_object = json.loads(data)
you can then access the fields like you would a dictionary e.g.
json_data_object['some_key']['some_other_key']
This is a very late answer, but I'm answering here because this is the first search hit when you search for this error. I was also using Tweepy and found that the JSON response object had attributes that could not be accessed.
'Response' object has no attribute 'text'
Through lots of tinkering and research, I found that in the loop where you access the Twitter API, using Tweepy, you must specify '.data' in the loop, not within it.
For example:
tweets = client.search_recent_tweets(query = "covid" , tweet.fields = ['text'])
for tweet in tweets:
print(tweet.text) # or print(tweet.data.text)
Will not work because the Response variable doesn't have access to the attributes within the JSON response object. Instead, you do something like:
tweets = client.search_recent_tweets(query = "covid" , tweet.fields = ['text'])
for tweet in tweets.data:
print(tweet.text)
Basically, this was a long-winded way to fix a problem I was having for a long time. Cheers, hopefully, other noobs like me won't have to struggle as long as I did!
I am trying to view another user's tweets. The other user is following me and i am following the user on twitter. But when i try this, i only see my own tweets, no matter what name i enter as argument for GetUserTimeline.
What should i do??
import twitter
api = twitter.Api(consumer_key='', consumer_secret='', access_token_key='',access_token_secret='')
statuses = api.GetUserTimeline('chooimooi')
for tweet in statuses:
print tweet
Also, how can i export this data to a text file?
Take a look at pydoc for twitter.Api.GetUserTimeline
pydoc twitter.Api.GetUserTimeline
which states:
twitter.Api.GetUserTimeline = GetUserTimeline(self, user_id=None, screen_name=None,
since_id=None, max_id=None, count=None, include_rts=True, trim_user=None,
exclude_replies=None) unbound twitter.Api method
I think therefore that putting screen_name='usernamerequired' will work. For example
statuses = api.GetUserTimeline(screen_name='chooimooi')