Tweepy taking forever to write Json Data - python

I ran this code last week with Jupyter Notebook and it worked rather quickly. However, this week I've run into issues with it taking forever (more than an hour)to write JSON data to a file. The code works, but I was curious if maybe the way I've written it, is causing it to run slow??
consumer_key = hidden
consumer_secret = hidden
access_token = hidden
access_secret = hidden
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
# set Twitter's rate limit
api = tweepy.API(auth, wait_on_rate_limit = True, wait_on_rate_limit_notify = True)
# write the querying JSON data into tweet_json.txt
with open('tweet_json.txt','a',encoding = 'utf8') as f:
for tweet_id in twitter_archive['tweet_id']:
try:
tweet = api.get_status(tweet_id, tweet_mode = 'extended') # set mode to extended
json.dump(tweet._json, f)
f.write('\n')
except:
print('error')

Related

Tweepy API Failing For Every ID

I'm running the below code that was given to me by an instructor to grab the status based off the tweet_id in another dataframe I've imported already. When running the code, everything comes back Failed. I don't receive any errors so I'm not sure what I'm missing. When I requested my twitter developer access I didn't have to answer a ton of questions like I've seen other people say they've had to do, so I'm curious if it's just not enough access?
import tweepy
from tweepy import OAuthHandler
import json
from timeit import default_timer as timer
# Query Twitter API for each tweet in the Twitter archive and save JSON in a text file
# These are hidden to comply with Twitter's API terms and conditions
consumer_key = 'HIDDEN'
consumer_secret = 'HIDDEN'
access_token = 'HIDDEN'
access_secret = 'HIDDEN'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
# NOTE TO STUDENT WITH MOBILE VERIFICATION ISSUES:
# df_1 is a DataFrame with the twitter_archive_enhanced.csv file. You may have to
# change line 17 to match the name of your DataFrame with twitter_archive_enhanced.csv
# NOTE TO REVIEWER: this student had mobile verification issues so the following
# Twitter API code was sent to this student from a Udacity instructor
# Tweet IDs for which to gather additional data via Twitter's API
tweet_ids = twitter_archive.tweet_id.values
len(tweet_ids)
# Query Twitter's API for JSON data for each tweet ID in the Twitter archive
count = 0
fails_dict = {}
start = timer()
# Save each tweet's returned JSON as a new line in a .txt file
with open('tweet_json.txt', 'w') as outfile:
# This loop will likely take 20-30 minutes to run because of Twitter's rate limit
for tweet_id in tweet_ids:
count += 1
print(str(count) + ": " + str(tweet_id))
try:
tweet = api.get_status(tweet_id, tweet_mode='extended')
print("Success")
json.dump(tweet._json, outfile)
outfile.write('\n')
except tweepy.TweepError as e:
print("Fail")
fails_dict[tweet_id] = e
pass
end = timer()
print(end - start)
print(fails_dict)

Tweepy Geolocation Search Not Returning Results

I've pulled this code together from a few other posts on here focusing on the topic of searching for Tweets in a certain geographical area. Unfortunately, all I receive from this code is a blank spreadsheet. I have tried a few different iterations with additional parameters added to no avail. Is there something I am missing here?
import tweepy
import csv
consumer_key = 'XXXXX'
consumer_secret = 'XXXXX'
access_token = 'XXXXX'
access_token_secret = 'XXXXX'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
search_area = tweepy.Cursor(api.search, count=100, geocode="37.5407,77.4360,5km").items()
output = [[tweet.id_str, tweet.created_at, tweet.text.encode("utf-8"), tweet.favorite_count, tweet.retweet_count,
tweet.entities.get('hashtags'), tweet.entities.get('user_mentions'), tweet.entities.get('media'),
tweet.entities.get('urls')] for tweet in search_area]
with open('city_tweets.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(["id", "created_at", "text", "likes", "retweets", "hashtags",
"user mentions", "media", "links"])
writer.writerows(output)
37.5407, 77.4360 is 37°32'26.5"N 77°26'09.6"E, which is in a relatively unpopulated area in Western China, where Twitter is blocked, so it makes sense for there to be no Tweets from there in the past week.
Did you mean 37.5407, -77.4360?, which is pretty much the center of Richmond, Virginia.

Get list of followers and following for group of users tweepy

I was just wondering if anyone knew how to list out the usernames that a twitter user is following, and their followers in two separate .csv cells.
This is what I have tried so far.
import tweepy
import csv
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
csvFile = open('ID.csv', 'w')
csvWriter = csv.writer(csvFile)
users = ['AindriasMoynih1', 'Fiona_Kildare', 'daracalleary', 'CowenBarry', 'BillyKelleherTD', 'BrendanSmithTD']
for user_name in users:
user = api.get_user(screen_name = user_name, count=200)
csvWriter.writerow([user.screen_name, user.id, user.followers_count, user.followers_id, user.friends_id user.description.encode('utf-8')])
print (user.id)
csvFile.close()
Tweepy is a wrapper around the Twitter API.
According to the Twitter API documentation, you'll need to call the GET friends/ids to get a list of their friends (people they follow), and GET followers/ids to get their followers.
Using the wrapper, you'll invoke those API calls indirectly by calling the corresponding method in Tweepy.
Since there will be a lot of results, you should use the Tweepy Cursor to handle scrolling through the pages of results for you.
Try the code below. I'll leave it to you to handle the CSV aspect, and to apply it to multiple users.
import tweepy
access_token = "1234"
access_token_secret = "1234"
consumer_key = "1234"
consumer_secret = "1234"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
for user in tweepy.Cursor(api.get_friends, screen_name="TechCrunch").items():
print('friend: ' + user.screen_name)
for user in tweepy.Cursor(api.get_followers, screen_name="TechCrunch").items():
print('follower: ' + user.screen_name)

Tweepy: Auto Tweet Images From Folder?

New here, First post aswell.
I'm currently trying to use Tweepy. I've successfully set it up so far and I'm able to tweet single images. So the code runs fine.
The purpose of this is because I run an account that tweets images only, no actual text tweets.
I've a folder of 100's of images I go through everyday to tweet and found out about tweepy, Is it possible to be able to tell Tweepy to go into the folder of the images and select 1 or any 1 at random? I've did extensive searching and couldn't find anything at all.
All help is greatly, greatly appreciated!
Here's the code I've got at the moment (python-2).
import tweepy
from time import sleep
consumer_key = 'Removed'
consumer_secret = 'Removed'
access_token = 'Removed'
access_token_secret = 'Removed'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
api.update_with_media('Image')
sleep(900)
print 'Tweeted!'
I'm assuming that you're iterating 100 times, given that you have 100 photos in your dir. I hope you don't mind, I took the liberty of placing your twitter api instantiation/auth in a function (for reusability's sake :) ). For the getPathsFromDir() function, I adapted GoToLoop's solution from processing.org. You might want to check out the link reference/link for more details. Also, practice placing your api.update[_with_media,_status]() in try - except blocks. You'll never know an odd exception would be raised by the api. I hope my implementation works for you!
import tweepy
from time import sleep
folderpath = "/path/to/your/directory/"
def tweepy_creds():
consumer_key = 'Removed'
consumer_secret = 'Removed'
access_token = 'Removed'
access_token_secret = 'Removed'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
return tweepy.API(auth)
def getPathsFromDir(dir, EXTS="extensions=,png,jpg,jpeg,gif,tif,tiff,tga,bmp"):
return this.listPaths(folder, EXTS)
def tweet_photos(api):
imagePaths = getPathsFromDir(this.dataPath(folderpath))
for x in imagePaths:
status = "tweet text here"
try:
api.update_with_media(filename=x,status=status)
print "Tweeted!"
sleep(900)
except Exception as e:
print "encountered error! error deets: %s"%str(e)
break
if __name__ == "__main__":
tweet_photos(tweepy_creds())
/ogs

Using tweepy to get old tweets

I've put together this short script to search twitter. The majority of the tweets from this search date back from a year ago. It was in connection with a kickstarter campaign. When I run this script though I only get newer tweets that aren't relevant to that term anymore. Could anybody tell me what I need to do to get it the way I want? When I search for the terms on twitter it gives me the right results.
import tweepy
import csv
consumer_key = 'x'
consumer_secret = 'x'
access_token = 'x'
access_token_secret = 'x'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Open/Create a file to append data
csvFile = open('tweets.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
api = tweepy.API(auth)
results = api.search(q="kickstarter campaign")
for result in results:
csvWriter.writerow([result.created_at, result.text.encode('utf-8')])
print result.created_at, result.text
csvFile.close()

Categories

Resources