I am trying to gather a list of people I follow. Next, I would reverse the list and unfollow the first 50. I have seen similar answers to the question of how to gather the list of all friends, but I am still getting stuck and unsure if the documentation changed since those questions are a little old. I am getting Twitter error response: status code = 431
Below is the current relevant code,
import os
import logging
import time
import tweepy
from time import sleep
from config import *
api = initialize_api()
ids = []
for page in tweepy.Cursor(api.friends_ids, screen_name="xxxxxxx").pages():
ids.extend(page)
time.sleep(60)
screen_names = [user.screen_name for user in api.lookup_users(user_ids=ids)]
screen_names.reverse()
logger.info("Starting Unfollow")
i=0
while i<50:
api.destroy_friendship(screen_names[i])
i += 1
Related
I am using instaloader to scrape instagram posts as part of a study project.
To avoid getting shut down by instagram, I use sleep function to sleep between 1-20 sec between each round. This works well.
I don't want to have to go through all posts each time I scrape, and therefore i want the loop to run 5 times. Which will give me 5 posts. But I don't seem to manage to get it to do it.
I had written the following function to try to scrape the profile and return the first 5 posts:
## importing and creating instance
from instaloader import Instaloader
from instaloader import Profile
import instaloader
import time
from random import randint
L = instaloader.Instaloader()
#random time for sleep
vent = randint(1,20)
# function:
def get2posts(profile_name):
profile = Profile.from_username(L.context, profile_name)
POSTS = profile.get_posts()
for post in POSTS:
for i in range(2):
L.download_post(post, profile_name)
time.sleep(vent)
break
print('scrape done')
This code returns 5 of the same posts though, and I simply can't figure out a way to get it to return the first 5 posts of an account.
The working function, which harvests all posts of a profile is:
# the original function (without range)
def get_posts(profile_name):
profile = Profile.from_username(L.context, profile_name)
POSTS = profile.get_posts()
for post in POSTS:
L.download_post(post, profile_name)
time.sleep(vent)
print('I am done')
Hope you can help :)
The problem is that the inner for loop runs download_post twice (range(2)) on the same post, and then the outer loop breaks. If POSTS is a list, you can use slicing to loop only over the first 5 items like so: for post in POSTS[:5]:. A safer method though would be to count the posts as you go, which should work for most types of iterables (not just lists), like so:
def get2posts(profile_name):
profile = Profile.from_username(L.context, profile_name)
POSTS = profile.get_posts()
for i, post in enumerate(POSTS):
L.download_post(post, profile_name)
if i == 4:
break
time.sleep(vent)
print('scrape done')
I'm using Spotipy and LyricsGenius to open lyrics on a web browser from a terminal.
I can open a url for one song, but have to run the script each time to run consecutively. What are some ways to detect the end of a song using Spotipy?
import spotipy
import webbrowser
import lyricsgenius as lg
...
# Create our spotifyObject
spotifyObject = spotipy.Spotify(auth=token)
# Create out geniusObject
geniusObject = lg.Genius(access_token)
...
while True:
currently_playing = spotifyObject.currently_playing()
artist = currently_playing['item']['artists'][0]['name']
title = currently_playing['item']['name']
search_query = artist + " " + title
# if (currently_playing has changed):
song = geniusObject.search_songs(search_query)
song_url = song['hits'][0]['result']['url']
webbrowser.open(song_url)
webbrowser.open(song_url)
I was reading relevant threads such as this, this, and read through documentation but could not find an answer to my question if this could be handled by Spotipy. I would appreciate any suggestions, thank you.
I used time.sleep(length) with the argument 'length' standing for the remaining duration of a current track.
I have the following program in python that reads in a list of 1390680 URLS of Instagram accounts and gets the follower count for each user. It utilizes the instaloader. Here's the code:
import pandas as pd
from instaloader import Instaloader, Profile
# 1. Loading in the data
# Reading the data from the csv
data = pd.read_csv('IG_Audience.csv')
# Getting the profile urls
urls = data['Profile URL']
def getFollowerCount(PROFILE):
# using the instaloader module to get follower counts from this programmer
# https://stackoverflow.com/questions/52225334/webscraping-instagram-follower-count-beautifulsoup
try:
L = Instaloader()
profile = Profile.from_username(L.context, PROFILE)
print(PROFILE, 'has', profile.followers, 'followers')
return(profile.followers)
except Exception as exception :
print(exception, False)
return(0)
# Follower count List
followerCounts = []
# This loop will fetch the follower count for each user
for url in urls:
# Getting the profile username from the URL by removing the instagram.com
# portion and the backslash at the end of the url
url_dirty = url.replace('https://www.instagram.com/', '')
url_clean = url_dirty[:-1]
followerCounts.append(getFollowerCount(url_clean))
# Converting the list to a series, adding it to the dataframe, and writing it to
# a csv
data['Follower Count'] = pd.Series(followerCounts)
data.to_csv('IG_Audience.csv')
The main issue I have with this is that it is taking a very long time to read through the entire list. It took 14 hours just to get the follower counts for 3035 users. Is there any way to speed up this process?
First I wanna say I'm sorry for being VERY late but hopefully this can help someone in the future. I'm having a similar issue and I believe I found out why, when you get the followers you instaloader doesn't just go to the profiles page and read the number but it gets the URL and profile ID for each account and can only get so many at a time, the best way I can think to get around this would be to make a request to the page and just read the follower count on their main page issue with this however is after I believe 9999 followers it will start saying "10k" or "10.1k" so you'll be off by 100 and just gets worse if the person has over a million because then its off by even more.
I am new to Facebook API. Currently, I am trying to print out ALL the comments that have been posted for this facebook page called 'leehsienloong'. However, I could only print out a total of 700+ comments. I'm sure there are more than 700+ comments in total.
I find out that the problem is, I did not request to go to another page to print out the comments. I read about paging Facebook API, but I still do not understand how to do the code for paging.
Is there anyone out there who will be able to help/assist me? I really need help. Thank you.
Here is my code, without paging:
import facebook #sudo pip install facebook-sdk
import itertools
import json
import re
import requests
access_token = "XXX"
user = 'leehsienloong'
graph = facebook.GraphAPI(access_token)
profile = graph.get_object(user)
posts = graph.get_connections(profile['id'], 'posts')
Jstr = json.dumps(posts)
JDict = json.loads(Jstr)
count = 0
for i in JDict['data']:
allID = i['id']
try:
allComments = i['comments']
for a in allComments['data']:
count += 1
print a['message']
except (UnicodeEncodeError):
pass
print count
You can use the limit parameter to increase the number of comments to be fetched. The default is 25. You can increase it like this:
posts = graph.get_connections(profile['id'], 'posts', limit=100)
But more convenient way would be get the previous and next pages from paging and do multiple requests.
to get all the comments of a post the logic should be something like
comments = []
for post in posts["data"]:
first_comments = graph.get_connections(id=post["id"], connection_name="comments")
comments.extend(first_comments["data"])
while True:
try:
next_comments = requests.get(post_comments["paging"]["next"]).json()
comments.extend(next_comments["data"])
except KeyError:
break
I am trying to create a little python script to follow Twitter user IDs from a textfile (one per line, in numeric format e.g. 217275660, 30921943, etc.). I took a look at this answer on stack exchange to make the code below using the 'try/except' answer, but I am getting an error "NameError: name 'TwitterError' is not defined"...
Anyone know how to clear this issue up and fix the code? I feel like it should be pretty simple but haven't used the Twitter API before.
# Script to follow Twitter users from text file containing user IDs (one per line)
# Header stuff I've just thrown in from another script to authenticate
import json
import time
import tweepy
import pprint
from tweepy.parsers import RawParser
from auth import TwitterAuth
from datetime import datetime
auth = tweepy.OAuthHandler(TwitterAuth.consumer_key, TwitterAuth.consumer_secret)
auth.set_access_token(TwitterAuth.access_token, TwitterAuth.access_token_secret)
rawParser = RawParser()
api = tweepy.API(auth_handler = auth, parser = rawParser)
# Follow everyone from list?!
with open('to_follow.txt') as f:
for line in f:
try:
api.CreateFriendship(userID)
except TwitterError:
continue
print "Done."
That is may be because the tweepy throws error of type TweepError so you need to catch TweepError instead of TwitterError
for line in f:
try:
api.CreateFriendship(userID)
except TweepError,e:
continue