I want to be able to use urllib2.urlopen() or requests.get() with http://plus.google.com/* url's.
Using python, how would I go about doing that? I need to login first, but how?
The following code returns something along the lines of:
"Your browser's cookie functionality is turned off. Please turn it on."
Well, the cookie itself is created and, and I tested robots.txt, there are no diallows... I also tried switching user agents, no luck.
cookie_filename = "google.cookie"
email = raw_input("Enter your Google username: ")
password = getpass.getpass("Enter your password: ")
self.cj = cookielib.MozillaCookieJar(cookie_filename)
self.opener = urllib2.build_opener(
urllib2.HTTPHandler(debuglevel = 0),
urllib2.HTTPSHandler(debuglevel = 0),
login_page_url = 'https://www.google.com/accounts/ServiceLogin?passive=true&service=grandcentral'
authenticate_url = 'https://www.google.com/accounts/ServiceLoginAuth?service=grandcentral'
gv_home_page_url = 'https://www.google.com/voice/#inbox'
# Load sign in page
login_page_contents = self.opener.open(login_page_url).read()
# Find GALX value
galx_match_obj = re.search(r'name="GALX"\s*value="([^"]+)"', login_page_contents, re.IGNORECASE)
galx_value = galx_match_obj.group(1) if galx_match_obj.group(1) is not None else ''
# Set up login credentials
login_params = urllib.urlencode( {
'Email' : email,
'Passwd' : password,
'continue' : 'https://www.google.com/voice/account/signin',
'GALX': galx_value
# Login
resp = self.opener.open(authenticate_url, login_params).readlines()
print resp
self.opener.open(authenticate_url, login_params).readlines()
# Open GV home page
gv_home_page_contents = self.opener.open(gv_home_page_url).read()
print gv_home_page_contents
The website I need to login to has an initial page for only entering the username and a second where I just enter the password, but the username block is still displayed with the username filled in.
I post the username and password as such:
session = requsts.session()
prelogin_url = ''
result = session.get(prelogin_url)
payload = {'j_username': 'username'}
result = session.post(
data = payload
# page 2
login_url = ''
result = session.get(login_url)
password = {'j_password': 'password'}
result = session.post(
data = password
When I check the content, it shows me that the login failed. There are no tokens.
What am I missing here?
I'm trying to scrape username and media information from a list of Instagram user using unofficial instagram-Api with python.
The library is there
I understand how I can scrape information from the user that is logged but i can't understand how i can refer to another username.
This is the code for taking my Instagram information
from InstagramAPI import InstagramAPI
import time
username = 'myUser'
pwd = 'mypass'
API = InstagramAPI(username,pwd)
pk = API.LastJson['user']['pk']
maxid = ''
while True:
API.getUserFeed(pk, maxid)
feed = API.LastJson
if 'fail' in feed['status']:
for i in range(0, len(feed['items']) - 1):
mediadata = feed['items'][i]
"Media number: "
"Like count: "
"Comment count: "
if feed['items'][i]['caption'] is None:
print("Caption: ["
"No Caption available"
caption = mediadata['caption']['text']
if len(caption) > 30:
caption = caption[:30] + ' (...)'
I solved it with:
API.searchUsername('IG_USERNAME') before this code: pk = API.LastJson['user']['pk'].
It Works for me.
Enjoy :D
I'm trying to write a script that does the following:
obtains a list of album (photoset) ID's from my flickr account
list the image titles from each album (photoset) into a text file named as the album title
Here's what I have so far:
import flickrapi
from xml.etree import ElementTree
api_key = 'xxxx'
api_secret = 'xxxx'
flickr = flickrapi.FlickrAPI(api_key, api_secret)
(token, frob) = flickr.get_token_part_one(perms='write')
if not token: raw_input("Press ENTER after you authorized this program")
flickr.get_token_part_two((token, frob))
sets = flickr.photosets_getList(user_id='xxxx')
for elm in sets.getchildren()[0]:
title = elm.getchildren()[0].text
print ("id: %s setname: %s photos: %s") %(elm.get('id'), title, elm.get('photos'))
The above simply outputs the result to the screen like this:
id: 12453463463252553 setname: 2006-08 photos: 371
id: 23523523523532523 setname: 2006-07 photos: 507
id: 53253253253255532 setname: 2006-06 photos: 20
... etc ...
From there, I've got the following which I assumed would list all the image titles in the above album:
import flickrapi
from xml.etree import ElementTree
api_key = 'xxxx'
api_secret = 'xxxx'
flickr = flickrapi.FlickrAPI(api_key, api_secret)
(token, frob) = flickr.get_token_part_one(perms='write')
if not token: raw_input("Press ENTER after you authorized this program")
flickr.get_token_part_two((token, frob))
photos = flickr.photosets_getPhotos(photoset_id='12453463463252553')
for elm in photos.getchildren()[0]:
title = elm.getchildren()[0].text
print ("%s") %(elm.get('title'))
Unfortunately it just spits out a index out of range index error.
I stuck with it and had a hand from a friend to come up with the following which works as planned:
import flickrapi
import os
from xml.etree import ElementTree
api_key = 'xxxx'
api_secret = 'xxxx'
flickr = flickrapi.FlickrAPI(api_key, api_secret)
(token, frob) = flickr.get_token_part_one(perms='write')
if not token: raw_input("Press ENTER after you authorized this program")
flickr.get_token_part_two((token, frob))
sets = flickr.photosets_getList(user_id='xxxx')
for set in sets.getchildren()[0]:
title = set.getchildren()[0].text
filename = "%s.txt" % (title)
f = open(filename,'w')
print ("Getting Photos from set: %s") % (title)
for photo in flickr.walk_set(set.get('id')):
f.write("%s" % (photo.get('title')))
Its quite easy if you use python-flickr-api. The complicated part is getting authorization from flickr to access private information.
Here is some (untested) code you can use:
import os
import flickr_api as flickr
# If all you want to do is get public information,
# then you need to set the api key and secret
flickr.set_keys(api_key='key', api_secret='sekret')
# If you want to fetch private/hidden information
# then in addition to the api key and secret,
# you also need to authorize your application.
# To do that, we request the authorization URL
# to get the value of `oauth_verifier`, which
# is what we need.
# This step is done only once, and we save
# the token. So naturally, we first check
# if the token exists or not:
if os.path.isfile('token.key'):
# This is the first time we are running,
# so get the token and save it
auth = flickr.auth.AuthHandler()
url = auth.get_authorization_url('read') # Get read permissions
session_key = raw_input('''
Please visit {} and then copy the value of oauth_verifier:'''.format(url))
if len(session_key.strip()):
# Save this token for next time
raise Exception("No authorization token provided, quitting.")
# If we reached this point, we are good to go!
# First thing we want to do is enable the cache, so
# we don't hit the API when not needed
# Fetching a user, by their username
user = flickr.Person.findByUserName('username')
# Or, we don't know the username:
user = flickr.Person.findByEmail('some#user.com')
# Or, if we want to use the authenticated user
user = flickr.test.login()
# Next, fetch the photosets and their corresponding photos:
photo_sets = user.getPhotosets()
for pset in photo_sets:
print("Getting pictures for {}".format(pset.title))
photos = pset.getPhotos()
for photo in photos:
# Or, just get me _all_ the photos:
photos = user.getPhotos()
# If you haven't logged in,
# photos = user.getPublicPhotos()
for photo in photos:
Hi this is my piece of code to get the number of friends a user has by python . It returns nothing. Can anyone tell me which access privilege should i grant or anything wrong what i have done so far?
friend_count = 0
q = urllib.urlencode({'SELECT friend_count FROM user WHERE uid': 784877761})
url = 'https://api.facebook.com/method/fql.query?query=' + q
request = urllib2.Request(url)
data = urllib2.urlopen(request)
doc = parse(data)
friend_count_node = doc.getElementsByTagName("friend_count")
test = friend_count_node[0].firstChild.nodeValue
Try replacing your code with this:
q = urllib.urlencode({'SELECT friend_count FROM user WHERE uid = 784877761'})
url = 'https://graph.facebook.com/fql?q=' + q
You should not need an access token to get the friend_count.
I have the following Handlers
First the user calls this Handler and gets redirected to Facebook:
class LoginFacebookHandler(BasicHandler):
def get(self):
user = self.auth.get_user_by_session()
if not user:
h = hashlib.new('sha512')
nonce = h.hexdigest()
logging.info("hash "+str(nonce))
memcache.set(str(nonce), True, 8600)
#facebook_uri = "https://www.facebook.com/dialog/oauth?client_id=%s&redirect_uri=%s&state=%s&scope=%s" % ("20773", "http://upstrackapp.appspot.com/f", str(nonce), "email")
data = {"client_id": 20773, "redirect_uri": "http://***.appspot.com/f", "state": str(nonce), "scope": "email"}
facebook_uri = "https://www.facebook.com/dialog/oauth?%s" % (urllib.urlencode(data))
After he authorized my app facebook redirects to the redirect URI (Handler):
class CreateUserFacebookHandler(BasicHandler):
def get(self):
state = self.request.get('state')
code = self.request.get('code')
logging.info("state "+state)
logging.info("code "+code)
if len(code) > 3 and len(state) > 3:
cached_state = memcache.get(str(state))
logging.info("cached_state "+str(cached_state))
if cached_state:
data = { "client_id": 20773, "redirect_uri": "http://***.appspot.com/f", "client_secret": "7f587", "code": str(code)}
graph_url = "https://graph.facebook.com/oauth/access_token?%s" % (urllib.urlencode(data))
logging.info("grph url "+graph_url)
result = urlfetch.fetch(url=graph_url, method=urlfetch.GET)
if result.status_code == 200:
fb_response = urlparse.parse_qs(result.content)
access_token = fb_response["access_token"][0]
token_expires = fb_response["expires"][0]
logging.info("access token "+str(access_token))
logging.info("token expires "+str(token_expires))
if access_token:
api_data = { "access_token": str(access_token)}
api_url = "https://graph.facebook.com/me?%s" % (urllib.urlencode(api_data))
logging.info("api url "+api_url)
api_result = urlfetch.fetch(url=api_url, method=urlfetch.GET)
if api_result.status_code == 200:
api_content = json.loads(api_result.content)
user_id = str(api_content["id"])
email = str(api_content["email"])
logging.info("user id "+str(user_id))
logging.info("email "+str(email))
h = hashlib.new('sha512')
password = h.hexdigest()
expire_data = datetime.now() + timedelta(seconds=int(token_expires))
user = self.auth.store.user_model.create_user(email, password_raw=password, access_token=access_token, token_expires=expire_data, fb_id=user_id)
self.response.write.out.write("error contacting the graph api")
self.response.out.write("access token not long enough")
self.response.out.write("error while contacting facebook server")
self.response.out.write("error no cached state")
self.response.out.write("error too short")
Mostly this works until the code tries to retrieve an access_token and I end up getting "error while contacting....".
The funny thing is, that I log all URLs, states etc. so I go into my Logs, copy&paste the URL that urlfetch tried to open (fb api->access_token) paste it into my browser and voilĂ I get my access_token + expires.
The same thing happens sometimes when the code tries to fetch the user information from the graph (graph/me).
The key problem is not facebook.
It is the AppEngine deployment process.
I always tested changes in the code live, not local, since the OAuth wouldn't properly work.
So the deployment -> flush casche -> flush database process seemed to have a certain delay causing artifacts to remain, which confused the code.
So if you have to test such things like OAuth live, I'd recommend deploying the changes as a new version of the app and after deployment you shall delete all data that could act as artifacts in the new version.