I am trying to download some photos from Flickr. With My KEY and Secret, I am able to search and download using these lines of code
image_tag = 'seaside'
extras = ','.join(SIZES[0])
flickr = FlickrAPI(KEY, SECRET)
photos = flickr.walk(text=image_tag, # it will search by image title and image tags
extras=extras, # get the urls for each size we want
privacy_filter=1, # search only for public photos
per_page=50,
sort='relevance',
safe_search = 1 )
Using this I am able to acquire the url and the photo ID but I would like to download photostats too (likes, views), but I can't find an appropriate command that starting from the ID of the photo allows me to download the stats.
You can find what you are looking for exactly on Flickr web page, in the API's documentation:
https://www.flickr.com/services/api/flickr.stats.getPhotoStats.html
Calling the method:
flickr.stats.getPhotoStats
with arguments:
api_key, date, photo_id
You will receive what you look for in the following format:
<stats views="24" comments="4" favorites="1" />
Remember to generate before your authentication token, there is a link in this same page on how to generate it, if you still didn't.
Related
I am trying to get, for a list of a tweets with a media attached and with a specific hashtag, their:
text, author id, tweet id, creation data, retweet_count, like_count and image url.
I am, however, having some problem grabbing the image url of the media attachment.
This is one of my very poor (quite the novice here) attempts to do it:
client = tweepy.Client('bearer_token')
response = client.search_recent_tweets(
"#covid -is:retweet has:media",
max_results=100,
expansions="author_id,attachments.media_keys",
tweet_fields="created_at,public_metrics,attachments",
user_fields="username,name,profile_image_url",
media_fields="public_metrics,url,height,width,alt_text")
for tweet in response.data:
metric = tweet.public_metrics
print(f"{tweet.created_at}, {tweet.text}, {tweet.author_id} \n {metric['retweet_count']}, {metric['like_count']}")
for image in tweet.includes['media']:
media[image.media_key] = f"{image.url}"
print(media[image.media_key])
With this I can get the text of the tweet. If I use the same technique with Paginator to obtain more tweets I cannot even see the text....
Anyone knows how to retrieve both the text and the image url (preferably using Paginator) of a tweet ?
I'm trying to use the Python SDK for IBM Watson Language Translator v3, testing the beta functionality of translating actual documents. Below is my code:-
from ibm_watson import LanguageTranslatorV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
API = "1234567890abcdefg"
GATEWAY = 'https://gateway-lon.watsonplatform.net/language-translator/api'
document_list = []
"""The below authenticates to the IBM Watson service and initiates an instance"""
authenticator = IAMAuthenticator(API)
language_translator = LanguageTranslatorV3(
version='2018-05-01',
authenticator=authenticator
)
language_translator.set_service_url(GATEWAY)
submission = language_translator.translate_document(file="myfile.txt", filename="myfile.txt", file_content_type='text/plain', model_id=None, source='en', target='es', document_id=None)
document_list.append(submission.result['document_id'])
while len(document_list) > 0:
for document in document_list:
document_status = language_translator.get_document_status(document)
if document_status.result['status'] == "available":
translated_document = language_translator.get_translated_document(document)
document_list.remove(document)
language_translator.delete_document(document)
A few questions on this:-
When I check the content of 'translated_document', it doesn't actually contain any content. It contains the headers and the HTTP status of the response but no actually translated content
I decided to use CURL to download my uploaded document and instead of the actual content of the .txt file being uploaded for translation, when downloading the translated file via CURL, it appears that the content is the actual file name (myfile.txt) that is being submitted for translation as opposed to the content of the file.
Researching this and looking at the actual IBM Watson Github respository, it appears that I may have to read the content of 'myfile.txt' to a variable and then pass this variable as 'file={my_variable}' when submitting the translation but doesn't this defeat the object of being able to submit the actual documents for translation? How is this different to the conventional service offered?
Can anybody advise me as to what I'm doing wrong? I've tried multiple approaches (writing the value of 'translated_content' to a file) for example but I just don't seem to be able to grab the translated content nor can I seem to actually upload the content of the file to the service, instead I simply appear to submit the filename.
Thanks all
The file parameter of translate_document is supposed to be the actual content to be translated. I realize that's not clear from the documentation, but that's how the service works. So try passing the actual content you want translated in the file parameter.
I am trying to make a program that takes YT playlist and play all it's content.
I've installed all components needed for pafy to run with python3. Everything I've tried works as it's expected, except the bellow part of the code.
plurl = "https://www.youtube.com/playlist?list=PL634F2B56B8C346A2"
playlist = pafy.get_playlist(plurl)
url = playlist['items'][21]['pafy'].getbest().url
video = pafy.new(url)
When pafy.new() is called, gives an error because of too long url:
Need 11 character video id or the URL of the video. Got https://r2---sn-bavc5aoxu-nv4l.googlevideo.com/videoplayback?ms=au%2Crdu&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cexpire&mv=m&mt=1554899146&requiressl=yes&ip=37.157.173.53&pl=19&id=o-AGQZkyoEvykUGae7O4v_Ycmuj4jJBYdgafcfLBQ5S4Dd&mn=sn-bavc5aoxu-nv4l%2Csn-nv47lnsr&mm=31%2C29&source=youtube&lmt=1387649403290510&ei=POGtXJzdIo_ugAeEiL_wAQ&c=WEB&key=yt6&mime=video%2Fmp4&gir=yes&itag=18&clen=5461830&fvip=2&expire=1554920860&ratebypass=yes&dur=206.100&initcwndbps=1573750&ipbits=0&signature=AAA8B36CD3B402F587F874956595ACB928806C4F.D36C0A79E7F1727DB872425E696DBFC550AA7DF6
Is there a way I can get normal url or video ID ?
The videoid is also available in the url object. You can use
dir(<object>)
to see what properties are available.
id = playlist['items'][2]['pafy'].videoid
video = pafy.new('https://www.youtube.com/watch?v='+id)
use try and catch before using pafy.new , as some of the videos might not be available in the region.
I am having trouble getting a video entry which includes a link rel="edit". I need such an entry in order to be able to call DeleteVideoEntry(...) on it.
I am retrieving the video using GetYouTubeVideoEntry(youtube_id=XXXXXXX). My yt_service is initialized with a username, password, and a developer key. I use ProgrammaticLogin. This part seems to work fine. I use the same yt_service to upload said video earlier. Also, if I change the developer key to something bogus (during debugging) and try to authenticate, I get a 403 error. This leads me to believe that authentication works OK.
Needsless to say, the video entry retrieved with GetYouTubeVideoEntry(youtube_id=XXXXXXX) does not contain the edit link and I cannot use the entry in a DeleteVideoEntry(...) call.
Is there some special way to get a video entry which will contain a link element with a rel="edit"? Can anyone suggest some way to resolve my issue? Could this possibly be a bug?
Update:
For the records, when I tried getting the feed of all my uploads, and then looping through the video entries, the video entries do have an edit link. So using this works:
uri = 'http://gdata.youtube.com/feeds/api/users/%s/uploads' % username
feed = yt_service.GetYouTubeVideoFeed(uri)
for entry in feed.entry:
yt_service.DeleteVideoEntry(entry)
But this does not:
entry = yt_service.GetYouTubeVideoEntry(video_id = video.youtube_id)
yt_service.DeleteVideoEntry(entry)
Using the same yt_service.
I've just deleted youtube video using gdata and ProgrammaticLogin()
Here is some steps to reproduce:
import gdata.youtube.service
yt_service = gdata.youtube.service.YouTubeService()
yt_service.developer_key = 'developer_key'
yt_service.email = 'email'
yt_service.password = 'password'
yt_service.ProgrammaticLogin()
# video_id should looks like 'iu6Gq-tUsTc'
uri = 'https://gdata.youtube.com/feeds/api/users/%s/uploads/%s' % (username, video_id)
entry = yt_service.GetYouTubeUserEntry(uri=uri)
response = yt_service.DeleteVideoEntry(entry)
print response # True
yt_service.GetYouTubeVideoFeed(uri) works because GetYouTubeVideoFeed doesn't check uri and just calls self.Get(uri, ...) but originaly, I think, it expected 'https://gdata.youtube.com/feeds/api/videos' uri.
vice versa yt_service.GetYouTubeVideoEntry() use YOUTUBE_VIDEO_URI = 'https://gdata.youtube.com/feeds/api/videos' but this entry doesn't contains rel="edit"
Hope that helps you out
You can view the HTTP headers of the generated requests by setting the debug flag to true. This is as simple as:
yt_service = gdata.youtube.service.YouTubeService()
yt_service.debug = True
You can read about this in the documentation here.
I'm trying to crawl Youtube to retrieve information about a group of users (approx. 200 people). I'm interested in looking for relationships between the users:
contacts
subscribers
subscriptions
what videos they commented on
etc
I've managed to get contact information with the following source:
import gdata.youtube
import gdata.youtube.service
from gdata.service import RequestError
from pub_author import KEY, NAME_REGEX
def get_details(name):
yt_service = gdata.youtube.service.YouTubeService()
yt_service.developer_key = KEY
contact_feed = yt_service.GetYouTubeContactFeed(username=name)
contacts = [ e.title.text for e in contact_feed.entry ]
return contacts
I can't seem the get the other bits of information I need. The reference guide says that I can grab the XML feed from http://gdata.youtube.com/feeds/api/users/username/subscriptions?v=2 (for some arbitrary user). However, if I try to get other users' subscriptions, I get the a 403 error with the following message:
User must be logged in to access these subscriptions.
If I use the gdata API:
sub_feed = yt_service.GetYouTubeSubscriptionFeed(username=name)
sub = [ e.title.text for e in contact_feed.entry ]
then I get the same error.
How can I get these subscriptions without logging in? It should be possible, as you can access this information without logging in to the Youtube web-site.
Also, there seems to be no feed for the subscribers of particular user. Is this information available through the API?
EDIT
So, it appears this can't be done through the API. I had to do this the quick and dirty way:
for f in `cat users.txt`; do wget "www.youtube.com/profile?user=$f&view=subscriptions" --output-document subscriptions/$f.html; done
Then use this script to get out the usernames from the downloaded HTML files:
"""Extract usernames from a Youtube profile using regex"""
import re
def main():
import sys
lines = open(sys.argv[1]).read().split('\n')
#
# The html files has two <a href="..."> tags for each user: once for an
# image thumbnail, and once for a text link.
#
users = set()
for l in lines:
match = re.search('<a href="/user/(?P<name>[^"]+)" onmousedown', l)
if match:
users.add(match.group('name'))
users = list(users)
users.sort()
print users
if __name__ == '__main__':
main()
In order to access a user's subscriptions feed without the user being logged in, the user must check the "Subscribe to a channel" checkbox under his Account Sharing settings.
Currently, there is no direct way to get a channel's subscribers through the gdata API. In fact, there has been an outstanding feature request for it that has remained open for over 3 years! See Retrieving a list of a user's subscribers?.