I have the following piece of python code which calls youtube-dl and extracts the links that I need.
ydl = youtube_dl.YoutubeDL({'outtmpl': '%(id)s%(ext)s'})
with ydl:
result = ydl.extract_info(
url,
download=False
# We just want to extract the info
)
if 'entries' in result:
# Can be a playlist or a list of videos
video = result['entries'][0]
else:
# Just a video
video = result
if video:
return video
return None
But I want to use the custom User-Agent in this program. I know I can specify the custom User-Agent while using the youtube-dl in the command line.
Is there any way I can specify the custom user-agent in the program embedding youtube-dl.
Thanks
I used Github's code search to find user-agent in the YTDL codebase, ended up finding this piece of code that sets the user agent based on the command line.
So, all in all, just
import youtube_dl.utils
youtube_dl.utils.std_headers['User-Agent'] = 'my-user-agent'
to override it.
Related
Well, the question is mostly above. I want to access the video description without having to use writeinfojson.
And no, extract_info doesn't do this!
In the terminology of youtube_dl the "video description" is everything that's in the .info.json-files, written by the writeinfojson option.
yes, there is a function to extract information which include the video description:
from youtube_dl import YoutubeDL
ydl = YoutubeDL()
ydl.add_default_info_extractors()
info = ydl.extract_info('https://www.youtube.com/watch?v=71PD2f1ogyk', download=False)
print (info['description'])
I'm currently making a program to transfer songs saved on my YT account into a Spotify playlist, and I am using youtube_dl to extract the meta data from the YT videos using the code below:
# use youtube_dl to collect the song name & artist name
video = youtube_dl.YoutubeDL({}).extract_info(
youtube_url, download=False)
song_name = video["track"]
artist = video["artist"]
When I was first made this project in March, the json that resulted from extract_info included the proper artist name, but now the artist, along with many other values (although not necessary for this task) are listed as None. Has anyone run into this issue? I'm considering a work around of not using the artist name to get the uri, but that would make it impossible to distinguish two songs with the same name. If anyone else has noticed this and has found a solution I'd love to hear it!
Setting the user-agent to Facebook's web crawler seems to solve this
youtube_dl.utils.std_headers['User-Agent'] = "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
Here is the video I tried your code with: https://www.youtube.com/watch?v=QBxSQXbj6Go
And the code and output from your snippet below.
Script:
[http_offline#greenhat-32 LEARNING]$ cat tests2.py
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import youtube_dl
# use youtube_dl to collect the song name & artist name
youtube_url = 'QBxSQXbj6Go'
video = youtube_dl.YoutubeDL({}).extract_info(youtube_url, download=False)
song_name = video["track"]
artist = video["artist"]
print(artist, ' - ', song_name)
[http_offline#greenhat-32 LEARNING]$
Output:
[http_offline#greenhat-32 LEARNING]$ ./tests2.py
[youtube] QBxSQXbj6Go: Downloading webpage
[youtube] QBxSQXbj6Go: Downloading MPD manifest
REO Speedwagon - Keep Pushin'
[http_offline#greenhat-32 LEARNING]$
youtube-dl can't extract Artist, Release Date and other certain fields from any video, it has to be a video with the description format like the video I provided above.
PS. it won't help you, but they released a new youtube-dl version a few days ago, you might want to grab it.
I am trying to make a program that takes YT playlist and play all it's content.
I've installed all components needed for pafy to run with python3. Everything I've tried works as it's expected, except the bellow part of the code.
plurl = "https://www.youtube.com/playlist?list=PL634F2B56B8C346A2"
playlist = pafy.get_playlist(plurl)
url = playlist['items'][21]['pafy'].getbest().url
video = pafy.new(url)
When pafy.new() is called, gives an error because of too long url:
Need 11 character video id or the URL of the video. Got https://r2---sn-bavc5aoxu-nv4l.googlevideo.com/videoplayback?ms=au%2Crdu&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cexpire&mv=m&mt=1554899146&requiressl=yes&ip=37.157.173.53&pl=19&id=o-AGQZkyoEvykUGae7O4v_Ycmuj4jJBYdgafcfLBQ5S4Dd&mn=sn-bavc5aoxu-nv4l%2Csn-nv47lnsr&mm=31%2C29&source=youtube&lmt=1387649403290510&ei=POGtXJzdIo_ugAeEiL_wAQ&c=WEB&key=yt6&mime=video%2Fmp4&gir=yes&itag=18&clen=5461830&fvip=2&expire=1554920860&ratebypass=yes&dur=206.100&initcwndbps=1573750&ipbits=0&signature=AAA8B36CD3B402F587F874956595ACB928806C4F.D36C0A79E7F1727DB872425E696DBFC550AA7DF6
Is there a way I can get normal url or video ID ?
The videoid is also available in the url object. You can use
dir(<object>)
to see what properties are available.
id = playlist['items'][2]['pafy'].videoid
video = pafy.new('https://www.youtube.com/watch?v='+id)
use try and catch before using pafy.new , as some of the videos might not be available in the region.
My goal is to connect to Youtube API and download the URLs of specific music producers.I found the following script which I used from the following link: https://www.youtube.com/watch?v=_M_wle0Iq9M. In the video the code works beautifully. But when I try it on python 2.7 it gives me KeyError:'items'.
I know KeyErrors can occur when there is an incorrect use of a dictionary or when a key doesn't exist.
I have tried going to the google developers site for youtube to make sure that 'items' exist and it does.
I am also aware that using get() may be helpful for my problem but I am not sure. Any suggestions to fixing my KeyError using the following code or any suggestions on how to improve my code to reach my main goal of downloading the URLs (I have a Youtube API)?
Here is the code:
#these modules help with HTTP request from Youtube
import urllib
import urllib2
import json
API_KEY = open("/Users/ereyes/Desktop/APIKey.rtf","r")
API_KEY = API_KEY.read()
searchTerm = raw_input('Search for a video:')
searchTerm = urllib.quote_plus(searchTerm)
url = 'https://www.googleapis.com/youtube/v3/search?part=snippet&q='+searchTerm+'&key='+API_KEY
response = urllib.urlopen(url)
videos = json.load(response)
videoMetadata = [] #declaring our list
for video in videos['items']: #"for loop" cycle through json response and searches in items
if video['id']['kind'] == 'youtube#video': #makes sure that item we are looking at is only videos
videoMetadata.append(video['snippet']['title']+ # getting title of video and putting into list
"\nhttp://youtube.com/watch?v="+video['id']['videoId'])
videoMetadata.sort(); # sorts our list alphaetically
print ("\nSearch Results:\n") #print out search results
for metadata in videoMetadata:
print (metadata)+"\n"
raw_input('Press Enter to Exit')
The problem is most likely a combination of using an RTF file instead of a plain text file for the API key and you seem to be confused whether to use urllib or urllib2 since you imported both.
Personally, I would recommend requests, but I think you need to read() the contents of the request to get a string
response = urllib.urlopen(url).read()
You can check that by printing the response variable
I am having trouble getting a video entry which includes a link rel="edit". I need such an entry in order to be able to call DeleteVideoEntry(...) on it.
I am retrieving the video using GetYouTubeVideoEntry(youtube_id=XXXXXXX). My yt_service is initialized with a username, password, and a developer key. I use ProgrammaticLogin. This part seems to work fine. I use the same yt_service to upload said video earlier. Also, if I change the developer key to something bogus (during debugging) and try to authenticate, I get a 403 error. This leads me to believe that authentication works OK.
Needsless to say, the video entry retrieved with GetYouTubeVideoEntry(youtube_id=XXXXXXX) does not contain the edit link and I cannot use the entry in a DeleteVideoEntry(...) call.
Is there some special way to get a video entry which will contain a link element with a rel="edit"? Can anyone suggest some way to resolve my issue? Could this possibly be a bug?
Update:
For the records, when I tried getting the feed of all my uploads, and then looping through the video entries, the video entries do have an edit link. So using this works:
uri = 'http://gdata.youtube.com/feeds/api/users/%s/uploads' % username
feed = yt_service.GetYouTubeVideoFeed(uri)
for entry in feed.entry:
yt_service.DeleteVideoEntry(entry)
But this does not:
entry = yt_service.GetYouTubeVideoEntry(video_id = video.youtube_id)
yt_service.DeleteVideoEntry(entry)
Using the same yt_service.
I've just deleted youtube video using gdata and ProgrammaticLogin()
Here is some steps to reproduce:
import gdata.youtube.service
yt_service = gdata.youtube.service.YouTubeService()
yt_service.developer_key = 'developer_key'
yt_service.email = 'email'
yt_service.password = 'password'
yt_service.ProgrammaticLogin()
# video_id should looks like 'iu6Gq-tUsTc'
uri = 'https://gdata.youtube.com/feeds/api/users/%s/uploads/%s' % (username, video_id)
entry = yt_service.GetYouTubeUserEntry(uri=uri)
response = yt_service.DeleteVideoEntry(entry)
print response # True
yt_service.GetYouTubeVideoFeed(uri) works because GetYouTubeVideoFeed doesn't check uri and just calls self.Get(uri, ...) but originaly, I think, it expected 'https://gdata.youtube.com/feeds/api/videos' uri.
vice versa yt_service.GetYouTubeVideoEntry() use YOUTUBE_VIDEO_URI = 'https://gdata.youtube.com/feeds/api/videos' but this entry doesn't contains rel="edit"
Hope that helps you out
You can view the HTTP headers of the generated requests by setting the debug flag to true. This is as simple as:
yt_service = gdata.youtube.service.YouTubeService()
yt_service.debug = True
You can read about this in the documentation here.