Error when extracting info with youtube_dl - python

I'm currently making a program to transfer songs saved on my YT account into a Spotify playlist, and I am using youtube_dl to extract the meta data from the YT videos using the code below:
# use youtube_dl to collect the song name & artist name
video = youtube_dl.YoutubeDL({}).extract_info(
youtube_url, download=False)
song_name = video["track"]
artist = video["artist"]
When I was first made this project in March, the json that resulted from extract_info included the proper artist name, but now the artist, along with many other values (although not necessary for this task) are listed as None. Has anyone run into this issue? I'm considering a work around of not using the artist name to get the uri, but that would make it impossible to distinguish two songs with the same name. If anyone else has noticed this and has found a solution I'd love to hear it!

Setting the user-agent to Facebook's web crawler seems to solve this
youtube_dl.utils.std_headers['User-Agent'] = "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"

Here is the video I tried your code with: https://www.youtube.com/watch?v=QBxSQXbj6Go
And the code and output from your snippet below.
Script:
[http_offline#greenhat-32 LEARNING]$ cat tests2.py
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import youtube_dl
# use youtube_dl to collect the song name & artist name
youtube_url = 'QBxSQXbj6Go'
video = youtube_dl.YoutubeDL({}).extract_info(youtube_url, download=False)
song_name = video["track"]
artist = video["artist"]
print(artist, ' - ', song_name)
[http_offline#greenhat-32 LEARNING]$
Output:
[http_offline#greenhat-32 LEARNING]$ ./tests2.py
[youtube] QBxSQXbj6Go: Downloading webpage
[youtube] QBxSQXbj6Go: Downloading MPD manifest
REO Speedwagon - Keep Pushin'
[http_offline#greenhat-32 LEARNING]$
youtube-dl can't extract Artist, Release Date and other certain fields from any video, it has to be a video with the description format like the video I provided above.
PS. it won't help you, but they released a new youtube-dl version a few days ago, you might want to grab it.

Related

grabbing video title from yt-dlp command line output

from yt_dlp import YoutubeDL
with YoutubeDL() as ydl:
ydl.download('https://youtu.be/0KFSuoHEYm0')
this is the relevant bit of code producing the output.
what I would like to do is grab the 2nd last line from the output below, specifying the video title.
I have tried a few variations of
output = subprocess.getoutput(ydl)
as well as
output = subprocess.Popen( ydl, stdout=subprocess.PIPE ).communicate()[0]
the output I am attempting to capture is the 2nd last line here:
[youtube] 0KFSuoHEYm0: Downloading webpage
[youtube] 0KFSuoHEYm0: Downloading android player API JSON
[info] 0KFSuoHEYm0: Downloading 1 format(s): 22
[download] Destination: TJ Watt gets his 4th sack of the game vs. Browns [0KFSuoHEYm0].mp4
[download] 100% of 13.10MiB in 00:01
There is also documentation on yt-dlp on how to pull title from metadata or include as something in the brackets behind YoutubeDL(), but I can not quite figure it out.
This is part of the first project I am making in python. I am missing an understanding of many concepts any help would be much appreciated.
Credits: answer to question: How to get information from youtube-dl in python ??
Modify your code as follows:
from yt_dlp import YoutubeDL
with YoutubeDL() as ydl:
info_dict = ydl.extract_info('https://youtu.be/0KFSuoHEYm0', download=False)
video_url = info_dict.get("url", None)
video_id = info_dict.get("id", None)
video_title = info_dict.get('title', None)
print("Title: " + video_title) # <= Here, you got the video title
This is the output:
#[youtube] 0KFSuoHEYm0: Downloading webpage
#[youtube] 0KFSuoHEYm0: Downloading android player API JSON
#Title: TJ Watt gets his 4th sack of the game vs. Browns

Is there a possibility to access the video description in youtube_dl(Python package, not command line tool) without having to download .info.json?

Well, the question is mostly above. I want to access the video description without having to use writeinfojson.
And no, extract_info doesn't do this!
In the terminology of youtube_dl the "video description" is everything that's in the .info.json-files, written by the writeinfojson option.
yes, there is a function to extract information which include the video description:
from youtube_dl import YoutubeDL
ydl = YoutubeDL()
ydl.add_default_info_extractors()
info = ydl.extract_info('https://www.youtube.com/watch?v=71PD2f1ogyk', download=False)
print (info['description'])

Is there a way to get YT url or video ID from playlist with pafy?

I am trying to make a program that takes YT playlist and play all it's content.
I've installed all components needed for pafy to run with python3. Everything I've tried works as it's expected, except the bellow part of the code.
plurl = "https://www.youtube.com/playlist?list=PL634F2B56B8C346A2"
playlist = pafy.get_playlist(plurl)
url = playlist['items'][21]['pafy'].getbest().url
video = pafy.new(url)
When pafy.new() is called, gives an error because of too long url:
Need 11 character video id or the URL of the video. Got https://r2---sn-bavc5aoxu-nv4l.googlevideo.com/videoplayback?ms=au%2Crdu&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cexpire&mv=m&mt=1554899146&requiressl=yes&ip=37.157.173.53&pl=19&id=o-AGQZkyoEvykUGae7O4v_Ycmuj4jJBYdgafcfLBQ5S4Dd&mn=sn-bavc5aoxu-nv4l%2Csn-nv47lnsr&mm=31%2C29&source=youtube&lmt=1387649403290510&ei=POGtXJzdIo_ugAeEiL_wAQ&c=WEB&key=yt6&mime=video%2Fmp4&gir=yes&itag=18&clen=5461830&fvip=2&expire=1554920860&ratebypass=yes&dur=206.100&initcwndbps=1573750&ipbits=0&signature=AAA8B36CD3B402F587F874956595ACB928806C4F.D36C0A79E7F1727DB872425E696DBFC550AA7DF6
Is there a way I can get normal url or video ID ?
The videoid is also available in the url object. You can use
dir(<object>)
to see what properties are available.
id = playlist['items'][2]['pafy'].videoid
video = pafy.new('https://www.youtube.com/watch?v='+id)
use try and catch before using pafy.new , as some of the videos might not be available in the region.

Custom User-Agent in youtube-dl python script

I have the following piece of python code which calls youtube-dl and extracts the links that I need.
ydl = youtube_dl.YoutubeDL({'outtmpl': '%(id)s%(ext)s'})
with ydl:
result = ydl.extract_info(
url,
download=False
# We just want to extract the info
)
if 'entries' in result:
# Can be a playlist or a list of videos
video = result['entries'][0]
else:
# Just a video
video = result
if video:
return video
return None
But I want to use the custom User-Agent in this program. I know I can specify the custom User-Agent while using the youtube-dl in the command line.
Is there any way I can specify the custom user-agent in the program embedding youtube-dl.
Thanks
I used Github's code search to find user-agent in the YTDL codebase, ended up finding this piece of code that sets the user agent based on the command line.
So, all in all, just
import youtube_dl.utils
youtube_dl.utils.std_headers['User-Agent'] = 'my-user-agent'
to override it.

Connecting to YouTube API and download URLs - getting KeyError

My goal is to connect to Youtube API and download the URLs of specific music producers.I found the following script which I used from the following link: https://www.youtube.com/watch?v=_M_wle0Iq9M. In the video the code works beautifully. But when I try it on python 2.7 it gives me KeyError:'items'.
I know KeyErrors can occur when there is an incorrect use of a dictionary or when a key doesn't exist.
I have tried going to the google developers site for youtube to make sure that 'items' exist and it does.
I am also aware that using get() may be helpful for my problem but I am not sure. Any suggestions to fixing my KeyError using the following code or any suggestions on how to improve my code to reach my main goal of downloading the URLs (I have a Youtube API)?
Here is the code:
#these modules help with HTTP request from Youtube
import urllib
import urllib2
import json
API_KEY = open("/Users/ereyes/Desktop/APIKey.rtf","r")
API_KEY = API_KEY.read()
searchTerm = raw_input('Search for a video:')
searchTerm = urllib.quote_plus(searchTerm)
url = 'https://www.googleapis.com/youtube/v3/search?part=snippet&q='+searchTerm+'&key='+API_KEY
response = urllib.urlopen(url)
videos = json.load(response)
videoMetadata = [] #declaring our list
for video in videos['items']: #"for loop" cycle through json response and searches in items
if video['id']['kind'] == 'youtube#video': #makes sure that item we are looking at is only videos
videoMetadata.append(video['snippet']['title']+ # getting title of video and putting into list
"\nhttp://youtube.com/watch?v="+video['id']['videoId'])
videoMetadata.sort(); # sorts our list alphaetically
print ("\nSearch Results:\n") #print out search results
for metadata in videoMetadata:
print (metadata)+"\n"
raw_input('Press Enter to Exit')
The problem is most likely a combination of using an RTF file instead of a plain text file for the API key and you seem to be confused whether to use urllib or urllib2 since you imported both.
Personally, I would recommend requests, but I think you need to read() the contents of the request to get a string
response = urllib.urlopen(url).read()
You can check that by printing the response variable

Categories

Resources