I am trying to convert an HTML web page using wkhtmltopdf API and pdfkit library. But when I am entering the URL of any web page it is showing me this error.
Traceback (most recent call last):
File "c:\\Users\\Fai\\Desktop\\doc\\convert.py", line 13, in \<module\>
pdfkit.from_url("www.imdb.com/chart/top/", "imdb.pdf", configuration=config)
File "C:\\Python\\Python310\\lib\\site-packages\\pdfkit\\api.py", line 27, in from_url
return r.to_pdf(output_path)
File "C:\\Python\\Python310\\lib\\site-packages\\pdfkit\\pdfkit.py", line 201, in to_pdf
File "c:\\Users\\Fai\\Desktop\\doc\\convert.py", line 13, in \<module\>
pdfkit.from_url("www.imdb.com", "imdb.pdf", configuration=config)
File "C:\\Python\\Python310\\lib\\site-packages\\pdfkit\\api.py", line 27, in from_url
return r.to_pdf(output_path)
File "C:\\Python\\Python310\\lib\\site-packages\\pdfkit\\pdfkit.py", line 201, in to_pdf
self.handle_error(exit_code, stderr)
File "C:\\Python\\Python310\\lib\\site-packages\\pdfkit\\pdfkit.py", line 155, in handle_error
raise IOError('wkhtmltopdf reported an error:\\n' + stderr)
OSError: wkhtmltopdf reported an error:
Error: Failed loading page https://www.google.com (sometimes it will work just to ignore this error with --load-error-handling ignore)
Exit with code 1 due to network error: SslHandshakeFailedError
My code:
import os
from docx2pdf import convert
import pdfkit
path_wkhtmltopdf = r'C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe'
config = pdfkit.configuration(wkhtmltopdf=path_wkhtmltopdf)
PATH = "C:/Users/Fai/Desktop/doc"
os.chdir(PATH)
list_dir = os.listdir()
pdfkit.from_url("www.imdb.com", "imdb.pdf", configuration=config)
Related
I am trying to read geotagging data from a live stream online, here is my code:
import exiftool
def getVideo(url):
with exiftool.ExifToolHelper() as et:
metadata = et.getmetadata(url)
print(metadata)
getVideo("url/to/stream")
however, I got this error:
Traceback (most recent call last):
File "C:\Users\alexa\Documents\vtest2.py", line 9, in <module>
getVideo("url/to/stream")
File "C:\Users\alexa\Documents\vtest2.py", line 4, in getVideo
with exiftool.ExifToolHelper() as et:
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\exiftool\helper.py", line 101, in __init__
super().__init__(**kwargs)
File "C:\Python311\Lib\site-packages\exiftool\exiftool.py", line 300, in __init__
self.executable = executable or constants.DEFAULT_EXECUTABLE
^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\exiftool\exiftool.py", line 374, in executable
raise FileNotFoundError(f'"{new_executable}" is not found, on path or as absolute path')
FileNotFoundError: "exiftool.exe" is not found, on path or as absolute path
is there a better way to read metadata from a live stream?
I am trying to load sound files on a python script.
I installed, loaded the right version of modules, imported successfully librosa and soundfile, and even ffmpeg (which I found was a solution to this same error for mp3 files)
import os
import json
from scipy import signal
import librosa
[...]
y, sr = librosa.load(directory_path + filename + '.webm', mono = True)
My code works on notebooks (Kaggle) but somehow, and I can't figure out why, it doesn't work when uploaded on a computer cluster. And I really need the cluster for its computational power/memory. :/
The error:
Traceback (most recent call last):
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/librosa/core/audio.py", line 146, in load
with sf.SoundFile(path) as sf_desc:
File "/cluster/home/.local/lib/python3.8/site-packages/soundfile.py", line 740, in __init__
self._file = self._open(file, mode_int, closefd)
File "/cluster/home/.local/lib/python3.8/site-packages/soundfile.py", line 1264, in _open
_error_check(_snd.sf_error(file_ptr),
File "/cluster/home/.local/lib/python3.8/site-packages/soundfile.py", line 1455, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'input/00092.webm': File contains data in an unknown format.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "analysis_python.py", line 47, in <module>
y,sr = librosa.load(directory_path + filename + '.webm')
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/librosa/core/audio.py", line 163, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/librosa/core/audio.py", line 187, in __audioread_load
with audioread.audio_open(path) as input_file:
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/audioread/__init__.py", line 116, in audio_open
raise NoBackendError()
audioread.exceptions.NoBackendError
I perused the internet for a solution but no luck. Thank you for your help in advance.
Traceback (most recent call last):
File "C:\Users\Balmeet\PycharmProjects\text&checkBox\venv\lib\site-packages\pdf2image\pdf2image.py", line 441, in pdfinfo_from_path
proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE)
File "C:\Users\Balmeet\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 756, in __init__
restore_signals, start_new_session)
File "C:\Users\Balmeet\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1155, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Balmeet/PycharmProjects/tables/final.py", line 470, in <module>
pdf2img2.convert_to_image(pdf_file)
File "C:\Users\Balmeet\PycharmProjects\tables\pdf2img2.py", line 42, in convert_to_image
pages = convert_from_path(pdf_file, 600)
File "C:\Users\Balmeet\PycharmProjects\text&checkBox\venv\lib\site-packages\pdf2image\pdf2image.py", line 97, in convert_from_path
page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
File "C:\Users\Balmeet\PycharmProjects\text&checkBox\venv\lib\site-packages\pdf2image\pdf2image.py", line 468, in pdfinfo_from_path
"Unable to get page count. Is poppler installed and in PATH?"
pdf2image.exceptions.PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?
I have followed steps mentioned in the similar post and set path for poppler but still getting this error
thanks in advance:)
Download Poppler and save it in your folder where u have scripts and try executing with below.
from pdf2image import convert_from_path
pages = convert_from_path(f'dummy.pdf', poppler_path='poppler-20.12.1\\bin')
for page in pages:
n = 1
page.save(f'page{n}.png', 'PNG')
n += 1
I have tested this and I got the pdf file converted to image.
adding this to code working for me, please download poppler from here and extract https://blog.alivate.com.au/poppler-windows/
pages = convert_from_path('document1.pdf', poppler_path='C:/Users/poppler-0.68.0/bin')
I am trying to make an exe file with -py2exe Here is my code:
import requests
from bs4 import BeautifulSoup
import csv
def get_html(url):
r = requests.get(url)
return r.text
url = 'https://loadtv.info/'
html=get_html(url)
soup=BeautifulSoup(html, 'html.parser')
article = soup.findAll('p')
text=[]
article_text = ''
for element in article:
text.append(element.getText().encode('utf8'))
pages=soup.findAll('a',class_='more-link')
links=[]
for page in pages :
links.append(page.get('href'))
zo=[]
for i in links:
z=get_html(i)
soup=BeautifulSoup(z, 'html.parser')
t=[]
r=soup.findAll('p')
w=[]
tex=[]
for i in r :
w.append(i.getText().encode('utf8'))
for i in range(3,6):
tex.append(w[i])
zo.append(tex)
video=[]
for i in links:
z=get_html(i)
soup=BeautifulSoup(z, 'html.parser')
vid = soup.find("iframe").get('src')
video.append(vid)
da={'links': links,'text' :zo,'video':video}
with open('C:\\2\\123.txt','wb') as f:
writer=csv.writer(f)
for i in da:
for rows in da[i]:
writer.writerow([rows])
I get the following error in log file after compilation:
Traceback (most recent call last):
File "films.py", line 12, in <module>
File "films.py", line 9, in resource_path
NameError: global name 'os' is not defined
Traceback (most recent call last):
File "films.py", line 15, in <module>
File "films.py", line 9, in get_html
File "requests\api.pyc", line 70, in get
File "requests\api.pyc", line 56, in request
File "requests\sessions.pyc", line 488, in request
File "requests\sessions.pyc", line 609, in send
File "requests\adapters.pyc", line 497, in send
requests.exceptions.SSLError: [Errno 2] No such file or directory
I made exe by run in cmd python setup.py py2exe
I have no idea how to fix it. The problem is related to SSL certificate, but I have added it to my setup modules. Please help.
setup.py:
from distutils.core import setup
import py2exe
setup(
windows=[{"script":"films.py"}],
options={"py2exe": {"includes":["bs4","requests","csv"]}},
zipfile=None
)
In Python I am trying to copy a directory (actually, its the Jenkins jobs directory), but it contains symbolic links in it, when I copy I get:
Traceback (most recent call last):
File "BackupJenkinsJobs.py", line 272, in <module>
main()
File "BackupJenkinsJobs.py", line 208, in main
distutils.dir_util.copy_tree(JenkinsJobSrc, cleanJobsDir, preserve_symlinks=False)
File "c:\Python27\lib\distutils\dir_util.py", line 163, in copy_tree
verbose=verbose, dry_run=dry_run))
File "c:\Python27\lib\distutils\dir_util.py", line 163, in copy_tree
verbose=verbose, dry_run=dry_run))
File "c:\Python27\lib\distutils\dir_util.py", line 167, in copy_tree
dry_run=dry_run)
File "c:\Python27\lib\distutils\file_util.py", line 148, in copy_file
_copy_file_contents(src, dst)
File "c:\Python27\lib\distutils\file_util.py", line 32, in _copy_file_contents
fsrc = open(src, 'rb')
IOError: [Errno 22] invalid mode ('rb') or filename: 'C:\\Program Files (x86)\\Jenkins\\jobs\\AutoRunTemplate\\builds\\lastFailedBuild'
I am using the following code:
try:
distutils.dir_util.copy_tree(JenkinsJobSrc, cleanJobsDir, preserve_symlinks=False)
except distutils.errors.DistutilsFileError as e:
print("Unable to copy Jenkins jobs. Error: %s".format(e))
return
Any assistance would be appreciated on how to copy, ignoring the links, as preserve_symlinks doesn't appear to work.