Unable to generate PDF with wkhtmltopdf on a headless server - python

I have a HTML document that uses CSS files and a around a dozen PNG files generated out of matplotlib. To convert this HTML document into a PDF, I installed wkhtmltopdf, on a AWS EC2 Linux instance, in a manner described here.
On the command line, this works
wkhtmltopdf http://www.google.com output.pdf
However, the following Python snippet does not work.
reportCss = pathlib.Path(os.path.join(cssDir, 'report.css'))
w3Css = pathlib.Path(os.path.join(cssDir, 'w3.css'))
#
options = {
'page-size': 'A4',
'dpi': 720,
'margin-bottom': 5
}
css = [str(reportCss), str(w3Css)]
pdfkit.from_file(htmlFilePath, pdfFilePath, options=options, css=css)
I get this error:
QPainter::begin(): Returned false
Traceback (most recent call last):
File "dailyemail.py", line 55, in <module>
main()
File "dailyemail.py", line 44, in main
pdfFileName = pdfgen.buildPdfDoc()
File "/home/ubuntu/demo/py/pdfgen.py", line 58, in buildPdfDoc
pdfkit.from_file(htmlFilePath, pdfFilePath, options=options, css=css)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pdfkit/api.py", line 49, in from_file
return r.to_pdf(output_path)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pdfkit/pdfkit.py", line 181, in to_pdf
'%s ' %(' '.join(args)),e)
TypeError: not enough arguments for format string
Can you please advise what am I missing?

Turned out that the PDF folder wasn't existing. The error message is mid leading.

Related

How can i work with templates in Python-pptx

I know this module is not very popular but if you know the answer then please help me out with it.
My code is:
from pptx import Presentation
prs = Presentation('template.pptx')
title_slide_layout = prs.slide_layout[0]
# print(len(prs.slide_layout))
slide = prs.slides.add_slide(title_slide_layout)
title = slide.shapes.title
subtitle = slide.placeholders[1]
title.text = "Python 3.6 - Turtle Race"
subtitle.text = "Data Analytics&Visualization with random generated data"
prs.save("out.pptx")
An error I have got:
Traceback (most recent call last):
File "D:/!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton/turtleRace/presentationMaker.py", line 8, in <module>
prs = Presentation('template.pptx')
File "D:\!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton\turtleRace\venv\lib\site-packages\pptx\api.py", line 28, in Presentation
presentation_part = Package.open(pptx).main_document_part
File "D:\!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton\turtleRace\venv\lib\site-packages\pptx\opc\package.py", line 103, in main_document_part
return self.part_related_by(RT.OFFICE_DOCUMENT)
File "D:\!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton\turtleRace\venv\lib\site-packages\pptx\opc\package.py", line 136, in part_related_by
return self.rels.part_with_reltype(reltype)
File "D:\!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton\turtleRace\venv\lib\site-packages\pptx\opc\package.py", line 439, in part_with_reltype
rel = self._get_rel_of_type(reltype)
File "D:\!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton\turtleRace\venv\lib\site-packages\pptx\opc\package.py", line 491, in _get_rel_of_type
raise KeyError(tmpl % reltype)
KeyError: "no relationship of type 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument' in collection"
A picture of my project interpreter:
PICTURE
So why I have got this error?
It is an issue about the type when you save the file as Strict Open XML Presentation. Try the standard Presentation document.
You can get more informations about relations inside the file using opc-diag:
You can resolve error Here
Trying to fix a old file:
Extract
unzip <FILE> -d old-file
Repackage it into a new fresh file
opc repackage bad-file new-file.docx
diff of relationships
opc diff-item test.docx test-ok.docx .rels
I have found the solution!!!
Before I saved the file(called template) as Strict Open XML Presentation(.pptx)
and not as PowerPoint Presentation(.pptx)
It's now opening the file but now I have another error:
Traceback (most recent call last):
File "D:/!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!_!Piton/turtleRace/presentationMaker.py", line 9, in <module>
title_slide_layout = prs.slide_layout[0]
AttributeError: 'Presentation' object has no attribute 'slide_layout'
Everything is the same just the saving method in PowerPoint has changed.

How to fix PackageNotFoundError for exe files

It gives me the error only after I have converted it to an exe, works fine as a .py file
I tried to find the file missing and replace it but it still didn't work.
The error I get is:
Traceback (most recent call last):
File "tkinter_init_.py", line 1705, in call
File "CompilerGui.py", line 259, in
done = ttk.Button(window, text="Compile", command=lambda:finish(texts,
window, search_folder))
File "CompilerGui.py", line 210, in finish cb.the_main(q_list, values)
File "CompilerBase.py", line 323, in the_main
file_written = write_docx(values_dict, file_to_write)
File "CompilerBase.py", line 100, in write_docx
my_docx = docx.Document()
File "site-packages\docx\api.py", line 25, in Document
File "site-packages\docx\opc\package.py", line 128, in open
File "site-packages\docx\opc\pkgreader.py", line 32, in from_file
File "site-packages\docx\opc\phys_pkg.py", line 31, in new
docx.opc.exceptions.PackageNotFoundError: Package not found at
'C:\Users\LENOVO\AppData\Local\Temp_MEI92522\docx\templates\default.docx'
In your .spec file, I think you can add:
datas= [ ('C:\\Program Files\\Python36\\Lib\\site-packages\\docx\\templates\\*', 'docx\\templates' ) ],
in the Analysis section, to add the missing file to your exe. This, of course, assumes that the missing default.docx is in the specified folder.
I figured out the solution to the problem, it was looking for a folder that didn't exist. Here's how I fixed it : https://youtu.be/bB9RXak4eVY
Another fairly simple fix would be to just copy the default.docx into your app directory, change my_docx = docx.Document() to my_docx = docx.Document(docx='default.docx'), and add datas=[('default.docx', '.')] to your .spec file.

Cannot run tensorflow examples

I am trying to run this tensorflow example: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/text_classification_character_cnn.py
However it keeps failing at the stage to open the tar file. This is the error message I am getting:
Successfully downloaded dbpedia_csv.tar.gz 1613 bytes.
Traceback (most recent call last):
File "text_classification_character_cnn.py", line 110, in <module>
tf.app.run()
File "/Users/alechewitt/Envs/solar_detection/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "text_classification_character_cnn.py", line 87, in main
'dbpedia', test_with_fake_data=FLAGS.test_with_fake_data, size='large')
File "/Users/alechewitt/Envs/solar_detection/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/__init__.py", line 64, in load_dataset
return DATASETS[name](size, test_with_fake_data)
File "/Users/alechewitt/Envs/solar_detection/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/text_datasets.py", line 48, in load_dbpedia
maybe_download_dbpedia(data_dir)
File "/Users/alechewitt/Envs/solar_detection/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/text_datasets.py", line 40, in maybe_download_dbpedia
tfile = tarfile.open(archive_path, 'r:*')
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/tarfile.py", line 1672, in open
raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
Any help would be much appreciated
When you get that error, you can look at the downloaded dbpedia_dsv.tar.gz in a text editor, and you might find that it is actually a 404 webpage. The file you want seems to be available here as well (I found this link here):
https://drive.google.com/drive/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M
Download that file (at your own risk) and replace it manually. Then you can run your script again.
Open the tar file using the full path. BTW the link you gave is 404 not found.

SimpleIDML How to convert IDML to PDF?

I am new to INDD CC Server. I have Implemented Indesign server running on Windows. I need to convert IDML to PDF but having issues.
I have used SimpleIDML Python library to manipulate Adobe(r) IDML(r) files.
My sample script is
I2P.py
from simple_idml.indesign import indesign
idml_file = "/home/user/Project/EPS/media/test/2-idml/test001.idml"
indd_file = "/home/user/Project/EPS/media/test/InDesigndocument/test001.indd"
url_path = "http://192.168.1.1:12345/"
client_dir = "/home/user/Project/EPS/media/source"
server_dir = "/home/user/Project/EPS/media/server"
response = indesign.save_as(indd_file, [{
"fmt": "pdf",
"params": {"colorSpace": "CMYK"},
}],
url_path,
client_dir,
server_dir)[0]
with open("my_file.pdf", "w+") as f:
f.write(response)
In documentation :
response = indesign.save_as("/path_to_file.indd", [{
"fmt": "pdf",
"params": {"colorSpace": "CMYK"},
}],
"http://url-to-indesign-server:port",
"/path/to/client/workdir",
"/path/to/indesign-server/workdir")[0]
When i run I2P script throws me error as :
Traceback (most recent call last):
File "ItoP.py", line 12, in <module>
server_path)[0]
File "/home/user/eps2_env/local/lib/python2.7/site-packages/simple_idml/indesign/indesign.py", line 71, in new_func
logger, logger_extra)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/simple_idml/indesign/indesign.py", line 180, in save_as
responses = map(lambda fmt: _save_as(fmt), dst_formats_params)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/simple_idml/indesign/indesign.py", line 180, in <lambda>
responses = map(lambda fmt: _save_as(fmt), dst_formats_params)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/simple_idml/indesign/indesign.py", line 149, in _save_as
response = cl.service.RunScript(params)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/suds/client.py", line 542, in __call__
return client.invoke(args, kwargs)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/suds/client.py", line 602, in invoke
result = self.send(soapenv)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/suds/client.py", line 649, in send
result = self.failed(binding, e)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/suds/client.py", line 702, in failed
r, p = binding.get_fault(reply)
File "/home/user/eps2_env/local/lib/python2.7/site-packages/suds/bindings/binding.py", line 265, in get_fault
raise WebFault(p, faultroot)
suds.WebFault: Server raised fault: 'The specified script file can not be found: /home/user/Project/EPS/media/server/tmp9LVUWj/save_as.jsx'
Manually i can see dynamically created dir tmp9LVUWj inside server dir. Server path expecting on same time.
Not able to figure out how to set indesign-server/workdir and access in code and how to solve ? I have spend much time on this and not able find help or example code.
Or is there other python package to convert from IDML to PDF.
Thanks in advance
You wrote,
Manually I can see dynamically created dir tmp9LVUWj inside server
dir.
That is true, but that is not the error. It is stating that it cannot find a JSX file named save_as.jsx within that directory. Is that in fact the name of the JSX file that you were intending to place there, or the file that is residing there now?

HTTP header error using the Python SDK for Azure

I am starting with Microsoft Azure SDK for Python (https://github.com/Azure/azure-sdk-for-python), but I have problems.
I am using Scientific Linux and I have installed the SDK for Python 3.4 following the next steps:
(instead of the SDK directory)
python setup.py install
after that I created a simple script just to test the connection:
from azure.storage import BlobService
blob_service = BlobService(account_name='thename', account_key='Mxxxxxxx3w==' )
blob_service.create_container('testcontainer')
for i in blob_service.list_containers():
print(i.name)
following this documentation:
http://blogs.msdn.com/b/tconte/archive/2013/04/17/how-to-interact-with-windows-azure-blob-storage-from-linux-using-python.aspx
http://azure.microsoft.com/en-us/documentation/articles/storage-python-how-to-use-blob-storage/#large-blobs
but is not working, I always receive the same error:
python3 test.py
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/storage/storageclient.py", line 143, in _perform_request
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/storage/storageclient.py", line 132, in _perform_request_worker
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/http/httpclient.py", line 247, in perform_request
azure.http.HTTPError: The value for one of the HTTP headers is not in the correct format.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 21, in <module>
blob_service.create_container('testcontainer')
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/storage/blobservice.py", line 192, in create_container
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/__init__.py", line 905, in _dont_fail_on_exist
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/storage/blobservice.py", line 189, in create_container
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/storage/storageclient.py", line 150, in _perform_request
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/storage/__init__.py", line 889, in _storage_error_handler
File "/usr/local/lib/python3.4/site-packages/azure-0.9.0-py3.4.egg/azure/__init__.py", line 929, in _general_error_handler
azure.WindowsAzureError: Unknown error (The value for one of the HTTP headers is not in the correct format.)
<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidHeaderValue</Code><Message>The value for one of the HTTP headers is not in the correct format.
RequestId:b37c5584-0001-002b-24b8-c2c245000000
Time:2014-11-19T14:54:38.9378626Z</Message><HeaderName>x-ms-version</HeaderName><HeaderValue>2012-02-12</HeaderValue></Error>
Thanks in advance and best regards.
I have this exact same issue. I believe it's a library bug, but the author/s haven't had their say yet.
It looks like the response states the version, but it's actually giving you the header that's wrong. Its value should be "2014-02-14", you can do the fix shown in https://github.com/Azure/azure-sdk-for-python/pull/289 .
Hopefully this will be fixed and nobody will ever read this answer. Cheers!

Categories

Resources