Merging PDFs in python using convertapi

Merging PDFs in python using convertapi - python

I'm trying to use the module convertapi to merge PDFs in Python3.8. I've tried multiple ways but I'm unable to figure out the origin of the returned error. Here is my function:
def merger(output_path, input_paths):
dictFiles = {}
for i,path in enumerate(input_paths):
dictFiles[f'File[{i}]'] = path
convertapi.api_secret = 'my-api-secret'
result = convertapi.convert('merge', dictFiles, from_format = 'pdf')
result.save_files(output_path)
And here is the error that is returned:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 46, in handle_response
r.raise_for_status()
File "C:\Python\Python38\lib\site-packages\requests\models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url:
https://v2.convertapi.com/convert/pdf/to/merge?Secret=my-api-secret
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File "D:\Desktop\merger.py", line 46, in merger
result = convertapi.convert('merge', dictFiles, from_format = 'pdf')
File "C:\Python\Python38\lib\site-packages\convertapi\api.py", line 7, in convert
return task.run()
File "C:\Python\Python38\lib\site-packages\convertapi\task.py", line 26, in run
response = convertapi.client.post(path, params, timeout = timeout)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 16, in post
return self.handle_response(r)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 49, in handle_response
raise ApiError(r.json())
convertapi.exceptions.ApiError: Parameter validation error. Code: 4000. {'Files': ['Files array item
count must be greater than 0.']}
I am suspecting the error to come from the fact that the dict is created before the merging because when entering directly the dictionary in the covertapi.convert(), I'm not getting the same error:
def merger(output_path, input_paths):
convertapi.api_secret = 'my-api-secret'
convertapi.convert('merge', {
'Files[0]': 'path/to/file1.pdf',
'Files[1]': 'path/to/file2.pdf'
}, from_format = 'pdf').save_files(output_path)
Here a different error:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 46, in handle_response
r.raise_for_status()
File "C:\Python\Python38\lib\site-packages\requests\models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url:
https://v2.convertapi.com/convert/pdf/to/merge?Secret=my-api-secret
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File "D:\Desktop\merger.py", line 50, in merger
convertapi.convert('merge', {
File "C:\Python\Python38\lib\site-packages\convertapi\api.py", line 7, in convert
return task.run()
File "C:\Python\Python38\lib\site-packages\convertapi\task.py", line 26, in run
response = convertapi.client.post(path, params, timeout = timeout)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 16, in post
return self.handle_response(r)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 49, in handle_response
raise ApiError(r.json())
convertapi.exceptions.ApiError: Unable to download remote file. Code: 5007.
Note that here I'm note using PyPDF2 to merge files because I'm having some errors when the file contains some specific characters (mostly chinese characters).

If you will go to https://www.convertapi.com/pdf-to-merge and scroll down you easily will find snippet builder and amount all programming snippets you will find Python one.
convertapi.api_secret = 'Your_secret'
convertapi.convert('merge', {
'Files[0]': '/path/to/dpa.pdf',
'Files[1]': '/path/to/sample.pdf'
}, from_format = 'pdf').save_files('/path/to/dir')
And if you take some time to analyze snippet you will find that plural is used for Files array and not singular like in your code.
def merger(output_path, input_paths):
dictFiles = {}
for i,path in enumerate(input_paths):
dictFiles[f'File[{i}]'] = path
convertapi.api_secret = 'my-api-secret'
result = convertapi.convert('merge', dictFiles, from_format = 'pdf')
result.save_files(output_path)
convertapi.exceptions.ApiError: Parameter validation error. Code: 4000. {'Files': ['Files array item
count must be greater than 0.']}
As for the second error, you didn't provide the code so I can't help you.

Related

Exeeuted in model using python on triton inference server getting issue json parser

python3 /home/ubuntu/Deepthi/triton-inference-server-master/src/clients/python/examples/image_client.py -m /home/ubuntu/Deepthi/triton-inference-server-master/docs/model_repository/resnet50_netdef/1 -s INCEPTION /home/ubuntu/Deepthi/triton-inference-server-master/qa/images/mug.jpg
Traceback (most recent call last):
File "/home/ubuntu/Deepthi/triton-inference-server-master/src/clients/python/examples/image_client.py", line 403, in <module>
model_name=FLAGS.model_name, model_version=FLAGS.model_version)
File "/home/ubuntu/.local/lib/python3.6/site-packages/tritonhttpclient/__init__.py", line 471, in get_model_metadata
_raise_if_error(response)
File "/home/ubuntu/.local/lib/python3.6/site-packages/tritonhttpclient/__init__.py", line 57, in _raise_if_error
error = _get_error(response)
File "/home/ubuntu/.local/lib/python3.6/site-packages/tritonhttpclient/__init__.py", line 46, in _get_error
error_response = json.loads(response.read())
rapidjson.JSONDecodeError: Parse error at offset 0: The document is empty.
I am not getting why I am getting this error, everything is ready .

How to detect/trap error codes from convertapi so that my python app doesn't fail?

First, my apologies as a Python newbie that I'm asking this question. It probably has nothing at all to do with convertapi and more to do with my basic lack of understanding as to how to interact with APIs.
I'm reading a Google sheet to find embedded hyperlinks containing references to files (PDF, html, whatever) and then using convertapi to get a txt version so that I can do content analysis based on existence, count and proximity of various terms.
My question has to do with the convertapi.convert failing because (in this case) it turns out convertapi thinks the PDF is invalid (because I have tested the file # convertapi.com and it returned a 5002 error). I don't dispute the file may be bad - all I want to do is detect that convertapi.convert can't convert the file so that I can ignore it and move on.
My python code has a small function:
def convert_PDF_to_text(inputfilename):
result = convertapi.convert('txt', { 'File': inputfilename }, from_format = 'pdf')
result.save_files('converted_pdf_files')
...and while it works fine for some inputs there is a particular URL PDF that results in this output (including my own messages from program):
about to call convertapi.convert with filename (https://www.epa.gov/sites/production/files/2016-06/documents/2016_policy_order_revision_6-10-16.pdf)
yes this is the specific file causing the problem: https://www.epa.gov/sites/production/files/2016-06/documents/2016_policy_order_revision_6-10-16.pdf
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/client.py", line 46, in handle_response
r.raise_for_status()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://v2.convertapi.com/convert/pdf/to/txt?Secret=PIuLcqNVL8w4rc9Y
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./p1.py", line 244, in <module>
convert_PDF_to_text(source_URL)
File "./p1.py", line 63, in convert_PDF_to_text
result = convertapi.convert('txt', { 'File': inputfilename }, from_format = 'pdf')
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/api.py", line 7, in convert
return task.run()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/task.py", line 26, in run
response = convertapi.client.post(path, params, timeout = timeout)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/client.py", line 16, in post
return self.handle_response(r)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/client.py", line 49, in handle_response
raise ApiError(r.json())
convertapi.exceptions.ApiError: <exception str() failed>
I know it should be obvious just from the errors what I should check...but I'm too much of a newbie to Python and APIs to know how to decipher.
How do I test for errors so that my Python code doesn't abort?
Thanks in advance and again sorry for the basic question - yes I did search for answers and don't find anyone addressing my question, it's likely too simple...

All - disregard. I used try: & except: to manage this.

Difficulty creating an XLSM file

I'm trying to create an xlsm file using xlwings. Or openpyxl if not possible with xlwings.
I'm on Mac so I can't use PyWin32.
I'm running the following python code:
import xlwings as xw
b = xw.Book()
b.save('test_book2.xlsm')
When I run that code I receive the message:
When I click 'yes' to that message I receive the following error:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aeosa/appscript/reference.py", line 482, in __call__
return self.AS_appdata.target().event(self._code, params, atts, codecs=self.AS_appdata).send(timeout, sendflags)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aeosa/aem/aemsend.py", line 92, in send
raise EventError(errornum, errormsg, eventresult)
aem.aemsend.EventError: Command failed: Parameter error. (-50)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/matthewbell/Desktop/test.py", line 4, in <module>
b.save('test_book2.xlsm')
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/xlwings/main.py", line 704, in save
return self.impl.save(path)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/xlwings/_xlmac.py", line 244, in save
self.xl.save_workbook_as(filename=hfs_path, overwrite=True)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aeosa/appscript/reference.py", line 518, in __call__
raise CommandError(self, (args, kargs), e, self.AS_appdata) from e
appscript.reference.CommandError: Command failed:
OSERROR: -50
MESSAGE: Parameter error.
COMMAND: app(pid=6198).workbooks['Book1'].save_workbook_as(filename='Macintosh HD:Users:matthewbell:Desktop:test_book2.xlsm', overwrite=True)
What can I do to create an xlsm file?

Access neo4j via python-joern

When using Joern, I accessed the Neo4j database via python-joern with the following code.
from joern.all import JoernSteps
j = JoernSteps()
j.setGraphDbURL('http://localhost:7474/db/data/')
j.connectToDatabase()
res = j.runGremlinQuery('getFunctionsByName("main")')
for r in res: print r
Error like this
Traceback (most recent call last):
File "test.py", line 11, in <module>
res = j.runGremlinQuery('getFunctionsByName("main")')
File "/home/binbin/Downloads/python-joern-0.3.1/joern/all.py", line 44, in runGremlinQuery
return self.gremlin.execute(finalQuery)
File "/usr/local/lib/python2.7/dist-packages/py2neo-2.0-py2.7-linux-x86_64.egg/py2neo/ext/gremlin/__init__.py", line 36, in execute
response = self.resources["execute_script"].post({"script": script})
File "/usr/local/lib/python2.7/dist-packages/py2neo-2.0-py2.7-linux-x86_64.egg/py2neo/core.py", line 288, in post
raise_from(self.error_class(message, **content), error)
File "/usr/local/lib/python2.7/dist-packages/py2neo-2.0-py2.7-linux-x86_64.egg/py2neo/util.py", line 215, in raise_from
raise exception
py2neo.error.NoClassDefFoundError: javax/transaction/SystemException
How to fix it?

I had searched a lot for my question. Finally I found the solution here: https://github.com/fabsx00/python-joern/issues/14. Anyone who has got the same problem can see it.

KeyError when assigning ''praw.Reddit'' to variable

I could successfully connect to reddit's servers with oauth2 some time ago, but when running my script just now, I get a KeyError followed by a NoSectionError. Code is below followed by exceptions, (The code has been reduced to its essentials).
import praw
# Configuration
APP_UA = 'useragent'
...
...
...
r = praw.Reddit(APP_UA)
Error message:
Traceback (most recent call last):
File "D:\Directory\Python\lib\configparser.py", line 843, in items
d.update(self._sections[section])
KeyError: 'useragent'
A NoSectionError occurred when handling the above exception.
"During handling of the above exception, another exception occurred:"
'Traceback (most recent call last):
File "D:\Directory\Python\Projects\myprj for Reddit, globaloffensive\oddshotcrawler.py", line 19, in <module>
r = praw.Reddit(APP_UA)
File "D:\Directory\Python\lib\site-packages\praw\reddit.py", line 84, in __init__
**config_settings)
File "D:\Directory\Python\lib\site-packages\praw\config.py", line 47, in __init__
raw = dict(Config.CONFIG.items(site_name), **settings)
File "D:\Directory\Python\lib\configparser.py", line 846, in items
raise NoSectionError(section)
configparser.NoSectionError: No section: 'useragent'
[Finished in 0.2s]

Try giving it a user_agent kwarg.
r = praw.Reddit(useragent=APP_UA)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Merging PDFs in python using convertapi - python

Related

Exeeuted in model using python on triton inference server getting issue json parser

How to detect/trap error codes from convertapi so that my python app doesn't fail?

Difficulty creating an XLSM file

Access neo4j via python-joern

KeyError when assigning ''praw.Reddit'' to variable

Categories

Resources