I want to use the pdfminer for extracting the text info. I have downloaded the pdfminer-20131113. I have installed the python in C:\python34.
Now using cmd, I am setting the path to the setup.py file of pdfminer.
and running the following command.
python setup.py install
But I am getting the below error.
> D:\pdfminer-20101226>python setup.py install
Traceback (most recent call last):
File "setup.py", line 3, in <module>
from pdfminer import __version__
File "D:\pdfminer-20101226\pdfminer\__init__.py", line 4
if __name__ == '__main__': print __version__
^
SyntaxError: invalid syntax
It seems to be some error in the setup.py file of pdfminer, which I am not sure how to resolve.
Also, I saw a pdf2txt.py file in the build folder of pdfminer. I tried to use that also as pdf2txt.py -o output.html pdffilename.pdf (with full path). but instead of converting it. it opens the pdf2txt.py file.
The PDFMiner project homepage states:
Written entirely in Python. (for version 2.4 or newer)
and further down:
Install Python 2.4 or newer. (Python 3 is not supported.)
so you'll have to install Python 2 to run this project.
Alternatively, you could try the Python 3 port, pdfminer3k; it hasn't seen any updates in 20 months, while PDFMiner does have more recent releases, so your mileage may vary.
This should solve your problem in Python 3
pip install pdfminer.six
pdfminer.six is a fork with Python 2+3 support using six. Last commit was 15 days ago.
Related
When I run
python3 -m pip install pyspatialite
I get the following error:
Collecting pyspatialite
Using cached https://files.pythonhosted.org/packages/cc/2a/ffb126f3e8890ab0da951a83906e54528a13ce4b913303dea8bed904e160/pyspatialite-3.0.1-alpha-0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-14jnmfoo/pyspatialite/setup.py", line 66
print "Is sphinx installed? If not, try 'sudo easy_install sphinx'."
^
SyntaxError: Missing parentheses in call to 'print'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-14jnmfoo/pyspatialite/
I don't understand the issue. Is there a syntax error in the module PySpatiaLite? What do I do about it?
I am using Python 3.5 and Linux Bash Shell in Windows 10. If there is any additional info needed, let me know in the comments and I will edit the question.
Seems like it is a known issue in python 3:
https://github.com/lokkju/pyspatialite/issues/27
print "Is sphinx installed? If not, try 'sudo easy_install sphinx'."
Seems that this library is written in Python 2.7 as they are using the Python 2 print statement. When pip3 runs the libraries setup.py the error you are receiving:
SyntaxError: Missing parentheses in call to 'print'
Is entirely expected as the correct Python 3 syntax would be:
print("Is sphinx installed? If not, try 'sudo easy_install sphinx'.")
You can either switch to Python 2.7 for writing code to interface with this, or reach out to the contributors for assistance. Looking at their documentation on PyPi (https://pypi.org/project/pyspatialite/) it looks like the project is still in Alpha and has not been updated with a new release since 2013. I wouldn't expect much in terms of Python 3 compatibility without forking the source and correcting it yourself.
EDIT
Looking at the GitHub commits (https://github.com/lokkju/pyspatialite/commits/master) a small amount of commits have been merged in since 2013, but I would still not expect Python 3 support.
i am starting to use Databases and am using MariaDB, Got that ready, but i want Python integration so i can get started on the program.
I have downloaded PyMySQL-0.7.10.tar.gz from the official python website
and have unzipped and have navigated an Command Prompt to there using cd (dir).
My command is:
"G:\Python\Portable\Portable Python 3.2.5.1\App\python.exe" setup.py install
(Yes, i am on Windows 10 and i am using Portable Python. This is because i learn it at school but also want to be able to work at it at home.)
The error it returns is the following:
Traceback (most recent call last):
File "setup.py", line 4, in <module>
version_tuple = __import__('pymysql').VERSION
File "C:\Users\Natan Samuel Geldorp.Remytop-PC\Downloads\PyMySQL-0.7.10\pymysql\__init__.py", line 28, in <module>
from .converters import escape_dict, escape_sequence, escape_string
File "C:\Users\Natan Samuel Geldorp.Remytop-PC\Downloads\PyMySQL-0.7.10\pymysql\converters.py", line 60
_escape_table[0] = u'\\0'
^
SyntaxError: invalid syntax
Does anyone know how to fix this?
-Natan
As requested, as an answer:
You need a newer Python version, in 3.0 to 3.2 the u prefix was forbidden, it was allowed again in version 3.3.
Since Portable Python is no longer being developed (according to their site) you need to pick an alternative. One suggested on the site is WinPython.
(Also the PyMySQL site states that the minimum required Python versions are either >= 2.6 or >= 3.3)
I have been using WinPython 2.2.5 with Python 2.7 and it works nice. The problem that I have is when I want to install additional libraries to use from the https://pypi.python.org repository.
For example I tried to install pdfminer which is in following link: https://pypi.python.org/pypi/pdfminer/
I have read that I can use pip install which is in the following path on my computer:
C:\WinPython-32bit-2.7.6.3\python-2.7.6\Scripts
On that directory I have saved the tar.gz file of pdfminer and from the windows command prompt on the aforementioned path I have typed:
pip install pdfminer(version number).tar.gz
It seems that it works fine, because there are no error messages, but when I open the winpython and in the command shell I put:
pdf2txt
to see if it works I got the following error message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'pdf2txt' is not defined
What am I doing wrong?
According to the documentation, "PDFMiner comes with two handy tools: pdf2txt.py and dumppdf.py." So, instead of trying to run pdf2txt.py by importing it, you need to run it as it shows in the example in the documentation, like this:
$ pdf2txt.py -o output.html samples/naacl06-shinyama.pdf
where output.html is the file that is created from the mined text, and samples/naac106-shinyama.pdf is the PDF you want to mine.
I am trying to install Cloudmonkey on a VM. I downloaded cloudmonkey and tried to run the following command "pip install cloudmonkey" and get the following error:
Collecting cloudmonkey
Using cached cloudmonkey-5.3.2.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "C:\Users[user]\AppData\Local\Temp\2\pip-build-gx5p5q5b\cloudmonkey\setup.py", line 50
print "If you're upgrading, run the following to enable parameter completion:"
^
SyntaxError: Missing parentheses in call to 'print'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\Users[user]\AppData\Local\Temp\2\pip-build
oudmonkey
Would someone be able to tell me what I am doing wrong with this install?
In source code, line 50 to 53 have four print statements without parenthesis in their setup.py. That is not compatible with python 3.x
It looks like from version 5.2 to 5.3, they added the print statements as upgrade notes. I recommend looking if there is an issue for compatibility, or you can download from source, remove those print statements, and then build/install.
Also, they have a docker image in github if you want to try that as well.
I just installed it using python 2.x and it was successful.
I would like to make the jump and get acquainted with Python 3.
I followed the instructions found here with the installation working flawlessly.
I'm also able to use the provided virtualenv to create enviroments for Python 2 and Python 3 (Followed the instuctions here.). Unfortunalty pip3 fails when no virtualenv is activated. I need to use it to install global modules for python3.
This is the error message:
± |master ✓| → pip3
Traceback (most recent call last):
File "/usr/local/bin/pip3", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/local/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 51
def _bypass_ensure_directory(name, mode=0777):
^
SyntaxError: invalid token
It looks like pip3 is trying to access distribute of python2. Is there any workaround for this?
I was having the same problem as you were and I had
export PYTHONPATH="/usr/local/lib/python2.7/site-packages:$PYTHONPATH"
in my ~/.bash_profile. Removing that line solved the problem for me. If you have that or something like it in your ~/.bashrc or ~/.bash_profile, try removing it.