I've tried numerous methods to install Tesseract but I just can't seem to get it working. I'm on a mac and this is the error I keep getting
txt = pytesseract.image_to_string(image, lang='eng')
File "/Users/user/anaconda/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 161, in image_to_string
config=config)
File "/Users/user/anaconda/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 94, in run_tesseract
stderr=subprocess.PIPE)
File "/Users/user/anaconda/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/Users/user/anaconda/lib/python2.7/subprocess.py", line 1343, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Does anyone know how I can solve this issue with tesseract?
Anaconda does have conda Tesseract package available. Use below command to install it and try again.
conda install -c brown-data-science tesseract=3.05.00
If you need the latest vesrion Tesseract 4.00.00alpha, you may refer to github on the installation instructions. If you don't have Xcode or Homebrew installed, check this to install it.
Related
First of all I did everything mentioned here pytesseract-no such file or directory error
Still doesn't work. Now I'm using Pycharm IDE with following code:
from PIL import Image
import pytesseract
import subprocess
im = Image.open('test.png')
im.show()
subprocess.call(['tesseract','test.png','out'])
print pytesseract.image_to_string(Image.open('test.png'))
im.show() opens the image successfully.
subprocess.call() with tesseract test.png out also extracts the text
from the image..
but pytesseract.image_to_string() fails.
I don't get it. Why I am able to use tesseract in shell but not in python. And in python I can open same image but when used with tesseract Image can't be found.
Below you can see the error output.
File "/home/hamza-c/Schreibtisch/Android/JioShare/orc.py", line 7, in <module>
print pytesseract.image_to_string(Image.open('/home/hamza-c/Schreibtisch/Android/JioShare/test.png'))
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 162, in image_to_string
config=config)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 95, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1340, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I tested the code you mentioned in your question. It works fine. I was facing the same error
No such file or directory found
The problem was the directory containing 'tesseract.exe' was not added to the environment Variable. You should be able to run command 'tesseract' in command prompt.
if tesseract is not installed you can download it from tesseract
1: https://github.com/tesseract-ocr/tesseract/wiki and for windows use third party installer available here
maybe you need install tesseract ,if your os is centos, please enter
yum install tesseract
I've used the following command and it worked for me:
brew install tesseract
I solved my own question.
im = Image.open('test.png')
print pytesseract.image_to_string(im)
It's still unclear why it works when a reference is passed but not directly when I try to open image inside the parameter.
I am using Ubuntu 14.04. I have the following code:
import Image
import pytesseract
im = Image.open('test.png')
print pytesseract.image_to_string(im)
but I keep getting the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 540, in runfile
execfile(filename, namespace)
File "/home/chaitanya/pythonapp/localcopy.py", line 4, in <module>
print pytesseract.image_to_string(im)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 142, in image_to_string
config=config)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 75, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Both the python program and the image are in the same location.What could be the problem??
You need to install tesseract-ocr:
sudo apt-get install tesseract-ocr
If you're on windows and have PIP installed go to your project directory and run:
pip install tesseract-ocr
Based off of #padraic cunningham's answer which I tailored to my setting.
If you are on Linux (ubuntu 16, should not matter) and have a conda installation:
First search for what you need to be installing:
$ anaconda search -t conda tesserocr
You will get a few options, you need to look at the platforms and builds to identify what makes sense for you.
As I have python 3.6 and linux-64 I chose mcs07/tesserocr
To install:
$ conda install -c mcs07 tesserocr
That's it. I didn't need a restart of the terminal or anything. I just kept going.
I'm trying to make pytesser (downloadable here) work on my mac OS, but I don't succeed.
I installed Tesseract, PIL and all the dependencies.
I unzipped pytesser in my python lib folder and modified the script file into __init__.py
in the init file I modified the path to the tesseract.exe file as suggested here and here
that is:
tesseract_exe_name = 'my lib path/pytesser/tesseract' # Name of executable to be called at command line
that's what I get as error:
Traceback (most recent call last):
File "<pyshell#50>", line 1, in <module>
print image_to_string(picz)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pytesser/__init__.py", line 31, in image_to_string
call_tesseract(scratch_image_name, scratch_text_name_root)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pytesser/__init__.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1228, in _execute_child
raise child_exception
OSError: [Errno 8] Exec format error
it seems that the module does not manage to run the .exe file. I tried to change the path, add the extension .exe but I always get the same error.
Several solutions for a python tesseract wrapper:
Python-Tesseract:
First of get homebrew and brew install python, then easy_install https://bitbucket.org/3togo/python-tesseract/downloads/python_tesseract-0.9.1-py2.7-macosx-10.10-x86_64.egg
source: https://code.google.com/p/python-tesseract/wiki/HowToCompileForHomebrewMac
pytesseract:
This what I was using previously before getting python-tesseract, pip install pytesseract. Then you have to go to /usr/local/lib/python2.7/site-packages and go to pytesseract then pytesseract.py. Change the file path in the python script to where tesseract is located on your computer.
I'm trying to follow this example of pytesser (link) in a Mac Maverick.
>>> from pytesser import *
>>> im = Image.open('phototest.tif')
>>> text = image_to_string(im)
But, in the last line I get this error message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pytesser.py", line 31, in image_to_string
call_tesseract(scratch_image_name, scratch_text_name_root)
File "pytesser.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
But, I don't understand what I should do. The file phototest is in the same folder I'm running the script. How to fix this?
UPDATE:
When I try
brew install tesseract
I get this error:
Warning: It appears you have MacPorts or Fink installed.
Software installed with other package managers causes known problems for
Homebrew. If a formula fails to build, uninstall MacPorts/Fink and try again.
Error: You must `brew link libtiff libpng jpeg' before tesseract can be installed
I actually had the same error as you, which is how I found this post. I also have the solution to my problem, because you gave it to me!
I was seeing:
ryan.davis$ python tesseract.py
Traceback (most recent call last):
File "tesseract.py", line 52, in <module>
print (image_to_string(big))
File "/usr/local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 161, in image_to_string
config=config)
File "/usr/local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 94, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Want to know what I had to do to fix this? Exactly what you tried: brew install tesseract I had installed the tesseract python library, but hadn't installed it at the system level. So that solves my problem. How about yours?
I think you might have been distracted by this:
Warning: It appears you have MacPorts or Fink installed. Software
installed with other package managers causes known problems for
Homebrew. If a formula fails to build, uninstall MacPorts/Fink and try
again.
And not noticed your answer was already provided in the brew response:
You must brew link libtiff libpng jpeg before tesseract can be
installed.
So do:
brew link libtiff
brew link libpng
brew link jpeg
Then:
brew install tesseract
Finally:
:)
I've downloaded PyTesser and extracted it.
I was in the pytesser_v0.0.1 folder and tried to run the sample usage code in the python interpreter:
from pytesser import *
print image_file_to_string('fnord.tif')
and the output:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pytesser.py", line 44, in image_file_to_string
call_tesseract(filename, scratch_text_name_root)
File "pytesser.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1259, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
NOTE: I'm in Ubuntu 12.10 with Python 2.7.3
can anyone help me understand this error, and what can I do to fix it ?
This isn't as well documented as it could be, but if you are not on Windows you need to install the tesseract binary for your platform. On Ubuntu and other Debian based Linux distributions, apt-get install tesseract-ocr. Then you can run:
python pytesser.py
which uses the test files phototest.tif, fnord.tif and fonts_test.png to test the library.
For beginners on windows to use pytesseract:
Open command prompt
Type: pip install pytesseract
(this will install pytesseract last version module on your python easily)
Go to this link and download and install tesseract-ocr engine:
https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe&can=2&q=
Now you are ready to use pytesseract
For more information and see code example check this link:
http://www.manejandodatos.es/2014/11/ocr-python-easy/