PyTesser simple usage error - python

I've downloaded PyTesser and extracted it.
I was in the pytesser_v0.0.1 folder and tried to run the sample usage code in the python interpreter:
from pytesser import *
print image_file_to_string('fnord.tif')
and the output:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pytesser.py", line 44, in image_file_to_string
call_tesseract(filename, scratch_text_name_root)
File "pytesser.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1259, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
NOTE: I'm in Ubuntu 12.10 with Python 2.7.3
can anyone help me understand this error, and what can I do to fix it ?

This isn't as well documented as it could be, but if you are not on Windows you need to install the tesseract binary for your platform. On Ubuntu and other Debian based Linux distributions, apt-get install tesseract-ocr. Then you can run:
python pytesser.py
which uses the test files phototest.tif, fnord.tif and fonts_test.png to test the library.

For beginners on windows to use pytesseract:
Open command prompt
Type: pip install pytesseract
(this will install pytesseract last version module on your python easily)
Go to this link and download and install tesseract-ocr engine:
https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe&can=2&q=
Now you are ready to use pytesseract
For more information and see code example check this link:
http://www.manejandodatos.es/2014/11/ocr-python-easy/

Related

Installing Graphviz for use with Python 3 on Windows 10

So I've had Python 3.6 on my Windows 10 computer for a while now, and today I just downloaded and installed the graphviz 0.8.2 (https://pypi.python.org/pypi/graphviz) package via the admin commandline with:
pip3 install graphviz
It was only after this point that I downloaded the Graphviz 2.38 MSI installer file and installed the program at:
C:\Program Files (x86)\Graphviz2.38
So then I tried to run this simple Python program:
from graphviz import Digraph
dot = Digraph(comment="The round table")
dot.node('A', 'King Arthur')
dot.node('B', 'Sir Bedevere the Wise')
dot.node('L', 'Sir Lancelot the Brave')
dot.render('round-table.gv', view=True)
But unfortunately, I received the following error when I try to run my Python program from commandline:
Traceback (most recent call last):
File "C:\Program Files\Python36\lib\site-packages\graphviz\backend.py", line 124, in render
subprocess.check_call(args, startupinfo=STARTUPINFO, stderr=stderr)
File "C:\Program Files\Python36\lib\subprocess.py", line 286, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\Program Files\Python36\lib\subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Program Files\Python36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Program Files\Python36\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\foldername\testing.py", line 11, in <module>
dot.render('round-table.gv', view=True)
File "C:\Program Files\Python36\lib\site-packages\graphviz\files.py", line 176, in render
rendered = backend.render(self._engine, self._format, filepath)
File "C:\Program Files\Python36\lib\site-packages\graphviz\backend.py", line 127, in render
raise ExecutableNotFound(args)
graphviz.backend.ExecutableNotFound: failed to execute ['dot', '-Tpdf', '-O', 'round-table.gv'], make sure the Graphviz executables are on your systems' PATH
Notice how what I've asked seems VERY similar to this question asked here:
"RuntimeError: Make sure the Graphviz executables are on your system's path" after installing Graphviz 2.38
But for some reason, adding those paths (suggested in the solutions at the link above) to the system variables isn't working, and I don't know why! I tried restarting the computer after adding the paths as well, still to no success. See the image below:
Although the other suggested solution, which was to add these few lines in front of my Python code, did work:
import os
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin/'
But here's the issue: I don't understand why adding to the environment variables didn't work, and this is my primary concern. So my question is this: why did adding those lines of code in front of the Python script work but changing the environment variables didn't? What do I need to do to get my script to run without adding those lines of code in front?
Can you please post the output you get when you type SET in a cmd window after setting the PATH environment variable?
Does it contain C:/Program Files (x86)/Graphviz2.38/bin/ ?
A cmd window must be restarted before updated environment variables become effective!

pytesseract-no such file or directory error

I am using Ubuntu 14.04. I have the following code:
import Image
import pytesseract
im = Image.open('test.png')
print pytesseract.image_to_string(im)
but I keep getting the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 540, in runfile
execfile(filename, namespace)
File "/home/chaitanya/pythonapp/localcopy.py", line 4, in <module>
print pytesseract.image_to_string(im)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 142, in image_to_string
config=config)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 75, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Both the python program and the image are in the same location.What could be the problem??
You need to install tesseract-ocr:
sudo apt-get install tesseract-ocr
If you're on windows and have PIP installed go to your project directory and run:
pip install tesseract-ocr
Based off of #padraic cunningham's answer which I tailored to my setting.
If you are on Linux (ubuntu 16, should not matter) and have a conda installation:
First search for what you need to be installing:
$ anaconda search -t conda tesserocr
You will get a few options, you need to look at the platforms and builds to identify what makes sense for you.
As I have python 3.6 and linux-64 I chose mcs07/tesserocr
To install:
$ conda install -c mcs07 tesserocr
That's it. I didn't need a restart of the terminal or anything. I just kept going.

how to make pytesser (Tesseract) work?

I'm trying to make pytesser (downloadable here) work on my mac OS, but I don't succeed.
I installed Tesseract, PIL and all the dependencies.
I unzipped pytesser in my python lib folder and modified the script file into __init__.py
in the init file I modified the path to the tesseract.exe file as suggested here and here
that is:
tesseract_exe_name = 'my lib path/pytesser/tesseract' # Name of executable to be called at command line
that's what I get as error:
Traceback (most recent call last):
File "<pyshell#50>", line 1, in <module>
print image_to_string(picz)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pytesser/__init__.py", line 31, in image_to_string
call_tesseract(scratch_image_name, scratch_text_name_root)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pytesser/__init__.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1228, in _execute_child
raise child_exception
OSError: [Errno 8] Exec format error
it seems that the module does not manage to run the .exe file. I tried to change the path, add the extension .exe but I always get the same error.
Several solutions for a python tesseract wrapper:
Python-Tesseract:
First of get homebrew and brew install python, then easy_install https://bitbucket.org/3togo/python-tesseract/downloads/python_tesseract-0.9.1-py2.7-macosx-10.10-x86_64.egg
source: https://code.google.com/p/python-tesseract/wiki/HowToCompileForHomebrewMac
pytesseract:
This what I was using previously before getting python-tesseract, pip install pytesseract. Then you have to go to /usr/local/lib/python2.7/site-packages and go to pytesseract then pytesseract.py. Change the file path in the python script to where tesseract is located on your computer.

image_to_string doesn't work in Mac

I'm trying to follow this example of pytesser (link) in a Mac Maverick.
>>> from pytesser import *
>>> im = Image.open('phototest.tif')
>>> text = image_to_string(im)
But, in the last line I get this error message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pytesser.py", line 31, in image_to_string
call_tesseract(scratch_image_name, scratch_text_name_root)
File "pytesser.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
But, I don't understand what I should do. The file phototest is in the same folder I'm running the script. How to fix this?
UPDATE:
When I try
brew install tesseract
I get this error:
Warning: It appears you have MacPorts or Fink installed.
Software installed with other package managers causes known problems for
Homebrew. If a formula fails to build, uninstall MacPorts/Fink and try again.
Error: You must `brew link libtiff libpng jpeg' before tesseract can be installed
I actually had the same error as you, which is how I found this post. I also have the solution to my problem, because you gave it to me!
I was seeing:
ryan.davis$ python tesseract.py
Traceback (most recent call last):
File "tesseract.py", line 52, in <module>
print (image_to_string(big))
File "/usr/local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 161, in image_to_string
config=config)
File "/usr/local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 94, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Want to know what I had to do to fix this? Exactly what you tried: brew install tesseract I had installed the tesseract python library, but hadn't installed it at the system level. So that solves my problem. How about yours?
I think you might have been distracted by this:
Warning: It appears you have MacPorts or Fink installed. Software
installed with other package managers causes known problems for
Homebrew. If a formula fails to build, uninstall MacPorts/Fink and try
again.
And not noticed your answer was already provided in the brew response:
You must brew link libtiff libpng jpeg before tesseract can be
installed.
So do:
brew link libtiff
brew link libpng
brew link jpeg
Then:
brew install tesseract
Finally:
:)

Could not call install_name_tool

Trying to use virutalenv version 1.6.4 (the latest at writing this post) on 10.7, Lion with yes Xcode 4 installed from mac app store, yet i'm getting the below error message:
New python executable in SUPENV/bin/python
Error [Errno 2] No such file or directory while executing command install_name_tool -change /System/Library/Fram.../Versions/2.7/Python #executable_path/../.Python SUPENV/bin/python
Could not call install_name_tool -- you must have Apple's development tools installed
Traceback (most recent call last):
File "/usr/local/bin/virtualenv", line 8, in <module>
load_entry_point('virtualenv==1.6.4', 'console_scripts', 'virtualenv')()
File "/Library/Python/2.7/site-packages/virtualenv-1.6.4-py2.7.egg/virtualenv.py", line 810, in main
never_download=options.never_download)
File "/Library/Python/2.7/site-packages/virtualenv-1.6.4-py2.7.egg/virtualenv.py", line 901, in create_environment
site_packages=site_packages, clear=clear))
File "/Library/Python/2.7/site-packages/virtualenv-1.6.4-py2.7.egg/virtualenv.py", line 1166, in install_python
py_executable])
File "/Library/Python/2.7/site-packages/virtualenv-1.6.4-py2.7.egg/virtualenv.py", line 843, in call_subprocess
cwd=cwd, env=env)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 672, in __init__
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1202, in _execute_child
OSError: [Errno 2] No such file or directory
Any hints on how to solve this problem... I guess the first would be to check if install_name_tool is present on my system, and then force virtualenv to use it...
thanks in advance!
You need to both install XCode, run it, and select the optional "command line tools" package and then install those. In more detail:
Download XCode from the App Store
Run the downloaded XCode binary from Applications or Launchpad
Select XCode->Preferences, then choose the "Downloads" tab
Click on the "Command Line Tools" selection and install those
Did you actually install Xcode 4? Downloading it from the App Store only downloads the installer for it. Then you need to run the installer; you should find the installer downloaded to /Applications. After you run it, you should find install_name_tool here:
$ which install_name_tool
/usr/bin/install_name_tool
With newer versions of virtualenv (at least from 1.8.4 on) it is no longer necessary to install the "Command Line Tools" package from Xcode.

Categories

Resources