Python: Install Tesseract for Windows 7 - python

My objective is to use OCR in Python 2.7 using Tesseract on a Windows 7 machine, but I am running into issues as for the installation process. I tried following the instruction here but the link to "tesseract-core-yyyymmdd.exe" and "tesseract-langs-yyyymmdd.exe" do not exist anymore and I can't find these .exe elsewhere online. Here's what I have done so far:
installed tesseract from its executable from official tesseract-ocr page.
installed via pip packages "wand", "PIL", "pyocr".
Now, if I do the following in Python:
from wand.image import Image
from PIL import Image as PI
import pyocr
import pyocr.builders
import io
No problem loading up these packages but pyocr.get_available_tools() gives me an empty list. I am sure this has to do with the missing installation .exe files above. Where can I find them? Is it something else that I am missing?

I just tried to set up pytesseract and it works ! I have windows 10 and python 2.7 installed.
all you need to do :
Download Visual basic C++ from http://aka.ms/vcpython27 and install it (common installation step)
Download tesseract from python via this link https://pypi.python.org/pypi/pytesseract
Unizip the file.
Go to the directory which contains the unizip file
Run this command " python setup.py install "
(Additional) to test if it's installed, go to your python shell and run this command " import pytesseract "
I hope it works !! Note pytesseract is google based OCR, it works similarly to tesseract.

Step [1] To install tesseract kindly visit
https://github.com/UB-Mannheim/tesseract/wiki
The latest installers can be downloaded from here:
e.g., tesseract-ocr-setup-3.05.02-20180621.exe, tesseract-ocr-w32-setup-v4.0.0-beta.1.20180608.exe, tesseract-ocr-w64-setup-v4.0.0-beta.1.20180608.exe (64 bit)
Step [2] Download Microsoft Visual C++ Compiler for Python 2.7 from the link given below
https://download.microsoft.com/download/7/9/6/796EF2E4-801B-4FC4-AB28-B59FBF6D907B/VCForPython27.msi
Step [3] Install pytesseract for binding for tesseract using pip
pip install pytesseract
Step [4] Furthermore you can install an image processing library in python, e.g., pillow:
pip install pillow
greetings!! you are done!! :)

PIP is a package manager for Python packages
Open cmd run pip search "pytesseract", you can see latest version
Run pip install pytesseract for latest version or pip install pytesseract==0.3.0 for version you want.
In windows python cmd run import pytesseract for sure installed was successful.

Install both and you are done
Binaries from:
https://github.com/UB-Mannheim/tesseract/wiki
Python Wrapper from here:
https://pypi.python.org/pypi/pytesseract

Related

TesseractNotFoundError Using Anaconda/Jupyter

I have installed Anaconda 2018.12 (Python 3.7 version). I am trying to test out the pytesseract module but I keep encountering:
TesseractNotFoundError: C:\Program Files (x86)\Tesseract-OCR\tesseract.exe is not installed or it's not in your path
I have done:
pip install Pillow (already installed it says)
pip install pytesseract (successful)
Tried to set the tesseract_cmd to the location of tesseract (but I can't find it)
I have searched for the tesseract.exe file but cannot find it anywhere on the system so I'm struggling to understand how do I reference/import the module into a jupyter notebook if it's already been consumed into anaconda?
The code I'm trying to run is:
from PIL import Image
import pytesseract
#pytesseract.pytesseract.tesseract_cmd = r"C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe"
text = pytesseract.image_to_string(Image.open('C:\Temp\IMG_1519.jpg'))
print(text)
I'm hoping it's simple user error but any assistance would be gratefully received. Many thanks, Ben
Quoting from the PyPi page:
Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine.
and (under prequisites):
Install Google Tesseract OCR (additional info how to install the engine on Linux, Mac OSX and Windows)
This means, that pytesseract is not a standalone module. It is a python wrapper for using the Google’s Tesseract-OCR Engine, which you need to install seperately

OpenCV - Can't import cv2 [duplicate]

How to install opencv with python 3.6 and anaconda 3.6?
I tried conda install -c https://conda.binstar.org/menpo opencv3
but i get the following error:
UnsatisfiableError: The following specifications were found to be in conflict:
- opencv3 -> python 2.7*
- python 3.6*
Use "conda info <package>" to see the dependencies for each package.
I am using Windows 10 64-bit, with python 3.6, and anaconda 3.6 installed.
Is it even available for python3.6 at the moment or should i rollback my python version to 3.5.*?
search anaconda prompt
open and run the command.
> pip install opencv-python
this single command help's you to install opencv easily.
you can take help from the video link below.
video link
From menpo file page, it shows that the OpenCV 3.2 binary there are only for Python 2.7/3.4/3.5 and on linux-64 platform
You may go to the this site to get the exact version you need.
opencv_python‑3.2.0‑cp36‑cp36m‑win_amd64.whl is the basic one.
opencv_python‑3.2.0+contrib‑cp36‑cp36m‑win_amd64.whl is the one
with opencv-contrib modules such as the text module for binding to tesseract OCR engine and many others.
Both binary are for OpenCV 3.2 with Python 3.6 binding for Windows 64-bit. To install it, 1) download the binary to local drive, 2) open your Anaconda command prompt and 3) type the command below in the directory the binary locates.
pip install opencv_python‑3.2.0+contrib‑cp36‑cp36m‑win_amd64.whl
Hope this help.
Update on 2018-02-22:
OpenCV 3.4.0 wheel files are now available in the unofficial site and replaced OpenCV 3.3.0
Update on 2019-01-30:
OpenCV 4.0.1 wheel files are now available in the unofficial site with CPython 3.5/3.6/3.7 support.
I managed to get it working by doing the following:
Download and install python3.6 from official python site
https://www.python.org/downloads/release/python-360/
Download and install Anaconda 4.4.0 from the official anaconda site
https://www.continuum.io/downloads
Open command line and run:
pip install opencv-python
Open command line and run:
pip install opencv-contrib-python
I am using Windows 10 and it worked for me.
It's pretty simple..
Install Anaconda 3.6. Check anaconda is added to System Variable Path.
Open CMD and type conda install -c conda-forge opencv.
This will install latest OpenCV version available (3.6).
Open IDE editor and try import cv2.
It will probably don't work...don't worry.
You have to add cv2 command to editor.
For Eclipse (with PyDev):
Create firs a project and then do the following:
For PyCharm:
cv2 module probably won't work. Go to the Anaconda folder/Lib/site-packages/cv2 and copy the file cv2.cp36-win_amd64.pyd to the site-packages folder. Rename it cv2.pyd
Now try to write a command... cv2.imread(). If auto-completition don't work, try cv2.cv2.imread().
This will work for sure.
I am using Python 3.6.2 and Anaconda 4.3.23 (It should also work with your case).
I did the following:
Download the Numpy version corresponding to your Python installation from here. In my case, I’ve used numpy-1.13.1+mkl-cp36-cp36m-win_amd64.whl
Download the OpenCV version corresponding to your Python installation from here. In my case, I’ve used opencv_python-3.3.0-cp36-cp36m-win_amd64.whl
Now go to the folder where you downloaded these files and run the following:
pip install numpy-1.13.1+mkl-cp36-cp36m-win_amd64.whl
pip install opencv_python-3.3.0-cp36-cp36m-win_amd64.whl
Note the Successfully installed … message after each command.
At this point, you should be able to play with OpenCV and Python. Let’s try a small test first. Start the Python interpreter or Jupyter Notebook and write:
import cv2
print(cv2.__version__)
If everything was correctly installed, you should see the version number of your OpenCV install, in my case this was 3.3.0.
I see you found a solution but this may be helpful for others. The package is not available for Python 3.6. You can check this by going to that package channel on anaconda.org and selecting the files tab. You will see the package tarballs with the Python version listed as py27, py34, py35,etc. This is a good way to check for Python versions of a specific package.
You can also run the following to see the package versions and Python versions available for your OS from the Anaconda channel:
conda search <package_name>
Or to search a particular channel and package you can do this:
conda search -c <channel_name> <package_name>
As of March 2018, OpenCV 3.4 can be installed directly from conda-forge or anaconda in Windows/OSX/Linux for Python 3.6
conda install -c conda-forge opencv
or
conda install -c anaconda opencv
Using:
conda install -c conda-forge opencv
worked for me
If you have installed anaconda then you should uninstall it, then try
pip install opencv_python‑3.2.0+contrib‑cp36‑cp36m‑win_amd64.whl
It worked for me.
Thank You.
I am using python 3.6 and the following worked for me:
Download and install opencv (Win pack) on your computer from the official website:
https://opencv.org/releases.html (I took version 3.4.2)
Go to the website of Christoph Gohlke and download the wheel file corresponding to your system. (I took opencv_python-3.4.2-cp36-cp36m-win_amd64.whl)
As mentioned on the website of Christoph Gohlke, make sure you installed 'numpy1.14' & 'mkl' package. Also make sure you use pip with version 9 or newer.
Start the 'Anaconda Prompt'
Change the directory in the 'Anaconda Prompt' to the folder where you downloaded the wheel file from Gohlke's website (via the MS-DOS command 'cd').
In the 'Anaconda Prompt' type 'pip install opencv_python-3.4.2-cp36-cp36m-win_amd64.whl') (change the name of the wheel file accordingly).
When starting spyder, test your installation as follows:
import cv2
print(cv2.__version__)
If the version is printed in the console (in my case 3.4.2), your installation was successful.
IMPORTANT REMARK:
If you created a dedicated environment within Anaconda (in my case 'py36'), make sure you installed spyder for this dedicated environment ('conda install spyder'). If not, your installation of opencv will not be recognised within the environment you are working in. Maybe this is obvious and straightforward but in my case I struggled to find this solution.
First Download Anaconda Python 3.6 from official site. After installing anaconda, simply open command prompt and type following statement and press enter of course -
conda install -c conda-forge opencv
It may take some time. After the completion, check your conda packages by typing conda list - opencv should be there.
However, Before proceed to install opencv, you can check whether opencv for python 3.6 is available or not. We can check it by typing conda info opencv in command prompt and press enter of course, you'll see following -
opencv 3.3.1 py36h20b85fd_1
---------------------------
file name : opencv-3.3.1-py36h20b85fd_1.tar.bz2
name : opencv
version : 3.3.1
build string: py36h20b85fd_1
build number: 1
channel : https://repo.anaconda.com/pkgs/main/win-64
size : 96.7 MB
arch : None
constrains : ()
license : BSD 3-clause
license_family: BSD
md5 : e65c68524073445511ace8ade7ae3641
platform : None
subdir : win-64
timestamp : 1512689066576
url : https://repo.anaconda.com/pkgs/main/win-64/opencv-3.3.1-py36h20b85fd_1.tar.bz2
dependencies:
jpeg >=9b,<10a
libpng >=1.6.32,<1.7.0a0
libtiff >=4.0.9,<5.0a0
numpy >=1.11.3,<2.0a0
python >=3.6,<3.7.0a0
vc 14.*
zlib >=1.2.11,<1.3.0a0
By this we can also get ensure that opencv 3.3.1 py36h20b85fd_1 is available. And this is available for python 3.6
I think this way is straight forward. Just install anaconda from official page and follow the image.
Using Anaconda3's package manager directly will be more reliable and cross-platform:
conda install opencv

Error installing from GitHub using pip on Windows

I am on a Windows machine and I want to install a Python module from GitHub using pip directly from IPython.
The simplest command that seems it should work is:
!pip install https://github.com/japerk/nltk-trainer.git
I have also tried:
!pip install https://github.com/japerk/nltk-trainer.git#egg=nltk-trainer
I've used variants including -vvv, etc.
However, I'm getting the following error. Why?
Cannot determine archive format of C:\Users\timo\AppData\Local\Temp\pip-build-183bwemw\nltk-trainer
go to the https://github.com/japerk/nltk-trainer and download the project zip file. extract the zip file and put it somewhere in your computer.
open command prompt in windows and go inside the folder that you extracted earlier(you must be in the folder that has setup.py file in it).
enter the following command: python setup.py install
python tries to install nltk-trainer . during installation some other dependency might be installed too. you need numpy and scipy to be installed. if any problem happened during installation of numpy or scipy try installing them manually first by using pip install numpy and pip install scipy
if you can't install numpy and scipy using pip command use the following link:
http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy
this site has compiled version of these libraries ( and other libraries if you need to install them too) you can download .whl file that is based on your python version and os architecture and install them using pip install filename.whl command ( you need to be in the folder that your whl file is) for example for python3.4 and 64bit operating system you may download scipy‑0.16.0‑cp34‑none‑win_amd64.whl file.

installing opencv for python in Ubuntu 14.04

I ran the following command on terminal
$ sudo apt-get install libopencv-dev python-opencv
This installed opencv version 2.4.10.
After that I open python in terminal and try to import opencv as follows
>> import cv2
This gives me an error :
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named cv2
I also tried using import cv, import opencv, etc. but I am getting the same error.
Do I need to follow some more steps to configure opencv for python ??
This happens when python cannot refer to your default site-packages folder where you have kept the required python files or libraries
Add these lines in the code:
import sys
sys.path.append('/usr/local/lib/python2.7/site-packages')
or before running the python command in bash move to /usr/local/lib/python2.7/site-packages directory. This is a work around if you don't want to add any thing to the code.
OR
try adding the following line in ~/.bashrc
export PATH=/usr/local/lib/python2.7/site-packages:$PATH
This happens when python cannot refer to your default site-packages folder where you have kept the required python files or libraries
Add these lines in the code:
import sys
sys.path.append('/usr/local/lib/python2.7/site-packages')
or before running the python command in bash move to /usr/local/lib/python2.7/site-packages directory. This is a work around if you don't want to add any thing to the code.
There is an installer for Ubuntu 16.04, and it may work well on Ubuntu 14.04, you could have a try. I have used it to install on Ubuntu 16.04 and it succeed!
An interactive installing script for install openCV on Ubuntu 16.04 LTS
The Opencv version(2.4.10) installed is for python2x version.
I think you are trying to use cv2 in python3x version (which might be set as default for python)
Open python2 on terminal (use command python2 instead of python)
>> import cv2
This will work.
I think it is better that you just install Anaconda python distribution.
https://www.continuum.io/downloads
You can find wealth of tutorials in the internet on how to install it in your system. And trust me, it is VERY EASY to install.
After you have install your Anaconda python distribution, you can install OpenCV 3.1 by the following commands. Note that you should have an internet connection.
# if you are using Anaconda for Python 2.7
conda install -c menpo opencv
The above code should install OpenCV 3.1 in your anaconda python 2.7
# if you are using Anaconda for Python 3.5
conda install -c menpo opencv3
The above code should install OpenCV 3.1 in your anaconda python 3.5
Then to verify that you have successfully install OpenCV 3.1 in your system, you can issue the following command in the python interpreter:
# import the opencv library
import cv2
# prints the version of the OpenCV installed in your system
cv2.__version__
That's it. I hope that helped you =)
Try using this:
sudo apt-get install python-opencv opencv-dev python-numpy python-dev

Obtaining PIL instead of Pillow for Python 2.7 64-bit on Windows

Pillow for Python seems to be completely broken. Every image produces an IOError: cannot identify image file. Using Python 2.6 (where I had PIL installed) works great. Does anyone know where to get hold of PIL-1.1.7.win-amd64-py2.7.exe now that http://www.lfd.uci.edu/~gohlke/pythonlibs/ has moved on to only offering Pillow?
EDIT: Please note that PIL 1.1.7 on Python 2.7 using Windows 64-bit is confirmed working when opening the same files, we just cannot find the installer.
Here you can find PIL-1.1.7.win-amd64-py2.7.exe
This blog by Christian explains process of compiling PIL for 64 bit Python in Windows 7 64-bit with Visual Studio 2010. At end of the blog, zip file containing compiled files for PIL and dependencies is also provided
Install Pillow or PIL from repository (option 1 or 2). I would recommend you to use Pillow instead of PIL. If options 1 and 2 would not help use option 3.
You don't need a separate installer for windows.
To install for Windows you can use easy_install:
easy_install Pillow
or pip
pip install Pillow
or just get Pillow source from Pillow repository
unpack and run
python setup.py install
Online help for Pillow is here

Categories

Resources