I recently received an error such as:
File "/usr/local/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 287, in run_and_get_output
run_tesseract(**kwargs)
File "/usr/local/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 263, in run_tesseract
raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, "read_params_file: Can't open ][|~_}{=!#%&«§><:;—?¢°*#, Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.")
I have a python file where I specify the pytesseract location:
pytesseract.pytesseract.tesseract_cmd = r"to/my/path"
then I also included the tesseract and pytessearch in requirements and install tesseract-ocr in dockerfile.
I do not understand why it happens but can anyone assist?
I actually also copied my tesseract-ocr folder to image in dockerfile
COPY tesseract-ocr .
Edited:
Below is my requirements:
opencv-python==4.5.1.48
openpyxl==3.0.6
packaging==20.8
pandas==1.2.1
pathlib==1.0.1
patsy==0.5.1
pdfminer.six==20200517
pdfplumber==0.5.25
Pillow==8.1.0
prov==2.0.0
pycryptodome==3.9.9
pydot==1.4.1
PyMuPDF==1.16.14
pyparsing==2.4.7
PyPDF2==1.26.0
pytesseract==0.3.7
tesseract
Below is my dockerfile
FROM python:3.8.7-slim
WORKDIR /usr/src/app
ARG src_folder= "folder/"
ARG src_ocr= "Tesseract-OCR/"
COPY ${src_folder} .
COPY ${src_ocr} .
COPY requirements.txt .
# Install all the required dependencies
RUN apt-get update \
&& apt-get install -y \
build-essential \
cmake \
git \
wget \
unzip \
yasm \
pkg-config \
libswscale-dev \
libtbb2 \
libtbb-dev \
libjpeg-dev \
libpng-dev \
libtiff-dev \
libavformat-dev \
libpq-dev \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get --fix-missing update && apt-get --fix-broken install && apt-get install -y poppler-utils && apt-get install -y tesseract-ocr && \
apt-get install -y libtesseract-dev && apt-get install -y libleptonica-dev && ldconfig && apt install -y libsm6 libxext6 && apt install -y python-opencv
RUN pip install -r requirements.txt
# command to run on container start
CMD [ "python", "./folder/main.py" ]
You have two problems here...
The primary problem is a strange one. The apt-get package tesseract-ocr-eng is installed as a transient dependency of one of the other packages you install with apt-get:
# apt-get install tesseract-ocr-eng
...
tesseract-ocr-eng is already the newest version (1:4.00~git30-7274cfa-1).
and that package installs an English trained data file in the right place:
# ls -l /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
-rw-r--r-- 1 root root 4113088 Sep 15 2017 /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
and yet you get this error message suggesting that Tesseract can't find this file.
I did some Googling, and after trying a number of different things that allowed Tesseract to work, I came to this most concise solution to your problem. Just add this line near the end of your Dockerfile, like just before the last CMD line that sets the Docker command to be executed:
RUN wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata -O /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata 2> /dev/null
This command replaces the previously installed eng.traineddata file with another one that I found on the internet. It is much larger than the previously installed file:
# ls -l /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
-rw-r--r-- 1 root root 23466654 Feb 14 20:17 /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
By replacing the previously installed eng.traineddata file with this new version, your code starts to run fine. I didn't have your image data, obviously, so I had to change your code a bit to use my own image for testing. When I supplied an image with some text in it, I got back the text as the result of calling pytesseract.image_to_string. So this one fix should be all you need.
There is a second problem here. Your pytesseract.image_to_string call is being garbled somehow by the fact that you're breaking it across multiple lines. To fix just this one issue, you can edit the call so that the string constant is all on one line:
infor = pytesseract.image_to_string(im,
lang="eng",
config='--dpi 300 --psm 6 --oem 2 -c tessedit_char_blacklist=][|~_}{=!#%&«§><:;—?¢°*#,')
When I made just this change, the part of the error message you're getting about "Can't open ..." goes away. If you fix just that, you're left with the error message:
pytesseract.pytesseract.TesseractError: (1, "Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.")
It's interesting that if you apply just the first fix, both problems go away, as you don't get an error message at all. I don't know what's up with that.
I believe that I've given you all that you need to know. If you have additional problems, let us know. If you want me to share my versions of your Dockerfile and main.py files, I can do that.
Happy Tesseracting!
PS: I would recommend that you move the installation calls in your Dockerfile, the calls to apt-get and pip, to the top of the file. This way, you can modify the parts of your Dockerfile specific to your application later on in the file, and your image build will happen quickly rather than all of the long installation steps having to be done again. This is an important practice to understand when building Docker images. It will save you a ton of time watching long Docker image builds over and over again. I did this right away when working on your problem, and I could rebuild and run the next version of the Docker image in just a few seconds rather than it taking more than a minute to rebuild and run each new image.
I am following this tutorial using Google Colab.
When I run the line game.init(), I get this error:
ViZDoomErrorException: Could not initialize SDL video:
No available video device
I installed vizdoom as follows:
%%bash
# Install deps from
# https://github.com/mwydmuch/ViZDoom/blob/master/doc/Building.md#-linux
apt-get install build-essential zlib1g-dev libsdl2-dev libjpeg-dev \
nasm tar libbz2-dev libgtk2.0-dev cmake git libfluidsynth-dev libgme-dev \
libopenal-dev timidity libwildmidi-dev unzip
# Boost libraries
apt-get install libboost-all-dev
# Lua binding dependencies
apt-get install liblua5.1-dev
Colab is run on a machine in the cloud. It cannot send the display back to your local machine. That's why it said "no video device".
Add the line game.set_window_visible(False) in the Step 8 cell of the Jupyter Notebook. Unless explicitly mentioned, ViZDoom tries to launch the application, which is not supported in Colab.
I'm developing a Python tool which uses a sqlite3 virtual table with FTS5 (Full Text Search). I would like to know how to properly install from a tarball (or any other means) the needed requirements for my tool to work so I can pack them for portability.
Currently, I managed to install the latest release tarball of sqlite. However, when I execute:
python3 -c "import sqlite3; print(sqlite3.sqlite_version)"
# or
python2 -c "import sqlite3; print(sqlite3.sqlite_version)"
I get 3.11.0, while sqlite3 --version returns: 3.22.0 2018-01-22 18:45:57 0c55d179733b46d8d0ba4d88e01a25e10677046ee3da1d5b1581e86726f2alt1
The system version sqlite3 3.22 does support FTS5, as I do pragma compile_options; and get:
COMPILER=gcc-5.4.0 20160609
ENABLE_DBSTAT_VTAB
ENABLE_FTS4
**ENABLE_FTS5**
ENABLE_JSON1
ENABLE_RTREE
ENABLE_STMTVTAB
ENABLE_UNKNOWN_SQL_FUNCTION
HAVE_ISNAN
THREADSAFE=1
But, the python version, using this script returns this:
[(u'ENABLE_COLUMN_METADATA',), (u'ENABLE_DBSTAT_VTAB',), (u'ENABLE_FTS3',), (u'ENABLE_FTS3_PARENTHESIS',), (u'ENABLE_JSON1',), (u'ENABLE_LOAD_EXTENSION',), (u'ENABLE_RTREE',), (u'ENABLE_UNLOCK_NOTIFY',), (u'ENABLE_UPDATE_DELETE_LIMIT',), (u'HAVE_ISNAN',), (u'LIKE_DOESNT_MATCH_BLOBS',), (u'MAX_SCHEMA_RETRY=25',), (u'OMIT_LOOKASIDE',), (u'SECURE_DELETE',), (u'SOUNDEX',), (u'SYSTEM_MALLOC',), (u'TEMP_STORE=1',), (u'THREADSAFE=1',)]
Hence, my questions are:
Is there any way I could make a linux portable package for my app
with sqlite3 FTS5 support in both python and linux system?
Is there any way to link the python module sqlite3 to an specific
sqlite3 path?
I tried all of this in an Ubuntu 16.04 LTS, but I would like to work as well on CentOS 7.
Thank you very much in advance.
More details about the installation from the tarball that I did:
wget "https://www.sqlite.org/src/tarball/sqlite.tar.gz?r=release" -O sqlite.tar.gz
tar -xzvf sqlite.tar.gz
cd sqlite
./configure --enable-fts5
make
sudo make install
The easy way is to use apsw (Another Python SQLite Wrapper). Its API is just a little different from sqlite3 and you can't just pip-install it (unless you're okay with outdated version), but the rest is good and you can have the most recent features of SQLite.
wget https://github.com/rogerbinns/apsw/releases/download/3.22.0-r1/apsw-3.22.0-r1.zip
unzip apsw-3.22.0-r1.zip
cd apsw-3.22.0-r1
python setup.py fetch --sqlite build --enable-all-extensions install
Then,
import apsw
apsw.Connection(':memory:').cursor().execute('pragma compile_options').fetchall()
Returns:
[('COMPILER=gcc-5.4.0 20160609',),
('ENABLE_API_ARMOR',),
('ENABLE_FTS3',),
('ENABLE_FTS3_PARENTHESIS',),
('ENABLE_FTS4',),
('ENABLE_FTS5',),
('ENABLE_ICU',),
('ENABLE_JSON1',),
('ENABLE_RBU',),
('ENABLE_RTREE',),
('ENABLE_STAT4',),
('THREADSAFE=1',)]
The hard way is to compile Python with custom SQLite. More detail in this article by Charles Leifer.
I think is a linking problem! I followed the same install steps with you and got the same results:
$ python ./test.py
[(u'ENABLE_COLUMN_METADATA',), (u'ENABLE_FTS3',), (u'ENABLE_RTREE',), (u'ENABLE_UNLOCK_NOTIFY',), (u'ENABLE_UPDATE_DELETE_LIMIT',), (u'MAX_SCHEMA_RETRY=25',), (u'OMIT_LOOKASIDE',), (u'SECURE_DELETE',), (u'SOUNDEX',), (u'SYSTEM_MALLOC',), (u'TEMP_STORE=1',), (u'THREADSAFE=1',)]
NO
However, when you install something by configure/make/make install on Linux, it usually goes in /usr/local/lib. To make sure that python links on runtime against the correct .so I used LD_LIBRARY_PATH. In this case I got:
$ LD_LIBRARY_PATH=/usr/local/lib python ./test.py
[(u'COMPILER=gcc-4.8.5',), (u'ENABLE_FTS5',), (u'HAVE_ISNAN',), (u'TEMP_STORE=1',), (u'THREADSAFE=1',)]
YES
Additionally, when installing libraries, you might have to update ldconfig. On my system (Ubuntu 14.04):
$ sudo ldconfig
$ python ./test.py
[(u'COMPILER=gcc-4.8.5',), (u'ENABLE_FTS5',), (u'HAVE_ISNAN',), (u'TEMP_STORE=1',), (u'THREADSAFE=1',)]
YES
Notice that using LD_LIBRARY_PATH is not needed any more and python links against the correct lib. For this to happen you will need to have /usr/local/lib folder in your ld.so.conf somewhere... for me this is in:
$ grep -ir local /etc/ld.so.conf.d/
/etc/ld.so.conf.d/libc.conf:/usr/local/lib
Thank you for your answers #urban and #saaj. I found your answers constructive.
The problem I see to #saaj answer is that it requires extra packages, specifically apsw package, which is not compatible with pypy, for example. I could not manage to make it work, but may be my fault.
I really like #urban answer. I did the process and got it working. I marked this answer as correct.
However I would like to add my own answer. Is quite aggressive but it worked for me. I created an Ubuntu docker with the following Dockerfile:
FROM ubuntu:16.04
RUN apt-get update -y
RUN DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true apt-get install -y apt-utils tzdata
RUN DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true dpkg-reconfigure tzdata
RUN echo "Europe/Berlin" > /etc/timezone
RUN dpkg-reconfigure -f noninteractive tzdata
RUN apt-get update -y
RUN apt-get install -y git build-essential sudo
Afterwards, inside the Ubuntu docker I did. In the process I remove sqlite3 and installed its dependencies, that I found in the following article. Afterwards I reinstalled python.
sudo apt-get update -y
echo "[ - Removing sqlite3... ]"
sudo apt-get remove -y sqlite3
sudo apt-get purge -y sqlite3
echo "[ - Installing sqlite3 dependencies... ]"
sudo apt-get install -y build-essential bzip2 git libbz2-dev libc6-dev libgdbm-dev libgeos-dev liblz-dev liblzma-dev libncurses5-dev libncursesw5-dev libreadline6 libreadline6-dev libsqlite3-dev libssl-dev lzma-dev python-dev python-pip python-software-properties python-virtualenv software-properties-common sqlite3 tcl tk-dev tk8.5-dev wget
echo "[ - Installing sqlite3... ]"
sudo wget "https://www.sqlite.org/src/tarball/sqlite.tar.gz?r=release" -O sqlite.tar.gz &> /dev/null
sudo tar -xzvf sqlite.tar.gz
cd sqlite
sudo ./configure --enable-fts5
sudo make
sudo make install
cd ..
echo "[ - Reinstalling python... ]"
sudo apt-get remove -y python python3 python-dev
sudo apt-get install -y --reinstall python2.7 python3 python-dev
sudo apt-get install -y build-essential bzip2 git libbz2-dev libc6-dev libgdbm-dev libgeos-dev liblz-dev liblzma-dev libncurses5-dev libncursesw5-dev libreadline6 libreadline6-dev libsqlite3-dev libssl-dev lzma-dev python-dev python-pip python-software-properties python-virtualenv software-properties-common sqlite3 tcl tk-dev tk8.5-dev wget
I'm trying to install Python Ta-Lib in Ubuntu,but when I run:
pip install TA-Lib
I get this error:
Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-YfCSFn/TA-Lib/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-swmI7D-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-YfCSFn/TA-Lib/
I already installed:
sudo apt-get install python3-dev
and installed Ta-lib
How can I fix this?
I am able to load in python3.
Steps:
download from http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
untar tar -xvf ta-lib-0.4.0-src.tar.gz
cd /../ta-lib
./configure --prefix=/usr
make
sudo make install
sudo apt upgrade
pip install ta-lib or pip install TA-Lib
Check import talib
Seem like other people had this problem.
To quote the accepted answer:
Seems that your PiP can't access Setuptools as per the "import
setuptools" in the error. Try the below first then try running your
pip install again.
> sudo pip install -U setuptools
Or if it doesn't work to quote his comment:
Try this 'sudo -H pip install TA-Lib'
As Filipe Ferminiano said in comment if this still doesn't fix it then you can try what is said on this link .
To quote the accepted answer once again:
Your sudo is not getting the right python. This is a known behaviour of sudo in Ubuntu. See this question for more info. You need to make sure that sudo calls the right python, either by using the full path:
sudo /usr/local/epd/bin/python setup.py install
or by doing the following (in bash):
alias sudo='sudo env PATH=$PATH'
sudo python setup.py install
Here is the question he's talking about
Please give credit to the one of the accepted answer if it fix your problem.
This has always been a tricky one, but I had made a script that has served me loyally in several Ubuntu physical, VM, and server instances (including GitHub Actions).
It's a little long but it's comprehensive and has worked in every Ubuntu instance I've needed it for. It includes a few precautionary steps that have previously caused errors.
sudo apt update && sudo apt upgrade -y && sudo apt autoremove -y
sudo apt install wget -y
sudo add-apt-repository ppa:deadsnakes/ppa -y
sudo apt-get install build-essential -y
sudo apt install python3.10-dev -y
sudo apt-get install python3-dev -y
wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib
wget 'http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD' -O './config.guess'
wget 'http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD' -O './config.sub'
./configure --prefix=/usr
make
sudo make install
sudo rm -rf ta-lib
sudo rm -rf ta-lib-0.4.0-src.tar.gz
pip install ta-lib
It consists of many steps...
Update built-in packages using apt.
Add the deadsnakes repo.
Install build-essential and python-dev (python3-dev, python3.10-dev).
Download the TA-Lib tarball with wget.
Download the updated make recognizer files to prevent common issues with make and make install.
make TA-Lib and make install it.
Clean up the mess.
Install with pip to make sure everything worked out. (pip will install the latest version of TA-Lib, 0.4.24, even though we can only download the source for 0.4.0. This works fine.)
Because I use this frequently, I turned it into a gist, for the purpose of accessing the script directly with curl.
Just grab a raw link from the Gist page and use it like below.
curl https://gist.githubusercontent.com/preritdas/bunchofrandomstuffhere/install-talib-ubuntu.sh | sudo bash
Make sure you're sudo activated before running the command to prevent issues. It will run all the above commands as a script and install TA-Lib in about 4-5 minutes (average, in my experience).
Here's a shell recording of this working on a fresh server instance of Ubuntu 22.04.
All in all, I hope this helps; for me, it made a once frustrating and volatile process easy.
I have Python 3.4 (32-bit) installed, and I installed the python-libtorrent-0.16.16.win32.msi on top of that.
My test code says:
ImportError: DLL load failed: %1 is not a valid Win32 application.
My google results suggest this works fine with Python 2.7. Is that the solution? I have to down-grade my Python?
NO. Libtorrent doesn't support Python 3.
It compiles but doesn't work due to Python 3 utf8 handling difference.
There was an unsuccessful effort to make it work a while back
https://code.google.com/p/libtorrent/issues/detail?id=449
Current trunk even contains invalid Python 3 i.e.
http://sourceforge.net/p/libtorrent/code/HEAD/tree/trunk/bindings/python/setup.py
Line 70 > 'print cmdline'
For some reason there is an Ubuntu python3-libtorrent package which confuses people, but it definitely doesn't work, neither does manual compilation.
steps:
apt-get build-dep libtorrent-rasterbar
export 'PYTHON_VERSION=3.4'; export 'PYTHON=/usr/bin/python3.34'
./configure LDFLAGS="-L/usr/lib/python3.4/config-3.4m-x86_64-linux-gnu/" --enable-python-binding --enable-geoip=no
--with-boost-python=boost_python-py34
ldconfig
>> python
import libtorrent
ses = libtorrent.session()
ses.save_state()
"UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 0: invalid start byte"
It does support python3
This docker file works for me (libtorrent - local dir with the lib been checked out to version needed)
FROM debian:buster-slim
WORKDIR /app/libtorrent
COPY debian/backports.list /etc/apt/sources.list.d/
RUN apt-get update
RUN apt-get install -y -t buster-backports checkinstall
RUN apt-get install -y build-essential libboost-system-dev libboost-python-dev libboost-chrono-dev libboost-random-dev libssl-dev
RUN apt-get install -y autoconf automake libtool
ADD libtorrent /app/libtorrent
RUN ./autotool.sh
# RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.7 1
ENV PYTHON=/usr/bin/python3.7
RUN ./configure --enable-python-binding --with-libiconv
RUN make
RUN checkinstall
RUN ldconfig