how to fix a erro python tesseract error?

how to fix a erro python tesseract error? - python

I need to use python tesseract to extract text from a photo:
import pytesseract
from PIL import Image
img = Image.open('stest.png')
pytesseract.pytesseract.tesseract_cmd = 'D:\\python\\venv\\Scripts\\pytesseract.exe'
file_name = img.filename
file_name = file_name.split(".")[0]
text = pytesseract.image_to_string(img,lang=None, config='')
print(text)
with open(f'{file_name}.txt', 'w') as text_file:
text_file.write(text)
But the error appears as if it is not related to my code:
Traceback (most recent call last):
File "D:\python\pybotavito\main.py", line 13, in <module>
text = pytesseract.image_to_string(img,lang=None, config='')
File "D:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 413, in image_to_string
return {
File "D:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 416, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "D:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 284, in run_and_get_output
run_tesseract(**kwargs)
File "D:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 260, in run_tesseract
raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')

From the README page of pytesseract,
# If you don't have tesseract executable in your PATH, include the following:
pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'
# Example tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'
This line must point to the tesseract executable, NOT to the pytesseract python executable. Try to set the correct executable path. There is an example on where to find the executable in the comment bellow the corresponding line.

Related

Permission error: The process cannot access the file because it is being used by another process

I'm using windows 10, I'm using tesseract latest version for text recognition, below is the sample code I'm using now. But sometimes for some images, it gives the following.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-
OCR\tesseract.exe'
config = ("-l eng --oem 1 --psm 7")
text=pytesseract.image_to_string(cv2.imread(r'C:\Users\Kesavan\Desktop\project\text Recogniton\opencv-text-recognition\images\cat.jpg'),config=config)
print(text)
Traceback (most recent call last):
File "C:/Users/Kesavan/PycharmProjects/Project/text recogniton/tester.py", line 6, in <module>
text=pytesseract.image_to_string(cv2.imread(r'C:\Users\Kesavan\Desktop\project\text Recogniton\opencv-text-recognition\images\cat.jpg'),config=config)
File "C:\Users\Kesavan\PycharmProjects\learning\venv\Learnings\lib\site-packages\pytesseract\pytesseract.py", line 350, in image_to_string
}[output_type]()
File "C:\Users\Kesavan\PycharmProjects\learning\venv\Learnings\lib\site-packages\pytesseract\pytesseract.py", line 349, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Users\Kesavan\PycharmProjects\learning\venv\Learnings\lib\site-packages\pytesseract\pytesseract.py", line 265, in run_and_get_output
return output_file.read().decode('utf-8').strip()
File "C:\Users\Kesavan\AppData\Local\Programs\Python\Python37\lib\contextlib.py", line 119, in __exit__
next(self.gen)
File "C:\Users\Kesavan\PycharmProjects\learning\venv\Learnings\lib\site-packages\pytesseract\pytesseract.py", line 177, in save
cleanup(f.name)
File "C:\Users\Kesavan\PycharmProjects\learning\venv\Learnings\lib\site-packages\pytesseract\pytesseract.py", line 134, in cleanup
raise e
File "C:\Users\Kesavan\PycharmProjects\learning\venv\Learnings\lib\site-packages\pytesseract\pytesseract.py", line 131, in cleanup
remove(filename)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Kesavan\\AppData\\Local\\Temp\\tess_jjp7wfoq.txt'
This is the image for which I'm facing the issue
Thanks in advance

try to upgrade to the latest pytesseract version or follow this workaround:
https://groups.google.com/d/msg/tesseract-ocr/IyPpisQ1E4U/tl3IP3gqAwAJ

unable to extract text from tif image using pytesseract in python

I am unable to extract text from .tif image file using pytesseract & PIL in Python.
It works well for .png, .jpg image file, it only gives error in .tif image file.
I am using Python 3.7.1 version
It gives below error while running Python code for .tif image file. Please let me know what I am doing wrong.
Fax3SetupState: Bits/sample must be 1 for Group 3/4 encoding/decoding.
Traceback (most recent call last):
File "C:/Users/u88ltuc/PycharmProjects/untitled1/Image Processing/Prog1.py", line 13, in <module>
image_to_text = pytesseract.image_to_string(image, lang='eng')
File "C:\Users\u88ltuc\PycharmProjects\untitled1\venv\lib\site-packages\pytesseract\pytesseract.py", line 347, in image_to_string
}[output_type]()
File "C:\Users\u88ltuc\PycharmProjects\untitled1\venv\lib\site-packages\pytesseract\pytesseract.py", line 346, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Users\u88ltuc\PycharmProjects\untitled1\venv\lib\site-packages\pytesseract\pytesseract.py", line 246, in run_and_get_output
with save(image) as (temp_name, input_filename):
File "C:\Program Files\Python37\lib\contextlib.py", line 112, in __enter__
return next(self.gen)
File "C:\Users\u88ltuc\PycharmProjects\untitled1\venv\lib\site-packages\pytesseract\pytesseract.py", line 171, in save
image.save(input_file_name, format=extension, **image.info)
File "C:\Users\u88ltuc\PycharmProjects\untitled1\venv\lib\site-packages\PIL\Image.py", line 2102, in save
save_handler(self, fp, filename)
File "C:\Users\u88ltuc\PycharmProjects\untitled1\venv\lib\site-packages\PIL\TiffImagePlugin.py", line 1626, in _save
raise OSError("encoder error %d when writing image file" % s)
OSError: encoder error -2 when writing image file
Below is the Python code for it.
#Import modules
from PIL import Image
import pytesseract
# Include tesseract executable in your path
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Create an image object of PIL library
image = Image.open(r'C:\Users\u88ltuc\Desktop\12110845-e001.tif')
# pass image into pytesseract module
image_to_text = pytesseract.image_to_string(image, lang='eng')
# Print the text
print(image_to_text)
Below is the tif image and its link:
https://ecat.aptiv.com/docs/default-source/ecatalog-documents/12110845-e001-tif.tif?sfvrsn=3ee3b8a1_0

Firstly,you should change your image extension.
This maybe can sovle your problem:
from PIL import Image
from io import BytesIO
import pytesseract
img = Image.open(r"C:\Users\u88ltuc\Desktop\12110845-e001.tif")
TempIO = BytesIO()
img.save(TempIO,format="JPEG")
img = Image.open(BytesIO(TempIO.getvalue()))
print(pytesseract.image_to_string(img))
Or if you don't mind your desktop have two same picture,you don't need to import BytesIO,and here it is:
from PIL import Image
import pytesseract
img = Image.open(r"C:\Users\u88ltuc\Desktop\12110845-e001.tif")
img.save(r"C:\Users\u88ltuc\Desktop\12110845-e001.jpg")
img = Image.open(r"C:\Users\u88ltuc\Desktop\12110845-e001.jpg")
print(pytesseract.image_to_string(img))

PyAutoGui - screencapture: cannot write file to intended destination

I am trying to find an image on my screen, however it it cannot seem to even save the screenshot? Any ideas?
code:
pyautogui.locateOnScreen('images/toolbox.jpg')
Error:
screencapture: cannot write file to intended destination, .screenshot2018-1106_00-06-22-111441.png
Traceback (most recent call last):
File "/Users/dirk/Desktop/firsttry/test.py", line 103, in <module>
a = pyautogui.locateOnScreen('images/toolbox.jpg')
File "/Users/dirk/Library/Python/2.7/lib/python/site-packages/pyscreeze/__init__.py", line 265, in locateOnScreen
screenshotIm = screenshot(region=None) # the locateAll() function must handle cropping to return accurate coordinates, so don't pass a region here.
File "/Users/dirk/Library/Python/2.7/lib/python/site-packages/pyscreeze/__init__.py", line 331, in _screenshot_osx
im = Image.open(tmpFilename)
File "/Library/Python/2.7/site-packages/PIL/Image.py", line 2609, in open
fp = builtins.open(filename, "rb")
IOError: [Errno 2] No such file or directory: '.screenshot2018-1106_00-06-22-111441.png'
[Finished in 0.8s with exit code 1]
[shell_cmd: python -u "/Users/dirk/Desktop/firstry/test.py"]
[dir: /Users/dirk/Desktop/firsttry]
[path: /opt/local/bin:/opt/local/sbin:/Library/Frameworks/Python.framework/Versions/3.7/bin:~/.composer/vendor/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin]

Go to pyscreeze/__init__.py (located either in virutalenv or inside your python folder) file, eg: "/Users/dirk/Library/Python/2.7/lib/python/site-packages/pyscreeze/__init__.py"
Navigate to line 327 or 331, inside function: def _screenshot_osx
Remove the . symbol in tempFilename = '.screenshot%s.png', so it should look like tempFilename = 'screenshot%s.png'

IO Error python PIL image preprocessing script

I am following this tutorial and specifically going through the "generate own data" section:
https://github.com/surfertas/deep_learning/tree/master/projects/imdbwiki-challenge
https://github.com/surfertas/deep_learning/blob/master/projects/imdbwiki-challenge/imdb_preprocess.py
and i am facing this issue running the imdb_preprocess.py script;
Dictionary created...
Converting 1000 samples. (0=all samples)
Traceback (most recent call last):
File "imdb_preprocess.py", line 137, in <module>
main()
File "imdb_preprocess.py", line 131, in main
create_and_dump(imdb_dict, args.partial)
File "imdb_preprocess.py", line 106, in create_and_dump
for img_path in imgs
File "/usr/lib64/python2.7/site-packages/scipy/misc/pilutil.py", line 156, in imread
im = Image.open(name)
File "/usr/lib64/python2.7/site-packages/PIL/Image.py", line 2477, in open
fp = builtins.open(filename, "rb")
IOError: [Errno 2] No such file or directory: u'/path/48/10000548_1925-04-04_1964.jpg'
Now i manually checked folder 48 and checked that the image is complaining about is indeed there.
Any hints on where the fault is?
path was replaced

Creating a PDF with scripting in Blender

I am an all around newbie here (new to Blender, new to Python, and new to coding in general) so please bear with me.
I have a Blender script that generates a specific geometry and then renders an image. In the same script, I would then like to create a PDF file containing that image.
I have two different pdf generation scripts that work perfectly fine outside of Blender (I am using Spyder) but if I run the same code in Blender, I run into problems.
Here is the first one:
import datetime
from reportlab.lib.enums import TA_JUSTIFY
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Image
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import mm
import os.path
formatted_date = datetime.date.today()
date_str = str(formatted_date)
full_name = "Nachname, Vorname"
fpath = "I:/MedTech_Projekte/NAM/Studenten/WenokorRebecca_SA/Spyder Scripts/"
fname = full_name + "_" + date_str
fcount = 0
fcounts = fname + "_" + str(fcount) + ".pdf"
while os.path.isfile(fcounts)==True:
fcount += 1
fcounts = fname + "_" + str(fcount) + ".pdf"
print(fcounts)
fname = fcounts
doc = SimpleDocTemplate(fpath + fname,pagesize=letter,
rightMargin=72,leftMargin=72,
topMargin=72,bottomMargin=18)
Story=[]
KRIlogo = fpath + "Klinikum_rechts_der_Isar_logo.png"
lg_res_x = 1920
lg_res_y = 1080
lg_w = 50
lg_h = lg_w * lg_res_y/lg_res_x
lg = Image(KRIlogo, lg_w*mm, lg_h*mm)
lg.hAlign = 'RIGHT'
Story.append(lg)
wireIm = fpath + "20170102_red_20170207-092526.png"
bl_res_x = 1920
bl_res_y = 1080
im_w = 60
im_h = im_w * bl_res_y/bl_res_x
im = Image(wireIm, im_w*mm, im_h*mm)
im.hAlign = 'LEFT'
Story.append(im)
styles=getSampleStyleSheet()
styles.add(ParagraphStyle(name='Justify', alignment=TA_JUSTIFY))
ntext = '<font size=12>%s</font>' % full_name
dtext = '<font size=12>%s</font>' % date_str
Story.append(Paragraph(ntext, styles["Normal"]))
Story.append(Spacer(1, 12))
Story.append(Paragraph(dtext, styles["Normal"]))
Story.append(Spacer(1, 12))
doc.build(Story)
Here is the second one:
import datetime
from reportlab.pdfgen import canvas
from reportlab.lib.units import mm
from reportlab.lib.utils import ImageReader
import os.path
formatted_date = datetime.date.today()
date_str = str(formatted_date)
full_name = "Nachname, Vorname"
fpath = "I:/MedTech_Projekte/NAM/Studenten/WenokorRebecca_SA/Spyder Scripts/"
fname = full_name + "_" + date_str
fcount = 0
fcounts = fname + "_" + str(fcount) + ".pdf"
while os.path.isfile(fcounts)==True:
fcount += 1
fcounts = fname + "_" + str(fcount) + ".pdf"
print(fcounts)
fname = fcounts
wireIm = fpath + "20170102_red_20170207-092526.png"
bl_res_x = 1920
bl_res_y = 1080
im_w = 60
im_h = im_w * bl_res_y/bl_res_x
WireImage = ImageReader(wireIm)
c = canvas.Canvas(fname)
c.drawImage(WireImage, 10, 10, width=60*mm)
c.showPage()
c.save()
Both scripts give me pretty much the same error:
Traceback (most recent call last):
File "I:\MedTech_Projekte\NAM\Studenten\WenokorRebecca_SA\BLENDER CODE\2016121
9 - Present\20170109 Face Align.blend\Text.002", line 58, in <module>
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\doctemplate.py", line 1200, in build
BaseDocTemplate.build(self,flowables, canvasmaker=canvasmaker)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\doctemplate.py", line 956, in build
self.handle_flowable(flowables)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\doctemplate.py", line 821, in handle_flowable
if frame.add(f, canv, trySplit=self.allowSplitting):
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\frames.py", line 167, in _add
w, h = flowable.wrap(aW, h)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\flowables.py", line 484, in wrap
return self.drawWidth, self.drawHeight
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\flowables.py", line 478, in __getattr__
self._setup_inner()
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\flowables.py", line 442, in _setup_inner
img = self._img
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\platypus\flowables.py", line 472, in __getattr__
self._img = ImageReader(self._file)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 807, in __init__
annotateException('\nfileName=%r identity=%s'%(fileName,self.identity()))
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 1387, in annotateException
rl_reraise(t,v,b)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 144, in rl_reraise
raise v
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 801, in __init__
annotateException('\nImaging Library not available, unable to import bitmaps
only jpegs\nfileName=%r identity=%s'%(fileName,self.identity()))
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 1387, in annotateException
rl_reraise(t,v,b)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 144, in rl_reraise
raise v
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\lib\utils.py", line 799, in __init__
self._width,self._height,c=readJPEGInfo(self.fp)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\pdfbase\pdfutils.py", line 243, in readJPEGInfo
x = struct.unpack('B', image.read(1))
struct.error: unpack requires a bytes object of length 1
Imaging Library not available, unable to import bitmaps only jpegs
fileName='I:/MedTech_Projekte/NAM/Studenten/WenokorRebecca_SA/Spyder Scripts/Kli
nikum_rechts_der_Isar_logo.png' identity=[ImageReader#0xac09ef0 filename='I:/Med
Tech_Projekte/NAM/Studenten/WenokorRebecca_SA/Spyder Scripts/Klinikum_rechts_der
_Isar_logo.png']
fileName='I:/MedTech_Projekte/NAM/Studenten/WenokorRebecca_SA/Spyder Scripts/Kli
nikum_rechts_der_Isar_logo.png' identity=[ImageReader#0xac09ef0 filename='I:/Med
Tech_Projekte/NAM/Studenten/WenokorRebecca_SA/Spyder Scripts/Klinikum_rechts_der
_Isar_logo.png']
Error: Python script fail, look in the console for now...
When I use jpeg instead of png, I get the following:
Bibliotheken/Dokumente/Spyder Scripts/20170102_red_20170207-092526.jpeg
Traceback (most recent call last):
File "I:\MedTech_Projekte\NAM\Studenten\WenokorRebecca_SA\BLENDER CODE\2016121
9 - Present\20170109 Face Align.blend\Text.001", line 37, in <module>
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\pdfgen\canvas.py", line 1237, in save
self._doc.SaveToFile(self._filename, self)
File "C:\Program Files\Blender Foundation\Blender\2.78\python\lib\site-package
s\reportlab\pdfbase\pdfdoc.py", line 218, in SaveToFile
f = open(filename, "wb")
PermissionError: [Errno 13] Permission denied: 'Nachname, Vorname_2017-02-10_0.p
df'
Error: Python script fail, look in the console for now...
A lot of online forums mention the need for PIL and/or pillow when working with images. I don't fully understand how I would use those libraries in my code, but if the code works without them in Spyder, I don't see why it would all of a sudden need them in Blender.
Any help is very much appreciated!!! Feel free to ask for more information if my question is not clear :)
Thanks!

Python provides an environment that allows you to run python code, the standard install contains the ability to read files and print text to the console running the python script, among a variety of other things.
To use functionality that isn't included with the standard python install we can install and use third party modules, the reportlab module that you are using to create pdf's is an example of a third party module. The reportlab module knows how to create a pdf file, if you want it to add an image to the pdf then it will use another module that knows how to read image files. If a module used to read images is not available then it can't get the image information needed to add the image to the pdf but it can still create pdf's without images.
When you install python, the main program and the various modules are installed into specific places that can be found when needed. An installation of blender contains it's own copy of python and it's standard library, which isn't setup to use any normal installation of python that you may have. As you have found, you can manually add items to blender's version of python but the failure within blender of a script that works in spyder (which is using the standard installation of python) indicates you have missed something.
The second error is due to permissions that prevent a normal user writing to the applications folder, this happens as you are only specifying a filename which leads to trying to create the file in the current directory. You should be able to fix this error by using a full path to the target file instead of just the filename.
You may want to look at the subprocess module to run your pdf creation script externally from blender, passing the location of the image created with blender as an argument. This will let you run your python script to automate the tasks in blender, and do the pdf generation in the same setup that you are using within spyder.

In your terminal type
sudo gnome-terminal
This will give you root access and then try running the code

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.