I am working on event-based video processing. I am new to this now and I have some problems with loading the raw video file into python. Some suggestions would be greatly appreciated!
I first tried to read the file directly but got an OSError.
OSError: Failed to open camera C:\Users\masiy\Documents\MPhil\Master\raw_videos\recording_2022-02-20_20-50-22 !
Then I tried assertion.
path = ["Users", "masiy", "Documents", "MPhil", "Master", "raw_videos", "recording_2022-02-20_20-50-22"]
f = 'C:'
for p in path:
f = os.path.join(f, p)
print(f)
assert os.path.exists(f)
assert os.path.isfile(f)```
Traceback (most recent call last):File "C:/Users/masiy/PycharmProjects/mphil_testing/load_automotive_sequence.py", line 34, in <module>assert os.path.exists(f)AssertionErrorC:Users
Any help would be appreciated!
I tried to replace my video with a video downloaded online and the code worked perfectly fine. I am so confused now...
Related
I am processing a set of DICOM files, some of which have image information and some of which don't. If a file has image information, the following code works fine.
file_reader = sitk.ImageFileReader()
file_reader.SetFileName(fileName)
file_reader.ReadImageInformation()
However, if the file does not have image information, I get the following error.
Traceback (most recent call last):
File "<ipython-input-61-d187aed107ed>", line 5, in <module>
file_reader.ReadImageInformation()
File "/home/peter/anaconda3/lib/python3.7/site-packages/SimpleITK/SimpleITK.py", line 8673, in ReadImageInformation
return _SimpleITK.ImageFileReader_ReadImageInformation(self)
RuntimeError: Exception thrown in SimpleITK ImageFileReader_ReadImageInformation: /tmp/SimpleITK/Code/IO/src/sitkImageReaderBase.cxx:107:
sitk::ERROR: Unable to determine ImageIO reader for "/path/115.dcm"
If the DICOM file has no information, I would like to just ignore the file rather than calling ReadImageInformation(). Is there a way to check whether ReadImageInformation() will work before it is called? I tried the following and they are no different between files where ReadImageInformation() and files where it does not.
file_reader.GetImageIO()
file_reader.GetMetaDataKeys() # Crashes
file_reader.GetDimension()
I would just put an exception handler around it to catch the error. So it'd look something like this:
file_reader = sitk.ImageFileReader()
file_reader.SetFileName(fileName)
try:
file_reader.ReadImageInformation()
except:
print(fileName, "has no image information")
I want to:
Download audio files from Youtube
which I have done with pytube, however, it is formatted in mp4 even though I set only_audio to True.
then turn the audio files to numpy arrays
There are libraries that work on mp3, for example, pydub, but not mp4. When I tried moviepy, it failed because there is no video and therefore no framerate. I don't want to download the video because it will take much longer.
note that I want the audio, not the video.
How can:
download audio from youtube, and turn it into numpy arrays?
Thanks for any helps :)
EDIT
Thanks to the comments, I've managed to turn the mp4 into mp3 using ffmpeg
However, when I tried to turn it into numpy arrays using the code from this question, which looks like this:
def read(f, normalized=False):
"""MP3 to numpy array"""
a = pydub.AudioSegment.from_mp3(f)
y = np.array(a.get_array_of_samples())
if a.channels == 2:
y = y.reshape((-1, 2))
if normalized:
return a.frame_rate, np.float32(y) / 2**15
else:
return a.frame_rate, y
it raised this error:
Traceback (most recent call last):
File "C:\Users\myname\Google Drive\Python\Projects\Music\Downloads\Music Read.py", line 63, in <module>
print(read(x,True))
......
File "C:\Users\myname\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 1017, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
This is weird because as demonstrated below, the path should work perfectly
for f in os.listdir(path):
if (f.endswith(".mp3")):
print(f)
x = 'C:/Users/myname/Google Drive/Python/Projects/Music/Downloads/{}'.format(f)
print(os.path.exists(x))
print(open(x))
print(read(x,True))
outputs:
test-Copy.mp3
True
c:/users/myname/google drive/python/projects/music/downloads/test-copy.mp3
<_io.TextIOWrapper name='c:/users/myname/google drive/python/projects/music/downloads/test-copy.mp3' mode='r' encoding='cp1252'>
Also, when I input a file path that actually doesn't exist, it outputs a different error:
......
File "C:\Users\myname\AppData\Local\Programs\Python\Python36\lib\site-packages\pydub\utils.py", line 57, in _fd_or_path_or_tempfile
fd = open(fd, mode=mode)
FileNotFoundError: [Errno 2] No such file or directory: 'c:/users/myname/google drive/python/projects/music/downloads/hi'
How can use the code from this question to turn the mp3 into numpy arrays, if I can't, how else?
btw I'm running on Win10 with python 3.6
I really hope I have made myself clear enough, and again thanks in advance for any bits of advice :)
This is weird answering my own question but:
I got around the pydub issue by using this code:
def decode (fname):
# If you are on Windows use full path to ffmpeg.exe
cmd = ["C:/Users/allen/Google Drive/Python/Tools/ffmpeg-20190604-d3f236b-win64-static/bin/ffmpeg.exe", "-i", fname, "-f", "wav", "-"]
# If you are on W add argument creationflags=0x8000000 to prevent another console window jumping out
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
data = p.communicate()[0]
return np.fromstring(data[data.find(data)+4:], np.int16)
I'm trying to use a psm of 0 with pytesseract, but I'm getting an error. My code is:
import pytesseract
from PIL import Image
img = Image.open('pathToImage')
pytesseract.image_to_string(img, config='-psm 0')
The error that comes up is
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 126, in image_to_string
f = open(output_file_name, 'rb')
IOError: [Errno 2] No such file or directory:
'/var/folders/m8/pkg0ppx11m19hwn71cft06jw0000gp/T/tess_uIaw2D.txt'
When I go into '/var/folders/m8/pkg0ppx11m19hwn71cft06jw0000gp/T', there's a file called tess_uIaw2D.osd that seems to contain the output information I was looking for. It seems like tesseract is saving a file as .osd, then looking for that file but with a .txt extension. When I run tesseract through the command line with --psm 0, it saves the output file as .osd instead of .txt.
Is it correct that pytesseract's image_to_string() works by saving an output file somewhere and then automatically reading that output file? And is there any way to either set tesseract to save the file as .txt, or to set it to look for a .osd file? I'm having no issues just running the image_to_string() function when I don't set the psm.
You have a couple of questions here:
PSM error
In your question you mention that you are running "--psm 0" in the command line. However in your code snip you have "-psm 0".
Using the double dash, config= "--psm 0", will fix that issue.
If you read the tesseract command line documentation, you can specify where to output the text read from the image. I suggest you start there.
Is it correct that pytesseract's image_to_string() works by saving an output file somewhere and then automatically reading that output file?
From my usage of tesseract, this is not how it works
pytesseract.image_to_string() by default returns the string found on the image. This is defined by the parameter output_type=Output.STRING, when you look at the function image_to_string.
The other return options include (1) Output.BYTES and (2) Output.DICT
I usually have something like text = pytesseract.image_to_string(img)
I then write that text to a log file
Here is an example:
import datetime
import io
import pytesseract
import cv2
img = cv2.imread("pathToImage")
text = pytesseract.image_to_string(img, config="--psm 0")
ocr_log = "C:/foo/bar/output.txt"
timestamp_fmt = "%Y-%m-%d_%H-%M-%S-%f"
# ...
# DO SOME OTHER STUFF BEFORE WRITING TO LOG FILE
# ...
with io.open(ocr_log, "a") as ocr_file:
timestamp = datetime.datetime.now().strftime(timestamp_fmt)
ocr_file.write(f"{timestamp}:\n====OCR-START===\n")
ocr_file.write(text)
ocr_file.write("\n====OCR-END====\n")
Basically i want to convert speech to text, so I am trying to use the google voice recognition api for python.
This is the code which i'm trying to run-
from pygsr import Pygsr
speech = Pygsr()
speech.record(3) # duration in seconds (3)
phrase, complete_response = speech.speech_to_text('es_ES')
print phrase # This is the required output
I've installed all the modules correctly, so probably nothing is wrong with the modules, i am getting the following error-
Traceback (most recent call last):
File "C:/Python/google_voice.py", line 4, in <module>
phrase, complete_response = speech.speech_to_text('es_ES') # select the language
File "C:/Python\pygsr\__init__.py", line 49, in speech_to_text
audio = open(file_upload, "rb").read()
IOError: [Errno 2] No such file or directory: 'audio.flac'
Can somebody please tell me what am i missing.
Or please suggest any good speech to text conversion method for python.
You miss the sox tool installed which converts recorded wav to flac, you can see in line in pygsr sources: system("sox %s -t wav -r 48000 -t flac %s.flac" % (self.file, self.file)). Make sure that sox works for you and it can create flac files.
i have the following problem during file open:
Using PyQt QFileDialog I get path for files from user which I would like to read it
def read_file(self):
self.t_file = (QFileDialog.getOpenFileNames(self, 'Select File', '','*.txt'))
Unfortunately I cannot open a file if the path has numbers in it:
Ex:
'E:\test\02_info\test.txt'
I tried
f1 = open(self.t_file,'r')
Could anyone help me to read files from such a path format?
Thank you in advance.
EDIT:
I get the following error:
Traceback (most recent call last):
File "<pyshell#27>", line 1, in <module>
f1 = open(self.t_file,'r')
IOError: [Errno 22] invalid mode ('r') or filename: 'E:\test\x02_info\test.txt'
The problem is caused by your use of getOpenFileNames (which returns a list of files) instead of getOpenFileName (which returns a single file). You also seem to have converted the return value wrongly, but since you haven't shown the relevant code, I will just show you how it should be done (assuming you are using python2):
def read_file(self):
filename = QFileDialog.getOpenFileName(self, 'Select File', '','*.txt')
# convert to a python string
self.t_file = unicode(filename)