AIFF-C file cannot be read with aifc module in python - python

I am trying to read a compressed .aiff file stored on my local directory. I get this;
>>>import aifc
>>>s = aifc.open('/Users/machinename/Desktop/folder/AudioTrack.aiff','r')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/aifc.py", line 942, in open
return Aifc_read(f)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/aifc.py", line 347, in __init__
self.initfp(f)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/aifc.py", line 317, in initfp
self._read_comm_chunk(chunk)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/aifc.py", line 497, in _read_comm_chunk
raise Error, 'cannot read compressed AIFF-C files'
aifc.Error: cannot read compressed AIFF-C files
>>>
I believe there must be a workaround for this. Here you can see aifc is supports aiff-c files as well.
A simple question, yet I could not find a solution on the web.

Old post, but... There seem to be two possible issues with this.
1 - You might need to pip install cl. If AIFC fails to import the cl module, it'll report the error you mention.
2 - There seems to be a bug in the aifc.py source (at least the one I found) where it expects uncompressed files to specify compression as 'NONE'. However some files seem to report 'raw ' (notice the extra space at the end) and AIFC does not recognize this as a compression format.

you might find that scikits.audiolab (requires mega-nerd.com/libsndfile/ is installed) does what you need. For example, I recently needed to get the duration of an .aif file (in seconds):
import scikits.audiolab
aiff_file = scikits.audiolab.Sndfile('best_song_ever.aif')
print aiff_file.nframes / float(aiff_file.samplerate)
You can do a bunch of other cool stuff too (Full API docs).
I hope that helps!

Related

Errno 22 Invalid argument - Zipfile Is Skipped

I am working on a project in Python in which I am parsing data from a zipped folder containing log files. The code works fine for most zips, but occasionally this exception is thrown:
[Errno 22] Invalid argument
As a result, the entire file is skipped, thus excluding the data in the desired log files from the results. When I try to extract the zipped file using the default Windows utility, I am met with this error:
Zip error
However, when I try to extract the file with 7zip, it does so successfully, save 2 errors:
1 <path> Unexpected End of Data
2 Data error: x.csv
x.csv is totally unrelated to the log I am trying to parse, and as such, I need to write code that is resilient to the point where if an unrelated file is corrupted, it will still be able to parse the other logs that are not.
At the moment, I am using the zipfile module to extract the files into memory. Is there a robust way to do this without the entire file being skipped?
Update 1: I believe the error I am running into is that the zipfile is missing a footer. I realized this when looking at it in a hex editor. I do not really have any idea how to safely edit the actual file using Python.
Here is the code that I am using to extract zips into memory:
for zip in os.listdir(directory):
try:
if zip.lower().endswith('.zip'):
if os.path.isfile(directory + "\\" + zip):
logs = zipfile.ZipFile(directory + "\\" + zip)
for log in logs.namelist():
if log.endswith('log.txt'):
data = logs.read(log)
Edit 2: Traceback for the error:
Traceback (most recent call last):
File "c:/Users/xxx/Desktop/Python Porjects/PE/logParse.py", line 28, in parse
logs = zipfile.ZipFile(directory + "\\" + zip)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1222, in __init__
self._RealGetContents()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1289, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
The stacktrace seems to show that it's not your code which badly manage to read the file but the Python module managing zip that is raising an error.
It looks like that python zip manager is more strict than other program (see this bug where a user report a difference between python behaviour and other program as GNOME Archive Manager).
Maybe, there is a bug report to do.

Receiving "pytesseract not in your path" error on the exact same code that used to work fine

I wrote this code several months ago and wanted to pass through it to clean it up and add some new features. Its a simple tool I used to take a picture of my screen and get write-able words from it. I am on a new computer from the one I originally wrote the code on; however, I went through and installed every module via the pycharm module manager. However, I keep getting this error when I run the code even though I have located the package in my path. Any help would be greatly appreciated.
I've looked up several different variations of my problem but they all seem to have different causes and fixes, of course, none of which work for me.
if c ==2:
img = ImageGrab.grab(bbox=(x1-5, y1-5, x2+5,y2+5)) # bbox specifies region (bbox= x,y,width,height)
img_np = np.array(img)
frame = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
c = 0
x = 0
string = str(pytesseract.image_to_string(frame)).lower()
print(string)
This is the only section of the code that references pytesseract other than of course "import pytesseract". Hopefully I can get this code up and running again and the pytesseract module in general as it is integral to many of my scripts. Thanks in advance for your help.
File "C:\Users\dante\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 184, in run_tesseract
proc = subprocess.Popen(cmd_args, **subprocess_args())
File "C:\Users\dante\Anaconda3\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\dante\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/dante/Desktop/DPC/processes/screen_to_text.py", line 29, in <module>
string = str(pytesseract.image_to_string(frame)).lower()
File "C:\Users\dante\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 309, in image_to_string
}[output_type]()
File "C:\Users\dante\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 308, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Users\dante\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 218, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\dante\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 186, in run_tesseract
raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path```
The problem was in my lack of understanding of the module. pytesseract is not an OCR, it is simply a translator that allows users to use googles OCR. This means, in order to use this package, a user must have google's OCR installed ( I downloaded mine from here https://sourceforge.net/projects/tesseract-ocr-alt/files/).
This does NOT; however, solve the full problem. The pytesseract package needs to know where the actual OCR program is located. On line 35 of the pytesseract.py script there is a line that tells pytesseract where to find the actual google OCR tesseract program
tesseract_cmd = 'tesseract'
If you are on windows and you haven't manually added tesseract to your path (if you don't know what that means just follow the next steps) then you need to replace that line with the actual location of the google OCR on your computer. Replacing that line with
tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
should allow you to run pytesseract assuming you have correctly installed everything. Took me quite a bit longer than i would care to admit to find the blatantly obvious solution to this issue, but hopefully people with this problem in the future resolve it faster than I did! Thanks and have a good day.

Understanding Parsing Error when reading model file into PySD

I am receiving the following error message when I try to read a Vensim model file (.mdl) using Python's PySD package.
My code is:
import pysd
import os
os.chdir('path/to/model_file')
model = pysd.read_vensim('my_model.mdl')
The Error I receive is:
Traceback (most recent call last):
Python Shell, prompt 13, line 1
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pysd/pysd.py", line 53, in read_vensim
py_model_file = translate_vensim(mdl_file)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pysd/vensim2py.py", line 673, in translate_vensim
entry.update(get_equation_components(entry['eqn']))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pysd/vensim2py.py", line 251, in get_equation_components
tree = parser.parse(equation_str)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/parsimonious/grammar.py", line 123, in parse
return self.default_rule.parse(text, pos=pos)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/parsimonious/expressions.py", line 110, in parse
node = self.match(text, pos=pos)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/parsimonious/expressions.py", line 127, in match
raise error
parsimonious.exceptions.ParseError: Rule 'subscriptlist' didn't match at '' (line 1, column 21).
I have searched for this particular error and I cannot find much information on the failed matching rule for 'subscriptlist'.
I appreciate any insight. Thank you.
Good news is that there is nothing wrong with your code. =) (Although you can also just include the path to the file in the .read_vensim call, if you don't want to make the dir change).
That being the case, there are a few possibilities that would cause this issue. One is if the model file is created with a sufficiently old version of Vensim, the syntax may be different from what the current parser is designed for. One way to get around this is to update Vensim and reload the model file there - Vensim will update to the current syntax.
If you are already using a recent version of Vensim (the parser was developed using syntax of Vensim 6.3E) then the parsing error may be due to a feature that isn't yet included. There are still some outstanding issues with subscripts, which you can read about here and here).
If you aren't using subscripts, you may have found a bug in the parser. If so, best course is to create a report in the github issue tracker for the project. The stack trace you posted says that the error is happening in the first line of the file, and that the error has to do with how the right hand side of the equation is being parsed. You might include the first few lines in your bug report to help me recreate the issue. I'll add a case to our growing test suite and then we can make sure it isn't a problem going forwards.

scipy.io.wavfile gives "WavFileWarning: chunk not understood" error

I'm trying to read a .wav file using scipy. I do this:
from scipy.io import wavfile
filename = "myWavFile.wav"
print "Processing " + filename
samples = wavfile.read(filename)
And I get this ugly error:
/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/io/wavfile.py:121: WavFileWarning: chunk not understood
warnings.warn("chunk not understood", WavFileWarning)
Traceback (most recent call last):
File "fingerFooler.py", line 15, in <module>
samples = wavfile.read(filename)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/io/wavfile.py", line 127, in read
size = struct.unpack(fmt, data)[0]
struct.error: unpack requires a string argument of length 4
I'm using Python 2.6.6, numpy 1.6.2, and scipy 0.11.0
Here's a wav file that causes the problem.
Any thoughts? What's wrong here?
The files is no longer available (not surprising after 9 months!), but for future reference the most likely cause is that it had extra metadata which scipy can't parse.
In my case, it was default metadata (copyright, track name etc) which was added by Audacity- you can open the file in Audacity and use File ... Open Metadata Editor to see it. Then use the 'Clear' button to strip it, and try again.
The current version of scipy supports the following RIFF chunks - 'fmt', 'fact', 'data' and 'LIST'. The Wikipedia page on RIFF has a bit more detail on how a WAV file is structured, for example yours might have included an unsupported-but-popular INFO chunk
I don't know anything about the WAV file format, but digging into the scipy code it looks like scipy isn't familiar with the chunk that's present towards the end of the file (chunk ID is bext, 2753632 bytes in, if that helps). That chunk is declared as 603 bytes long so it reads past it expecting another chunk ID 603 bytes later -- it doesn't find it (runs out of file) and falls over.
Have you tried it on other WAV files successfully? How was this one generated?
The easiest solution to this problem is to convert the wav file into other wav file using SoX.
$ sox wavfile.wav wavfile2.wav
Works for me!
I had the same error and could successfully convert to what it can read.
My original file was from Logic Pro. Then I used audacity to read the file.
I also got this error because of (presumably) metadata introduced by Audacity. I exported my wav file from another DAW (Ableton Live), and scipy.io.wavfile loaded it without error.
Solved this problem when exporting from Reaper:
simply deselect "Write BWF ('bext') chunk" in the Render to File window.

xlrd: struct.error: unpack requires a string argument of length 512

I was using xlrd 0.6.1 and 0.7.1 to open my xls files the both complained:
Traceback (most recent call last):
File "../../xls2csv.py", line 53, in <module>
book = xlrd.open_workbook(args[0])
File "build/bdist.linux-i686/egg/xlrd/__init__.py", line 366, in open_workbook
File "build/bdist.linux-i686/egg/xlrd/__init__.py", line 760, in __init__
File "build/bdist.linux-i686/egg/xlrd/compdoc.py", line 149, in __init__
struct.error: unpack requires a string argument of length 512
I googled around and found this advice helped:
open the xls file with open office and save to a new file. the problem will go away.
Just in case someone else got the same problem, I post it here.
If you have an xls file that opens OK in Excel, OpenOffice Calc, or Gnumeric, but isn't opened by xlrd, then you should e-mail the xlrd author (sjmachin at lexicon dot net) with the details and a copy of the file, so that xlrd can be improved; this will benefit you and all other xlrd users.
Update after examining the source:
The stack trace that you supplied was from the antique 0.6.1 version; why on earth are you using that?
According to my reading of the code, xlrd should have emitted a message like this: `WARNING * file size (SIZE) not 512 + multiple of sector size (512)' ... did it?
This is already out of spec. Often the cause is that the data payload (the Workbook stream) is not a multiple of 512 bytes, it is the last structure written, and the writer has not bothered to pad it out. In that case it is safe to continue, as the missing padding will not be accessed.
However, in your case where xlrd falls off the end of the file it is following a chain of index sectors (MS calls it the "double indirect FAT") that is used when the file size is bigger than about 7 MB. The last 4 bytes in each of those sectors contains the sector number of the next sector in the chain (or a special end-of-chain value). Consequently if one of those sectors is shorter than 512 bytes, the file is corrupt. Recovering from that without even a warning message is NOT something that I'd call good behaviour, and NOT something I'd be advocating SO users to rely on.
Please contact me via e-mail to discuss how I can get a copy of this file (under a non-disclosure agreement, if necessary).
I came across this issue too when running xlrd on a procedurally created XLS from a provider.
My solution was to run libreoffice to convert the file, afterwhich, I could use xlrd successfully on the file!
libreoffice --headless --convert-to xls --outdir converted original/not_working.xls
Which I did in Python3 by:
from subprocess import call
call(["libreoffice", "--headless",
"--convert-to", "xls",
"--outdir", "converted" , "original/not_working.xls"])
Sources:
https://unix.stackexchange.com/questions/354043/convert-xlsx-to-xls-in-linux-shell-script#354054
https://www.computerhope.com/forum/index.php?topic=160219.0

Categories

Resources