How to get label file from wav file with python? - python

Hy,
How to get .label file from .wav file with python? Could someone please share some link with example or something?

Since wav files are a form of riff files, they can contain an optional INFO chunk. The information in that chunk is tagged. However, there is no "label" tag. And since it is optional many wav files don't contain this chunk.
Note that the wave module from Python's standard library does not read or write the INFO chunk!

Related

How to read .dss (audio file) metadata with python

The .dss (or .ds2) file format is used for dictation devices e.g. by Philips or Olympus and stores metadata on the dictation file in addition to the audio information.
Is there a way to somehow readout this metadata by using a simple python routine?
One idea was to read out the file in binary format, yet I could not do it by myself.
Help anyone :-) ?
Sample file (short dictation with metadata) available here: https://www.dropbox.com/s/g5uk22prkqht372/TH10094.DSS?dl=0
In the sample file there are the metadata "Sofort", "Heartbeat" and "WGB", which needs to be looked for. Can't find them though.

Cannot open WAV file in python using scipy.io.wavfile

I'm trying to work with WAV files using scipy.io.wavfile but the file I want to read has headers inside of it (NIST). I tried deleting the headers (that was dumb), I tried using other libraries (wave), custom functions found online but It still doesn't work. I get "Not a WAV file."
The .wav files are from mocha-timit, a speech training corpus.
can someone help me out?
Thanks.
Some things you could try:
(1) Use scikits.audiolab as in this question
(2) Convert the wav format from NIST format to the standard RIFF format with a tool like sndfile-convert from 'libsndfile' (You'll need to change the original file endings to .nist).
I got (2) to work on my own system and could read the files with scipy.io.wavfile.read

How to determine the default executable for a specific file format?

Is there a way to do this, just by relying on the file's extension?
For example: os.system(filepath) opens the given filepath using the default application, but what is the executable's filepath?
No. You can write assumptions into your program, which is what all developers do to handle these formats. It doesn't matter what extension a file has, it can be used as a format regardless.
Take for example an XML file. If you take that XML data and put it into a .txt file, or simply rename the .xml file to .txt, reading from that file and parsing the data within will still render XML formats.

Fingerprinting the file type of a string of data in python

I'd like to be able to read in the first couple kilobytes of unknown file types and see if it matches any known file types (i.e. mp3 file, jpeg, etc...). I was thinking of trying to load meta data from files from libraries like PIL, sndhdr, py264, etc... and see if they picked up any valid formats but I thought this must have been a problem someone has solved before.
Is there one library or a gist showing the usage of multiple libraries which would do this?
Use python-magic to do the fingerprinting.
The library can determine file type from bytes data only:
import magic
magic.from_buffer(start_data_from_something)
The library provides access to the libmagic file type identification library, which also drives the UNIX file command.

python adding gibberish when reading from a .rtf file?

I have a .rtf file that contains nothing but an integer, say 15. I wish to read this integer in through python and manipulate that integer in some way. However, it seems that python is reading in much of the metadata associated with .rtf files. Why is that? How can I avoid it? For example, trying to read in this file, I get..
{\rtf1\ansi\ansicpg1252\cocoartf949\cocoasubrtf460
{\fonttbl\f0\fswiss\fcharset0
Helvetica;}
{\colortbl;\red255\green255\blue255;}
\margl720\margr720\margb720\margt720\vieww9000\viewh8400\viewkind0
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\ql\qnatural\pardirnatural
That's the nature of .RTF (i.e Rich Text files), they include extra data to define how the text is layed-out and formated.
It is not recommended to store data in such files lest you encounter the difficulties you noted. Would you go through the effort to parse this file and "recover" your one numeric value, you may expose your application to the risk of updated versions of the RTF format which may render the parsing logic partially incorrect and hence yield wrong numeric data for the application).
Why not store this info in a true text file. This could be a flat text file or preferably an XML, YAML, JSON file for example for added "forward" compatibility as your application and you may add extra parameters and such in the file.
If this file is a given, however, there probably exist Python libraries to read and write to it. Check the Python Package Index (PyPI) for the RTF keyword.
That's exactly what the RTF file contains, so Python (in the absence of further instruction) is giving you what the file contains.
You may be looking for a library to read the contents of RTF files, such as pyrtf-ng.

Categories

Resources