How to programatically combine two aac files into one?

How to programatically combine two aac files into one? - python

I'm looking for a cat for aac music files (the stuff iTunes uses).
Use Case: My father in law will not touch computers except for audiobooks he downloads to his iPod. I have taught him some iTunes (Windows) basics, but his library is a mess. It turns out, that iTunes is optimized for listening to podcasts and random songs from your library, not for audiobooks.
I would like to write a script (preferably python, but comfortable with other stuff too) to import his audiobook cds in a sane fashion, combining the tracks of each cd into a bookmarkable aac file (.m4b?) and then adding that to iTunes so it shows up in the audiobooks section.
I have figured out how to talk to iTunes (there is a COM interface in Windows, look for the iTunes SDK). Using that interface, I can use iTunes to rip the CD to aac format. It's the actual concatenation of the aac files I'm having trouble with. Can't find the right stuff on the net...

I created a freeware program called "Chapter and Verse" to concatenate m4a (AAC) files into a single m4b audiobook file with chapter marks and metadata.
If you have already ripped the CD's to AAC using itunes (which you say you have) then the rest is easy with my software. I wrote it for this exact reason and scenario. You can download it from www.lodensoftware.com
After trying to work with SlideShow Assembler, the QT SDK and a bunch of other command line tools, I ended up building my own application based on the publicly available MP4v2 library. The concatenating of files and adding of chapters is done using the MP4v2 library.
There are quite a few nuances in building an audiobook properly formatted for the iPod. The information is hard to find. Working with Apple documentation and open libraries has its own challenges as well.
Best of Luck.

Not programming related (well, kinda.)
iTunes already has functionality to rip as a single track (e.g. an audiobook.) Check this out: http://www.ehow.com/how_2108906_merge-cd-single-track-itunes.html
That fixes your immediate problem, but I guess people can keep discussing how to do it programatically.

The most powerful Python audio manipulation module out there seems to be Python Audio Tools. The download comes with CLI tools that would probably do everything you'd want to do, even ripping, so you can even get by with shell scripting the whole thing. The module itself is also pretty powerful and has a handy set of functions to manipulate audio files. If you want to stick with writing everything in python, you can possibly learn enough to do what you want to do after studying their CLI source code. Specifically they have a tool that just does audio file cat in any codec. (They do depend on FAAC/FAAD2 for AAC support, but that'd be true for every library you'll find)

I haven't seen an aac codec library for python, but you could use wav files as an intermediary format.
You can pull the tracks off the cd as wav files, and then use the wave module to concatenate them into one large file, which could then be converted by itunes to aac. This may increase your processing time considerably because of the size of the data, but it would be fairly easy, and you don't need any external libraries.

Related

Python Ripping CD Files to WAV

I'm writing a program where I need to be able to rip files from a CD into WAV format (or flac but wav works fine). It must run on Windows. I saw other answers where Express Rip and Audio Commander were recommended as command line tools. But Audio Commander's page doesn't seem to exist anymore. And I'm not so sure about express rip, it seems a bit sketchy.
Then they mentioned mutagen for retrieving the metadata.
Anyone have any experience with these utilities or this goal? I would like to be able to rip a CD in WAV, keep the metadata that is there, and if possible check the CD Archive for metadata as well.
Anyone ever do anything like this or have an suggestions on modules, utilities, methods, etc. to get me going? Even some small examples would help. That is examples of ripping a cd with python, or modules to accomplish the task.

You might want to have a look at PyMedia
PyMedia is a Python module for wav, mp3, ogg,
avi, divx, dvd, cdda etc files manipulations. It allows you to parse,
demutiplex, multiplex, decode and encode all supported formats. It can
be compiled for Windows, Linux and cygwin.
PyMedia was built to be really simple and flexible at the same time.
See tutorial for example. It allows you to create your own multimedia
applications in a matter of minutes and adjust it to your needs using
other components. Python language is chosen because of simple
semantics, complete and reach set of features.
You can also use this as a library:
From their Audio CD Grabber:
import pymedia.removable.cd as cd
def readTrack(track, offset, bytes):
cd.init()
if cd.getCount() == 0:
print 'There is no cdrom found. Bailing out...'
return 0
c = cd.CD(0)
props = c.getProperties()
if props['type'] != 'AudioCD':
print 'Media in %s has type %s, not AudioCD. Cannot read audio data.' % (c.getName(), props['type'])
return 0
tr0 = c.open(props['titles'][track - 1]['name'])
tr0.seek(offset, cd.SEEK_SET)
return tr0.read(bytes)
Update: For accessing metadata about the Audio CD you can use the PyCDDB lirbary.

Python split mp3 channel

I'd like to seperate the channels of a mp3 file in Python and save it in two other files.
Does anybody know a library for this.
Thanks in advance.

I assume you want to split the channels losslessly, without decoding MP3 and re-encoding it - otherwise you would not have mentioned MP3 at all and would have easily found many tools like Audacity to do that.
There are 4 channel modes of MP3 frames - this means 4 types of MP3 files: simple stereo, joint-stereo, dual-channel, mono. joint-stereo files can't be split without loss. mono files doesn't need splitting. The rest: stereo and dual-channel, consists of less than 0.1% of all MP3 files, technically can be split into 2 files, each for a channel, without loss. However there is not any tool on the Internet to do that - not any command line tool nor any GUI tool, because few need the function.
There are not any python library for you neither. Most libraries abstracted MP3 files into a common audio which you can manipulate, after decoding. pymad is the only one specific to MP3 file, and it can tell if a file is using any of the 4 channel modes, but does not offer to extract a channel without decoding it. If you write a new tool, you will have to work on raw MP3 files or produce a library for it.
And it is not easy to write a tool or library for it. It's one stream with 2 channels and not two streams interleaved on a frame level. You cannot simply work on MP3 frames, drop some frames, keep others, and manage to extract a channel out that way. It's a task for a professional, and perhaps best happen in a decoder project (like lame or libmad) and not in a file manipulation project (like mp3info or the python eyeD3). In other words, this feature is likely written in C, not python.
Implementaiton Note:
The task to build such a tool thus suits well for a computer science C-programming language course project:
1. it takes a lot of time to do;
2. it requires every skill learned from C programming course;
3. it can get wrong easily;
4. it is likely built on the work of other projects, a lesson of adaptating existing work;
5. is a damn-hard endeavor that no-one did before and thus very rewarding
6. perhaps can be done in 300 difficult lines of code instead of bloated simple Visual Basic code, thus is a good lession of modesty and quality;
7. and finally: nobody is waiting in an hurry for a working implementation.
All condition fits perfectly for a C-programming course project.
Implementation Note 2:
some bit-rates are only possible in mono mode (80kbps), and some bit-rates are only possible in stereo mode (e.g. 320kpbs). Luckily this does not present a problem in this task, because all dual-mp3 bit-rate can be mapped into a fitting mono-mp3 bit-rate -- but not vice versa!

Python interface for outputting MIDI files or text that's readable by audio programs

I am looking for a python package or library that will allow me to programmatically output a file format (e.g. MIDI) that can be read by audio/sound processing programs, like LogicPro or iDrum. What are the best options for this?

A large number of possibilities are listed here, especially under the "Midi Mania" header. For your requirements, and the various packages' descriptions, it seems to me that pythonmidi might suit you best, but I have no first-hand experience with it.

scripting fruityloops or propellerheads reason from VB or Python?

I have both Fruityloops and Propellerheads Reason software synths on my Windows PC.
Any way I can get at and script these from either Visual Basic or Python? Or at least send Midi messages to the synths from code?
Update : attempts to use something like a "midi-mapper" (thanks for link MusiGenesis) don't seem to work. I don't think Reason or FL Studio act like standard GM Midi synths.
Update 2 : If you're interested in this question, check out this too.

Both applications support MIDI. It's just that they don't see each other.
In order to send messages via MIDI between applications, you need to install a virtual midi port.
There are several freely available, but this one works: http://www.midiox.com/zip/MidiYokeSetup.msi
You'll get a virtual MIDI output port that you can write to as if it's a normal MIDI device. In Fruity Loops or Rebirth you choose that port as the input. That's all you need to do to connect the programs.
It'll work like this:
Your Application --> Virtual MIDI Port --> FruityLoops

Note: This answer doesn't exactly answer the question you asked but it might achieve the result you want :)
You can author a VST plugin in Java using jVSTWrapper (http://jvstwrapper.sourceforge.net/). If you really wanted to use Python you could use Jython to interface to java and do it that way. Alternatively you could just write the plugin in Java or another scripting language for the JVM like Groovy.

I think both FL Studio and Reason can be configured as the default MIDI playback device. To send MIDI messages to either from VB.NET, you'll need to PInvoke the midiOutOpen, midiOutShortMsg and midiOutClose API calls. Here's a link to code samples:
http://www.answers.com/topic/midioutopen
They're for VB6, but they should be easy to translate to VB.NET.
I know FL Studio can be "driven" from a plugin authored for FL (or a VSTx plugin), but I think these are always written in C or C++.
Edit: I just learned that Windows Vista dropped the MIDI Mapper (which would have made setting up FL or Reason as the default MIDI device simple). Amazing. Here is a link I found with an alternative solution:
http://akkordwechsel.de/15-windows-vista-und-der-midi-mapper/
I just tried it out (it's just a *.CPL file that you double-click to run) and it appears to work (although the GM Synth is the only option available on my laptop, so I'm not sure if it will pick up FL or Reason as choices).

What you need is a VST MIDI scripter / scripting plugin to create a logic of MIDI events that can be sent to any MIDI channel. You would need to set a MIDI channel in FL for the VST instrument/effect you need to tweak its values. Google for it there are some plugins around and please share them back here if you find anything useful :)

You could write a Rewire host. Though, you will have to get a license (the license is free, but your application must be proprietary, so no open source).
Alternatively, you could interface through MIDI messages.
Finally, you could implement a dummy audio device which would route the audio to/from wherever you want or process it in some way.
I imagine all of these would be reasonably difficult. MIDI is probably the easiest of the three (I have no idea how easy or hard the Rewire protocol is to use).

When it comes to Reason, you can do with it to much because of it's closed architecture - you can use VST plugins (or any other type like DirectX ones) - your only option is to use MIDI.
Regarding Fruity Loops, you could write a VST plugin that can take an input from a scripting language (VB, Python or whatever) but in order to write such thing you would have to use Delphi or C++.
Alternatively, you can check out MAX made by Cycling74 - it's something like a IDE for music ;-) - and I'm pretty sure you can use Python with it.

There's an opensource music workstation, called Frinika, and you can script that in Javascript. (Insert / delete notes , change midi effects like pitch wheel etc.) It can import / export regular midi files, so it will work with Fruity loops or whatever else you have.
// Insert New
song.newLane("MyMidiLane", type("Midi"));
lane = song.getLane("MyMidiLane");
part = lane.newPart( time("10.0:000"), time("4.0:000") );
part.insertNote(note("c#3"), time("11.2:000"), time("2:0"), 120 );
part.insertNote(note("f3"), time("11.3:000"), time("1:0"), 100 );
part.insertNote(note("g#3"), time("11.3:000"), time("1:0"), 100 );
part.insertNote(note("b3"), time("11.3:000"), time("0:64"), 100 );
part.removeNote(note("f3"), time("11.3:000"));
part = song.newLane("MyTextLane",
type("Text")).newPart(time("24.0:000"), time("10.0:000"));
part.text = "This is the test text to be inserted.";
part.lane.parts[0].remove(); // remove initially inserted text-part
Another example for reading/changing notes:
lane = song.getLane("MyMidiLane");
// a lane has a fixed instrument assigned
lane.parts[0].notes[0].duration=64
lane.parts[0].notes[1].duration=32
lane.parts[0].notes[1].startTick=120
// Parts are blocks of notes that you can drag around together in the Frinika GUI.
// They're like patterns in trackers.
for (i in lane.parts[0].notes){
println("i: "+i+", n: "+noteName(lane.parts[0].notes[i].note));
println("i: "+i+", dur: "+lane.parts[0].notes[i].duration);
println("i: "+i+", startT: "+lane.parts[0].notes[i].startTick);
}
http://frinika.appspot.com/
It has a Java Webstart launcher as well, so you don't even have to
install.
It used to bundle the Javadoc documentation as well, but for some
reason their latest downloads don't include that. It's a pity, because
that's where the Javascript bindings are documented. So, now you have
to browse the source or build the Javadoc yourself. (It has some built-in examples that are accessible from the scripting window, you should check them out first. My first example is from there.)
Here is the sourcefile where you'll find the Javascript docs:
frinika Javascript doc/source
But there are other options as well. You can check out mingus too, which is a Python library for music theory and midi file handling. It requires Fluidsynth, and the demo apps require GamePython too, so it's a bit more complicated to setup than Frinika.
P.S.:
Frinika has a particular bug: when dragging around neighbouring notes, some might not sound the right length. You can help that by transposing forth and back the consecutive notes (fairly fast in piano roll view), or dragging the part that contains the notes forth and back. Restarting Frinika will also help, but that's the slower way. So this bug won't affect saved files, neither midi export.

Python library to modify MP3 audio without transcoding

I am looking for some general advice about the mp3 format before I start a small project to make sure I am not on a wild-goose chase.
My understanding of the internals of the mp3 format is minimal. Ideally, I am looking for a library that would abstract those details away. I would prefer to use Python (but could be convinced otherwise).
I would like to modify a set of mp3 files in a fairly simple way. I am not so much interested in the ID3 tags but in the audio itself. I want to be able to delete sections (e.g. drop 10 seconds from the 3rd minute), and insert sections (e.g. add credits to the end.)
My understanding is that the mp3 format is lossy, and so decoding it to (for example) PCM format, making the modifications, and then encoding it again to MP3 will lower the audio quality. (I would love to hear that I am wrong.)
I conjecture that if I stay in mp3 format, there will be some sort of minimum frame or packet-size to deal with, so the granularity of the operations may be coarser. I can live with that, as long as I get an accuracy of within a couple of seconds.
I have looked at PyMedia, but it requires me to migrate to PCM to process the data. Similarly, LAME wants to help me encode, but not access the data in place. I have seen several other libraries that only deal with the ID3 tags.
Can anyone recommend a Python MP3 library? Alternatively, can you disabuse me of my assumption that going to PCM and back is bad and avoidable?

If you want to do things low-level, use pymad. It turns MP3s into a buffer of sample data.
If you want something a little higher-level, use the Echo Nest Remix API (disclosure: I wrote part of it for my dayjob). It includes a few examples. If you look at the cowbell example (i.e., MoreCowbell.dj), you'll see a fork of pymad that gives you a NumPy array instead of a buffer. That datatype makes it easier to slice out sections and do math on them.

I got three quality answers, and I thank you all for them. I haven't chosen any as the accepted answer, because each addressed one aspect, so I wanted to write a summary.
Do you need to work in MP3?
Transcoding to PCM and back to MP3 is unlikely to result in a drop in quality.
Don't optimise audio-quality prematurely; test it with a simple prototype and listen to it.
Working in MP3
Wikipedia has a summary of the MP3 File Format.
MP3 frames are short (1152 samples, or just a few milliseconds) allowing for moderate precision at that level.
However, Wikipedia warns that "Frames are not independent items ("byte reservoir") and therefore cannot be extracted on arbitrary frame boundaries."
Existing libraries are unlikely to be of assistance, if I really want to avoid decoding.
Working in PCM
There are several libraries at this level:
LAME (latest release: October 2017)
PyMedia (latest release: February 2006)
PyMad (Linux only? Decoder only? Latest release: January 2007)
Working at a higher level
Echo Nest Remix API (Mac or Linux only, at the moment) is an API to a web-service that supports quite sophisticated operations (e.g. finding the locations of music beats and tempo, etc.)
mp3DirectCut (Windows only) is a GUI that apparently performs the operations I want, but as an app. It is not open-source. (I tried to run it, got an Access Denied installer error, and didn't follow up. A GUI isn't suitably for me, as I want to repeatedly run these operations on a changing library of files.)
My plan is now to start out in PyMedia, using PCM.

Mp3 is lossy, but it is lossy in a very specific way. The algorithms used as designed to discard certain parts of the audio which your ears are unable to hear (or are very difficult to hear). Re-doing the compression process at the same level of compression over and over is likely to yield nearly identical results for a given piece of audio. However, some additional losses may slowly accumulate. If you're going to be modifying files a lot, this might be a bad idea. It would also be a bad idea if you were concerned about quality, but then using MP3 if you are concerned about quality is a bad idea over all.
You could construct a test using an encoder and a decoder to re-encode a few different mp3 files a few times and watch how they change, this could help you determine the rate of deterioration and figure out if it is acceptable to you. Sounds like you have libraries you could use to run this simple test already.
MP3 files are composed of "frames" of audio and so it should be possible, with some effort, to remove entire frames with minimal processing (remove the frame, update some minor details in the file header). I believe frames are pretty short (a few milliseconds each) which would give the precision you're looking for. So doing some reading on the MP3 File Format should give you enough information to code your own python library to do this. This is a fair bit different than traditional "audio processing" (since you don't care about precision) and so you're unlikely to find an existing library that does this. Most, as you've found, will decompress the audio first so you can have complete fine-grained control.

Not a direct answer to your needs, but check the mp3DirectCut software that does what you want (as a GUI app). I think that the source code is available, so even if you don't find a library, you could build one of your own, or build a python extension using code from mp3DirectCut.

As for removing or extracting mp3 segments from an mp3 file while staying in the MP3 domain (that is, without conversion to PCM format and back), there is also the open source package PyMp3Cut.
As for splicing MP3 files together (adding e.g. 'Credits' to the end or beginning of an mp3 file) I've found you can simply concatenate the MP3 files providing that the files have the same sampling rate (e.g. 44.1khz) and the same number of channels (e.g. both are stereo or both are mono).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.