I have a shoutcat radio station and now want to build a player for it. I know how to "get" thet stream from the server, thanks a lot to bobince , but I am not sure how to convert that stream into playable samples. How is it done?
Shoutcast streams are typically (but not always) MP3. To get playable samples, you have to decode the stream's MP3 data.
Have you seen the resource at http://codeboje.de/playing-mp3-stream-python/? Looks like a simple solution, but requires an awful lot of libraries.
There are quite a few possibilities for MP3 decoding under Python. PyMedia is one I've had some success with in the past (but for which development seems to have stopped). It's not just an MP3 decoder though, but a playback interface with support for many audio and video formats via ffmpeg. There's also pyffmpeg which seems to have come back to life recently (haven't tried it yet).
Then there's PyGame can also play MP3, though this is a pretty small part of what it does. pymad is more lightweight possibility, being a direct interface to the libmad decoder library. And then there's always the possibility of handing the task off to an external multimedia library such as DirectShow, or GStreamer (via gst-python)...
Well, from what I can read on python, try this page. If that doesn't work, try the PythonInMusic article on the python wiki.
Related
I'm trying to build a virtual microphone that takes the physical microphone's input stream, modifies the audio using a neural network model written with Python, and replaces the sound stream with my program's output sound stream (in near realtime) so that other apps (e.g. Zoom, Skype, etc.) will receive my app's modified audio stream, for both Mac and Windows.
I've been reading for the last few hours and so far found several libraries that might (?) be able to work here: WebRTC, Soundflower, etc. Does anyone know if there's already a good library that can do or facilitate this? This might be wading into offtopic territory, but if this is doable entirely in Python, e.g. through eel or similar, does anyone know of any library that can do the same type of audio manipulation?
Thank you!
You can create virtual microphone device using official Core Audio APIs
I am quite new in python.
I have written a program which downloads a video file through torrent, using libtorrent. I have set it to sequential download so all the parts are downloaded in right order, for watching that video while it is being downloaded.
The problem I get is that the file is not available for playing immediately after download has started. Sometimes there should be downloaded 10mb, sometimes 30mb before video view could be started.
My thoughts are, that this is because of some metadata are missing.
My question is about the way to check whether file could be played or not. Any suggestions on achieving this? I have searched a lot but not found anything yet.
I am using python 2.7 (2.7 because of its compatibility with libtorrent), libtorrent, kivy framework 1.8 and built-in video player, which uses gstreamer as far as i know.
Source code could be checked at github:https://github.com/dpitkevics/stream-ies?files=1
Important files are main.py, lib/downloader.py
Thank You in advance guys :)
I would suggest using a metadata extraction tool, such as Hachoir. If the tool is able to successfully read the metadata, chances are the file is good to go. BUT - you don't necessarily want to start playback at that point. You need to buffer as well. The metadata will provide you with the content length; with the file size and download speed from the torrent you can calculate how much buffer is needed to ensure seamless playback. If you buffer properly, and the streams in the container are interleaved, this should ensure you always have the necessary data to start playback.
I want to make software for displaying PIP(Picture in Picture) of 2 video streams. How can i know, pyglet allows play video only from file source.
May be i decide to use wrong lib (pyglet), please, advice what lib is bettor for my goals.
I'm also will be appreciated if you advice some books or articles connected with generating video streams.
Thank you!
Pyglet is fully capable of playing video, however, as you said, I do not think it is capable of streaming video. A much more popular library for this kind of program would be GStreamer. It offers much more functionality, and it can be extended with plugins. Here is a Wikipedia page for it, and here is the official website. If you look at the features page, you will see this:
container formats: asf, avi, 3gp/mp4/mov, flv, mpeg-ps/ts, mkv/webm, mxf, ogg
streaming: http, mms, rtsp
codecs: FFmpeg, various codec libraries, 3rd party codec packs
metadata: native container formats with a common mapping between them
video: various colorspaces, support for progressive and interlaced video
audio: integer and float audio in various bit depths and multichannel configurations
So, it seems perfect for what you want to do (The bold bullet points are the ones I thought would interest you most).
Luckily GStreamer has bindings for Python.
Another library I have found, which you could use for the encoding part and all under-the-hood part is PyMedia. Maybe that would be something of interest too. However note that you do not need to use it as GStreamer can do everything that PyMedia can. I just put it in, incase you wanted to take a look at it, for future use perhaps.
Good luck.
I'm working on an audio mixing program (DAW) web app, and considering using Python and Python Gstreamer for the backend. I understand that I can contain the audio tracks of a single music project in a gst.Pipeline bin, but playback also appears to be controlled by this Pipeline.
Is it possible to create several "views" into the Pipeline representing the project? So that more than one client can grab an audio stream of this Pipeline at will, with the ability to do time seek?
If there is a better platform/library out there to use, I'd appreciate advice on that too. I'd prefer sticking to Python though, because my team members are already researching Python for other parts of this project.
Thanks very much!
You might want to look at Flumotion (www.flumotion.org). It is a python based streaming server using GStreamer, you might be able to get implementation ideas from that in terms of how you do your application. It relies heavily on the python library Twisted for its network handling.
I am looking for some general advice about the mp3 format before I start a small project to make sure I am not on a wild-goose chase.
My understanding of the internals of the mp3 format is minimal. Ideally, I am looking for a library that would abstract those details away. I would prefer to use Python (but could be convinced otherwise).
I would like to modify a set of mp3 files in a fairly simple way. I am not so much interested in the ID3 tags but in the audio itself. I want to be able to delete sections (e.g. drop 10 seconds from the 3rd minute), and insert sections (e.g. add credits to the end.)
My understanding is that the mp3 format is lossy, and so decoding it to (for example) PCM format, making the modifications, and then encoding it again to MP3 will lower the audio quality. (I would love to hear that I am wrong.)
I conjecture that if I stay in mp3 format, there will be some sort of minimum frame or packet-size to deal with, so the granularity of the operations may be coarser. I can live with that, as long as I get an accuracy of within a couple of seconds.
I have looked at PyMedia, but it requires me to migrate to PCM to process the data. Similarly, LAME wants to help me encode, but not access the data in place. I have seen several other libraries that only deal with the ID3 tags.
Can anyone recommend a Python MP3 library? Alternatively, can you disabuse me of my assumption that going to PCM and back is bad and avoidable?
If you want to do things low-level, use pymad. It turns MP3s into a buffer of sample data.
If you want something a little higher-level, use the Echo Nest Remix API (disclosure: I wrote part of it for my dayjob). It includes a few examples. If you look at the cowbell example (i.e., MoreCowbell.dj), you'll see a fork of pymad that gives you a NumPy array instead of a buffer. That datatype makes it easier to slice out sections and do math on them.
I got three quality answers, and I thank you all for them. I haven't chosen any as the accepted answer, because each addressed one aspect, so I wanted to write a summary.
Do you need to work in MP3?
Transcoding to PCM and back to MP3 is unlikely to result in a drop in quality.
Don't optimise audio-quality prematurely; test it with a simple prototype and listen to it.
Working in MP3
Wikipedia has a summary of the MP3 File Format.
MP3 frames are short (1152 samples, or just a few milliseconds) allowing for moderate precision at that level.
However, Wikipedia warns that "Frames are not independent items ("byte reservoir") and therefore cannot be extracted on arbitrary frame boundaries."
Existing libraries are unlikely to be of assistance, if I really want to avoid decoding.
Working in PCM
There are several libraries at this level:
LAME (latest release: October 2017)
PyMedia (latest release: February 2006)
PyMad (Linux only? Decoder only? Latest release: January 2007)
Working at a higher level
Echo Nest Remix API (Mac or Linux only, at the moment) is an API to a web-service that supports quite sophisticated operations (e.g. finding the locations of music beats and tempo, etc.)
mp3DirectCut (Windows only) is a GUI that apparently performs the operations I want, but as an app. It is not open-source. (I tried to run it, got an Access Denied installer error, and didn't follow up. A GUI isn't suitably for me, as I want to repeatedly run these operations on a changing library of files.)
My plan is now to start out in PyMedia, using PCM.
Mp3 is lossy, but it is lossy in a very specific way. The algorithms used as designed to discard certain parts of the audio which your ears are unable to hear (or are very difficult to hear). Re-doing the compression process at the same level of compression over and over is likely to yield nearly identical results for a given piece of audio. However, some additional losses may slowly accumulate. If you're going to be modifying files a lot, this might be a bad idea. It would also be a bad idea if you were concerned about quality, but then using MP3 if you are concerned about quality is a bad idea over all.
You could construct a test using an encoder and a decoder to re-encode a few different mp3 files a few times and watch how they change, this could help you determine the rate of deterioration and figure out if it is acceptable to you. Sounds like you have libraries you could use to run this simple test already.
MP3 files are composed of "frames" of audio and so it should be possible, with some effort, to remove entire frames with minimal processing (remove the frame, update some minor details in the file header). I believe frames are pretty short (a few milliseconds each) which would give the precision you're looking for. So doing some reading on the MP3 File Format should give you enough information to code your own python library to do this. This is a fair bit different than traditional "audio processing" (since you don't care about precision) and so you're unlikely to find an existing library that does this. Most, as you've found, will decompress the audio first so you can have complete fine-grained control.
Not a direct answer to your needs, but check the mp3DirectCut software that does what you want (as a GUI app). I think that the source code is available, so even if you don't find a library, you could build one of your own, or build a python extension using code from mp3DirectCut.
As for removing or extracting mp3 segments from an mp3 file while staying in the MP3 domain (that is, without conversion to PCM format and back), there is also the open source package PyMp3Cut.
As for splicing MP3 files together (adding e.g. 'Credits' to the end or beginning of an mp3 file) I've found you can simply concatenate the MP3 files providing that the files have the same sampling rate (e.g. 44.1khz) and the same number of channels (e.g. both are stereo or both are mono).