I need to convert mp3 audio files to 64kbps on the server side.
Right now, I am using subprocess to call lame, but I wonder if there are any good alternatives?
There seems to be a slightly old thread on that topic here: http://www.dreamincode.net/forums/topic/72083-lame-mp3-encoder-for-python/
The final conclusion was to create a custom binding to lame_enc.dll via Python->C bindings.
The reason for that conclusion was that the existing binding libraries (pymedia/py-lame) have not been maintained.
Unfortunately the guy didn't get it to work :)
Maybe you should continue to use subprocess. You could take advantage of that choice, abstract your encoding at a slightly higher level, and reuse the code/strategy to optionally execute other command line encoding tools (such as ogg or shn tools).
I've seen several audio ripping tools adopt that strategy.
I've been working with Python Audio Tools, which is capable of make conversions between different audio formats.
I've already used it to convert .wav files into mp3, .flac and .m4a.
If you want to use LAME to encode your MP3s (and not PyMedia), you can always use ctypes to wrap the lame encoder DLL (or .so if you are on Linux). The exact wrapper code you'll use is going to be tied to the LAME DLL version (and there are many of these flying around, unfortunately), so I can't really give you any example, but the ctypes docs should be clear enough about wrapping DLLs.
Caveat: relatively new programmer here and I haven't had a need to convert audio files before.
However, if I understand what you mean by server-side, correctly, you might be looking for a good approach to manage mass conversions, and your interest in a python solution might be in part to be able to better manage the resource use or integrate into your processing chain. I had a similar problem/goal, which I resolved using a mix of Merlyn's recommendation and Celery. I don't use django-celery, but if this is for a django-based project, that might appeal to you as well. You can find out more about celery here:
http://celeryproject.org/community.html
http://ask.github.com/celery/getting-started/introduction.html
Depending on what you have setup already, there may be a little upfront time needed to get setup. To take advantage of everything you'll need rabbitmq/erlang installed, but if you follow the guide on the sites above, it's pretty quick now.
Here's an example of how I use celery with subprocess to address a similar issue. Similar to the poster's suggestion above, I use subprocess to call ffmpeg, which is as good as it gets for video tools, and probably would actually be as good as it gets for audio tools too. I'm including a bit more than necessary here to give you a feel for how you might configure your own a little.
#example of configuring an option, here I'm selecting how much I want to adjust bitrate
#based on my input's format
def generate_command_line_method(self):
if self.bitrate:
compression_dict = {'.mp4':1.5, '.rm':1.5, '.avi': 1.2,
'.mkv': 1.2, '.mpg': 1, '.mpeg':1}
if self.ext.lower() in compression_dict.keys():
compression_factor = compression_dict[self.ext.lower()]
#Making a list to send to the command line through subprocess
ffscript = ['ffmpeg',
'-i', self.fullpath,
'-b', str(self.bitrate * compression_factor),
'-qscale', '3', #quality factor, based on trial and error
'-g', '90', #iframe roughly per 3 seconds
'-intra',
outpath
]
return ffscript
#The celery side of things, I'd have a celeryconfig.py file in the
#same directory as the script that points to the following function, so my task
#queue would know the specifics of the function I'll call through it. You can
#see example configs on the sites above, but it's basically just going to be
#a tuple that says, here are the modules I want you to look in, celery, e.g.
#CELERY_MODULES = ("exciting_asynchronous_module.py",). This file then contains,
from celery.decorators import task
from mymodule import myobject
from subprocess import Popen
#task(time_limit=600) #say, for example, 10 mins
def run_ffscript(ffscript):
some_result = Popen(ffscript).wait()
#Note: we'll wait because we don't want to compound
#the asynchronous aspect (we don't want celery to launch the subprocess and think
#it has finished.
#Then I start up celery/rabbitmq, and got into my interactive shell (ipython shown):
#I'll have some generator feeding these ffscripts command lines, then process them
#with something like:
In[1]: for generated_ffscript in generator:
run_ffscript.delay(generated_ffscript)
Let me know if this was useful to you. I'm relatively new to answering questions here and not sure if my attempts are helpful or not. Good luck!
Well, Gstreamer has the "ugly plugin" lamemp3enc and there are python bindings for Gstreamer (gst-python 1.2, supports python 3.3). I haven't tried going this route myself so I'm not really in a position to recommend anything... Frankly, a subprocess solution seems a lot simpler, if not "cleaner", to me.
Related
begin TLDR;
I want to write a python3 script to scan through the memory of a running windows process and find strings.
end TLDR;
This is for a CTF binary. It's a typical Windows x86 PE file. The goal is simply to get a flag from the processes memory as it runs. This is easy with ProcessHacker you can search through the strings in the memory of the running application and find the flag with a regex. Now because I'm a masochistic geek I strive to script out solutions for CTFs (for everything really). Specifically I want to use python3, C# is also an option but would really like to keep all of the solution scripts in python.
Thought this would be a very simple task. You know... pip install some library written by someone that's already solved the problem and use it. Couldn't find anything that would let me do what I need for this task. Here are the libraries I tried out already.
ctypes - This was the first one I used, specifically ReadProcessMemory. Kept getting 299 errors which was because the buffer I was passing in was larger than that section of memory so I made a recursive function that would catch that exception, divide the buffer length by 2 until it got something THEN would read one byte at a time until it hit a 299 error. May have been on the right track there but I wasn't able to get the flag. I WAS able to find the flag only if I knew the exact address of the flag (which I'd get from process hacker). I may make a separate question on SO to address that, this one is really just me asking the community if something already exists before diving into this.
pymem - A nice wrapper for ctypes but had the same issues as above.
winappdbg - python2.x only. I don't want to use python 2.x.
haystack - Looks like this depends on winappdbg which depends on python 2.x.
angr - This is a possibility, Only scratched the surface with it so far. Looks complicated and it's on the to learn list but don't want to dive into something right now that's not going to solve the issue.
volatility - Looks like this is meant for working with full RAM dumps not for hooking into currently running processes and reading the memory.
My plan at the moment is to dive a bit more into angr to see if that will work, go back to pymem/ctypes and try more things. If all else fails ProcessHacker IS opensource. I'm not fluent in C so it'll take time to figure out how they're doing it. Really hoping there's some python3 library I'm missing or maybe I'm going about this the wrong way.
Ended up writing the script using the frida library. Also have to give soutz to rootbsd because his or her code in the fridump3 project helped greatly.
Has someone implemented the CSV-handling for Flyway? It was requested some time ago (Flyway specific migration with csv files). Flyway comments it now as a possibility for the MigrationResolver and MigrationExecutor, but it does not seem to be implemented.
I've tried to do it myself with Flyway 4.2, but I'm not very good with java. I got as far as creating my own jar using the sample and make it accessible to flyway. But how does flyway distinguish when to use the SqlMigrator and when to use my CsvMigrator? I thought I have to register my own prefix/suffix (as the question above writes), but FlywayConfiguration seems to be read-only, at least I did not see any API calls for doing this :(.
How to connect the different Resolvers to the different migration file types? (.sql to the migration using Sql and .csv/.py to the loading of Csv and executing python scripts)
After some shed of tears and blood, it looks like came up with something on this. I can't make the whole code available because it is using proprietary file format, but here's the main ideas:
implement the ConfigurationAware as well, and use the setFlywayConfiguration implementation to catalog the extra files you want to handle (i.e. .csv). This is executed only once during the run.
during this cataloging I could not use the scanner or LoadableResources, there's some Java magic I do not understand. All the classes and methods seem to be available and accessible, even when using .getMethods() runtime... but when trying to actually call them during a run it throws java.lang.NoSuchMethodError and java.lang.NoClassDefFoundError. I've wasted a whole day on this - don't do that, just copy-paste the code from org.flywaydb.core.internal.util.scanner.filesystem.FileSystemScanner.
use Set< String > instead of LoadableResources[], way easier to work with, especially since there's no access to LoadableResources anyway and working with [] was a nightmare.
the python/shell call will go to the execute(). Some tips:
any exception or fawlty exitcode needs to be translated to an SQLException.
the build is enforcing Java 1.6, so new ProcessBuilder(cmd).inheritIO() cannot be used. Look at these solutions: ProcessBuilder: Forwarding stdout and stderr of started processes without blocking the main thread if you want to print the STDOUT/STDERR.
to compile flyway including your custom module, clone the whole flyway repo from git, edit the main pom.xml to include your module as well and use this command to compile: "mvn install -P-CommercialDBTest -P-CommandlinePlatformAssemblies -DskipTests=true" (I found this in another stackoverflow question.)
what I haven't done yet is the checksum part, I don't know yet what that wants.
Per Python documentation, subprocess.call should be blocking and wait for the subprocess to complete. In this code I am trying to convert few xls files to a new format by calling Libreoffice on command line. I assumed that the call to subprocess call is blocking but seems like I need to add an artificial delay after each call otherwise I miss few files in the out directory.
what am I doing wrong? and why do I need the delay?
from subprocess import call
for i in range(0,len(sorted_files)):
args = ['libreoffice', '-headless', '-convert-to',
'xls', "%s/%s.xls" %(sorted_files[i]['filename'],sorted_files[i]['filename']), '-outdir', 'out']
call(args)
var = raw_input("Enter something: ") # if comment this line I dont get all the files in out directory
EDIT It might be hard to find the answer through the comments below. I used unoconv for document conversion which is blocking and easy to work with from an script.
It's possible likely that libreoffice is implemented as some sort of daemon/intermediary process. The "daemon" will (effectively1) parse the commandline and then farm the work off to some other process, possibly detaching them so that it can exit immediately. (based on the -invisible option in the documentation I suspect strongly that this is indeed the case you have).
If this is the case, then your subprocess.call does do what it is advertised to do -- It waits for the daemon to complete before moving on. However, it doesn't do what you want which is to wait for all of the work to be completed. The only option you have in that scenario is to look to see if the daemon has a -wait option or similar.
1It is likely that we don't have an actual daemon here, only something which behaves similarly. See comments by abernert
The problem is that the soffice command-line tool (which libreoffice is either just a link to, or a further wrapper around) is just a "controller" for the real program soffice.bin. It finds a running copy of soffice.bin and/or creates on, tells it to do some work, and then quits.
So, call is doing exactly the right thing: it waits for libreoffice to quit.
But you don't want to wait for libreoffice to quit, you want to wait for soffice.bin to finish doing the work that libreoffice asked it to do.
It looks like what you're trying to do isn't possible to do directly. But it's possible to do indirectly.
The docs say that headless mode:
… allows using the application without user interface.
This special mode can be used when the application is controlled by external clients via the API.
In other words, the app doesn't quit after running some UNO strings/doing some conversions/whatever else you specify on the command line, it sits around waiting for more UNO commands from outside, while the launcher just runs as soon as it sends the appropriate commands to the app.
You probably have to use that above-mentioned external control API (UNO) directly.
See Scripting LibreOffice for the basics (although there's more info there about internal scripting than external), and the API documentation for details and examples.
But there may be an even simpler answer: unoconv is a simple command-line tool written using the UNO API that does exactly what you want. It starts up LibreOffice if necessary, sends it some commands, waits for the results, and then quits. So if you just use unoconv instead of libreoffice, call is all you need.
Also notice that unoconv is written in Python, and is designed to be used as a module. If you just import it, you can write your own (simpler, and use-case-specific) code to replace the "Main entrance" code, and not use subprocess at all. (Or, of course, you can tear apart the module and use the relevant code yourself, or just use it as a very nice piece of sample code for using UNO from Python.)
Also, the unoconv page linked above lists a variety of other similar tools, some that work via UNO and some that don't, so if it doesn't work for you, try the others.
If nothing else works, you could consider, e.g., creating a sentinel file and using a filesystem watch, so at least you'll be able to detect exactly when it's finished its work, instead of having to guess at a timeout. But that's a real last-ditch workaround that you shouldn't even consider until eliminating all of the other options.
If libreoffice is being using an intermediary (daemon) as mentioned by #mgilson, then one solution is to find out what program it's invoking, and then directly invoke it yourself.
I have a working GPIB interface and Linux-GPIB package installed and working.
I only know two commands at the moment, x.write and x.find. I don't know much about Python, but I recognize the dot operator and realize that after importing gpib, I should get some functions at my disposal.
I have not been able to locate the list of GPIB functions.
They are in the gpib library. You reference them like so: gpib.foo().
Add this line into your code:
help(gpib)
And browse through the functions/classes.
If you are working in Python, I think the pyvisa is what you are looking for. It provides lots of useful high level functions which helps you to send a series of SCPI commands to your equipment via GPIB, such as write, read,ask and so on.
As for SCPI commands themselves, usually they will differ from the different vendors. So in terms of what kind of SCPI you should send to the equipment, you should read the corresponding datasheet. But in the other case, you could have installed the drivers which were provided by the vendor. In this case you can send some even higher commands. For instance, if you would like to control a voltage source, they have probably already got the function setvoltage(double voltage). Things will be much more easier for you.
Actually there are many commands available. Except those two you mentioned, there are x.read, x.ask, x.ask_for_value and so on.
But I recommend your to read those help file, I think that will give you a better understanding.
I'm working in Python to create images from text. I've already been back and forth with PIL and frankly, its font and alignment options need a lot of work.
I can subprocess Imagemagick and it works great, except that it seems to always need to write a file to disk. I would like to subprocess the image creation and just get the data returned to Python, keeping everything in memory.
I've looked into a number of supposed Python wrappers for ImageMagick, but they're all hopelessly years out of date or not documented whatsoever. Even searching extensively on SO doesn't see to clearly point to a defacto way to use ImageMagic with Python. So I think going for subprocessing is the best way forward.
convert and the other ImageMagick commands can output image data to stdout if you specify format:- as the output file. You can capture that output in Python using the subprocess module.
For instance:
cmd = ["convert", "test.bmp", "jpg:-"]
output_stream = subprocess.Popen(cmd, stdout=subprocess.PIPE).stdout
It would be a lot more work than piping data to ImageMagick, but there are several Pango based solutions. I used pango and pygtk awhile back, and I am pretty sure you could develop a headless gtk or gdk application to render text to a pixbuf.
A simpler solution might be to use the python cairo bondings.
Pango works at a pretty low level, so simple stuff can be a lot more complicated, but rendering quality is hard to beat, and it gives you a lot of fine grained control over the layout.