Changing mp4ize.py to work on Windows

Changing mp4ize.py to work on Windows - python

Mp4ize (python) is a utility for converting video files to mp4 for use on iPhone and iPod. I'm trying to get it to run on Windows.
The python script relies on the library fcntl, and according to another question ( fcntl substitute on Windows ), the Windows equivalent is win32api. The other question also says:
If you provide more details about the fcntl calls people can find windows equivalents.
and since I've had no luck trying to rewrite the code myself, I thought I'd ask here.
How can I rewrite the following code for use on Windows?
fcntl.fcntl(
p.stderr.fileno(),
fcntl.F_SETFL,
fcntl.fcntl(p.stderr.fileno(), fcntl.F_GETFL) | os.O_NONBLOCK,
)
See here for the full source code.

This command sets the NONBLOCK option of the standard error file descriptor. This lets it pass data through before the entirety of the data has been written to it.
The patch at http://pastebin.com/Zr5LN8Ui will work, with progress indicators, on Windows. However, it will sometimes report a bad encode even when the encode was good.
It uses the solution from Non-blocking read on a subprocess.PIPE in python to allow non blocking IO, and fixes the pad option (your version didn't work for my test file) and progress bar for a modern FFMpeg.
Note that it is hardcoded to use the linked method when FFMpeg gets passed 3 or more command line options, as it messes up the first call to FFMpeg which gets the resolution of the input file.

Related

sys.stdout.flush() not working properly with python and electronjs

I'm building an electronjs python application and I'm using the pythonshell module. The electron application is supposed to log any messages my python script prints to the console, but rather than printing each message when it's supposed to be printed it waits until the script has finished executing and then prints everything. I've tried using sys.stdout.write("message") and then sys.stdout.flush(), but it still doesn't work.
The question I'm linking has a similar problem that I do, but the answer that worked for them didn't work for me on the electron application. It's flushing properly on the python backend, the frontend is what's causing the problem.
Similar question: Python sys.stdout.flush() doesn't work

file.flush() does not necessarily write the file’s data to disk!. you need to Use flush() followed by os.fsync(fd) to ensure this behavior.
see below:
sys.stdout.flush()
os.fsync(sys.stdout.fileno())
os.fsync(fd) documentation from python docs
Force write of file with file descriptor fd to disk. On Unix, this
calls the native fsync() function; on Windows, the MS _commit()
function.
If you’re starting with a Python file object f, first do f.flush(),
and then do os.fsync(f.fileno()), to ensure that all internal buffers
associated with f are written to disk.

Force a 3rd-party program to flush its output when called through subprocess

I am using a 3rd-party python module which is normally called through terminal commands. When called through terminal commands it has a verbose option which prints to terminal in real time.
I then have another python program which calls the 3rd-party program through subprocess. Unfortunately, when called through subprocess the terminal output no longer flushes, and is only returned on completion (the process takes many hours so I would like real-time progress).
I can see the source code of the 3rd-party module and it does not set printing to be flushed such as print('example', flush=True). Is there a way to force the flushing through my module without editing the 3rd-party source code? Furthermore, can I send this output to a log file (again in real time)?
Thanks for any help.

The issue is most likely that many programs work differently if run interactively in a terminal or as part of a pipe line (i.e. called using subprocess). It has very little to do with Python itself, but more with the Unix/Linux architecture.
As you have noted, it is possible to force a program to flush stdout even when run in a pipe line, but it requires changes to the source code, by manually applying stdout.flush calls.
Another way to print to screen, is to "trick" the program to think it is working with an interactive terminal, using a so called pseudo-terminal. There is a supporting module for this in the Python standard library, namely pty. Using, that, you will not explicitly call subprocess.run (or Popen or ...). Instead you have to use the pty.spawn call:
def prout(fd):
data = os.read(fd, 1024)
while(data):
print(data.decode(), end="")
data = os.read(fd, 1024)
pty.spawn("./callee.py", prout)
As can be seen, this requires a special function for handling stdout. Here above, I just print it to the terminal, but of course it is possible to do other thing with the text as well (such as log or parse...)
Another way to trick the program, is to use an external program, called unbuffer. Unbuffer will take your script as input, and make the program think (as for the pty call) that is called from a terminal. This is arguably simpler if unbuffer is installed or you are allowed to install it on your system (it is part of the expect package). All you have to do then, is to change your subprocess call as
p=subprocess.Popen(["unbuffer", "./callee.py"], stdout=subprocess.PIPE)
and then of course handle the output as usual, e.g. with some code like
for line in p.stdout:
print(line.decode(), end="")
print(p.communicate()[0].decode(), end="")
or similar. But this last part I think you have already covered, as you seem to be doing something with the output.

Python write to ram file when using command line, ghostscript

I want to run this command from python:
gs.exe -sDEVICE=jpeg -dTextAlphaBits=4 -r300 -o a.jpg a.pdf
Using ghostscript, to convert pdf to series of images. How do I use the RAM for the input and output files? Is there something like StringIO that gives you a file path?
I noticed there's a python ghostscript library, but it does not seem to give much more over the command line

You can't use RAM for the input and output file using the Ghostscript demo code, it doesn't support it. You can pipe input from stdin and out to stdout but that's it for the standard code.
You can use the Ghostscript API to feed data from any source, and you can write your own device (or co-opt the display device) to have the page buffer (which is what the input is rendered to) made available elsewhere. Provided you have enough memory to hold the entire page of course.
Doing that will require you to write code to interface with the Ghostscript shared object or DLL of course. Possibly the Python library does this, I wouldn't know not being a Python developer.
I suspect that the pointer from John Coleman is sufficient for your needs though.

subprocess.call does not wait for the process to complete

Per Python documentation, subprocess.call should be blocking and wait for the subprocess to complete. In this code I am trying to convert few xls files to a new format by calling Libreoffice on command line. I assumed that the call to subprocess call is blocking but seems like I need to add an artificial delay after each call otherwise I miss few files in the out directory.
what am I doing wrong? and why do I need the delay?
from subprocess import call
for i in range(0,len(sorted_files)):
args = ['libreoffice', '-headless', '-convert-to',
'xls', "%s/%s.xls" %(sorted_files[i]['filename'],sorted_files[i]['filename']), '-outdir', 'out']
call(args)
var = raw_input("Enter something: ") # if comment this line I dont get all the files in out directory
EDIT It might be hard to find the answer through the comments below. I used unoconv for document conversion which is blocking and easy to work with from an script.

It's possible likely that libreoffice is implemented as some sort of daemon/intermediary process. The "daemon" will (effectively1) parse the commandline and then farm the work off to some other process, possibly detaching them so that it can exit immediately. (based on the -invisible option in the documentation I suspect strongly that this is indeed the case you have).
If this is the case, then your subprocess.call does do what it is advertised to do -- It waits for the daemon to complete before moving on. However, it doesn't do what you want which is to wait for all of the work to be completed. The only option you have in that scenario is to look to see if the daemon has a -wait option or similar.
1It is likely that we don't have an actual daemon here, only something which behaves similarly. See comments by abernert

The problem is that the soffice command-line tool (which libreoffice is either just a link to, or a further wrapper around) is just a "controller" for the real program soffice.bin. It finds a running copy of soffice.bin and/or creates on, tells it to do some work, and then quits.
So, call is doing exactly the right thing: it waits for libreoffice to quit.
But you don't want to wait for libreoffice to quit, you want to wait for soffice.bin to finish doing the work that libreoffice asked it to do.
It looks like what you're trying to do isn't possible to do directly. But it's possible to do indirectly.
The docs say that headless mode:
… allows using the application without user interface.
This special mode can be used when the application is controlled by external clients via the API.
In other words, the app doesn't quit after running some UNO strings/doing some conversions/whatever else you specify on the command line, it sits around waiting for more UNO commands from outside, while the launcher just runs as soon as it sends the appropriate commands to the app.
You probably have to use that above-mentioned external control API (UNO) directly.
See Scripting LibreOffice for the basics (although there's more info there about internal scripting than external), and the API documentation for details and examples.
But there may be an even simpler answer: unoconv is a simple command-line tool written using the UNO API that does exactly what you want. It starts up LibreOffice if necessary, sends it some commands, waits for the results, and then quits. So if you just use unoconv instead of libreoffice, call is all you need.
Also notice that unoconv is written in Python, and is designed to be used as a module. If you just import it, you can write your own (simpler, and use-case-specific) code to replace the "Main entrance" code, and not use subprocess at all. (Or, of course, you can tear apart the module and use the relevant code yourself, or just use it as a very nice piece of sample code for using UNO from Python.)
Also, the unoconv page linked above lists a variety of other similar tools, some that work via UNO and some that don't, so if it doesn't work for you, try the others.
If nothing else works, you could consider, e.g., creating a sentinel file and using a filesystem watch, so at least you'll be able to detect exactly when it's finished its work, instead of having to guess at a timeout. But that's a real last-ditch workaround that you shouldn't even consider until eliminating all of the other options.

If libreoffice is being using an intermediary (daemon) as mentioned by #mgilson, then one solution is to find out what program it's invoking, and then directly invoke it yourself.

python to capture output of another windows - GUI program

The situation is like this:
I want to capture the pop-ups of IPmsg.exe in my python program.
There is an easy way of doing it, which is reading from the log file. But I would like to know if this can be done without bringing log files into discussion.
For more on IPmsg.exe: http://ipmsg.org/index.html.en
That was being specific.
Now, what would be a generic approach to capturing the output of a windows based GUI program?

There are generally two ways to talk to GUI programs on Windows, if you hate the log files:
Use their command line interface! I doubt this has one that outputs to stdout as messages come in
Use the win32 api or a wrapper for it to search for specific windows (polling as necessary or installing hooks to find out when they appear) and then grabbing text from them using more api calls. See this question: Get text from popup window
+1 for using the log files by the way, far easier.

You can capture only the output from applications through Python that you start directly from Python e.g. using the subprocess module:
http://docs.python.org/library/subprocess.html
Otherwise you have basically no chance for reading the direct output of other applications.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.