I rewriting a simple midi music sequencer from javascript into Python as a way of teaching myself Python.
I'm ready to begin working with time (for fireing midi events) but I can't find any good resources for executing scripts in time, scheduling timing events, etc.
A few things I've read suggest I should use a module like tkinter, but I would rather have all the timing mechanisms independent of any gui module.
Does anyone have any suggestions/resources for working with time?
For executing scripts in a certain interval (of course within another script), you might want to take a look at the time module (Documentation here).
But if you are planning to use timing with a GUI, you might want to have concurrent threading or processing so that there is not delay with the user interface. In such case you can use multithreading (Documentation) or multiprocessing (Documentation) modules.
As a final note, some GUI frameworks come with built-in threading support, so you might want to take a look at that. For example, PyQT4 has something called QThread which handles all thread/event manipulation.
Related
I have a little python application that I have developed using wxpython4.0.3 that performs a fairly simple ETL type task of:
Takes user input to select a parent directory containing several
sub-directories full of CSV files (time series data.)
Transforms the data in each of those files to comply with formatting
required by a third party recipient and writes them out to new
files.
Zips each of the new files into a file named after the original
file's parent directory
At the user's initiative the application then FTPs the zip files to
the third party to be loaded into a data store.
The application works well enough, but the time required to process through several thousand CSV files is pretty extreme and mostly IO bound from what I can tell.
Is asyncio a reasonable option to pursue or are there other recommendations that anyone can make? I originally wrote this as a CLI and saw significant performance gains by using pypy, but I was reluctant to combine pypy with wxpython when I developed the UI for others.
Thanks for your guidance.
If you saw a significant speedup by using PyPy instead of CPython, that implies that your code probably isn't I/O-bound. Which means that making the I/O asynchronous isn't going to help very much. Plus, it'll be extra work, as well, because you'll have to restructure all of your CPU-heavy tasks into small pieces that can await repeatedly so they don't block the other tasks.
So, you probably want to use multiple processes here.
The simplest solution is to use a concurrent.futures.ProcessPoolExecutor: just toss tasks at the executor, and it'll run them on the child processes and return you a Future.
Unlike using asyncio, you won't have to change those tasks at all. They can read a file just by looping over the csv module, process it all in one big chunk, and even use the synchronous ftplib module, without needing to worry about anyone blocking anyone else. Only your top-level code needs to change.
However, you may want to consider splitting the code into a wx GUI that you run in CPython, and a multiprocessing engine that you run via subprocess in PyPy, which then spins off the ProcessPoolExecutor in PyPy as well. This would take a bit more work, but it means you'll get the CPU benefits of using PyPy, the well-tested-with-wx benefits of using CPython, and the parallelism of multiprocessing.
Another option to consider is pulling in a library like NumPy or Pandas that can do the slow parts (whether that's reading and processing the CSV, or doing some kind of elementwise computation on thousands of rows, or whatever) more quickly (and possibly even releasing the GIL, meaning you don't need multiprocessing).
If your code really is I/O-bound code, and primarily bound on the FTP requests, asyncio would help. But it would require rewriting a lot of code. You'd need to find or write an asyncio-driven FTP client library. And, if the file reading takes any significant part of your time, converting that to async is even more work.
There's also the problem of integrating the wx event loop with the asyncio event loop. You might be able to get away with running the asyncio loop in a second thread, but then you need to come up with some way of communicating between the wx event loop in the main thread and the asyncio loop in the background thread. Alternatively, you might be able to drive one loop from the other (or there might even be third-party libraries that do that for you). But this might be a lot easier to do with (or have better third-party libraries to help with) something like twisted instead of asyncio.
But, unless you need massive concurrency (which you probably don't, unless you've got hundreds of different FTP servers to talk to), threads should work just as well, with a lot fewer changes to your code. Just use a concurrent.futures.ThreadPoolExecutor, which is nearly identical to using a ProcessPoolExecutor as explained above.
Yes, you will probably benefit from using asynchronous library. Since most of your time is spent waiting for IO, a well-written asynchronous program will use that time to do something else, without the overhead of extra threads/processes. It will scale really well.
Hey I am learning Python at the moment. I wrote a few programs. Now I have a question:
Is it possible to run more "operations" at once?
According to my knowledge the scripts runs from the top to the bottom (except from thing like called def and if statements and so on).
For example: I want to do something and wait 5 seconds an then continue but while my program "waits" it should do something other? (This one is very simple)
Or: While checking for input do something other output things.
The examples are very poor but I do not finde something better at the moment. (If something comes to my mind, I will add it later)
I hope you understand what my question is.
Cheers
TL;DR: Use an async approach. Raymond Hettinger is a god, and this talk explains this concept more accurately and thoroughly than I can. ;)
The behavior you are describing is called "concurrency" or "asynchronicity", where you have more than one "piece" of code executing "at the same time". This is one of the hardest problems in practical computer science, because adding the dimension of time causes scheduling problems in addition to logic problems. However, it is very much in demand these days because of multi-core processors and the inherently parallel environment of the internet
"At the same time" is in quotes, because there are two basic ways to make this happen:
actually run the code at the same time
make it look like it is running at the same time.
The first option is called Concurrent programing, and the second is called Asynchronous programming (commonly "async").
Generally, "modern" programming seems to favor async, because it's easier to reason about and comes with fewer, less severe pitfalls. If you do it right, async programs can look a lot like the synchronous, procedural code you're already familiar with. Golang is basically built on the concept. Javascript has embraced "futures" in the form of Promises and async/await. I know it's not Python, but this talk by the creator of Go gives a good overview of the philosophy.
Python gives you three main ways to approach this, separated into three major modules: threading, multiprocessing, and asyncio
multiprocessing and threading are concurrent solutions. They do very similar things, but accomplish them in slightly different ways by delegating to the OS in different ways. This answer has a concise explanation of the difference. Concurrency is notoriously difficult to debug, because it is not deterministic: small differences in timing can result in completely different sequences of execution. You also have to deal with "race conditions" in threads, where two bits of code want to read/change the same piece of shared state at the same time.
asyncio, or "asynchronous input-output" is a more recent, async solution. You'll need at least Python 3.4. It uses event loops to allow long-running tasks to execute without "blocking" the rest of the program. Processes and threads do a similar thing, running two or more operations on even the same processor core by interrupting the running process periodically, forcing them to take turns. But with async, you decide where the turn-taking happens. It's like designing mature adults that interact cooperatively rather than designing kindergarteners that have to be baby-sat by the OS and forced to share the processor.
There are also third-party packages like gevent and eventlet that predate asyncio and work in earlier versions of Python. If you can afford to target Python >=3.4, I would recommend just using asyncio, because it's part of the Python core.
I have written a monitoring program for the control system at our plant. It is basically a GUI which lets the operator see the current status of the lock of the closed loop system and aware the operator in case the lock/loop breaks.
Now, the operation is heavily dependent on the responses of the GUI. My seniors told me that they prefer just the console prints instead of using TKinter based GUI as TKinter has lags while working in real time.
Can anyone please comment on this aspect?
Can this lag be checked and corrected?
Thanks in advance.
I would say that if your program is simply accessing data and not interacting with the data, then a GUI seems to be a bit of overkill. GUI's are guided user interfaces, as you know, and are made for guiding a user through an interface. If the interface is just a status, as you indicated, then I see nothing wrong with a console program.
If, however, your program also interacts with data in a way that would be difficult without a GUI, then the GUI is likely the right choice.
Have you considered a GUI in another programming language? Python is known to be a bit slow, even in console. In my experience, C++ is faster in terms of viewing data. Best of luck!
Python / tkinter in general
In a tkinter program, your code falls in one of four categories;
Initialization code that runs before the mainloop is started.
Callbacks that are run from within the mainloop.
Code running in other threads.
Code running in different processes.
In the first case, the time the code takes only influences the startup time, which for a long-running program is probably not all that relevant.
Concerning the second case, well-written callbacks should not take that long to run. In the order of tens of milliseconds, maybe up to 100 ms. If they take longer, they will render the GUI unresponsive. So unless you notice a sluggish GUI (without threads; see below) this should not be a problem.
One pitfall here are after callbacks, that is functions that will be scheduled to run after a certain time. If you launch them too often, this will also starve the GUI of time.
Another possible problem might be the manipulation of a Canvas with lots and lots of items in it.
As of Python 3.x, tkinter is thread-safe to the best of my understanding. However, in the reference implementation of Python, only one thread at a time can be executing Python bytecode. So doing heavy calculations in a second thread would slow down the GUI.
If you GUI uses multiprocessing to run calculations in another process, that should not influence the speed of your GUI much, unless you do things wrong when communicating with that other process.
Your monitoring program
What is too slow depends on the situation. In general Python is not considered a language suitable for hard real-time programs. To do hard real-time one also needs a suitable operating system.
So the question then becomes what is the acceptable lag in your system specification? Without knowing that it is impossible to precisely answer your question.
It seems that your GUI is just displaying some system status. That should not cause too much of a load, provided that you don't read/check the data too often. As described in the callbacks paragraph above it is possible to starve your GUI of CPU cycles with callbacks that run too often. From what you've written, I gather that the GUI's task is just to inform the human operator.
That leads me to believe that the task is not hugely time critical; a system that requires millisecond intervention time should not rely on a human operator.
So based on your information I would say that a competently written GUI should probably not be too slow.
I want to use SQLite for my GUI Python application but I have to update database every 500 MS without effecting the performance of my program.
I'm using PyQt4,So I thought about using QThread but it seems difficult to deal with, so I wondered if it was the best way before really trying to understand it.
My Question is: is QThread the best way or there are other ways?
According to the fact that python implementation rely on the GIL, even with using threads or timer you won't be able to do something (potentially costly) in your program without effecting the global performance of the program.
I will suggest you to have a look to multiprocessing module to get around of this limitation. Using this module, you will no more use threads (that are affected by the GIL), but processes (not affected by GIL).
Maybe you could create a subprocess that will arm a timer to make the update every 500ms when the main process will continue his job.
Then, you will let the system do the job of balancing the programs and it may be better in term of responsiveness (especially in a multi core environment)
I am trying to write a game that does stuff in the background until you hit a key, so it's waiting for input while doing other stuff at the same time. I have never done something like this before but I hear the solution has to do with event handling. I am trying to use either the "asyncore" library or the "signal" library (Python), but I don't understand the documentation and I think I'm missing basic concepts. Can you explain to me how I might go about using signal handling? (Or maybe there's something else I can do?)
Thanks!
Python's asyncore library is for network communication, and the signal library is used for timers and operating system signals (not related to keyboard input).
To get started, you should find a Python game programming library that suits your purposes.
If you want to do something as seemingly simple as keyboard input without help from a game programming library, you'll quickly be forced to use native APIs like Win32 and X11. By using a game programming library, you'll have get a chance to learn about events and background tasks first.
If you want to write a game in python with an SDL support, you should consider using pygame.
SDL: Simple DirectMedia Layer is a cross-platform multimedia library designed to provide low level access to audio, keyboard, mouse, joystick, 3D hardware via OpenGL, and 2D video framebuffer. [ http://www.libsdl.org/ ]
Pygame is python bindings with SDL: http://www.pygame.org
But if you really want to do it the hard way, I think that you should consider using the multiprocessing package.
The reason is that your game should have a main loop which is used for analysing the inputs (mouse, keyboard) and update the screen of your game. This process should not have too much overhead or the game will show signs of poor performance...
The second process should be the worker process that you want to use to code your other stuff in the background...
the multiprocessing package gives you plenty of choices for the interprocess communications (Pipe, Queue, Event)... http://docs.python.org/library/multiprocessing.html
To conclude, even if you use a framework or not for your game, your background stuff should be on a different process that your game's main loop. (Threading in python is good only for high use of I/O, so it's not the package you want right now).