Non-interfering printing from threads

Non-interfering printing from threads - python

I am doing some threading with a Python script and it spits out stuff in all kinds of orders. However, I want to print out a single "Remaining: x" at the end of each thread and then erase that line before the next print statement. So basically, I'm trying to implement a progress/status update that erases itself before the next print statement.
I have something like this:
for i in range(1,10):
print "Something here"
print "Remaining: x"
sleep(5)
sys.stdout.write("\033[F")
sys.stdout.write("\033[K")
This works fine when you're printing this out just the way it is; however, as soon as you implement threading, the "Remaining" text doesn't get wiped out all the time and sometimes you get another "Something here" right before it wipes out the previous line.
Just trying to figure out the best way to get my progress/status text organized with multithreading.

Since your code doesn't include any threads it's difficult to give you specific advice about what you are doing wrong.
If it's output sequencing that bothers you, however, you should learn about locking, and have the threads share a lock. The idea would be to grab the lock (so nobody else can), send your output and flush the stream before releasing the lock.
However, the way the code is structured makes this difficult, since there is no way to guarantee that the cursor will always end up at a specific position when you have multiple threads writing output.

As previous commenter mentioned, you can use mutex (RLock) object to serialize access.
import threading
# global lock
stdout_lock = threading.RLock()
def log_stdout(*args):
global stdout_lock
msg = ""
for i in args:
msg += i
with stdout_lock:
sys.stdout.write(msg)
for i in range(1,10):
log_stdout("Something here\n")
log_stdout("Remaining: x\n")
sleep(5)
log_stdout("\033[F")
log_stdout("\033[K")

Related

print information in a corner when using python/ipython REPL -- and quitting it while thread is running

I am using a python library I wrote to interact with a custom USB device. The library needs to send and receive data. I also need to interactively call methods. At the moment I utilize two shells, one receiving only and the other sending only. The latter is in the (i)python REPL. It works but it is clumsy, so I want to consolidate the two things in a single shell, which will have the advantage to have access to data structures from both sides in one context. That works fine. The problem is in the UI.
In fact, the receiving part needs to asynchronously print some information. So I wrote something like the following:
#!/usr/bin/python
import threading
import time
import blessings
TOTAL=5
def print_time( thread_id, delay):
count = 0
t=blessings.Terminal()
while count < TOTAL:
time.sleep(delay)
count += 1
stuff = "Thread " + str(thread_id) + " " + str(time.ctime(time.time())) + " -- " + str(TOTAL - count) + " to go"
with t.location(t.width - len(stuff) - 1, thread_id):
print (stuff, end=None )
print("", end="") # just return the cursor
try:
t1 = threading.Thread( target = print_time, args = (1, 2, ) )
t1.start()
print ("Thread started")
except:
print ("Error: unable to start thread")
That is my __init__.py file for the module. It somewhat works, but it has two problems:
While the thread is running, you cannot exit the REPL neither with CTRL-D nor with sys.exit() (that is the reason I am using TOTAL=5 above, so your life is easier if you try this code). This is a problem since my actual thread needs to be an infinite loop. I guess one solution could be to exit via a custom call which will cause a break into that infinite loop, but is there anything better?
The cursor does not return correctly to its earlier position
if I remove the end="" in the line with the comment # just return the cursor, it sort of works, but obviously print an unwanted newline in the place the cursor was (which messes us other input and/or output which might be happening there, in addition to add that unwanted newline)
if I leave the end="" it does not return the cursor, not even if I add something to print, e.g. print(".", end="") -- the dots . are printed at the right place, but the blinking cursor and the input is printed at the top
I know these are two unrelated problem and I could have asked two separate questions, but I need an answer to both, or otherwise it's a moot point. Or alternatively, I am open to other solutions. I thought of a separate GTK window, and that might work, but it's a subpar solution, since I really would like this to work in CLI only (to keep it possible in a ssh-without-X-tunneling setup).

Using blessed instead of blessing does not have the problem with the cursor not returning to the previous position, even without anything outside of the with context.
Making the thread a daemon solves the other problem.

How to make a python script stopable from another script?

TL;DR: If you have a program that should run for an undetermined amount of time, how do you code something to stop it when the user decide it is time? (Without KeyboardInterrupt or killing the task)
--
I've recently posted this question: How to make my code stopable? (Not killing/interrupting)
The answers did address my question, but from a termination/interruption point of view, and that's not really what I wanted. (Although, my question didn't made that clear)
So, I'm rephrasing it.
I created a generic script for example purposes. So I have this class, that gathers data from a generic API and write the data into a csv. The code is started by typing python main.py on a terminal window.
import time,csv
import GenericAPI
class GenericDataCollector:
def __init__(self):
self.generic_api = GenericAPI()
self.loop_control = True
def collect_data(self):
while self.loop_control: #Can this var be changed from outside of the class? (Maybe one solution)
data = self.generic_api.fetch_data() #Returns a JSON with some data
self.write_on_csv(data)
time.sleep(1)
def write_on_csv(self, data):
with open('file.csv','wt') as f:
writer = csv.writer(f)
writer.writerow(data)
def run():
obj = GenericDataCollector()
obj.collect_data()
if __name__ == "__main__":
run()
The script is supposed to run forever OR until I command it to stop. I know I can just KeyboardInterrupt (Ctrl+C) or abruptly kill the task. That isn't what I'm looking for. I want a "soft" way to tell the script it's time to stop, not only because interruption can be unpredictable, but it's also a harsh way to stop.
If that script was running on a docker container (for example) you wouldn't be able to Ctrl+C unless you happen to be in the terminal/bash inside the docker.
Or another situation: If that script was made for a customer, I don't think it's ok to tell the customer, just use Ctrl+C/kill the task to stop it. Definitely counterintuitive, especially if it's a non tech person.
I'm looking for way to code another script (assuming that's a possible solution) that would change to False the attribute obj.loop_control, finishing the loop once it's completed. Something that could be run by typing on a (different) terminal python stop_script.py.
It doesn't, necessarily, needs to be this way. Other solutions are also acceptable, as long it doesn't involve KeyboardInterrupt or Killing tasks. If I could use a method inside the class, that would be great, as long I can call it from another terminal/script.
Is there a way to do this?
If you have a program that should run for an undetermined amount of time, how do you code something to stop it when the user decide it is time?

In general, there are two main ways of doing this (as far as I can see). The first one would be to make your script check some condition that can be modified from outside (like the existence or the content of some file/socket). Or as #Green Cloak Guy stated, using pipes which is one form of interprocess communication.
The second one would be to use the built in mechanism for interprocess communication called signals that exists in every OS where python runs. When the user presses Ctrl+C the terminal sends a specific signal to the process in the foreground. But you can send the same (or another) signal programmatically (i.e. from another script).
Reading the answers to your other question I would say that what is missing to address this one is a way to send the appropriate signal to your already running process. Essentially this can be done by using the os.kill() function. Note that although the function is called 'kill' it can send any signal (not only SIGKILL).
In order for this to work you need to have the process id of the running process. A commonly used way of knowing this is making your script save its process id when it launches into a file stored in a common location. To get the current process id you can use the os.getpid() function.
So summarizing I'd say that the steps to achieve what you want would be:
Modify your current script to store its process id (obtainable by using os.getpid()) into a file in a common location, for example /tmp/myscript.pid. Note that if you want your script to be protable you will need to address this in a way that works in non-unix like OSs like Windows.
Choose one signal (typically SIGINT or SIGSTOP or SIGTERM) and modify your script to register a custom handler using signal.signal() that addresses the graceful termination of your script.
Create another (note that it could be the same script with some command line paramater) script that reads the process id from the known file (aka /tmp/myscript.pid) and sends the chosen signal to that process using os.kill().
Note that an advantage of using signals to achieve this instead of an external way (files, pipes, etc.) is that the user can still press Ctrl+C (if you chose SIGINT) and that will produce the same behavior as the 'stop script' would.

What you're really looking for is any way to send a signal from one program to another, independent, program. One way to do this would be to use an inter-process pipe. Python has a module for this (which does, admittedly, seem to require a POSIX-compliant shell, but most major operating systems should provide that).
What you'll have to do is agree on a filepath beforehand between your running-program (let's say main.py) and your stopping-program (let's say stop.sh). Then you might make the main program run until someone inputs something to that pipe:
import pipes
...
t = pipes.Template()
# create a pipe in the first place
t.open("/tmp/pipefile", "w")
# create a lasting pipe to read from that
pipefile = t.open("/tmp/pipefile", "r")
...
And now, inside your program, change your loop condition to "as long as there's no input from this file - unless someone writes something to it, .read() will return an empty string:
while not pipefile.read():
# do stuff
To stop it, you put another file or script or something that will write to that file. This is easiest to do with a shell script:
#!/usr/bin/env sh
echo STOP >> /tmp/pipefile
which, if you're containerizing this, you could put in /usr/bin and name it stop, give it at least 0111 permissions, and tell your user "to stop the program, just do docker exec containername stop".
(using >> instead of > is important because we just want to append to the pipe, not to overwrite it).
Proof of concept on my python console:
>>> import pipes
>>> t = pipes.Template()
>>> t.open("/tmp/file1", "w")
<_io.TextIOWrapper name='/tmp/file1' mode='w' encoding='UTF-8'>
>>> pipefile = t.open("/tmp/file1", "r")
>>> i = 0
>>> while not pipefile.read():
... i += 1
...
At this point I go to a different terminal tab and do
$ echo "Stop" >> /tmp/file1
then I go back to my python tab, and the while loop is no longer executing, so I can check what happened to i while I was gone.
>>> print(i)
1704312

Add time delay between every statement of python code

Is there an easy way to execute time delay (like time.sleep(3)) between every statement of Python code without having to explicitly write between every statement?
Like in the below Python Script which performs certain action on SAP GUI window. Sometimes, the script continues to the next statement before the previous statement is complete. So, I had to add a time delay between every statement so that it executes correctly. It is working with time delay, but I end up adding time.sleep(3) between every line. Just wondering if there is a better way?
import win32com.client
import time
sapgui = win32com.client.GetObject("SAPGUI").GetScriptingEngine
session = sapgui.FindById("ses[0]")
def add_record(employee_num, start_date, comp_code):
try:
time.sleep(3)
session.findById("wnd[0]/tbar[0]/okcd").text = "/npa40"
time.sleep(3)
session.findById("wnd[0]").sendVKey(0)
time.sleep(3)
session.findById("wnd[0]/usr/ctxtRP50G-PERNR").text = employee_num
time.sleep(3)
session.findById("wnd[0]").sendVKey(0)
time.sleep(3)
session.findById("wnd[0]/usr/ctxtRP50G-EINDA").text = start_date
time.sleep(3)
session.findById("wnd[0]/usr/tblSAPMP50ATC_MENU_EVENT/ctxtRP50G-WERKS[1,0]").text = comp_code
time.sleep(3)
session.findById("wnd[0]/usr/tblSAPMP50ATC_MENU_EVENT/ctxtRP50G-PERSG[2,0]").text = "1"
time.sleep(3)
session.findById("wnd[0]/usr/tblSAPMP50ATC_MENU_EVENT/ctxtRP50G-PERSK[3,0]").text = "U1"
time.sleep(3)
session.findById("wnd[0]/usr/tblSAPMP50ATC_MENU_EVENT").getAbsoluteRow(0).selected = True
time.sleep(3)
return "Pass"
except:
return "failed"

The right way to do what you asked for is almost certainly to use the debugger, pdb.
The right way to do what you want is probably something completely different: find some signal that tells you that the step is done, and wait for that signal. With problems like this, almost any time you pick will be way, way too long 99% of the time, but still too short 1% of the time. That signal may be joining a thread, or waiting on a (threading or multiprocessing) Condition, or getting from a queue, or awaiting a coroutine or future, or setting the sync flag on an AppleEvent, or… It really depends on what you're doing.
But if you really want to do this, you can use settrace:
def sleeper(frame, event, arg):
if event == 'line':
time.sleep(2)
return sleeper
sys.settrace(sleeper)
One small problem is that the notion of line used by the interpreter may well not be what you want. Briefly, a 'line' trace event is triggered whenever the ceval loop jumps to a different lnotab entry (see lnotab_notes.txt in the source to understand what that means—and you'll probably need at least a passing understanding of how bytecode is interpreted, at least from reading over the dis docs, to understand that). So, for example, a multiline expression is a single line; the line of a with statement may appear twice, etc.1
And there's probably an even bigger problem.
Sometimes, the script continues to next step before the previous step is fully complete.
I don't know what those steps are, but if you put the whole thread to sleep for 2 seconds, there's a good chance the step you're waiting for won't make any progress, because the thread is asleep. (For example, you're not looping through any async or GUI event loops, because you're doing nothing at all.) If so, then after 2 seconds, it'll still be just as incomplete as it was before, and you'll have wasted 2 seconds for nothing.
1. If your notion of "line" is closer to what's described in the reference docs on lexing and parsing Python, you could create an import hook that walks the AST and adds an expression statement with a Call to time.sleep(2) after each list element in each body with a module, definition, or compound statement (and then compiles and execs the result as usual).

Anything you want to happen in a program has to be explicitly stated - this is the nature of programming. This is like asking if you can print hello world without calling print("hello world").

I think the best advice to give you here is: don't think in terms of "lines", but think in term of functions.

use debugging mode and watch each and every line executing line by line.

How to keep a While True loop running with raw_input() if inputs are seldom?

I'm currently working on a project where I need to send data via Serial persistently but need to occasionally change that data based in new inputs. My issue is that my current loop only functions exactly when a new input is offered by raw_input(). Nothing runs again until another raw_input() is received.
My current (very slimmed down) loop looks like this:
while True:
foo = raw_input()
print(foo)
I would like for the latest values to be printed (or passed to another function) constantly regardless of how often changes occur.
Any help is appreciated.

The select (or in Python 3.4+, selectors) module can allow you to solve this without threading, while still performing periodic updates.
Basically, you just write the normal loop but use select to determine if new input is available, and if so, grab it:
import select
while True:
# Polls for availability of data on stdin without blocking
if select.select((sys.stdin,), (), (), 0)[0]:
foo = raw_input()
print(foo)
As written, this would print far more than you probably want; you could either time.sleep after each print, or change the timeout argument to select.select to something other than 0; if you make it 1 for instance, then you'll update immediately when new data is available, otherwise, you'll wait a second before giving up and printing the old data again.

How will you type in your data at the same time while data is being printed?
However, you can use multithreading if you make sure your source of data doesn't interfere with your output of data.
import thread
def give_output():
while True:
pass # output stuff here
def get_input():
while True:
pass # get input here
thread.start_new_thread(give_output, ())
thread.start_new_thread(get_input, ())
Your source of data could be another program. You could connect them using a file or a socket.

Thread-Switching in PyDbg

I've tried posting this in the reverse-engineering stack-exchange, but I thought I'd cross-post it here for more visibility.
I'm having trouble switching from debugging one thread to another in pydbg. I don't have much experience with multithreading, so I'm hoping that I'm just missing something obvious.
Basically, I want to suspend all threads, then start single stepping in one thread. In my case, there are two threads.
First, I suspend all threads. Then, I set a breakpoint on the location where EIP will be when thread 2 is resumed. (This location is confirmed by using IDA). Then, I enable single-stepping as I would in any other context, and resume Thread 2.
However, pydbg doesn't seem to catch the breakpoint exception! Thread 2 seems to resume and even though it MUST hit that address, there is no indication that pydbg is catching the breakpoint exception. I included a "print "HIT BREAKPOINT" inside pydbg's internal breakpoint handler, and that never seems to be called after resuming Thread 2.
I'm not too sure about where to go next, so any suggestions are appreciated!
dbg.suspend_all_threads()
print dbg.enumerate_threads()[0]
oldcontext = dbg.get_thread_context(thread_id=dbg.enumerate_threads()[0])
if (dbg.disasm(oldcontext.Eip) == "ret"):
print disasm_at(dbg,oldcontext.Eip)
print "Thread EIP at a ret"
addrstr = int("0x"+(dbg.read(oldcontext.Esp + 4,4))[::-1].encode("hex"),16)
print hex(addrstr)
dbg.bp_set(0x7C90D21A,handler=Thread_Start_bp_Handler)
print dbg.read(0x7C90D21A,1).encode("hex")
dbg.bp_set(oldcontext.Eip + dbg.instruction.length,handler=Thread_Start_bp_Handler)
dbg.set_thread_context(oldcontext,thread_id=dbg.enumerate_threads()[0])
dbg.context = oldcontext
dbg.resume_thread(dbg.enumerate_threads()[0])
dbg.single_step(enable=True)
return DBG_CONTINUE
Sorry about the "magic numbers", but they are correct as far as I can tell.

One of your problems is that you try to single step through Thread2 and you only refer to Thread1 in your code:
dbg.enumerate_threads()[0] # <--- Return handle to the first thread.
In addition, the code the you posted is not reflective of the complete structure of your script, which makes it hard to judge wether you have other errors or not. You also try to set breakpoint within the sub-brach that disassembles your instructions, which does not make a lot of sense to me logically. Let me try to explain what I know, and lay it out in an organized manner. That way you might look back at your code, re-think it and correct it.
Let's start with basic framework of debugging an application with pydbg:
Create debugger instance
Attache to the process
Set breakpoints
Run it
Breakpoint gets hit - handle it.
This is how it could look like:
from pydbg import *
from pydbg.defines import *
# This is maximum number of instructions we will log
MAX_INSTRUCTIONS = 20
# Address of the breakpoint
func_address = "0x7C90D21A"
# Create debugger instance
dbg = pydbg()
# PID to attach to
pid = int(raw_input("Enter PID: "))
# Attach to the process with debugger instance created earlier.
# Attaching the debugger will pause the process.
dbg.attach(pid)
# Let's set the breakpoint and handler as thread_step_setter,
# which we will define a little later...
dbg.bp_set(func_address, handler=thread_step_setter)
# Let's set our "personalized" handler for Single Step Exception
# It will get triggered if execution of a thread goes into single step mode.
dbg.set_callback(EXCEPTION_SINGLE_STEP, single_step_handler)
# Setup is done. Let's run it...
dbg.run()
Now having the basic structure, let's define our personalized handlers for breakpoint and single stepping. The code snippet below defines our "custom" handlers. What will happen is when breakpoint hits we will iterate through threads and set them to single step mode. It will in turn trigger single step exception, which we will handle and disassemble MAX_INSTRUCTIONS amount of instructions:
def thread_step_setter(dbg):
dbg.suspend_all_threads()
for thread_id in dbg.enumerate_threads():
print "Single step for thread: 0x%08x" % thread_id
h_thread = dbg.open_thread(thread_id)
dbg.single_step(True, h_thread)
dbg.close_handle(h_thread)
# Resume execution, which will pass control to step handler
dbg.resume_all_threads()
return DBG_CONTINUE
def single_step_handler(dbg):
global total_instructions
if instructions == MAX_INSTRUCTION:
dbg.single_step(False)
return DBG_CONTINUE
else:
# Disassemble the instruction
current_instruction = dbg.disasm(dbg.context,Eip)
print "#%d\t0x%08x : %s" % (total_instructions, dbg.context.Eip, current_instruction)
total_instructions += 1
dbg.single_step(True)
return DBG_CONTINUE
Discloser: I do not guarantee that the code above will work if copied and pasted. I typed it out and haven't tested it. However, if basic understanding is acquired, the small syntactical error could be easily fixed. I apologize in advanced if I have any. I don't currently have means or time to test it.
I really hope it helps you out.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.