AppJar threading to allow for GUI updates

AppJar threading to allow for GUI updates - python

I have been studying the AppJar documentation for the last few hours, but I really can't seem to figure out how to get the GUI to update during the data processing. I split the 4 main functions into different threads, and within the threads I added the update function as a .queuefunction, but the GUI still hangs until everything has completed.
This is the update function I wrote:
label_status = ["Ready"]
def update_label():
app.setLabel("status_label", label_status[-1])
I then broke down the process into 4 threads, but it didn't change anything compared to before. So I'm guessing I missed something pretty obvious here, but I can't find it.
def press(button):
""" Process a button press
Args:
button: The name of the button. Either Process of Quit
"""
if button == "Process":
global label_status
global output_directory
global filename_out
src_file = app.getEntry("input_file")
output_directory = app.getEntry("output_directory")
filename_out = app.getEntry("output_name")
errors, error_msg = validate_inputs(src_file, output_directory, filename_out)
if errors:
label_status.append("Error")
update_label()
app.errorBox("Error", "\n".join(error_msg), parent=None)
return label_status
else:
#Create single xlsx doc from data
trimmed_input = src_file[:-4]
app.thread(create_xlsx_file(trimmed_input))
# add graphs to excel file
app.thread(add_graphs())
#clean temporary files
app.thread(clean_files())
#move output.xlsx to location chosen with filename chosen
app.thread(move_output())
I have attempted to update the GUI within in threads in the following way:
def clean_files():
label_status.append("Cleaning temporary files")
app.queueFunction(update_label())
file_path = os.path.join("csv_output/" + "temp*")
del_files = glob.glob(file_path)
for files in del_files:
os.remove(files)
Since I'm appending to a list, I can see all statuses are being added, but only the first and last are displayed to the user. What am I missing here?

It looks like all your calls to app.thread() and app.queueFunction() are being done wrong.
I think you should be passing just the name of a function. But, because you’ve put brackets after the function names, you’re actually passing the result of the functions.
Try it without including brackets after the function names, eg. app.queueFunction(update_label)

Related

Chaining Python Scripts

I have two user defined python scripts. First takes a file and processes it, while the second script takes the output of first and runs an executable, and supplies the output of first script to program with additional formatting.
I need to run these scripts via another python script, which is my main executable script.
I searched a bit about this topic and;
I can use importlib to gather the content of scripts so that I can call them at appropriate times. This requires the scripts to be under my directory/or modification to path environment variable. So it is a bit ugly looking at best, not seem pythonish.
Built-in eval function. This requires the user to write a server-client like structure, cause the second script might have to run the said program more than one time while the first script still gives output.
I think I'm designing something wrong, but I cannot come up with a better approach.
A more detailed explenation(maybe gibberish)
I need to benchmark some programs, while doing so I have a standard form of data, and this data needs to be supplied to benchmark programs. The scripts are (due to nature of benchmark) special to each program, and needs to be bundled with benchmark definition, yet I need to create this program as a standalone configurable tester. I think, I have designed something wrong, and would love to hear the design approaches.
PS: I do not want to limit the user, and this is the reason why I choose to run python scripts.

I created a few test scripts to make sure this works.
The first one (count_01.py) sleeps for 100 seconds, then counts from 0 to 99 and sends it to count_01.output.
The second one (count_02.py) reads the output of first one (count_01.output) and adds 1 to each number and writes that to count_02.output.
The third script (chaining_programs.py) runs the first one and waits for it to finish before calling the second one.
# count_01.py --------------------
from time import sleep
sleep(100)
filename = "count_01.output"
file_write = open(filename,"w")
for i in range(100):
#print " i = " + str(i)
output_string = str(i)
file_write.write(output_string)
file_write.write("\n")
file_write.close()
# ---------------------------------
# count_02.py --------------------
file_in = "count_01.output"
file_out = "count_02.output"
file_read = open(file_in,"r")
file_write = open(file_out,"w")
for i in range(100):
line_in = file_read.next()
line_out = str(int(line_in) + 1)
file_write.write(line_out)
file_write.write("\n")
file_read.close()
file_write.close()
# ---------------------------------
# chaining_programs.py -------------------------------------------------------
import subprocess
import sys
#-----------------------------------------------------------------------------
path_python = 'C:\Python27\python.exe' # 'C:\\Python27\\python.exe'
#
# single slashes did not work
#program_to_run = 'C:\Users\aaaaa\workspace\Rich_Project_044_New_Snippets\source\count.py'
program_to_run_01 = 'C:\\Users\\aaaaa\\workspace\\Rich_Project_044_New_Snippets\\source\\count_01.py'
program_to_run_02 = 'C:\\Users\\aaaaa\\workspace\\Rich_Project_044_New_Snippets\\source\\count_02.py'
#-----------------------------------------------------------------------------
# waits
sys.pid = subprocess.call([path_python, program_to_run_01])
# does not wait
sys.pid = subprocess.Popen([path_python, program_to_run_02])
#-----------------------------------------------------------------------------

Python 2.7 Multiprocessing Pool for a list of Strings?

I'm new to Python (disclaimer: I'm new to programming and I've been reading python online for two weeks) and I've written a simple multi-processing script that should allow me to use four subprocesses at once. I was using a global variable (YES, I KNOW BETTER NOW) to keep track of how many processes were running at once. Start a new process, increment by one; end a process, decrement by one. This was messy but I was only focused on getting the multi-processes working, which it does.
So far I've been doing the equivalent of:
processes = 0
def function(value)
global processes
do stuff to value
processes-=1
While read line
if processes < 4
processes+=1
create a new subprocess - function(line)
1: I need to keep track of processes in a better way than a global. I saw some use of a 'pool' in python to have 4 workers, but I failed hard at it. I like the idea of a pool but I don't know how to pass each line of a list to the next worker. Thoughts?
2: On general principles, why is my global var decrement not working? I know it's ugly, but I at least expected it to be ugly and successful.
3: I know I'm not locking the var before editing, I was going to add that once the decrementation was working properly.
Sorry if that's horrible pseudo-code, but I think you can see the gist. Here is the real code if you want to dive in:
MAX_THREADS = 4
CURRENT_THREADS = 0
MAX_LOAD = 8
# Iterate through all users in the userlist and call the funWork function on each user
def funReader(filename):
# I defined the logger in detail above, I skipped about 200 lines of code to get it slimmed down
logger.info("Starting 'move' function for file \"{0}\"...".format(filename))
# Read in the entire user list file
file = open(filename, 'r')
lines = file.read()
file.close()
for line in lines:
user = line.rstrip()
funControl(user)
# Accept a username and query system load and current funWork thread count; decide when to start next thread
def funControl(user):
# Global variables that control whether a new thread starts
global MAX_THREADS
global CURRENT_THREADS
global MAX_LOAD
# Decide whether to start a new subprocess of funWork for the current user
print
logger.info("Trying to start a new thread for user {0}".format(user))
sysLoad = os.getloadavg()[1]
logger.info("The current threads before starting a new loop are: {0}.".format(CURRENT_THREADS))
if CURRENT_THREADS < MAX_THREADS:
if sysLoad < MAX_LOAD:
CURRENT_THREADS+=1
logger.info("Starting a new thread for user {0}.".format(user))
p = Process(target=funWork, args=(user,))
p.start()
else:
print "Max Load is {0}".format(MAX_LOAD)
logger.info("System load is too high ({0}), process delayed for four minutes.".format(sysLoad))
time.sleep(240)
funControl(user)
else:
logger.info("There are already {0} threads running, user {1} delayed for ten minutes.".format(CURRENT_THREADS, user))
time.sleep(600)
funControl(user)
# Actually do the work for one user
def funWork(user):
global CURRENT_THREADS
for x in range (0,10):
logger.info("Processing user {0}.".format(user))
time.sleep(1)
CURRENT_THREADS-=1
Lastly: any errors you see are likely to be transcription mistakes because the code executes without bugs on a server at work. However, any horrible coding practices you see are completely mine.
Thanks in advance!

how about this: (not tested)
MAX_PROCS = 4
# Actually do the work for one user
def funWork(user):
for x in range (0,10):
logger.info("Processing user {0}.".format(user))
time.sleep(1)
return
# Iterate through all users in the userlist and call the funWork function on each user
def funReader(filename):
# I defined the logger in detail above, I skipped about 200 lines of code to get it slimmed down
logger.info("Starting 'move' function for file \"{0}\"...".format(filename))
# Read in the entire user list file
file = open(filename, 'r')
lines = file.read()
file.close()
work = []
for line in lines:
user = line.rstrip()
work.append(user)
pool = multiprocessing.Pool(processes=MAX_PROCS) #threads are different from processes...
return pool.map(funWork, work)

My python program is running really slow

I'm making a program that (at least right now) retrives stream information from TwitchTV (streaming platform). This program is to self educate myself but when i run it, it's taking 2 minutes to print just the name of the streamer.
I'm using Python 2.7.3 64bit on Windows7 if that is important in anyway.
classes.py:
#imports:
import urllib
import re
#classes:
class Streamer:
#constructor:
def __init__(self, name, mode, link):
self.name = name
self.mode = mode
self.link = link
class Information:
#constructor:
def __init__(self, TWITCH_STREAMS, GAME, STREAMER_INFO):
self.TWITCH_STREAMS = TWITCH_STREAMS
self.GAME = GAME
self.STREAMER_INFO = STREAMER_INFO
def get_game_streamer_names(self):
"Connects to Twitch.TV API, extracts and returns all streams for a spesific game."
#start connection
self.con = urllib2.urlopen(self.TWITCH_STREAMS + self.GAME)
self.info = self.con.read()
self.con.close()
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
#run in a for to reduce all "live_user_NAME" values
for name in self.streamers_names:
if name.startswith("live_user_"):
self.streamers_names.remove(name)
#end method
return self.streamers_names
def get_streamer_mode(self, name):
"Returns a streamers mode (on/off)"
#start connection
self.con = urllib2.urlopen(self.STREAMER_INFO + name)
self.info = self.con.read()
self.con.close()
#check if stream is online or offline ("stream":null indicates offline stream)
if self.info.count('"stream":null') > 0:
return "offline"
else:
return "online"
main.py:
#imports:
from classes import *
#consts:
TWITCH_STREAMS = "https://api.twitch.tv/kraken/streams/?game=" #add the game name at the end of the link (space = "+", eg: Game+Name)
STREAMER_INFO = "https://api.twitch.tv/kraken/streams/" #add streamer name at the end of the link
GAME = "League+of+Legends"
def main():
#create an information object
info = Information(TWITCH_STREAMS, GAME, STREAMER_INFO)
streamer_list = [] #create a streamer list
for name in info.get_game_streamer_names():
#run for every streamer name, create a streamer object and place it in the list
mode = info.get_streamer_mode(name)
streamer_name = Streamer(name, mode, 'http://twitch.tv/' + name)
streamer_list.append(streamer_name)
#this line is just to try and print something
print streamer_list[0].name, streamer_list[0].mode
if __name__ == '__main__':
main()
the program itself works perfectly, just really slow
any ideas?

Program efficiency typically falls under the 80/20 rule (or what some people call the 90/10 rule, or even the 95/5 rule). That is, 80% of the time the program is actually running in 20% of the code. In other words, there is a good shot that your code has a "bottleneck": a small area of the code that is running slow, while the rest runs very fast. Your goal is to identify that bottleneck (or bottlenecks), then fix it (them) to run faster.
The best way to do this is to profile your code. This means you are logging the time of when a specific action occurs with the logging module, use timeit like a commenter suggested, use some of the built-in profilers, or simply print out the current time at very points of the program. Eventually, you will find one part of the code that seems to be taking the most amount of time.
Experience will tell you that I/O (stuff like reading from a disk, or accessing resources over the internet) will take longer than in-memory calculations. My guess as to the problem is that you're using 1 HTTP connection to get a list of streamers, and then one HTTP connection to get the status of that streamer. Let's say that there are 10000 streamers: your program will need to make 10001 HTTP connections before it finishes.
There would be a few ways to fix this if this is indeed the case:
See if Twitch.TV has some alternatives in their API that allows you to retrieve a list of users WITH their streaming mode so that you don't need to call an API for each streamer.
Cache results. This won't actually help your program run faster the first time it runs, but you might be able to make it so that if it runs a second time within a minute, it can reuse results.
Limit your application to only dealing with a few streamers at a time. If there are 10000 streamers, what exactly does your application do that it really needs to look at the mode of all 10000 of them? Perhaps it's better to just grab the top 20, at which point the user can press a key to get the next 20, or close the application. Often times, programming is not just about writing code, but managing expectations of what your users want. This seems to be a pet project, so there might not be "users", meaning you have free reign to change what the app does.
Use multiple connections. Right now, your app makes one connection to the server, waits for the results to come back, parses the results, saves it, then starts on the next connection. This process might take an entire half a second. If there were 250 streamers, running this process for each of them would take a little over two minutes total. However, if you could run four of them at a time, you could potentially reduce your time to just under 30 seconds total. Check out the multiprocessing module. Keep in mind that some APIs might have limits to how many connections you can make at a certain time, so hitting them with 50 connections at a time might irk them and cause them to forbid you from accessing their API. Use caution here.

You are using the wrong tool here to parse the json data returned by your URL. You need to use json library provided by default rather than parsing the data using regex.
This will give you a boost in your program's performance
Change the regex parser
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
To json parser
self.info = json.loads(self.info) #This will parse the json data as a Python Object
#Parse the name and return a generator
return (stream['name'] for stream in data[u'streams'])

Building commands from variables in Python and Tkinter

Partly to learn, partly to help myself I'm trying to write an app with a GUI for encoding/decoding. At the moment, I'm just working on encoding.
I've a Tkinter menu which feeds a variable to the GUI def with item specified as base64, urllib or encoding hex.
A button exists on the GUI which runs gettext. I'm having difficulty getting encodedvar to contain the process + variable and for the results to be displayed in the bottom frame.
On running this, at the moment, the following appears (as an example) in the bottom frame - blackcat obviously being what was entered into the middle frame.
base64.encodestring('blackcat
')
Have 2 issues:
Getting the code to actually be formatted correctly i.e. not over 2 lines as shown above
Have the code run, rather than the command itself be printed in the bottom.
The code I'm using is displayed below:
def gui(item):
if item == 'encode_b64':
process = 'base64.encodestring'
elif item == 'encode_url':
process = 'urllib.quote_plus'
else:
process = '.encode("hex")'
def getText():
bottomtext.delete(1.0, END)
var = middletext.get(1.0, END)
encodedvar = process + "('%s')" % var
bottomtext.insert(INSERT, encodedvar)

The text widget guarantees a trailing newline, so you should use "end-1c" when getting the contents of the text widget. Doing that guarantees you get only the text that was entered by the user without that extra trailing newline.
Second, to run the function instead of printing it out, you can store the actual function in a variable, then use the variable to invoke the function:
if item == 'encode_b64':
process = base64.encodestring
elif item == 'encode_url':
process = urllib.quote_plus
else:
process = default_encode
def default_encode(s):
s.encode("hex")
...
bottomtext.insert(INSERT, process(var))
The above can be written a bit more succinctly like this:
mapping = {"encode_b64": base64.encodestring,
"encode_url": urllib.quote_plus}
process = mapping.get(item, default_encode)

Pygtk StatusIcon not loading?

I'm currently working on a small script that needs to use gtk.StatusIcon(). For some reason, I'm getting some weird behavior with it. If I go into the python interactive shell and type:
>> import gtk
>> statusIcon = gtk.status_icon_new_from_file("img/lin_idle.png")
Pygtk does exactly what it should do, and shows an icon (lin_idle.png) in the system tray:
However, if I try to do the same task in my script:
def gtkInit(self):
self.statusIcon = gtk.status_icon_new_from_file("img/lin_idle.png")
When gtkInit() gets called, I see this instead:
I made I ran the script in the same working directory as the interactive python shell, so I'm pretty sure it's finding the image, so I'm stumped... Any ideas anyone? Thanks in advance.
Update: For some reason or another, after calling gtk.status_icon_new_from_file() a few times in the script, it does eventually create the icon, but this issue still remains unfortunately. Does anyone at all have any ideas as to what could be going wrong?
As requested: Here's the full script. This is actually an application that I'm in the very early stages of making, but it does work at the moment if you get it setup correctly, so feel free to play around with it if you want (and also help me!), you just need to get an imgur developer key and put it in linup_control.py
Linup.py
#
# Linup - A dropbox alternative for Linux!
# Written by Nakedsteve
# Released under the MIT License
#
import os
import time
import ConfigParser
from linup_control import Linup
cfg = ConfigParser.RawConfigParser()
# See if we have a .linuprc file
home = os.path.expanduser("~")
if not os.path.exists(home+"/.linuprc"):
# Nope, so let's make one
cfg.add_section("paths")
cfg.set("paths","watch_path", home+"/Desktop/screenshot1.png")
# Now write it to the file
with open(home+"/.linuprc","wb") as configfile:
cfg.write(configfile)
else:
cfg.read(home+"/.linuprc")
linup = Linup()
# Create the GUI (status icon, menus, etc.)
linup.gtkInit()
# Enter the main loop, where we check to see if there's a shot to upload
# every 1 second
path = cfg.get("paths","watch_path")
while 1:
if(os.path.exists(path)):
linup.uploadImage(path)
url = linup.getURL()
linup.toClipboard(url)
linup.json = ""
print "Screenshot uploaded!"
os.remove(path)
else:
# If you're wondering why I'm using time.sleep()
# it's because I found that without it, my CPU remained
# at 50% at all times while running linup. If you have a better
# method for doing this, please contact me about it (I'm relatively new at python)
time.sleep(1)
linup_control.py
import gtk
import json
import time
import pycurl
import os
class Linup:
def __init__(self):
self.json = ""
def uploadImage(self, path):
# Set the status icon to busy
self.statusIcon.set_from_file("img/lin_busy.png")
# Create new pycurl instance
cu = pycurl.Curl()
# Set the POST variables to the image and dev key
vals = [
("key","*************"),
("image", (cu.FORM_FILE, path))
]
# Set the URL to send to
cu.setopt(cu.URL, "http://imgur.com/api/upload.json")
# This lets us get the json returned by imgur
cu.setopt(cu.WRITEFUNCTION, self.resp_callback)
cu.setopt(cu.HTTPPOST, vals)
# Do eet!
cu.perform()
cu.close()
# Set the status icon to done...
self.statusIcon.set_from_file("img/lin_done.png")
# Wait 3 seconds
time.sleep(3)
# Set the icon to idle
self.statusIcon.set_from_file("img/lin_idle.png")
# Used for getting the response json from imgur
def resp_callback(self, buff):
self.json += buff
# Extracts the image URL from the json data
def getURL(self):
js = json.loads(self.json)
return js['rsp']['image']['original_image']
# Inserts the text variable into the clipboard
def toClipboard(self, text):
cb = gtk.Clipboard()
cb.set_text(text)
cb.store()
# Initiates the GUI elements of Linup
def gtkInit(self):
self.statusIcon = gtk.StatusIcon()
self.statusIcon.set_from_file("img/lin_idle.png")

You need to call the gtk.main function like qba said, however the correct way to call a function every N milliseconds is to use the gobject.timeout_add function. In most cases you would want to have anything that could tie up the gui in a separate thread, however in your case where you just have an icon you don't need to. Unless you are planning on making the StatusIcon have a menu. Here is the part of Linup.py that I changed:
# Enter the main loop, where we check to see if there's a shot to upload
# every 1 second
path = cfg.get("paths","watch_path")
def check_for_new():
if(os.path.exists(path)):
linup.uploadImage(path)
url = linup.getURL()
linup.toClipboard(url)
linup.json = ""
print "Screenshot uploaded!"
os.remove(path)
# Return True to keep calling this function, False to stop.
return True
if __name__ == "__main__":
gobject.timeout_add(1000, check_for_new)
gtk.main()
You will have to import gobject somewhere too.
I don't know for sure if this works because I can't get pycurl installed.
EDIT: In linup_control.py, I would try changing
# Wait 3 seconds
time.sleep(3)
# Set the icon to idle
self.statusIcon.set_from_file("img/lin_idle.png")
to
gobject.timeout_add(3000, self.statusIcon.set_from_file, "img/lin_idle.png")

You made two mistakes. One is important one is not.
At first if you want to use stock icon use .set_from_stock( stock_id ) method. If you want to use your own icon then the .set_from_file(/path/to/img.png) is ok.
The other think witch is the probably the main problem is that when you write gtk application you have to call gtk.main() function. This is main gtk loop where all signal handling/window drawing and all other gtk stuff is done. If you don't do this, simply your icon is not drawing.
The solution in your case is to make two threads - one for gui, second for your app. In the first one you simply call gtk.main(). In second you put your main program loop. Of course when you call python program you have one thread already started:P
If you aren't familiar whit threads there is other solution. Gtk have function which calls function specified by you with some delay:
def call_me:
print "Hello World!"
gtk.timeout_add( 1000 , call_me )
gtk.timeout_add( 1000 , call_me )
gtk.main()
But it seems to be deprecated now. Probably they have made a better solution.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.