How to emit GObject signal once? - python

First, I have less than one year experience with Python. I normally have problems with data usage with my ISP such that I discover late that I have overused my data allocation. So, in order to evade this problem, I thought of creating a script that would run immediately I tether my phone to my PC. The script scrapes data from my ISP's website and returns my data bundle balance.
I found out that pyudev is useful in such an instance where I would like to constantly monitor my usb ports activities. Part of the script is as shown below. The problem is identify_phone is called 3 times (from my experience) which tends to call the scraping method 3 times whereas I would like it to be called once. I tried using the global value but it does not work as expected. I have searched online but there are scarce resources on pyudev.
Any help would be appreciated (Including any revision on my code :P)
import glib
import re
import subprocess
import requests
import bs4
import datetime
import sys
from selenium import webdriver
from pyudev import Context, Monitor
def identify_phone(observer, device):
global last_updated
current_time = datetime.datetime.now()
time_diff = current_time - last_updated
if time_diff.seconds < 10:#300:
pass
else:
#print('\Checking USB ports...')
try:
tout = subprocess.check_output("lsusb | grep 1234:1234", shell=True) #check if specific usb device (phone) is connected
except subprocess.CalledProcessError:
tout = None
if tout is not None:
device_found = True
else:
device_found = False
last_updated = datetime.datetime.now()
get_network_data()# scrapes ISP website
try:
device_found = False
last_updated = datetime.datetime.now()
try:
from pyudev.glib import MonitorObserver
except ImportError:
from pyudev.glib import GUDevMonitorObserver as MonitorObserver
context = Context()
monitor = Monitor.from_netlink(context)
monitor.filter_by(subsystem='usb')
observer = MonitorObserver(monitor)
observer.connect('device-added', identify_phone)
monitor.start()
glib.MainLoop().run()
except KeyboardInterrupt:
print('\nShutdown requested.\nExiting gracefully...')
sys.exit(0)

Related

Running a Python script for k seconds

I have written a Python script that reads the serial monitor in order to get sensor readings from an Arduino. I've been trying to solve the following problem: I want my script to run exactly for one minute in order to get the data and process it offline. For instance, if I execute the following script, it should be running for a minute and then stop. I have tried using the time module or the sleep function but my script keeps getting data and does not stop. I'm not sure how to break the while loop. Until now I managed to stop the execution by pressing CTRL+C, but it's necessary for the script to stop on its own. Here's my code(I'm also posting the get_readings function):
python
# -*- coding: utf-8 -*-
import chair_functions as cf
import os
if __name__ == '__main__':
file_extension = '.txt'
rec_file = 'chair_'+cf.get_date()+file_extension
raw_data = cf.create_directories()
rec_file = os.path.join(raw_data,rec_file)
cf.get_readings(rec_file)
# -*- coding: utf-8 -*-
from serial import Serial
import pandas as pd
import collections
import logging
import serial
import time
import sys
import csv
import os
def get_readings(output_file):
"""Read the data stream coming from the serial monitor
in order to get the sensor readings
Parameters
----------
output_file : str
The file name, where the data stream will be stored
"""
serial_port = "/dev/ttyACM0"
baud_rate = 9600
ser = serial.Serial(serial_port,baud_rate)
logging.basicConfig(filename=output_file,level=logging.DEBUG,format="%(asctime)s %(message)s")
flag = False
while True:
try:
serial_data = str(ser.readline().decode().strip('\r\n'))
time.sleep(0.2)
tmp = serial_data.split(' ')[0] #Getting Sensor Id
if(tmp == 'A0'):
flag = True
if (flag and tmp != 'A4'):
#print(serial_data)
logging.info(serial_data)
if(flag and tmp == 'A4'):
flag = False
#print(serial_data)
logging.info(serial_data)
except (UnicodeDecodeError, KeyboardInterrupt) as err:
print(err)
print(err.args)
sys.exit(0)
It is the time module itself that does not work. Therefore, instead of using True for the while condition, set start = timer.time(), then while time.time() - start < 60: as below:
start = time.time()
while time.time() - start < 60:
...

Run python script like a service with Twisted

I would like to run this script like a automatic service who will run every minute, everyday with Twisted (I first tried to 'DAEMON' but it seems to difficult and i didn't find good tutos to do it, I already tried crontab but that's not what I'm looking for).
Do anyone ever do that with Twisted because I'm not finding the tutorial made for my kind of script(getting datas from a db table and putting them in another table of same db) ? I have to keep the logs in a file but it will not be the most difficult part.
from twisted.enterprise import adbapi
from twisted.internet import task
import logging
from datetime import datetime
from twisted.internet import reactor
from twisted.internet.defer import inlineCallbacks
"""
Test DB : This File do database connection and basic operation.
"""
log = logging.getLogger("Test DB")
dbpool = adbapi.ConnectionPool("MySQLdb",db="xxxx",user="guza",passwd="vQsx7gbblal8aiICbTKP",host="192.168.15.01")
class MetersCount():
def getTime(self):
log.info("Get Current Time from System.")
time = str(datetime.now()).split('.')[0]
return time
def getTotalMeters(self):
log.info("Select operation in Database.")
getMetersQuery = """ SELECT count(met_id) as totalMeters FROM meters WHERE DATE(met_last_heard) = DATE(NOW()) """
return dbpool.runQuery(getMetersQuery).addCallback(self.getResult)
def getResult(self, result):
print ("Receive Result : ")
print (result)
# general purpose method to receive result from defer.
return result
def insertMetersCount(self, meters_count):
log.info("Insert operation in Database.")
insertMetersQuery = """ INSERT INTO meter_count (mec_datetime, mec_count) VALUES (NOW(), %s)"""
return dbpool.runQuery(insertMetersQuery, [meters_count])
def checkDB(self):
d = self.getTotalMeters()
d.addCallback(self.insertMetersCount)
return d
a= MetersCount()
a.checkDB()
reactor.run()
If you want to run a function once a minute, have a look at LoopingCall. It takes a function, and runs it at intervals unless told to stop.
You would use it something like this (which I haven't tested):
from twisted.internet.task import LoopingCall
looper = LoopingCall(a.checkDB)
looper.start(60)
The documentation is at the link.

Python schedule with commandline

I have this problem that I want to automate a script.
And in passed projects I've used python scheduler for this. But for this project I'm unsure how to handle this.
The problem is that the code works with login details that are outside the code and entered in the commandline when launching the script.
ex. python scriptname.py email#youremail.com password
How can I automate this with python scheduler?
The code that is in 'scriptname.py' is:
//LinkedBot.py
import argparse, os, time
import urlparse, random
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
def getPeopleLinks(page):
links = []
for link in page.find_all('a'):
url = link.get('href')
if url:
if 'profile/view?id=' in url:
links.append(url)
return links
def getJobLinks(page):
links = []
for link in page.find_all('a'):
url = link.get('href')
if url:
if '/jobs' in url:
links.append(url)
return links
def getID(url):
pUrl = urlparse.urlparse(url)
return urlparse.parse_qs(pUrl.query)['id'][0]
def ViewBot(browser):
visited = {}
pList = []
count = 0
while True:
#sleep to make sure everything loads, add random to make us look human.
time.sleep(random.uniform(3.5,6.9))
page = BeautifulSoup(browser.page_source)
people = getPeopleLinks(page)
if people:
for person in people:
ID = getID(person)
if ID not in visited:
pList.append(person)
visited[ID] = 1
if pList: #if there is people to look at look at them
person = pList.pop()
browser.get(person)
count += 1
else: #otherwise find people via the job pages
jobs = getJobLinks(page)
if jobs:
job = random.choice(jobs)
root = 'http://www.linkedin.com'
roots = 'https://www.linkedin.com'
if root not in job or roots not in job:
job = 'https://www.linkedin.com'+job
browser.get(job)
else:
print "I'm Lost Exiting"
break
#Output (Make option for this)
print "[+] "+browser.title+" Visited! \n("\
+str(count)+"/"+str(len(pList))+") Visited/Queue)"
def Main():
parser = argparse.ArgumentParser()
parser.add_argument("email", help="linkedin email")
parser.add_argument("password", help="linkedin password")
args = parser.parse_args()
browser = webdriver.Firefox()
browser.get("https://linkedin.com/uas/login")
emailElement = browser.find_element_by_id("session_key-login")
emailElement.send_keys(args.email)
passElement = browser.find_element_by_id("session_password-login")
passElement.send_keys(args.password)
passElement.submit()
Running this on OSX.
I can see at least two different way of automating the trigger of your script. Since you are mentioning that your script is started this way:
python scriptname.py email#youremail.com password
It means that you start it from a shell. As you want to have it scheduled, it sounds like a Crontab is a perfect answer. (see https://kvz.io/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/ for example)
If you really want to use python scheduler, you can use the subprocess.
In your file using python scheduler:
import subprocess
subprocess.call("python scriptname.py email#youremail.com password", shell=True)
What is the best way to call a Python script from another Python script?
About the code itself
LinkedIn REST Api
Have you tried using LinkedIn's REST Api instead of retrieving heavy pages, filling in some form and sending it back?
Your code is prone to be broken whenever LinkedIn changes some elements in their page. Whereas the Api is a contract between LinkedIn and the users.
Check here https://developer.linkedin.com/docs/rest-api and there https://developer.linkedin.com/docs/guide/v2/concepts/methods
Credentials
So that you don't have to pass your credentials through command line (especially your password, which will be readable in clear through history), you should either
use a config file (with your Api Key) and read it with ConfigParser (or anything else, depending on the format of your config file (json, python, etc...)
or set them into your environment variables.
For the scheduling
Using Cron
Moreover, for the scheduling part, you can use cron.
Using Celery
If you're looking for a 100% Python solution, you can use the excellent Celery project. Check its periodic tasks.
You can pass the args to the python scheduler.
scheduler.enter(delay, priority, action, argument=(), kwargs={})
Schedule an event for delay more time units. Other than the relative time, the other arguments, the effect and the return value are the same as those for enterabs().
Changed in version 3.3: argument parameter is optional.
New in version 3.3: kwargs parameter was added.
>>> import sched, time
>>> s = sched.scheduler(time.time, time.sleep)
>>> def print_time(a='default'):
... print("From print_time", time.time(), a)
...
>>> def print_some_times():
... print(time.time())
... s.enter(10, 1, print_time)
... s.enter(5, 2, print_time, argument=('positional',))
... s.enter(5, 1, print_time, kwargs={'a': 'keyword'})
... s.run()
... print(time.time())
...
>>> print_some_times()
930343690.257
From print_time 930343695.274 positional
From print_time 930343695.275 keyword
From print_time 930343700.273 default
930343700.276

Get python application memory usage

My main goal is to know how much memory my python application takes during execution.
I'm using python 2.7.5 on Windows-32 and Windows-64.
I found a way to get some info about my process here: http://code.activestate.com/recipes/578513-get-memory-usage-of-windows-processes-using-getpro/
Putting the code here for convenience:
"""Functions for getting memory usage of Windows processes."""
__all__ = ['get_current_process', 'get_memory_info', 'get_memory_usage']
import ctypes
from ctypes import wintypes
GetCurrentProcess = ctypes.windll.kernel32.GetCurrentProcess
GetCurrentProcess.argtypes = []
GetCurrentProcess.restype = wintypes.HANDLE
SIZE_T = ctypes.c_size_t
class PROCESS_MEMORY_COUNTERS_EX(ctypes.Structure):
_fields_ = [
('cb', wintypes.DWORD),
('PageFaultCount', wintypes.DWORD),
('PeakWorkingSetSize', SIZE_T),
('WorkingSetSize', SIZE_T),
('QuotaPeakPagedPoolUsage', SIZE_T),
('QuotaPagedPoolUsage', SIZE_T),
('QuotaPeakNonPagedPoolUsage', SIZE_T),
('QuotaNonPagedPoolUsage', SIZE_T),
('PagefileUsage', SIZE_T),
('PeakPagefileUsage', SIZE_T),
('PrivateUsage', SIZE_T),
]
GetProcessMemoryInfo = ctypes.windll.psapi.GetProcessMemoryInfo
GetProcessMemoryInfo.argtypes = [
wintypes.HANDLE,
ctypes.POINTER(PROCESS_MEMORY_COUNTERS_EX),
wintypes.DWORD,
]
GetProcessMemoryInfo.restype = wintypes.BOOL
def get_current_process():
"""Return handle to current process."""
return GetCurrentProcess()
def get_memory_info(process=None):
"""Return Win32 process memory counters structure as a dict."""
if process is None:
process = get_current_process()
counters = PROCESS_MEMORY_COUNTERS_EX()
ret = GetProcessMemoryInfo(process, ctypes.byref(counters),
ctypes.sizeof(counters))
if not ret:
raise ctypes.WinError()
info = dict((name, getattr(counters, name))
for name, _ in counters._fields_)
return info
def get_memory_usage(process=None):
"""Return this process's memory usage in bytes."""
info = get_memory_info(process=process)
return info['PrivateUsage']
if __name__ == '__main__':
import pprint
pprint.pprint(get_memory_info())
And this is the result:
{'PageFaultCount': 1942L,
'PagefileUsage': 4624384L,
'PeakPagefileUsage': 4624384L,
'PeakWorkingSetSize': 7544832L,
'PrivateUsage': 4624384L,
'QuotaNonPagedPoolUsage': 8520L,
'QuotaPagedPoolUsage': 117848L,
'QuotaPeakNonPagedPoolUsage': 8776L,
'QuotaPeakPagedPoolUsage': 117984L,
'WorkingSetSize': 7544832L,
'cb': 44L}
But this does not satisfy me. These results give me the whole python process information while what I need is only my specific application that runs on top of the Python framework.
I saw several memory profilers on the internet and also here in Stack Overflow but they are too big for me. The only information that I need is how much memory my app consumes by itself - without taking into account all the Python framework.
How can I achieve this?
Here is a simple easy pythonic way, based on (os, psutil) modules. Thanks to (Dataman) and (RichieHindle) answers.
import os
import psutil
## - Get Process Id of This Running Script -
proc_id = os.getpid()
print '\nProcess ID: ', proc_id
#--------------------------------------------------
## - Get More Info Using the Process Id
ObjInf = psutil.Process(proc_id)
print '\nProcess %s Info:' % proc_id, ObjInf
#--------------------------------------------------
## - Proccess Name of this program
name = ObjInf.name()
print '\nThis Program Process name:', name
#--------------------------------------------------
## - Print CPU Percentage
CpuPerc = ObjInf.cpu_percent()
print '\nCpu Percentage:', CpuPerc
#---------------------------------------------------
## - Print Memory Usage
memory_inf = ObjInf.memory_full_info()
print '\nMemory Info:', memory_inf, '\n'
## Print available commands you can do with the psutil obj
for c in dir(ObjInf):
print c
If your script is made in python, then your script is python itself, so it wouldn't run without it, thus you have to account for python memory usage also, if you want to see how much memory python consumes by itself just run an empty python script and you will deduct from there, that your script will be the main resources consumer, which happens to be made in python thus python.
Now if you want to check memory usage of a thread then this question might help -> Why does python thread consume so much memory?
Here is my script I am using to find the maximum amount of resources used during the execution of another script. I am using psutilto achieve this. You can tweak the script to make it suitable for your purposes.
import psutil, sys
import numpy as np
from time import sleep
pid = int(sys.argv[1])
delay = int(sys.argv[2])
p = psutil.Process(pid)
max_resources_used = -1
while p.is_running():
## p.memory_info()[0] returns 'rss' in Unix
r = int(p.memory_info()[0] / 1048576.0) ## resources used in Mega Bytes
max_resources_used = r if r > max_resources_used else max_resources_used
sleep(delay)
print("Maximum resources used: %s MB." %np.max(max_resources_used))
Usage:
python script.py pid delay_in_seconds
For example:
python script.py 55356 2
Explanation:
You need to find out the process ID and pass it as an argument to the script plus the time interval for checking the resource usage in seconds (i.e. every how many seconds the script checks the amount of used resources). The script keeps track of the memory usage until the process is running. Finally it returns the maximum amount of memory used in MB.

Python Script works when run from command line but not when run from windows service

I have created a windwos service utilising the following code:
import win32service
import win32serviceutil
import win32api
import win32con
import win32event
import win32evtlogutil
import os, sys, string, time
class aservice(win32serviceutil.ServiceFramework):
_svc_name_ = "PAStoDistillerIFC"
_svc_display_name_ = "PAS DW to Distiller Interface"
_svc_description_ = "Service that checks the Clinical Research folder for any new files from PAS to process in Distiller"
def __init__(self, args):
win32serviceutil.ServiceFramework.__init__(self, args)
self.hWaitStop = win32event.CreateEvent(None, 0, 0, None)
def SvcStop(self):
self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)
win32event.SetEvent(self.hWaitStop)
def SvcDoRun(self):
import servicemanager
servicemanager.LogMsg(servicemanager.EVENTLOG_INFORMATION_TYPE,servicemanager.PYS_SERVICE_STARTED,(self._svc_name_, ''))
#self.timeout = 640000 #640 seconds / 10 minutes (value is in milliseconds)
self.timeout = 120000 #120 seconds / 2 minutes
# This is how long the service will wait to run / refresh itself (see script below)
while 1:
# Wait for service stop signal, if timeout, loop again
rc = win32event.WaitForSingleObject(self.hWaitStop, self.timeout)
# Check to see if self.hWaitStop happened
if rc == win32event.WAIT_OBJECT_0:
# Stop signal encountered
servicemanager.LogInfoMsg("PAStoDistillerIFC - STOPPED!") #For Event Log
break
else:
#[actual service code between rests]
try:
file_path = "D:\\SCRIPTS\\script.py"
execfile(file_path) #Execute the script
except:
servicemanager.LogInfoMsg("File CRASHED")
pass
#[actual service code between rests]
def ctrlHandler(ctrlType):
return True
if __name__ == '__main__':
win32api.SetConsoleCtrlHandler(ctrlHandler, True)
win32serviceutil.HandleCommandLine(aservice)
To run this script:
import os, re, urllib, urllib2, time, datetime
def postXML( path, fname):
fileresultop = open("D:\\CLinicalResearch\\SCRIPTS\\LOG.txt", 'a') # open result file
fileresultop.write('CheckXXX ')
fileresultop.close()
now = datetime.datetime.now() #####ALWAYS CRASHES HERE######
fileresult = open("D:\\SCRIPTS\\IFCPYTHONLOG.txt", 'a') # open result file
fileresultop = open("D:\\SCRIPTS\\LOG.txt", 'a')
fileresultop.write('Check2 ')
fileresultop.close()
path="D:\\Test2" # Put location of XML files here.
procpath="D:\\Test2Processed" # Location of processed files
now = datetime.datetime.now()
dirList=os.listdir(path)
for fname in dirList: # For each file in directory
if re.search("PatientIndexInsert", fname): # Brand new patient records
fileresultop = open("D:\\SCRIPTS\\LOG.txt", 'a') # open result file
fileresultop.write('Check1 ')
fileresultop.close()
postXML(path, fname)
I have pared down the script to the bare code where I believe this is crashing.
This works perfectly from the command line, I run the windows service under my own login.
Once I take the datetime function out of the function it seems to work.
Edit 1: I saw that the service runs in a blank environment. I don't have any environmental variables set myself.
Edit 2: Added traceback:
File "D:\ClinicalResearch\SCRIPTS\PAS2DIST.py", line 23, in <module>
postXML(path, fname)
File "D:\ClinicalResearch\SCRIPTS\PAS2DIST.py", line 6, in postXML
now = datetime.datetime.now()
NameError: global name 'datetime' is not defined
I didn't find the cause but I did find a workaround.
I needed to import all the same libraries into the function too. Once I did that, worked like a charm.
Hope this can help someone else.

Categories

Resources