How to loop function using a list of variables? - python

I have a function that prints OHLCV data for stock prices from a websocket. It works but I have to copy it for each variable (Var1 to Var14) to get each individual stock data. How would I automate this process given that I have list:
varlist = [var1, var2, var3...var14]
and my code is:
def process_messages_for_var1(msg):
if msg['e'] == 'error':
print(msg['m'])
# If message is a trade, print the OHLC data
else:
# Convert time into understandable structure
transactiontime = msg['k']['T'] / 1000
transactiontime = datetime.fromtimestamp(transactiontime).strftime('%d %b %Y %H:%M:%S')
# Process this message once websocket starts
print("{} - {} - Interval {} - Open: {} - Close: {} - High: {} - Low: {} - Volume: {}".
format(transactiontime,msg['s'],msg['k']['i'],msg['k']['o'],msg['k']['c'],msg['k']['h'],msg['k']['l'],msg['k']['v']))
# Also, put information into an array
kline_array_msg = "{},{},{},{},{},{}".format(
msg['k']['T'],msg['k']['o'],msg['k']['c'],msg['k']['h'],msg['k']['l'],msg['k']['v'])
# Insert at first position
kline_array_dct[var1].insert(0, kline_array_msg)
if (len(kline_array_dct[var1]) > window):
# Remove last message when res_array size is > of FIXED_SIZE
del kline_array_dct[var1][-1]
I'm hoping to get the following result (notice how function name also changes):
def process_messages_for_var2(msg):
if msg['e'] == 'error':
print(msg['m'])
# If message is a trade, print the OHLC data
else:
# Convert time into understandable structure
transactiontime = msg['k']['T'] / 1000
transactiontime = datetime.fromtimestamp(transactiontime).strftime('%d %b %Y %H:%M:%S')
# Process this message once websocket starts
print("{} - {} - Interval {} - Open: {} - Close: {} - High: {} - Low: {} - Volume: {}".
format(transactiontime,msg['s'],msg['k']['i'],msg['k']['o'],msg['k']['c'],msg['k']['h'],msg['k']['l'],msg['k']['v']))
# Also, put information into an array
kline_array_msg = "{},{},{},{},{},{}".format(
msg['k']['T'],msg['k']['o'],msg['k']['c'],msg['k']['h'],msg['k']['l'],msg['k']['v'])
# Insert at first position
kline_array_dct[var2].insert(0, kline_array_msg)
if (len(kline_array_dct[var2]) > window):
# Remove last message when res_array size is > of FIXED_SIZE
del kline_array_dct[var2][-1]

You can adjust the function so that it takes one of the vars as an argument. I.e.,
def process_messages(msg, var):
...
kline_array_dct[var].insert(0, kline_array_msg)
if (len(kline_array_dct[var]) > window):
# Remove last message when res_array size is > of FIXED_SIZE
del kline_array_dct[var][-1]

If the processes are generally the same, just define one of them, and give it more arguments:
def process_messages(msg, var)
Then, you can adjust your process code to run through each var when you call it. You can do this by removing the numbered vars in the process code:
if msg['e'] == 'error':
print(msg['m'])
# If message is a trade, print the OHLC data
else:
# Convert time into understandable structure
transactiontime = msg['k']['T'] / 1000
transactiontime = datetime.fromtimestamp(transactiontime).strftime('%d %b %Y %H:%M:%S')
# Process this message once websocket starts
print("{} - {} - Interval {} - Open: {} - Close: {} - High: {} - Low: {} - Volume: {}".
format(transactiontime,msg['s'],msg['k']['i'],msg['k']['o'],msg['k']['c'],msg['k']['h'],msg['k']['l'],msg['k']['v']))
# Also, put information into an array
kline_array_msg = "{},{},{},{},{},{}".format(
msg['k']['T'],msg['k']['o'],msg['k']['c'],msg['k']['h'],msg['k']['l'],msg['k']['v'])
# Insert at first position
kline_array_dct[var].insert(0, kline_array_msg)
if (len(kline_array_dct[var]) > window):
# Remove last message when res_array size is > of FIXED_SIZE
del kline_array_dct[var][-1]
Then, create a simple for loop to call the process for each var in the list:
for var in varList:
process_messages("msg", var)
The for loop will call the process for each var in the list.

Related

How do I add a daterange to open trades during when backtesting with backtrader?

I am trying to backtest a strategy where trades are only opened during 8.30 to 16.00 using backtrader.
Using the below attempt my code is running but returning no trades so my clsoing balane is the same as the opening. If I remove this filter my code is running correctly and trades are opening and closing so it is definitely the issue. Can anyone please help?
I have tried adding the datetime column of the data to a data feed using the below code:
` def __init__(self):
# Keep a reference to the "close" line in the data[0] dataseries
self.dataclose = self.datas[0].close
self.datatime = mdates.num2date(self.datas[0].datetime)
self.datatsi = self.datas[0].tsi
self.datapsar = self.datas[0].psar
self.databbmiddle = self.datas[0].bbmiddle
self.datastlower = self.datas[0].stlower
self.datastupper = self.datas[0].stupper
# To keep track of pending orders
self.order = None`
I then used the following code to try filter by this date range:
# Check if we are in the market
if not self.position:
current_time = self.datatime[0].time()
if datetime.time(8, 30) < current_time < datetime.time(16, 0):
if self.datatsi < 0 and self.datastupper[0] > self.dataclose[0] and self.datastlower[1] < self.dataclose[1] and self.dataclose[0] < self.databbmiddle[0] and self.datapsar[0] > self.dataclose[0]:
self.log('SELL CREATE, %.2f' % self.dataclose[0])
# Keep track of the created order to avoid a 2nd order
os = self.sell_bracket(size=100,price=sp1, stopprice=sp2, limitprice=sp3)
self.orefs = [o.ref for o in os]
else:
o1 = self.buy(exectype=bt.Order.Limit,price=bp1,transmit=False)
print('{}: Oref {} / Buy at {}'.format(self.datetime.date(), o1.ref, bp1))
o2 = self.sell(exectype=bt.Order.Stop,price=bp2,parent=o1,transmit=False)
print('{}: Oref {} / Sell Stop at {}'.format(self.datetime.date(), o2.ref, bp2))
o3 = self.sell(exectype=bt.Order.Limit,price=bp3,parent=o1,transmit=True)
print('{}: Oref {} / Sell Limit at {}'.format(self.datetime.date(), o3.ref, bp3))
self.orefs = [o1.ref, o2.ref, o3.ref] # self.sell(size=100, exectype=bt.Order.Limit, price=self.data.close[0]+16, parent=self.order, parent_bracket=bt.Order.Market)

Creating a function from current source code "list comprehension"

Outcome 1 required:
The first batch of code below is in its working form.
Please assist in creating a function " def Calculations():" inclusive of all the list calculations to return the same results with the static list. With the calculations in proper functions I will be able to refine the problem and might be able to move forward ...
Outcome 2 required for those that want to go in depth...:
When I run the code on a live list that appends every x intervals it stalls the information feed. I believe it could be the creating of the appending lists in batches of increasing numbers... but I don't have a solution for it... Below is the working code...
I am getting my live data from Binance in a appending list of closes only for those who would like to test it in the live status...
The data could be coming from any source , does not need to be Binance as long as its an appending list of closes in float format...
See code below...
import itertools
l = [16.329,16.331, 16.3705, 16.3965, 16.44, 16.4227, 16.4028, 16.37, 16.3829, 16.3482, 16.3614, 16.4191, 16.4008, 16.4048, 16.4076, 16.3724, 16.3599, 16.3872, 16.3794, 16.3528, 16.3886, 16.3904, 16.3815, 16.3864, 16.4254, 16.4411, 16.4151, 16.4338, 16.4212, 16.3819, 16.2857, 16.2703, 16.2408, 16.1938, 16.2038, 16.2035, 16.217, 16.2374, 16.2414, 16.2238, 16.1787, 16.2725, 16.2964, 16.3155, 16.238, 16.2149, 16.2992, 16.3568, 16.2793, 16.2467, 16.312, 16.3117, 16.3017, 16.3465, 16.3882, 16.3698, 16.307, 16.3328, 16.3311, 16.3466, 16.3382, 16.3703, 16.3502, 16.3661, 16.38, 16.3972, 16.4141, 16.393, 16.3769, 16.3683, 16.4136, 16.3774, 16.3709, 16.3179, 16.3019, 16.3149, 16.2838, 16.2689, 16.2602, 16.2679, 16.2921, 16.312, 16.3158, 16.3198, 16.2955, 16.303, 16.327, 16.356, 16.313, 16.3, 16.2806, 16.2634, 16.2856, 16.2702, 16.2136, 16.2782, 16.276, 16.2231, 16.2255, 16.1314, 16.0796, 16.1192, 16.0977, 16.1358, 16.1408, 16.1703]
#### VARIABLES & SETTINGS ####
dataingestperiod = 17
original_list_count = len(l)
timeframed_list = l[-dataingestperiod:]
timeframed_list_count = len(timeframed_list)
def groupSequence(x):
it = iter(x)
prev, res = next(it), []
while prev is not None:
start = next(it, None)
if start and start > prev:
res.append(prev)
elif res:
yield list(res + [prev])
res = []
prev = start
def divbyZero(increasingcount,decreasingcount):
try: return increasingcount/decreasingcount
except ZeroDivisionError: return 0
def divbyZeroMain(increasingcountMain,decreasingcountMain):
try: return increasingcountMain/decreasingcountMain
except ZeroDivisionError: return 0
#### TIMEFRAMED LIST CALCULATIONS#####
listA_1 = (list(groupSequence(timeframed_list))) # obtain numbers in increasing sequence
# print(len(listA_1)) # number of increases in mixed format
listA = list(itertools.chain.from_iterable(listA_1)) # remove double brackets to enable list count
increasingcount = len(listA)
decreasingcount = timeframed_list_count - increasingcount
trend = divbyZero(increasingcount,decreasingcount)
#### MAIN APPENDING LIST CALCULATIONS #####
listMain_1 = (list(groupSequence(l)))
listMain = list(itertools.chain.from_iterable(listMain_1))
increasingcountMain = len(listMain)
decreasingcountMain = original_list_count - increasingcountMain
trendMain = divbyZeroMain(increasingcountMain,decreasingcountMain)
###Timeframed list increases-only appending to max last"dataingestperiod" perhaps problem on live feed data....###
increase_list_timeframed = []
for x in listA:
increase_list_timeframed.append(x)
### Main list increases only appending...####
increase_list_Main = []
for x in listMain:
increase_list_Main.append(x)
###Prints ON TIMEFRAMED LIST ####
print ("---------------")
print ("---------------")
print ("Timeframed list count set at max {}".format(timeframed_list_count))
print ("Count of decreases in timeframed list is {}".format(decreasingcount))
print ("Count of increases in timeframed list is {}".format(increasingcount))
print ("Current timeframed trend is {}".format(trend))
print ("---------------")
print ("---------------")
###Prints ON MAIN LIST ####
print ("Main appending list count so far is {}".format(original_list_count))
print ("Count of decreases in Main appending list is {}".format(decreasingcountMain))
print ("Count of increases in Main appending list is {}".format(increasingcountMain))
print ("Current Main trend is {}".format(trendMain))
The actual code as live to binance is listed below with the above code included. You also need to install "pip install python-binance" and "pip install websocket_client" got the binance access code from ParttimeLarry
Outcome 2 required: When run live that all calculations run uninterruptedly...
import itertools
import copy
import websocket, json, pprint, talib, numpy
from binance.client import Client
from binance.enums import *
#DATA FROM WEBSOCKETS########
SOCKET = "wss://stream.binance.com:9443/ws/linkusdt#kline_1m"
API_KEY = 'yourAPI_KEY'
API_SECRET ='yourAPI_Secret'
closes = [] # created for RSI indicator only using closes
in_position = False
client = Client(API_KEY, API_SECRET) # tld='us'
def order(side, quantity, symbol,order_type=ORDER_TYPE_MARKET):
try:
print("sending order")
order = client.create_order(symbol=symbol, side=side, type=order_type, quantity=quantity)
print(order)
except Exception as e:
print("an exception occured - {}".format(e))
return False
return True
def on_open(ws):
print('opened connection')
# start_time = datetime.datetime.now().time().strftime('%H:%M:%S')
# try:
# file = open("C:/GITPROJECTS/binance-bot/csvstats.txt","+a")
# file.write("New Session Open Connection Start at time {}\n".format(datetime.datetime.now())))
# finally:
# file.close()
def on_close(ws):
print('closed connection')
def on_message(ws, message):
global closes, in_position
print('received message')
json_message = json.loads(message)
pprint.pprint(json_message)
candle = json_message['k']
is_candle_closed = candle['x']
close = candle['c']
if is_candle_closed:
print("candle closed at {}".format(close))
closes.append(float(close))
print("closes")
print(closes)
####################################################################################
########CALCULATIONS ON INDICATORS #################################################
# dataingestperiod = 5
l = copy.deepcopy(closes)
maincountofcloses = len(l)
print ("Total count of closes so far {}".format(maincountofcloses))
#### VARIABLES & SETTINGS ####
l = copy.deepcopy(closes)
dataingestperiod = 3
original_list_count = len(l)
#print ("Main list count so far is {}".format(original_list_count))
timeframed_list = l[-dataingestperiod:]
timeframed_list_count = len(timeframed_list)
#print ("Timeframed list count set at max {}".format(timeframed_list_count))
def groupSequence(x):
it = iter(x)
prev, res = next(it), []
while prev is not None:
start = next(it, None)
if start and start > prev:
res.append(prev)
elif res:
yield list(res + [prev])
res = []
prev = start
def divbyZero(increasingcount,decreasingcount):
try: return increasingcount/decreasingcount
except ZeroDivisionError: return 0
def divbyZeroMain(increasingcountMain,decreasingcountMain):
try: return increasingcountMain/decreasingcountMain
except ZeroDivisionError: return 0
#### TIMEFRAMED LIST CALCULATIONS#####
listA_1 = (list(groupSequence(timeframed_list))) # obtain numbers in increasing sequence
# print(len(listA_1)) # number of increases in mixed format
listA = list(itertools.chain.from_iterable(listA_1)) # remove double brackets to enable list count
increasingcount = len(listA)
decreasingcount = timeframed_list_count - increasingcount
trend = divbyZero(increasingcount,decreasingcount)
#### MAIN APPENDING LIST CALCULATIONS #####
listMain_1 = (list(groupSequence(l)))
listMain = list(itertools.chain.from_iterable(listMain_1))
increasingcountMain = len(listMain)
decreasingcountMain = original_list_count - increasingcountMain
trendMain = divbyZeroMain(increasingcountMain,decreasingcountMain)
increase_list_timeframed = []
for x in listA:
increase_list_timeframed.append(x)
increase_list_Main = []
for x in listMain:
increase_list_Main.append(x)
###Prints ON TIMEFRAMED LIST ####
print ("---------------")
print ("---------------")
print ("Timeframed list count set at max {}".format(timeframed_list_count))
print ("Count of decreases in timeframed list is {}".format(decreasingcount))
print ("Count of increases in timeframed list is {}".format(increasingcount))
print ("Current timeframed trend is {}".format(trend))
print ("---------------")
print ("---------------")
###Prints ON MAIN LIST ####
print ("Main appending list count so far is {}".format(original_list_count))
print ("Count of decreases in Main appending list is {}".format(decreasingcountMain))
print ("Count of increases in Main appending list is {}".format(increasingcountMain))
print ("Current Main trend is {}".format(trendMain))
# threading.Timer(10.0, divbyZeroMain).start()
# threading.Timer(10.0, divbyZero).start()
# ws = websocket.WebSocketApp(SOCKET, on_open=on_open, on_close=on_close, on_message=on_message)
# ws.run_forever()
ws = websocket.WebSocketApp(SOCKET, on_open=on_open, on_close=on_close, on_message=on_message)
ws.run_forever()

Python3, nested dict comparison (recursive?)

I'm writing a program to take a .csv file and create 'metrics' for ticket closure data. Each ticket has one or more time entries; the goal is to grab the 'delta' (ie - time difference) for open -> close and time_start -> time_end on a PER TICKET basis; these are not real variables, they're just for the purpose of this question.
So, say we have ticket 12345 that has 3 time entries like so:
ticket: 12345
open: 2016-09-26 00:00:00.000 close: 2016-09-27 00:01:00.000
time_start: 2016-09-26 00:01:00.000 time_end: 2016-09-26 00:02:00.000
ticket: 12345
open: 2016-09-26 00:00:00.000 close: 2016-09-27 00:01:00.000
time_start: 2016-09-26 00:01:00.000 time_end: 2016-09-26 00:02:00.000
ticket: 12345
open: 2016-09-26 00:00:00.000 close: 2016-09-27 00:01:00.000
time_start: 2016-09-26 00:01:00.000 time_end: 2016-09-27 00:02:00.000
I'd like to have the program display ONE entry for this, adding up the 'deltas', like so:
ticket: 12345
Delta open/close ($total time from open to close):
Delta start/end: ($total time of ALL ticket time entries added up)
Here's what I have so far;
.csv example:
Ticket #,Ticket Type,Opened,Closed,Time Entry Day,Start,End
737385,Software,2016-09-06 12:48:31.680,2016-09-06 15:41:52.933,2016-09-06 00:00:00.000,1900-01-01 15:02:00.417,1900-01-01 15:41:00.417
737318,Hardware,2016-09-06 12:20:28.403,2016-09-06 14:35:58.223,2016-09-06 00:00:00.000,1900-01-01 14:04:00.883,1900-01-01 14:35:00.883
737296,Printing/Scan/Fax,2016-09-06 11:37:10.387,2016-09-06 13:33:07.577,2016-09-06 00:00:00.000,1900-01-01 13:29:00.240,1900-01-01 13:33:00.240
737273,Software,2016-09-06 10:54:40.177,2016-09-06 13:28:24.140,2016-09-06 00:00:00.000,1900-01-01 13:17:00.860,1900-01-01 13:28:00.860
737261,Software,2016-09-06 10:33:09.070,2016-09-06 13:19:41.573,2016-09-06 00:00:00.000,1900-01-01 13:05:00.113,1900-01-01 13:15:00.113
737238,Software,2016-09-06 09:52:57.090,2016-09-06 14:42:16.287,2016-09-06 00:00:00.000,1900-01-01 12:01:00.350,1900-01-01 12:04:00.350
737238,Software,2016-09-06 09:52:57.090,2016-09-06 14:42:16.287,2016-09-06 00:00:00.000,1900-01-01 14:36:00.913,1900-01-01 14:42:00.913
737220,Password,2016-09-06 09:28:16.060,2016-09-06 11:41:16.750,2016-09-06 00:00:00.000,1900-01-01 11:30:00.303,1900-01-01 11:36:00.303
737197,Hardware,2016-09-06 08:50:23.197,2016-09-06 14:02:18.817,2016-09-06 00:00:00.000,1900-01-01 13:48:00.530,1900-01-01 14:02:00.530
736964,Internal,2016-09-06 01:02:27.453,2016-09-06 05:46:00.160,2016-09-06 00:00:00.000,1900-01-01 06:38:00.917,1900-01-01 06:45:00.917
class Time_Entry.py:
#! /usr/bin/python
from datetime import *
class Time_Entry:
def __init__(self, ticket_no, time_entry_day, opened, closed, start, end):
self.ticket_no = ticket_no
self.time_entry_day = time_entry_day
self.opened = opened
self.closed = closed
self.start = datetime.strptime(start, '%Y-%m-%d %H:%M:%S.%f')
self.end = datetime.strptime(end, '%Y-%m-%d %H:%M:%S.%f')
self.total_open_close_delta = 0
self.total_start_end_delta = 0
def open_close_delta(self, topen, tclose):
open_time = datetime.strptime(topen, '%Y-%m-%d %H:%M:%S.%f')
if tclose != '\\N':
close_time = datetime.strptime(tclose, '%Y-%m-%d %H:%M:%S.%f')
self.total_open_close_delta = close_time - open_time
def start_end_delta(self, tstart, tend):
start_time = datetime.strptime(tstart, '%Y-%m-%d %H:%M:%S.%f')
end_time = datetime.strptime(tend, '%Y-%m-%d %H:%M:%S.%f')
start_end_delta = (end_time - start_time).seconds
self.total_start_end_delta += start_end_delta
return (self.total_start_end_delta)
def add_start_end_delta(self, delta):
self.total_start_end_delta += delta
def display(self):
print('Ticket #: %7.7s Start: %-15s End: %-15s Delta: %-10s' % (self.ticket_no, self.start.time(), self.end.time(), self.total_start_end_delta))
Which is called by metrics.py:
#! /usr/bin/python
import csv
import pprint
from Time_Entry import *
file = '/home/jmd9qs/userdrive/metrics.csv'
# setup CSV, load up a list of dicts
reader = csv.DictReader(open(file))
dict_list = []
for line in reader:
dict_list.append(line)
def load_tickets(ticket_list):
for i, key in enumerate(ticket_list):
ticket_no = key['Ticket #']
time_entry_day = key['Time Entry Day']
opened = key['Opened']
closed = key['Closed']
start = key['Start']
end = key['End']
time_entry = Time_Entry(ticket_no, time_entry_day, opened, closed, start, end)
time_entry.open_close_delta(opened, closed)
time_entry.start_end_delta(start, end)
for h, key2 in enumerate(ticket_list):
ticket_no2 = key2['Ticket #']
time_entry_day2 = key2['Time Entry Day']
opened2 = key2['Opened']
closed2 = key2['Closed']
start2 = key2['Start']
end2 = key2['End']
time_entry2 = Time_Entry(ticket_no2, time_entry_day2, opened2, closed2, start2, end2)
if time_entry.ticket_no == time_entry2.ticket_no and i != h:
# add delta and remove second time_entry from dict (no counting twice)
time_entry2_delta = time_entry2.start_end_delta(start2, end2)
time_entry.add_start_end_delta(time_entry2_delta)
del dict_list[h]
time_entry.display()
load_tickets(dict_list)
This seems to work OK so far; however, I get multiple lines of output per ticket instead of one with the 'deltas' added. FYI the way the program displays output is different from my example, which is intentional. See example below:
Ticket #: 738388 Start: 15:24:00.313000 End: 15:35:00.313000 Delta: 2400
Ticket #: 738388 Start: 16:30:00.593000 End: 16:40:00.593000 Delta: 1260
Ticket #: 738381 Start: 15:40:00.763000 End: 16:04:00.767000 Delta: 1440
Ticket #: 738357 Start: 13:50:00.717000 End: 14:10:00.717000 Delta: 1200
Ticket #: 738231 Start: 11:16:00.677000 End: 11:21:00.677000 Delta: 720
Ticket #: 738203 Start: 16:15:00.710000 End: 16:31:00.710000 Delta: 2160
Ticket #: 738203 Start: 09:57:00.060000 End: 10:02:00.060000 Delta: 1560
Ticket #: 738203 Start: 12:26:00.597000 End: 12:31:00.597000 Delta: 900
Ticket #: 738135 Start: 13:25:00.880000 End: 13:50:00.880000 Delta: 2040
Ticket #: 738124 Start: 07:56:00.117000 End: 08:31:00.117000 Delta: 2100
Ticket #: 738121 Start: 07:47:00.903000 End: 07:52:00.903000 Delta: 300
Ticket #: 738115 Start: 07:15:00.443000 End: 07:20:00.443000 Delta: 300
Ticket #: 737926 Start: 06:40:00.813000 End: 06:47:00.813000 Delta: 420
Ticket #: 737684 Start: 18:50:00.060000 End: 20:10:00.060000 Delta: 13380
Ticket #: 737684 Start: 13:00:00.560000 End: 13:08:00.560000 Delta: 8880
Ticket #: 737684 Start: 08:45:00 End: 10:00:00 Delta: 9480
Note that there are a few tickets with more than one entry, which is what I don't want.
Any notes on style, convention, etc. are also welcome as I'm trying to be more 'Pythonic'
The problem here is that with a nested loop like the one you implemented you double-examine the same ticket. Let me explain it better:
ticket_list = [111111, 111111, 666666, 777777] # lets simplify considering the ids only
# I'm trying to keep the same variable names
for i, key1 in enumerate(ticket_list): # outer loop
cnt = 1
for h, key2 in enumerate(ticket_list): # inner loop
if key1 == key2 and i != h:
print('>> match on i:', i, '- h:', h)
cnt += 1
print('Found', key1, cnt, 'times')
See how it double counts the 111111
>> match on i: 0 - h: 1
Found 111111 2 times
>> match on i: 1 - h: 0
Found 111111 2 times
Found 666666 1 times
Found 777777 1 times
That's because you will match the 111111 both when the inner loop examines the first position and the outer the second (i: 0, h: 1), and again when the outer is on the second position and the inner is on the first (i: 1, h: 0).
A proposed solution
A better solution for your problem is to group the entries of the same ticket together and then sum your deltas. groupby is ideal for your task. Here I took the liberty to rewrite some code:
Here I modified the constructor in order to accept the dictionary itself. It makes passing the parameters later less messy. I also removed the methods to add the deltas, later we'll see why.
import csv
import itertools
from datetime import *
class Time_Entry(object):
def __init__(self, entry):
self.ticket_no = entry['Ticket #']
self.time_entry_day = entry['Time Entry Day']
self.opened = datetime.strptime(entry['Opened'], '%Y-%m-%d %H:%M:%S.%f')
self.closed = datetime.strptime(entry['Closed'], '%Y-%m-%d %H:%M:%S.%f')
self.start = datetime.strptime(entry['Start'], '%Y-%m-%d %H:%M:%S.%f')
self.end = datetime.strptime(entry['End'], '%Y-%m-%d %H:%M:%S.%f')
self.total_open_close_delta = (self.closed - self.opened).seconds
self.total_start_end_delta = (self.end - self.start).seconds
def display(self):
print('Ticket #: %7.7s Start: %-15s End: %-15s Delta: %-10s' % (self.ticket_no, self.start.time(), self.end.time(), self.total_start_end_delta))
Here we load the data using list comprehensions, the final output will be a the list of Time_Entry objects:
with open('metrics.csv') as ticket_list:
time_entry_list = [Time_Entry(line) for line in csv.DictReader(ticket_list)]
print(time_entry_list)
# [<Time_Entry object at 0x101142f60>, <Time_Entry object at 0x10114d048>, <Time_Entry object at 0x1011fddd8>, ... ]
In the nested-loop version instead you kept rebuilding the Time_Entry inside the inner loop, which means for 100 entries you end up initializing 10000 temporary variables! Creating a list "outside" instead allows us to initialize each Time_Entry only once.
Here comes the magic: we can use the groupby in order to collect all the objects with the same ticket_no in the same list:
sorted(time_entry_list, key=lambda x: x.ticket_no)
ticket_grps = itertools.groupby(time_entry_list, key=lambda x: x.ticket_no)
tickets = [(id, [t for t in tickets]) for id, tickets in ticket_grps]
The final result in ticket is a list tuples with the ticket id in the first position, and the list of associated Time_Entry in the last:
print(tickets)
# [('737385', [<Time_Entry object at 0x101142f60>]),
# ('737318', [<Time_Entry object at 0x10114d048>]),
# ('737238', [<Time_Entry object at 0x1011fdd68>, <Time_Entry object at 0x1011fde80>]),
# ...]
So finally we can iterate over all the tickets, and using again a list comprehension we can build a list containing only the deltas so we can sum them together. You can see why we removed the old method to update the deltas, since now we simply store their value for the single entry and then sum them externally.
Here is your result:
for ticket in tickets:
print('ticket:', ticket[0])
# extract list of deltas and then sum
print('Delta open / close:', sum([entry.total_open_close_delta for entry in ticket[1]]))
print('Delta start / end:', sum([entry.total_start_end_delta for entry in ticket[1]]))
print('(found {} occurrences)'.format(len(ticket[1])))
print()
Output:
ticket: 736964
Delta open / close: 17012
Delta start / end: 420
(found 1 occurrences)
ticket: 737197
Delta open / close: 18715
Delta start / end: 840
(found 1 occurrences)
ticket: 737220
Delta open / close: 7980
Delta start / end: 360
(found 1 occurrences)
ticket: 737238
Delta open / close: 34718
Delta start / end: 540
(found 2 occurrences)
ticket: 737261
Delta open / close: 9992
Delta start / end: 600
(found 1 occurrences)
ticket: 737273
Delta open / close: 9223
Delta start / end: 660
(found 1 occurrences)
ticket: 737296
Delta open / close: 6957
Delta start / end: 240
(found 1 occurrences)
ticket: 737318
Delta open / close: 8129
Delta start / end: 1860
(found 1 occurrences)
ticket: 737385
Delta open / close: 10401
Delta start / end: 2340
(found 1 occurrences)
At the end of the story: list comprehensions can be super-useful, they allows you to do a lot of stuff with a super-compact syntax. Also the python standard library contains a lot of ready-to-use tools that can really come to your aid, so get familiar!

Python - calculations with more than one splited serial data

I use Arduino to receive data from sensors (4 types of data : humidity, temperature, photocell and milliseconds)
Datas comes like this : xx xx xx xxxx in the serial buffer. (data space data space etc...)
I split this line in order to isolate each data because I want to make individual calculations for each sensor.
Calculation for each sensor consist on : ((latest_data) - (data_of_previous_line), latest_data) in order to get a tuple for each sensor. I want all the sensors tuples appearing in the same line.
Doing this with 1 sensor and 1 method (calculate()) is working fine but it doesn't work if I add a second sensor in sensors() object !
QUESTION : how to make all this working with at least 2 sensors data ?
(the code below is working perfectly with 1 "splited" sensor data).
Thanks in advance.
class _share:
def __init__(self):
self.last_val = [0 for i in range(2)]
def calculate(self, val):
self.last_data = val
self.last_val = [self.last_data] + self.last_val[:-1]
diff = reduce(operator.__sub__, self.last_val)
print (diff, val)
return (diff, val)
share = _share()
ser = serial.Serial('/dev/ttyS1', 9600, timeout=0.1)
def sensors():
while True:
try:
time.sleep(0.01)
ser.flushInput()
reception = ser.readline()
receptionsplit = reception.split()
sensor_milli = receptionsplit[3]
sensor_pho_1 = receptionsplit[2]
sensor_tem_1 = receptionsplit[1]
sensor_hum_1 = receptionsplit[0]
int_sensor_milli = int(sensor_milli)
int_sensor_pho_1 = int(sensor_pho_1)
int_sensor_tem_1 = int(sensor_tem_1)
int_sensor_hum_1 = int(sensor_hum_1)
a = int_sensor_milli
b = int_sensor_pho_1
c = int_sensor_tem_1
d = int_sensor_hum_1
return str(share.calculate(b))
except:
pass
time.sleep(0.1)
f = open('da.txt', 'ab')
while 1:
arduino_sensor = sensors()
f.write(arduino_sensor)
f.close()
f = open('da.txt', 'ab')
You need to use different share instance for each sensor otherwise the calculations will be wrong. So use share_a, share_b, share_c and share_d for a, b, c and d respectively for example. Now if I understand this correctly, you can return all the sensors at once by changing your return to:
return [ str(share_a.calculate(a)), str(share_b.calculate(b)), str(share_c.calculate(c)), str(share_d.calculate(d)) ]
The above would return a list containing all 4 sensors and then in your main method you can change to:
arduino_sensor = sensors()
sensor_string ="a:%s b:%s c:%s d:%s"%( arduino_sensor[0], arduino_sensor[1], arduino_sensor[2], arduino_sensor[3] )
print sensor_string # single line screen print of all sensor data
f.write( sensor_string )
I hope that is helpful.

Python: File formatting

I have a for loop which references a dictionary and prints out the value associated with the key. Code is below:
for i in data:
if i in dict:
print dict[i],
How would i format the output so a new line is created every 60 characters? and with the character count along the side for example:
0001
MRQLLLISDLDNTWVGDQQALEHLQEYLGDRRGNFYLAYATGRSYHSARELQKQVGLMEP
0061
DYWLTAVGSEIYHPEGLDQHWADYLSEHWQRDILQAIADGFEALKPQSPLEQNPWKISYH
0121 LDPQACPTVIDQLTEMLKETGIPVQVIFSSGKDVDLLPQRSNKGNATQYLQQHLAMEPSQ
It's a finicky formatting problem, but I think the following code:
import sys
class EveryN(object):
def __init__(self, n, outs):
self.n = n # chars/line
self.outs = outs # output stream
self.numo = 1 # next tag to write
self.tll = 0 # tot chars on this line
def write(self, s):
while True:
if self.tll == 0: # start of line: emit tag
self.outs.write('%4.4d ' % self.numo)
self.numo += self.n
# wite up to N chars/line, no more
numw = min(len(s), self.n - self.tll)
self.outs.write(s[:numw])
self.tll += numw
if self.tll >= self.n:
self.tll = 0
self.outs.write('\n')
s = s[numw:]
if not s: break
if __name__ == '__main__':
sys.stdout = EveryN(60, sys.stdout)
for i, a in enumerate('abcdefgh'):
print a*(5+ i*5),
shows how to do it -- the output when running for demonstration purposes as the main script (five a's, ten b's, etc, with spaces in-between) is:
0001 aaaaa bbbbbbbbbb ccccccccccccccc dddddddddddddddddddd eeeeee
0061 eeeeeeeeeeeeeeeeeee ffffffffffffffffffffffffffffff ggggggggg
0121 gggggggggggggggggggggggggg hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
0181 hhhhhhh
# test data
data = range(10)
the_dict = dict((i, str(i)*200) for i in range( 10 ))
# your loops as a generator
lines = ( the_dict[i] for i in data if i in the_dict )
def format( line ):
def splitter():
k = 0
while True:
r = line[k:k+60] # take a 60 char block
if r: # if there are any chars left
yield "%04d %s" % (k+1, r) # format them
else:
break
k += 60
return '\n'.join(splitter()) # join all the numbered blocks
for line in lines:
print format(line)
I haven't tested it on actual data, but I believe the code below would do the job. It first builds up the whole string, then outputs it a 60-character line at a time. It uses the three-argument version of range() to count by 60.
s = ''.join(dict[i] for i in data if i in dict)
for i in range(0, len(s), 60):
print '%04d %s' % (i+1, s[i:i+60])
It seems like you're looking for textwrap
The textwrap module provides two convenience functions, wrap() and
fill(), as well as TextWrapper, the class that does all the work, and
a utility function dedent(). If you’re just wrapping or filling one or
two text strings, the convenience functions should be good enough;
otherwise, you should use an instance of TextWrapper for efficiency.

Categories

Resources