I have been using Zipline for some time now and could highly benefit from being able to pickle a backtest in order to be able to resume it later. The idea is to save the state of the trading algorithm and update it when new data become available. I started pickling some attributes I could think of but forgot some others and was therefore wondering if anyone had an easy solution to do that.
Best,
Vincent
PS:
I tried updating the portfolio with those few lines. It goes ok but more attributes need to be overwritten.
if self.load_former_ptf:
for k, v in context.former_portfolio.__dict__.items():
self.TradingAlgorithm.portfolio.__setattr__(k, v)
updPositionDict = {}
for p in context.former_portfolio.positions.values():
formerDelta = p.amount*p.last_sale_price
newSid = context.symbol(p.sid.symbol)
newPrice = data[newSid].price
newQuantity = int(formerDelta/newPrice)
# portfolio should be made of positions instead of plain dict
updPositionDict.update({newSid:{'amount':newQuantity, 'cost_basis':p.cost_basis,
'last_sale_date':p.last_sale_price, 'last_sale_price':newPrice,
'sid':newSid}})
self.TradingAlgorithm.portfolio.positions = updPositionDict
self.load_former_ptf = False
...wondering if anyone had an easy solution to do that.
I have not used zipline and don't have an implemented solution for what you are asking. All I can help you with is with pickling your test state.
Though from the information you've provided, I can only read that perf_tracker is what you need help with. Looking at its source there shouldn't be any problem with pickling it (correct me if I am wrong, because I haven't done it myself).
Also you can read about pickling custom objects here. Another interesting method is __repr__ which helps recreating objects from strings.
On every Tkinter notebook tab, there is a list of checkbuttons and the variables get saved to their corresponding v[ ] (i.e. cb.append(Checkbuttons(.., variables = v[x],..)).
For now, I am encountering this error:
File "/home/pass/OptionsInterface.py", line 27, in __init__
self.ntbk_render(f = self.f1, ntbkLabel="Options",cb = optsCb, msg = optMsg)
File "/home/pass/OptionsInterface.py", line 59, in ntbk_render
text = msg[x][1], command = self.cb_check(v, opt)))
File "/home/pass/OptionsInterface.py", line 46, in cb_check
opt[ix]=(v[ix].get())
IndexError: list assignment index out of range
And I think the error is coming here. I don't know how to access the values of the checkbutton variables.
def cb_check(self, v = [], cb = [], opt = []):
for ix in range(len(cb)):
opt[ix]=(v[ix].get())
print opt
Here are some snippets:
def cb_check(self, v = [], cb = [], opt = []):
for ix in range(len(cb)):
opt[ix]=(v[ix].get())
print opt
def ntbk_render(self, f=None, ntbkLabel="", cb = [], msg = []):
v = []
opt = []
msg = get_thug_args(word = ntbkLabel, argList = msg) #Allows to get the equivalent list (2d array)
#to serve as texts for their corresponding checkboxes
for x in range(len(msg)):
v.append(IntVar())
off_value = 0
on_value = 1
cb.append(Checkbutton(f, variable = v[x], onvalue = on_value, offvalue = off_value,
text = msg[x][1], command = self.cb_check(v, opt)))
cb[x].grid(row=self.rowTracker + x, column=0, sticky='w')
opt.append(off_value)
cb[-1].deselect()
After solving the error, I want to get all the values of the checkbutton variables of each tab after pressing the button Ok at the bottom. Any tips on how to do it will help!
Alright, so there’s a bit more (… alright, maybe a little more than a bit…) here than I intended, but I’ll leave it on the assumption that you’ll simply take away from it what you need or find of value.
The short answer is that when your Checkbutton calls cb_check, it’s passing the arguments like this:
cb_check(self = self, v = v, cb = opt, opt = [])
I think it’s pretty obvious why you’re getting an IndexError when we write it out like this: you’re using the length of your opt list for indexes to use on the empty list that the function uses when opt is not supplied; in other words, if you have 5 options, the it will try accessing indices [0…4] on empty list [] (obviously, it stops as soon as it fails to access Index 0). Your function doesn’t know that the thing you’re passing it are called v and opt: it simply takes some random references you give it and places them in the order of the positional arguments, filling in keyword arguments in order after that, and then fills out the rest of the keyword arguments with whatever defaults you told it to use.
Semi-Quick Aside:
When trying to fix an error, if I have no idea what went wrong, I would start by inserting a print statement right before it breaks with all the references that are involved in the broken line, which will often tell you what references do not contain the values you thought they had. If this looks fine, then I would step in further and further, checking any lookups/function returns for errors. For example:
def cb_check(self, v = [], cb = [], opt = []):
for ix in range(len(cb)):
print(ix, opt, v) ## First check, for sanity’s sake
print(v[ix]) ## Second Check if I still can’t figure it out, but
## this is a lookup, not an assignment, so it
## shouldn’t be the problem
print(v[ix].get()) ## Third Check, again, not an assignment
print(opt[ix]) ## “opt[ix]={something}” is an assignment, so this is
## (logically) where it’s breaking. Here we’re only
## doing a lookup, so we’ll get a normal IndexError
## instead (it won’t say “assignment”)
opt[ix]=(v[ix].get()) ##point in code where IndexError was raised
The simple fix would be to change the Checkbutton command to “lambda: self.cb_check(v,cb,opt)” or more explicitly (so we can do a sanity check) “lambda: self.cb_check(v = v, cb = cb, opt = opt).” (I’ll further mention that you can change “lambda:” to “lambda v = v, cb = cb, opt = opt:” to further ensure that you’ll forever be referring to the same lists, but this should be irrelevant, especially because of changes I’ll suggest below)
[The rest of this is: First Section- an explicit explanation of what your code is doing and a critique of it; second section- an alternative approach to how you have this laid out. As mentioned, the above fixes your problem, so the rest of this is simply an exercise in improvement]
In regards to your reference names-
There’s an old adage “Code is read much more often than it is written,” and part of the Zen of Python says: “Explicit is better than Implicit.[…] Readability counts.” So don’t be afraid to type a little bit more to make it easier to see what’s going on (same logic applies to explicitly passing variables to cb_check in the solution above). v can be varis; cb can be cbuttons; ix would be better (in my opinion) as ind or just plain index; f (in ntkb_render) should probably be parent or master.
Imports-
It looks like you’re either doing star (*) imports for tkinter, or explicitly importing parts of it. I’m going to discourage you from doing either of these things for two reasons. The first is the same reason as above: if only a few extra keystrokes makes it easier to see where everything came from, then it’s worth it in the long run. If you need to go through your code later to find every single tkinter Widget/Var/etc, then simply searching “tk” is a lot easier than searching “Frame” then “Checkbutton” then IntVar and so on. Secondly, imports occasionally clash: so if- for example- you do
import time ## or from time import time, even
from datetime import time
life may get kinda hairy for you. So it would be better to “import tkinter as tk” (for example) than the way you are currently doing it.
cb_check-
There are a few things I’ll point out about this function:
1) v, cb, and opt are all required for the function to work correctly; if the empty list reference is used instead, then it’s going to fail unless you created 0 Checkbuttons (because there wouldn’t be anything to iterate over in the “for loop”; regardless, this doesn’t seem like it should ever happen). What this means is that they’re better off simply being positional arguments (no default value). Had you written them this way, the function would have given you an error stating that you weren’t giving it enough information to work with rather than a semi-arbitrary “IndexError.”
2) Because you supply the function with all the information it needs, there is no practical reason (based on the code supplied, at any rate) as to why the function needs to be a method of some object.
3) This function is being called each time you select a Checkbutton, but reupdates the recorded values (in opt) of all the Checkbuttons (instead of just the one that was selected).
4) The opt list is technically redundant: you already have a reference to a list of all the IntVars (v), which are updated/maintained in real time without you having to do anything; it is basically just as easy to perform v[ix].get() as it is to do opt[ix]: in exchange for the “.get()” call when you eventually need the value you have to include a whole extra function and run it repeatedly to make sure your opt list is up to date. To complicate matters further, there’s an argument for v also being redundant, but we’ll get to that later.
And as an extra note: I’m not sure why you wrapped the integer value of your IntVar (v[ix].get()) with parentheses; they seem extraneous, but I don’t know if you’re trying to cast the value in the same manner as C/Java/etc.
ntbk_render-
Again, notice that this function is given nearly everything it needs to be executed, and therefore feels less like a method than a stand-alone function (at this moment; again, we’ll get to this at the end). The way it’s setup also means that it requires all of that information, so they would better off as positional argument as above.
The cb reference-
Unlike v and opt, the cb reference can be supplied to the function. If we follow cb along its path through the code, we’ll find out that its length must always be equal to v and opt. Assumedly, the reason we may want to pass cb to this method but not v or opt is because we only care about the reference to cb in the rest of our code. However, notice that cb is always an empty iterable with an append method (seems safe to assume it will always be an empty list). So either we should be testing to make sure that it’s empty before we start doing anything with it (because it will break our code if it isn’t), or we should just create it at the same time that we’re creating v and opt. Not knowing how your code is set up, I personally would think it would be easiest to initialize it alongside the other two and then simply return it at the end of the method (putting “return cb” at the end of this function and “cb=[whatever].ntbk_render(f = someframe, ntbklabel = “somethug”, msg = argList)”). Getting back to the redundancy of opt and v (point 4 in cb_check), since we’re keeping all the Checkbuttons around in cb, we can use them to access their IntVars when we need to.
msg-
You pass msg to the function, and then use it to the value of “argList” in get_thug_args and replace it with the result. I think it would make more sense to call the keyword that you pass the ntbk_render “argList” because that’s what it is going to be used for, and then simply let msg be the returned value of get_thug_args. (The same line of thought applies to the keyword “ntbkLabel”, for the record)
Iterating-
I’m not sure if using an Index Reference (x) is just a habit picked up from more rigid programing languages like C and Java, but iterating is probably one of my favorite advantages (subjective, I know) that Python has over those types of languages. Instead of using x, to get your option out of msg, you can simply step through each individual option inside of msg. The only place that we run into insurmountable problems is when we use self.rowTracker (which, on the subject, is not updated in your code; we’ll fix that for now, but as before, we’ll be dealing with that later). What we can do to amend this is utilize the enumerate function built into Python; this will create a tuple containing the current index followed by the value at the iterated index.
Furthermore, because you’re keeping everything in lists, you have to keep going back to the index of the list to get the reference. Instead, simply create references to the things (datatypes/objects) you are creating and then add the references to the lists afterwards.
Below is an adjustment to the code thus far based on most of the things I noted above:
import tkinter as tk ## tk now refers to the instance of the tkinter module that we imported
def ntbk_render(self, parent, word, argList):
cbuttons=list() ## The use of “list()” here is purely personal preference; feel free to
## continue with brackets
msg = get_thug_args(word = word, argList=argList) ## returns a 2d array [ [{some value},
## checkbutton text,…], …]
for x,option in enumerate(msg):
## Each iteration automatically does x=current index, option=msg[current_index]
variable = tk.IntVar()
## off and on values for Checkbuttons are 0 and 1 respectively by default, so it’s
## redundant at the moment to assign them
chbutton=tk.Checkbutton(parent, variable=variable, text=option[1])
chbutton.variable = variable ## rather than carrying the variable references around,
## I’m just going to tack them onto the checkbutton they
## belong to
chbutton.grid(row = self.rowTracker + x, column=0, sticky=’w’)
chbutton.deselect()
cbuttons.append(chbutton)
self.rowTracker += len(msg) ## Updating the rowTracker
return cbuttons
def get_options(self, cbuttons):
## I’m going to keep this new function fairly simple for clarity’s sake
values=[]
for chbutton in cbuttons:
value=chbutton.variable.get() ## It is for this purpose that we made
## chbutton.variable=variable above
values.append(value)
return values
Yes, parts of this are a bit more verbose, but any mistakes in the code are going to be much easier to spot because everything is explicit.
Further Refinement
The last thing I’ll touch on- without going into too much detail because I can’t be sure how much of this was new information for you- is my earlier complaints about how you were passing references around. Now, we already got rid of a lot of complexity by reducing the important parts down to just the list of Checkbuttons (cbuttons), but there are still a few references being passed that we may not need. Rather than dive into a lot more explanation, consider that each of these Notebook Tabs are their own objects and therefore could do their own work: so instead of having your program add options to each tab and carry around all the values to the options, you could relegate that work to the tab itself and then tell it how or what options to add and ask it for its options and values when you need them (instead of doing all that work in the main program).
Please read this whole question before answering, as it's not what you think... I'm looking at creating python object wrappers that represent hardware devices on a system (trimmed example below).
class TPM(object):
#property
def attr1(self):
"""
Protects value from being accidentally modified after
constructor is called.
"""
return self._attr1
def __init__(self, attr1, ...):
self._attr1 = attr1
...
#classmethod
def scan(cls):
"""Calls Popen, parses to dict, and passes **dict to constructor"""
Most of the constructor inputs involve running command line outputs in subprocess.Popen and then parsing the output to fill in object attributes. I've come up with a few ways to handle these, but I'm unsatisfied with what I've put together just far and am trying to find a better solution. Here are the common catches that I've found. (Quick note: tool versions are tightly controlled, so parsed outputs don't change unexpectedly.)
Many tools produce variant outputs, sometimes including fields and sometimes not. This means that if you assemble a dict to be wrapped in a container object, the constructor is more or less forced to take **kwargs and not really have defined fields. I don't like this because it makes static analysis via pylint, etc less than useful. I'd prefer a defined interface so that sphinx documentation is clearer and errors can be more reliably detected.
In lieu of **kwargs, I've also tried setting default args to None for many of the fields, with what ends up as pretty ugly results. One thing I dislike strongly about this option is that optional fields don't always come at the end of the command line tool output. This makes it a little mind-bending to look at the constructor and match it up to tool output.
I'd greatly prefer to avoid constructing a dictionary in the first place, but using setattr to create attributes will make pylint unable to detect the _attr1, etc... and create warnings. Any ideas here are welcome...
Basically, I am looking for the proper Pythonic way to do this. My requirements, for a re-summary are the following:
Command line tool output parsed into a container object.
Container object protects attributes via properties post-construction.
Varying number of inputs to constructor, with working static analysis and error detection for missing required fields during runtime.
Is there a good way of doing this (hopefully without a ton of boilerplate code) in Python? If so, what is it?
EDIT:
Per some of the clarification requests, we can take a look at the tpm_version command. Here's the output for my laptop, but for this TPM it doesn't include every possible attribute. Sometimes, the command will return extra attributes that I also want to capture. This makes parsing to known attribute names on a container object fairly difficult.
TPM 1.2 Version Info:
Chip Version: 1.2.4.40
Spec Level: 2
Errata Revision: 3
TPM Vendor ID: IFX
Vendor Specific data: 04280077 0074706d 3631ffff ff
TPM Version: 01010000
Manufacturer Info: 49465800
Example code (ignore lack of sanity checks, please. trimmed for brevity):
def __init__(self, chip_version, spec_level, errata_revision,
tpm_vendor_id, vendor_specific_data, tpm_version,
manufacturer_info):
self._chip_version = chip_version
...
#classmethod
def scan(cls):
tpm_proc = Popen("/usr/sbin/tpm_version")
stdout, stderr = Popen.communicate()
tpm_dict = dict()
for line in tpm_proc.stdout.splitlines():
if "Version Info:" in line:
pass
else:
split_line = line.split(":")
attribute_name = (
split_line[0].strip().replace(' ', '_').lower())
tpm_dict[attribute_name] = split_line[1].strip()
return cls(**tpm_dict)
The problem here is that this (or a different one that I may not be able to review the source of to get every possible field) could add extra things that cause my parser to work, but my object to not capture the fields. That's what I'm really trying to solve in an elegant way.
I've been working on a more solid answer to this the last few months, as I basically work on hardware support libraries and have finally come up with a satisfactory (though pretty verbose) answer.
Parse the tool outputs, whatever they look like, into objects structures that match up to how the tool views the device. These can have very generic dict structures, but should be broken out as much as possible.
Create another container class on top of that that which uses attributes to access items in the tool-container-objects. This enforces an API and can return sane errors across multiple versions of the tool, and across differing tool outputs!
I'm processing data from a serial port in python. The first byte indicates the start of a message and then the second byte indicates what type of message it is. Depending on that second byte we read in the message differently (to account for different types of messages, some are only data others are string and so on).
I now had the following structure. I have a general Message class that contains basic functions for every type of message and then derived classes that represent the different types of Messages (for example DataMessage or StringMessage). These have there own specific read and print function.
In my read_value_from_serial I read in all the byte. Right now I use the following code (which is bad) to determine if a message will be a DataMessage or a StringMessage (there are around 6 different type of messages but I simplify a bit).
msg_type = serial_port.read(size=1).encode("hex").upper()
msg_string = StringMessage()
msg_data = StringData()
processread = {"01" : msg_string.read, "02" : msg_data.read}
result = processread[msg_type]()
Now I want to simplify/improve this type of code. I've read about killing the switch but I don't like it that I have to create objects that I won't use in the end. Any suggestions for improving this specific problem?
Thanks
This is very close to what you have and I see nothing wrong with it.
class Message(object):
def print(self):
pass
class StringMessage(Message):
def __init__(self, port):
self.message = 'get a string from port'
def MessageFactory(port):
readers = {'01': StringMessage, … }
msg_type = serial_port.read(size=1).encode("hex").upper()
return readers[msg_type](port)
You say "I don't like it that I have to create objects that I won't use in the end". How is it that you aren't using the objects? If I have a StringMessage msg, then
msg.print()
is using an object exactly how it is supposed to be used. Did it bother you that your one instance of msg_string only existed to call msg_string.read()? My example code makes a new Message instance for every message read; that's what objects are for. That's actually how Object Oriented Programming works.
I have a simple spyne service:
class JiraAdapter(ServiceBase):
#srpc(Unicode, String, Unicode, _returns=Status)
def CreateJiraIssueWithBase64Attachment(summary, base64attachment, attachment_filename):
status = Status
try:
newkey = jira_client.createWithBase64Attachment(summary, base64attachment, attachment_filename)
status.Code = StatusCodes.IssueCreated
status.Message = unicode(newkey)
except Exception as e:
status.Code = StatusCodes.InternalError
status.Message = u'Internal Exception: %s' % e.message
return status
The problem is that some programs will insert '\n' into generated base64string, after every 60th character or so and it will come into the services' method escaped ('\\n') causing things to behave oddly. Is there a setting or something to avoid this?
First, some comments about the code you posted:
You must instantiate your types (i.e. status = Status() instead of status = Status). As it is, you're setting class attributes on the Status class. Not only this is just wrong, you're also creating race conditions by altering global state without proper locking.
Does Jira have a way of creating issues with binary data? You can use ByteArray that handles base64 encoding/decoding for you. Note that ByteArray gets deserialized as a sequence of strings.
You can define a custom base64 type:
Base64String = String(pattern='[0-9a-zA-Z/+=]+')
... and use it instead of plain String together with validation to effortlessly reject invalid input.
Instead of returning a "Status" object, I'd return nothing but raise an exception when needed (or you can just let the original exception bubble up). Exceptions also get serialized just like normal objects. But that's your decision to make as it depends on how you want your API to be consumed.
Now for your original question:
You'll agree that the right thing to do here is to fix whatever's escaping '\n' (i.e. 0x0a) as r"\n" (i.e. 0x5c 0x6e).
If you want to deal with it though, I guess the solution in your comment (i.e. base64attachment = base64attachment.decode('string-escape') would be the best solution.
I hope that helps.
Best regards,