How can I optionally populate instance of an object from pickle file? - python

Full code here: link
Relevant code is:
class Session():
...
def load(self, create_date=None):
if create_date:
self = pickle.load(open(file_path(create_date), 'rb'))
else:
self = pickle.load(open(file_path(), 'rb'))
I have defined a simple class "Session"
(It's a container for everything else my app will do).
I have a method for creating a fresh session, and saving it to a pickle.
I intend to have one Session = one Day.
So if the user re-opens the app on a particular day, it checks for an existing session and reloads it from the pickle.
My current code throws an error:
AttributeError: 'Session' object has no attribute 'create_date'
I believe the line that isn't working properly is 44:
self = pickle.load(open(file_path(), 'rb'))
But I have a working variation on Line 12. (Not ideal, outside the class)
How can I load this existing pickle data and populate it into the "active_session" instance?

Related

How can I reload a python class if file is changed on disk?

To support runtime changes of parameters stored in source class file and used as an object with fields, how can I check if the source file for the object was modified since runtime started or since the last time it was reloaded, and reload the class and make a new instance of the object?
This method seems to work:
def reload_class_if_modified(obj:object, every:int=1)->object:
"""
Reloads an object if the source file was modified since runtime started or since last reloaded
:param obj: the original object
:param every: only check every this many times we are invoked
:returns: the original object if classpath file has not been modified
since startup or last reload time, otherwise the reloaded object
"""
reload_class_if_modified.counter+=1
if reload_class_if_modified.counter>1 and reload_class_if_modified.counter%every!=0:
return obj
try:
module=inspect.getmodule(obj)
cp=Path(module.__file__)
mtime=cp.stat().st_mtime
classname=type(obj).__name__
if (mtime>reload_class_if_modified.start_time and (not (classname in reload_class_if_modified.dict))) \
or ((classname in reload_class_if_modified.dict) and mtime>reload_class_if_modified.dict[classname]):
importlib.reload(module)
class_ =getattr(module,classname)
o=class_()
reload_class_if_modified.dict[classname]=mtime
return o
else:
return obj
except Exception as e:
logger.error(f'could not reload {obj}: got exception {e}')
return obj
reload_class_if_modified.dict=dict()
reload_class_if_modified.start_time=time()
reload_class_if_modified.counter=0
Use it like this:
import neural_mpc_settings
from time import sleep as sleep
g=neural_mpc_settings()
while True:
g=reload_class_if_modified(g, every=10)
print(g.MIN_SPEED_MPS, end='\r')
sleep(.1)
where neural_mpc_settings is
class neural_mpc_settings():
MIN_SPEED_MPS = 5.0
When I change neural_mpc_settings.py on disk,
the class is reloaded and the new object
returned reflects the new class fields.
You might want to consider using a library like watchdog, which would let you trigger a handler whenever the file is changed. Instead of collocating your parameters with the code, you could stored them in a data file, with a data loader method that was called on startup and whenever the underlying data file was changed.

Create activities for an user with Stream-Framework

I'm trying to setup stream-framework the one here not the newer getstream. I've setup the Redis server and the environment properly, the issue I'm facing is in creating the activities for a user.
I've been trying to create activities, following the documentation to add an activity but it gives me an error message as follows:
...
File "/Users/.../stream_framework/activity.py", line 110, in serialization_id
if self.object_id >= 10 ** 10 or self.verb.id >= 10 ** 3:
AttributeError: 'int' object has no attribute 'id'
Here is the code
from stream_framework.activity import Activity
from stream_framework.feeds.redis import RedisFeed
class PinFeed(RedisFeed):
key_format = 'feed:normal:%(user_id)s'
class UserPinFeed(PinFeed):
key_format = 'feed:user:%(user_id)s'
feed = UserPinFeed(13)
print(feed)
activity = Activity(
actor=13, # Thierry's user id
verb=1, # The id associated with the Pin verb
object=1, # The id of the newly created Pin object
)
feed.add(activity) # Error at this line
I think there is something missing in the documentation or maybe I'm doing something wrong. I'll be very grateful if anyone helps me get the stream framework working properly.
The documentation is inconsistent. The verb you pass to the activity should be (an instance of?*) a subclass of stream_framework.verbs.base.Verb. Check out this documentation page on custom verbs and the tests for this class.
The following should fix the error you posted:
from stream_framework.activity import Activity
from stream_framework.feeds.redis import RedisFeed
from stream_framework.verbs import register
from stream_framework.verbs.base import Verb
class PinFeed(RedisFeed):
key_format = 'feed:normal:%(user_id)s'
class UserPinFeed(PinFeed):
key_format = 'feed:user:%(user_id)s'
class Pin(Verb):
id = 5
infinitive = 'pin'
past_tense = 'pinned'
register(Pin)
feed = UserPinFeed(13)
activity = Activity(
actor=13,
verb=Pin,
object=1,
)
feed.add(activity)
I quickly looked over the code for Activity and it looks like passing ints for actor and object should work. However, it is possible that these parameters are also outdated in the documentation.
* The tests pass in classes as verb. However, the Verb base class has the methods serialize and __str__ that can only be meaningfully invoked if you have an object of this class. So I'm still unsure which is required here. It seems like in the current state, the framework never calls these methods, so classes still work, but I feel like the author originally intended to pass instances.
With the help of great answer by #He3lixxx, I was able to solve it partially. As the package is no more maintained, the package installs the latest Redis client for python which was creating too many issues so by installation redis-2.10.5 if using stream-framework-1.3.7, should fix the issue.
I would also like to add a complete guide to properly add activity to a user feed.
Key points:
If you are not using feed manager, then make sure to first insert the activity before you add it to the user with feed.insert_activity(activity) method.
In case of getting feeds with feed[:] throws an error something like below:
File "/Users/.../stream_framework/activity.py", line 44, in get_hydrated
activity = activities[int(self.serialization_id)]
KeyError: 16223026351730000000001005L
then you need to clear data for that user using the key format for it in my case the key is feed:user:13 for user 13, delete it with DEL feed:user:13, In case if that doesn't fix the issue then you can FLUSHALL which will delete everything from Redis.
Sample code:
from stream_framework.activity import Activity
from stream_framework.feeds.redis import RedisFeed
from stream_framework.verbs import register
from stream_framework.verbs.base import Verb
class PinFeed(RedisFeed):
key_format = 'feed:normal:%(user_id)s'
class UserPinFeed(PinFeed):
key_format = 'feed:user:%(user_id)s'
class Pin(Verb):
id = 5
infinitive = 'pin'
past_tense = 'pinned'
register(Pin)
feed = UserPinFeed(13)
print(feed[:])
activity = Activity(
actor=13,
verb=Pin,
object=1)
feed.insert_activity(activity)
activity_id = feed.add(activity)
print(activity_id)
print(feed[:])

Unable to use same SQLite connection across multiple objects in Python

I'm working on a Python desktop app using wxPython and SQLite. The SQLite db is basically being used as a save file for my program so I can save and backup and reload the data being entered. I've created separate classes for parts of my UI so make it easier to manage from the "main" window. The problem I'm having is that each control needs to access the database, but the filename, and therefore the connection name, needs to be dynamic. I originally created a DBManager class that hardcoded a class variable with the connection string, which worked but didn't let me change the filename. For example
class DBManager:
conn = sqlite3.Connection('my_file.db')
#This could then be passed to other objects as needed
class Control1:
file = DBManager()
class Control2:
file = DBManager()
etc.
However, I'm running into a lot of problems trying to create this object with a dynamic filename while also using the same connection across all controls. Some examples of this I've tried...
class DBManager:
conn = None
def __init__(self):
pass
def __init__(self, filename):
self.conn = sqlite3.Connection(filename)
class Control1:
file = DBManager()
class Control2:
file = DBManager()
The above doesn't work because Python doesn't allow overloading constructors, so I always have to pass a filename. I tried adding some code to the constructor to act differently based upon whether the filename passed was blank or not.
class DBManager:
conn = None
def __init__(self, filename):
if filename != '':
self.conn = sqlite3.Connection(filename)
class Control1:
file = DBManager('')
class Control2:
file = DBManager('')
This let me compile, but the controls only had an empty connection. The conn object was None. It seems like I can't change a class variable after it's been created? Or am I just doing something wrong?
I've thought about creating one instance of DBManager that I then pass into each control, but that would be a huge mess if I need to load a new DB after starting the program. Also, it's just not as elegant.
So, I'm looking for ideas on achieving the one-connection path with a dynamic filename. For what it's worth, this is entirely for personal use, so it doesn't really have to follow "good" coding convention.
Explanation of your last example
You get None in the last example because you are instantiating DBManager in Control1 and Control2 with empty strings as input, and the DBManager constructor has an if-statement saying that a connection should not be created if filename is just an empty string. This leads to the self.conn instance variable never being set and any referal to conn would resolve to the conn class variable which is indeed set to None.
self.conn would create an instance variable only accessible by the specific object.
DBManager.conn would refer to the class variable and this is what you want to update.
Example solution
If you only want to keep one connection, you would need to do it with e.g. a. class variable, and update the class variable every time you interact with a new db.
import sqlite3
from sqlite3 import Connection
class DBManager:
conn = None
def __init__(self, filename):
if filename != '':
self.filename = filename
def load(self) -> Connection:
DBManager.conn = sqlite3.Connection(self.filename) # updating class variable with new connection
print(DBManager.conn, f" used for {self.filename}")
return DBManager.conn
class Control1:
db_manager = DBManager('control1.db')
conn = db_manager.load()
class Control2:
db_manager = DBManager('control2.db')
conn = db_manager.load()
if __name__ == "__main__":
control1 = Control1()
control2 = Control2()
would output the below. Note that the class variable conn refers to different memory addresses upon instantiating each control, showing that it's updated.
<sqlite3.Connection object at 0x10dc1e1f0> used for control1.db
<sqlite3.Connection object at 0x10dc1e2d0> used for control2.db

How to perform a task after flask app starts?

I'm trying to execute a task after the app does the binding with the port so it doesn't get killed by Heroku for taking too long on startup. I am aware of the existence of before_first_request however I would like this action to be performed as soon as possible after the app startup without requiring a request.
I am loading an object as an attribute of the app object (because I need to access it across requests) and this object has to initialize in a weird way (it checks if a file exists and downloads it if it doesn't and afterwards it performs a bunch of computations).
Currently I'm doing this in the following way:
def create_app() -> Flask:
...
with app.app_context():
app.model = RecommenderModel() # This downloads a pretty heavy file if it isn't there
app.model.load_products() # This performs a bunch of calculations
...
return app
This initializes the app properly (as tested locally) however Heroku kills it (Error R10) because it takes too long.
Is there a way to do this asynchronously? When I tried to do so the app context got lost.
Edit: Additional information regarding what I'm doing:
The RecommenderModel object models the logic of a recommendation system. As of now, the recommendations are based on vector cosine similarity. Those vectors are extracted using pre-trained word2vec embeddings (which is the large file that needs to be downloaded). The conversion from products to vectors is handled by a Preprocessor class.
The Recommender Model initialization looks like this:
class RecommenderModel(object):
def __init__(self) -> None:
self.preproc = Preprocessor()
self.product_vector: dict = {}
def load_products(self) -> None:
for product in Product.get_all():
self.product_vector[product.id] = self.preproc.compute_vector(product)
The Preprocessor initialization looks like this:
class Preprocessor(object):
def __init__(self, embeddings: str = embeddings) -> None:
S3.ensure_file(embeddings)
self.vectors = KeyedVectors.load_word2vec_format(embeddings)
The S3.ensure_file method basically checks if the file exists and downloads it if it doesn't:
class S3(object):
client = boto3.client('s3')
#classmethod
def ensure_file(cls, filepath: str) -> None:
if os.path.exists(filepath):
return
dirname, filename = os.path.split(filepath)
bucket_name = os.environ.get('BUCKET_NAME')
cls.client.download_file(bucket_name, filename, filepath)

Python error: Nonetype object not subscriptable, when loading JSON into variables

I have a program where I am reading in a JSON file, and executing some SQL based on parameters specified in the file. The
load_json_file()
method loads the json file to a Python object first (not seen here but works correctly)
The issue is with the piece of the code here:
class TestAutomation:
def __init__(self):
self.load_json_file()
# connect to Teradata and load session to be used for execution
def connection(self):
con = self.load_json_file()
cfg_dsn = con['config']['dsn']
cfg_usr = con['config']['username']
cfg_pwd = con['config']['password']
udaExec = teradata.UdaExec(appName="DataAnalysis", version="1.0", logConsole=False)
session = udaExec.connect(method="odbc", dsn=cfg_dsn, username=cfg_usr, password=cfg_pwd)
return session
the init_ method first loads the JSON file, and then I store that in 'con'. I am getting an error though that reads:
cfg_dsn = con['config']['dsn']
E TypeError: 'NoneType' object is not subscriptable
The JSON file looks like this:
{
"config":{
"src":"C:/Dev\\path",
"dsn":"XYZ",
"sheet_name":"test",
"out_file_prefix":"C:/Dev\\test\\OutputFile_",
"password":"pw123",
"username":"user123",
"start_table":"11",
"end_table":"26",
"skip_table":"1,13,17",
"spot_check_table":"77"
}
}
the load_json_file() is defined like this:
def load_json_file(self):
if os.path.isfile(os.path.dirname(os.path.realpath(sys.argv[0])) + '\dwconfig.json'):
with open('dwconfig.json') as json_data_file:
cfg_data = json.load(json_data_file)
return cfg_data
Any ideas why I am seeing the error?
problem is that you're checking if the configuration file exists, then read it.
If it doesn't, your function returns None. This is wrong in many ways because os.path.realpath(sys.argv[0]) can return an incorrect value, for instance if the command is run with just the base name, found through the system path ($0 returns the full path in bash but not in python or C).
That's not how you get the directory of the current command.
(plus afterwards you're going to do with open('dwconfig.json') as json_data_file: which is now the name of the file, without the full path, wrong again)
I would skip this test, but compute the config file path properly. And if it doesn't exist, let the program crash instead of returning None that will crash later.
def load_json_file(self):
with open(os.path.join(os.path.dirname(__file__),'dwconfig.json')) as json_data_file:
cfg_data = json.load(json_data_file)
return cfg_data
So... cfg_dsn = con['config']['dsn']
something in there is set to None
you could be safe and write it like
(con or {}).get('config',{}).get('dsn')
or make your data correct.

Categories

Resources