Store reference to non-NDB object in an NDB model

Store reference to non-NDB object in an NDB model - python

As a caveat: I am an utter novice here. I wouldn't be surprised to learn a) this is already answered, but I can't find it because I lack the vocabulary to describe my problem or b) my question is basically silly to begin with, because what I want to do is silly.
Is there some way to store a reference to a class instance that defined and stored in active memory and not stored in NDB? I'm trying to write an app that would help manage a number of characters/guilds in an MMO. I have a class, CharacterClass, that includes properties such as armor, name, etc. that I define in main.py as a base python object, and then define the properties for each of the classes in the game. Each Character, which would be stored in Datastore, would have a property charClass, which would be a reference to one of those instances of CharacterClass. In theory I would be able to do things like
if character.charClass.armor == "Cloth":
while storing the potentially hundreds of unique characters and their specifc data in Datastore, but without creating a copy of "Cloth" for every cloth-armor character, or querying Datastore for what kind of armor a mage wears thousands of times a day.
I don't know what kind of NDB property to use in Character to store the reference to the applicable CharacterClass. Or if that's the right way to do it, even. Thanks for taking the time to puzzle through my confused question.

A string is all you need. You just need to fetch the class based on the string value. You could create a custom property that automatically instantiates the class on reference.
However I have a feeling that hard coding the values in code might be a bit unwieldy. May be you character class instances should be datastore entities as well. It means you can adjust these parameters without deploying new code.
If you want these objects in memory then you can pre-cache them on warmup.

Related

How to structure a project into Object Oriented framework?

I have a project and would like ideas/tips on how I could tackle it. This is the project
Each component in a car has a number. Example:
"Hitch" has num: "43". I want to search and return
every car model in the database (CSV file) that has the number "43".
In the future, I would also like to be able to see information about
the car model. Example: "manufacturer", "HP", then info about the
"manufacturer" etc etc.
I have never done anything like this before, but I have researched and found that maybe OOP is the way to go? In that case, how could one structure it?

Since your main question is how do you break your requirement into an Object Oriented framework, I would structure it as follows:
You have objects called Parts and an object called Car.
Parts will have attributes: PartName(string), PartId(integer), PartManufacturer(string).
Car will have objects CarName(string), CarId(int), CarManufacturer(string).
A third object PartsInCar will track the relation between Car and Parts.
This object will have attributes CarPartId (int), CarId(Car), PartId(part).
Alternatively PartsInCar can also be an attribute of Car as a "vector of class Parts".
Depending on how granular you want to maintain the data, there is a possibility to create another class Manufacturer having ManufacturerId(int), ManufacturerName(string).
Now the question is how do you load your csv database into this structure ?
This depends on how your input data looks like.
if you are doing the whole thing in memory, then you could use vector or dictionary to store the whole "list" of parts, cars and parts_in_car.
For each class defined above you will of course need member functions that will allow us easy operation.
E.g.:
Cars object can have a member function that will return all Parts objects associated with it.
Parts object can have a member function that searches all Cars and shortlists those cars that are using this part.

you could use a dictionary with the key being the car name, and the value being a list of information about each car. You can assign certain aspects of the car to each index of the list to keep track of the data. you could use oop for this but it's a bit uncessary if I understood your question correctly

What to do with runtime generated data spanning several classes?

I'm a self-taught programmer and a lot of the problems I encounter come from a lack of formal education (and often also experience).
My question it the following: How to rationalize where you store the data a class or function creates? I'll make a simple example:
Case: I have a webshop (SHOP) with a REST api and a product provider (PROVIDER) also with a REST API. I determine the product, I send that data to PROVIDER who sends me back formatted data that can be read by SHOP to make a working product on the webshop. PROVIDER also has a secondary REST api that provides generated images.
What I would come up with:
I'd make three classes: ProductBase, Shop and Provider
ProductBase would be the class from where I instantiate and store the individual product information.
Shop would be where I design the api interactions with the webshop.
Provider same as shop, but for interactions with provider api.
My problem: At some point you're creating data that's not clearly separated in concern. For example: Would I store the generated product data (from PROVIDER) in the ProductBase instance I created? It feels like I'm coupling the two classes this way. But it not there, then where?
What if I create product images with PROVIDER and I upload them to SHOP? Do I store the uploaded image-url in PRODUCT? How do you keep track of all this info?
The question I want answered:
I've read a lot on OOP and Design Patterns, and I have adopted a TDD approach which has greatly helped to improve my code but I haven't found anything on how to approach the flow of at runtime generated data within software engineering.
What would be a good way to solve above problem(s) and could you explain your rationale for it?

If I understand correctly, I think your current concern is that you have "raw" product data, which you want to store in objects, and you have "processed" (formatted) product data, which you also want to store in objects. Your question being should you mix them.
Let me just first point out the other obvious option. Namely, having two product classes: RawProduct and ProcessedProduct. Which to do?
(Edit: also, to be sure, product data should not be stored in provider. The provide performs the action of formatting but the data is product data. Not provider data).
It depends. There are a couple of considerations:
1) In general, in OOP, the idea is to couple actions on data with the data. So if possible, you have some method in ProductBase like "format()", where format will send the object off to the API to get formatted, and store the result in an instance variable. You can then also have a method like "find_image", that goes and fetches the image url from the API and then stores that in a field. An object's data is meant to be dynamic. It is meant to be altered by object methods.
2) If you need version control (if you want the full history of the object's state to be available), then you can't override fields with new data. So either you need to store a history of every object field in the object, or you need to create new objects.
3) Is RAM a concern? I sometimes create dataclasses that store only the final part of an object's life so that I can fit more of the objects into memory.
Personally I often find myself creating "RawObject" and "ProcessedObject" classes, it's just easier a lot of the time. But that's probably because I mostly work with document processing, so it's very clear. Usually You'll just update the objects data.
A benefit of having one object with the full history is that it is much easier to debug. Because the raw data and the API result are in the same object. So you can very easily probe what went wrong. If you start splitting things up it's harder to track. In general, the more information an object has about where it's been, the easier it is to figure out what went wrong with it.
Remember also though, since this is a Python question, Python is multi-paridigm. And if you're writing pipeline-style architectures (synchronous, linear processes), then a functional approach can also work well.
Once your data is stored in a product object, anything can hold a reference to that. So a shop can reference an object and a product can reference the object. Be clear on the difference between "has-a" relationships and "is-a" relationships.

SQLAlchemy: Knowing the field names and values of a model object?

I'm trying to keep to SOLID object oriented programming principles, stay DRY, etc, but my newness to Python/SQLAlchemy/Pyramid is making it very hard.
I'm trying to take what I now know to be a SQLAlchemy model used to create a simple Pyramid Framework object and use what I know to be "reflection" in C#, it may be called something different in Python (Introspection? Not sure as this is only my second week with python but I have lots of experience in other languages (C/C++/C#,Java, etc) so the trouble seems to be mapping my knowledge to the vocabulary of python, sorry), to find out the field names of the database table, and most importantly, the current field values, when I do not know the column names or ANY of the shape of the object in advance.
Thats right; I don't know that the 'derp' instance has a field named id or name, just that it has columns and a value in each of them. And thats all I care about.
The goal is to be able to take any SQLAlchemy defined data model, and convert it to a dictionary of column_name -> column_value fields of simple data types of the kind found in JSON as I want to ultimately serialize any object I create in SQLAlchemy to a json object, but I will settle for a dictionary as from there its trivial as long as the dictionary holds the correct types of data. Doing this for every object by hand is a violation of too many good clean code rules and will create too much work over time; I could spend another week on this and still save time and effort by doing it the right way.
So if I have a class defined in SQLAlchemy as:
class SimpleFooModel(Base):
id = Column(Integer, primary_key=True, autoincrement=True, nullable=False)
name = Column(VARCHAR(length=12), nullable=False, index=True)
.. and I have an instance of this equal to (in python):
derp = SimpleFooModel(id=7, name="Foobar")
I want to be able to having ONLY the 'derp' instance variable described above, and NO OTHER KNOWLEDGE of how the model is shaped, and be able to flatten it out to a python key->value dictionary for that simple object, where every value in that dictionary can be serialized to JSON using import json from python syslib.
The problem is , I have been up for 2 days looking at this and I cant find an answer that gives me the results I want in my unit tests ANYWHERE; Google keeps taking me to really old posts here on SO about really old versions of the library that either use interfaces that no longer apply, or have accepted answers that do not actually work at all; and since none of them are recent that does surprise me (but why Stack Overflow keeps them when they are wrong and allows google to mislead people does surprise me)
I know I could wire every object manually for every object to json, etc, but thats not only NOT ELEGANT, its inefficient because it just creates more work for me as I create more objects and could lead to big bugs down the road. I want to know how to do this the correct way, with introspection/reflection, but nobody seems to know, and the people who claim to have all given examples here on stack overflow that actually do not work at all (at least with the current versions of things)
This seems like a really common use case for me; and getting the column field list and then iterating through it with getattr - like many of the answers say to do - doesn't work as expected either; it just creates what look like namespaces that never return the actual value of the column, and don't actually exist in any code as none of the fields created by sqlalchmy are singleton/static.
So:
from sqlalchemy.inspection import inspect
obj = inspect(derp, raiseerr=True)
for key in obj.attrs.keys():
fields[key] = getattr(derp, key)
print fields[key]
Just gives me:
[Class Name].[Column Name]
.. or in this case just gives me:
SimpleFooModel.id
SimpleFooModel.name
NOT the values of 7 and "Foobar" for id and name respectively, that I actually expected in my tests.
In fact it seems like I cant even find WHERE the values are being stored in the object model; or I could brute force the issue and get them from there as an ugly, evil hack I would be ashamed to look at. All I get through the "official public api" is a lot of objects that seem to have no clue where the real data is being stored, but will happily tell me the name of the path used by the column name and type, restrictions, etc... just not the actual data that I want.
Yet since my requirement is that I do not know the field names in advance, using a call to derp.id or derp.name to collect the value is not an option since that would violate SOLID and force me to duplicate work for every single class. So its not an option.
Maybe its the fact I have not slept in 2 days but its really hard for me to not see this as a serious design flaw in these libs; I just want to serialize a SQLAlchemy defined Model object representing a single row in a table into a python dictionary without having to know the names of the fields in advance, and while many other languages make this easy or even trivial, this seems to be far too hard than it should be.
Can somebody please explain either a working solution or why I am wrong to want to apply SOLID to my code?
EDIT: Updated spelling.

Extend your model with following class:
class BaseModel(object):
#classmethod
def _get_keys(cls):
return sa.orm.class_mapper(cls).c.keys()
def get_dict(self):
d = {}
for k in self._get_keys():
d[k] = getattr(self, k)
return d
This will do exactly what you want, return a dict in form of {'column_name':'value'} pairs.

db.ReferenceProperty() vs ndb.KeyProperty in App Engine

ReferenceProperty was very helpful in handling references between two modules. Fox example:
class UserProf(db.Model):
name = db.StringProperty(required=True)
class Team(db.Model):
manager_name = db.ReferenceProperty(UserProf, collection_name='teams')
name = db.StringProperty(required=True)
To get 'manager_name' with team instance, we use team_ins.manager_name.
To get 'teams' which are managed by particular user instance, we use user_instance.teams and iterate over.
Doesn't it look easy and understandable?
In doing same thing using NDB, we have to modify
db.ReferenceProperty(UserProf, collection_name='teams') --> ndb.KeyProperty(kind=UserProf)
team_ins.manager_name.get() would give you manager name
To get all team which are manger by particular user, we have to do
for team in Team.query(Team.manager_name == user_ins.key):
print "team name:", team.name
As you can see handling these kind of scenarios looks easier and readable in db than ndb.
What is the reason for removing ReferenceProperty in ndb?
Even db's query user_instance.teams would have doing the same thing as it is done in ndb's for loop. But in ndb, we are explicitly mentioning using for loop.
What is happening behind the scenes when we do user_instance.teams?
Thanks in advance..

Tim explained it well. We found that a common anti-pattern was using reference properties and loading them one at a time, because the notation "entity.property1.property2" doesn't make it clear that the first dot causes a database "get" operation. So we made it more obvious by forcing you to write "entity.property1.get().property2", and we made it easier to do batch prefetching (without the complex solution from Nick's blog) by simply saying "entity.property1.get_async()" for a bunch of entities -- this queues a single batch get operation without blocking for the result, and when you next reference any of these properties using "entity.property1.get().property2" this won't start another get operation but just waits for that batch get to complete (and the second time you do this, the batch get is already complete). Also this way in-process and memcache integration comes for free.

I don't know the answer as to why Guido didn't implement reference property.
However I found a spent a lot of time using pre_fetch_refprops http://blog.notdot.net/2010/01/ReferenceProperty-prefetching-in-App-Engine (pre fetches all of the reference properties by grabbing all the keys with get_value_for_datastore,) and then it does a get_multi on the keys.
This was vastly more efficient.
Also if the object referenced doesn't exist you would get an error when trying to fetch the object.
If you pickled an object which had references you ended up pickling a lot more than you probably planned too.
So I found except for the one case, where you have single entity and you wanted to grab the referenced object with .name type accessor you had to jump through all sorts of hoops to prevent the referenced entity from being fetched.

Python sqlite3, saving instance of a class with an other instance as it's attribute?

I'm creating a game mod for Counter-Strike in python, and it's basically all done. The only thing left is to code a REAL database, and I don't have any experience on sqlite, so I need quite a lot of help.
I have a Player class with attribute self.steamid, which is unique for every Counter-Strike player (received from the game engine), and self.entity, which holds in an "Entity" for player, and Entity-class has lots and lots of more attributes, such as level, name and loads of methods. And Entity is a self-made Python class).
What would be the best way to implement a database, first of all, how can I save instances of Player with an other instance of Entity as it's attribute into a database, powerfully?
Also, I will need to get that users data every time he connects to the game server, (I have player_connect event), so how would I receive the data back?
All the tutorials I found only taught about saving strings or integers, but nothing about whole instances. Will I have to save every attribute on all instances (Entity instance has few more instances as it's attributes, and all of them have huge amounts of attributes...), or is there a faster, easier way?
Also, it's going to be a locally saved database, so I can't really use any other languages than sql.

You need an ORM. Either you roll your own (which I never suggest), or you use one that exists already. Probably the two most popular in Python are sqlalchemy, and the ORM bundled with Django.

SQL databses typically can hold only fundamental datatypes. You can use SQLAlchemy if you want to map your models so that their attributes are automatically mapped to SQL types - but it would require a lot of study and trial and error using SQLlite on your part.
I think you are not entirely correct when you say "it has to be SQL" - if you are running Python code, you can save whatver format you like.
However, Python allows you to serialize your instance Data to a string - which is persistable in a database.
So, you can create a varchar(65535) field in the SQL, along with an ID field (which could be the player ID number you mentioned, for example), and persist to it the value returned by:
import pickle
value = pickle.dumps(my_instance)
When retrieving the value you do the reverse:
my_instance = pickle.loads(value)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.