I'm trying to keep to SOLID object oriented programming principles, stay DRY, etc, but my newness to Python/SQLAlchemy/Pyramid is making it very hard.
I'm trying to take what I now know to be a SQLAlchemy model used to create a simple Pyramid Framework object and use what I know to be "reflection" in C#, it may be called something different in Python (Introspection? Not sure as this is only my second week with python but I have lots of experience in other languages (C/C++/C#,Java, etc) so the trouble seems to be mapping my knowledge to the vocabulary of python, sorry), to find out the field names of the database table, and most importantly, the current field values, when I do not know the column names or ANY of the shape of the object in advance.
Thats right; I don't know that the 'derp' instance has a field named id or name, just that it has columns and a value in each of them. And thats all I care about.
The goal is to be able to take any SQLAlchemy defined data model, and convert it to a dictionary of column_name -> column_value fields of simple data types of the kind found in JSON as I want to ultimately serialize any object I create in SQLAlchemy to a json object, but I will settle for a dictionary as from there its trivial as long as the dictionary holds the correct types of data. Doing this for every object by hand is a violation of too many good clean code rules and will create too much work over time; I could spend another week on this and still save time and effort by doing it the right way.
So if I have a class defined in SQLAlchemy as:
class SimpleFooModel(Base):
id = Column(Integer, primary_key=True, autoincrement=True, nullable=False)
name = Column(VARCHAR(length=12), nullable=False, index=True)
.. and I have an instance of this equal to (in python):
derp = SimpleFooModel(id=7, name="Foobar")
I want to be able to having ONLY the 'derp' instance variable described above, and NO OTHER KNOWLEDGE of how the model is shaped, and be able to flatten it out to a python key->value dictionary for that simple object, where every value in that dictionary can be serialized to JSON using import json from python syslib.
The problem is , I have been up for 2 days looking at this and I cant find an answer that gives me the results I want in my unit tests ANYWHERE; Google keeps taking me to really old posts here on SO about really old versions of the library that either use interfaces that no longer apply, or have accepted answers that do not actually work at all; and since none of them are recent that does surprise me (but why Stack Overflow keeps them when they are wrong and allows google to mislead people does surprise me)
I know I could wire every object manually for every object to json, etc, but thats not only NOT ELEGANT, its inefficient because it just creates more work for me as I create more objects and could lead to big bugs down the road. I want to know how to do this the correct way, with introspection/reflection, but nobody seems to know, and the people who claim to have all given examples here on stack overflow that actually do not work at all (at least with the current versions of things)
This seems like a really common use case for me; and getting the column field list and then iterating through it with getattr - like many of the answers say to do - doesn't work as expected either; it just creates what look like namespaces that never return the actual value of the column, and don't actually exist in any code as none of the fields created by sqlalchmy are singleton/static.
So:
from sqlalchemy.inspection import inspect
obj = inspect(derp, raiseerr=True)
for key in obj.attrs.keys():
fields[key] = getattr(derp, key)
print fields[key]
Just gives me:
[Class Name].[Column Name]
.. or in this case just gives me:
SimpleFooModel.id
SimpleFooModel.name
NOT the values of 7 and "Foobar" for id and name respectively, that I actually expected in my tests.
In fact it seems like I cant even find WHERE the values are being stored in the object model; or I could brute force the issue and get them from there as an ugly, evil hack I would be ashamed to look at. All I get through the "official public api" is a lot of objects that seem to have no clue where the real data is being stored, but will happily tell me the name of the path used by the column name and type, restrictions, etc... just not the actual data that I want.
Yet since my requirement is that I do not know the field names in advance, using a call to derp.id or derp.name to collect the value is not an option since that would violate SOLID and force me to duplicate work for every single class. So its not an option.
Maybe its the fact I have not slept in 2 days but its really hard for me to not see this as a serious design flaw in these libs; I just want to serialize a SQLAlchemy defined Model object representing a single row in a table into a python dictionary without having to know the names of the fields in advance, and while many other languages make this easy or even trivial, this seems to be far too hard than it should be.
Can somebody please explain either a working solution or why I am wrong to want to apply SOLID to my code?
EDIT: Updated spelling.
Extend your model with following class:
class BaseModel(object):
#classmethod
def _get_keys(cls):
return sa.orm.class_mapper(cls).c.keys()
def get_dict(self):
d = {}
for k in self._get_keys():
d[k] = getattr(self, k)
return d
This will do exactly what you want, return a dict in form of {'column_name':'value'} pairs.
Related
This question already has answers here:
How to make a class JSON serializable
(41 answers)
Closed 6 months ago.
I'm noting that the methods I am looking at to serialize a variable into JSON in python don't really seem to handle it all that well, and for my purpose I just want to quickly dump an objects contents into a string format so I can pick out what I actually want to write custom code to handle. I want to be able to dump the main fields at the very least of any class I pass the python serializer and really if its worth the name this should work.
So take the following code:
import json
c = SomeClass()
#causes an error if any field in someclass has another class instance.
json.dumps(c)
leads to..
TypeError: Object of type {Type} is not JSON serializable
Are there any modules other people have used that would solve my problem ? I really don't see how there would not be. Or maybe one might explain how to circumvent this error ?
The goal is to simply get some output to look at. If I wrote a recursion loop in c# using reflection, excepting circular references, it wouldn't be difficult, so I cannot imagine python users have never tackled this exact issue and I'm not satisfied with the answers that I have seen in older posts which seem to suggest a lot of custom tinkering for something seems to be designed in spirit to just dump any old object's contents out.
I don't even need complex traversal is the funny part, though it would be nice. I just need a dump of the property values which are primitive types in many cases. I know this is possible because the debugger does it.
Additionally I looked at one of the methods given indicating to use default lambda to specify how the json serializer should descend into the object:
json.dumps(o, default=lambda k: k.__dict__)
and the object does not contain the standard dict member.
in the end I just ended up writing a class to do this.
edit:
Here use this now you can one way serialize a class structure with this nifty little bit of code that I added to address my problem with f**** discord.py !
end edit
There is no fire and forget option that would disentangle a mass of information.
The way of creating this solution would be to manage seperate lists of subclasses to make sure not to recurse until a stackoverflow is reached.
The slots_ can be used with getattr(o,name) when hasattr(o,'dict') is False.
But the answer is you'd have to create a solution that basically does the job that the json serializer should be doing and cut out circular reference by determining the unique complex types and writing them in seperate tabular entries in the json file and replacing them in the referencing classes with ids.
That way you could cross reference these objects while glancing at them.
However the short answer is no. Python does not offer an out of the box way of doing this and all the provided answers encountered thus far only solve a single use-case or scenario, and do not create a incorporated solution to the problem which the above mentioned algorithm WOULD by NORMALIZING the class data into unique elements.
As a caveat: I am an utter novice here. I wouldn't be surprised to learn a) this is already answered, but I can't find it because I lack the vocabulary to describe my problem or b) my question is basically silly to begin with, because what I want to do is silly.
Is there some way to store a reference to a class instance that defined and stored in active memory and not stored in NDB? I'm trying to write an app that would help manage a number of characters/guilds in an MMO. I have a class, CharacterClass, that includes properties such as armor, name, etc. that I define in main.py as a base python object, and then define the properties for each of the classes in the game. Each Character, which would be stored in Datastore, would have a property charClass, which would be a reference to one of those instances of CharacterClass. In theory I would be able to do things like
if character.charClass.armor == "Cloth":
while storing the potentially hundreds of unique characters and their specifc data in Datastore, but without creating a copy of "Cloth" for every cloth-armor character, or querying Datastore for what kind of armor a mage wears thousands of times a day.
I don't know what kind of NDB property to use in Character to store the reference to the applicable CharacterClass. Or if that's the right way to do it, even. Thanks for taking the time to puzzle through my confused question.
A string is all you need. You just need to fetch the class based on the string value. You could create a custom property that automatically instantiates the class on reference.
However I have a feeling that hard coding the values in code might be a bit unwieldy. May be you character class instances should be datastore entities as well. It means you can adjust these parameters without deploying new code.
If you want these objects in memory then you can pre-cache them on warmup.
I'm working on creating a python program to interact with many different types of conceptual objects. For example, it might represent a person, in which case it'd have something like this:
type = "person"
name = "Bono"
profession = "performer"
nationality = "Irish"
However, it might also represent a magazine, in which case it'd look something like this
type = "publication"
name = "Rolling Stone"
editor = ("Jann Wenner" , "Will Dana")
founding_year = "1967"
Aside from type and name, all of the other fields are optional. Here's the tricky bit -- it's part of code written for a scraper, so all of the other fields are determined/created on the fly. In other words, we won't know that we need an "editor" field until the scraper spits back "editor" to the code
Ideally, this would be implemented fairly straightforwardly as a python dictionary of lists. However, we'll be working with a large number of records -- too many to keep in memory at the same time. As a result, I'd like to have database compatibility -- something like Django's MVC, so we can easily query the record set.
One option I had considered was Django fieldsets, but it looks like they're still in beta and I worry that I'll lose some generality in what I can store -- ideally, I'd be able to store any type of data with a key, (value_list) pair. I'd love any input on the feasibility of fieldsets or example code.
Another option I had considered was a combination of the Django MVC and JSON. In this case, I'd have three columns for each object -- type, name, and attributes. Attributes would be a JSON serialization (or other appropriate pickling method) of all of the other attributes, so that once you had the object, you could reconstitute it's attributes and query the set. I'd store something like this or this (links). With this method, I'd lose the ability to easily do a search for any of the attributes in the dict.
I'd very much appreciate any input or guidance. If anyone knows of similar projects, I'd love to know.
This seems like an excellent opportunity to use NoSQL databases. Something like MongoDB doesn't rely on a fixed schema, so it might be suitable for your scenario.
I have a sqlalchemy class R which implements a m:n relation between two other classes A and B. So R has two integer columns source_id and target_id which hold the ids of the referenced instances. And R has two properties source_obj and target_obj which are defined via relationship. It's more or less the same as decribed here in the documenation.
What I want to do is to retrieve the referenced classes from R. I'm using sqlalchemy 0.8 and tried to use the inspect() method on R.source_obj, but I only get back a InstrumentedAttribute which seems not to be of much help. At least I was not able to extract any useful information or to find any documentation about it.
Any help would be very appreciated! How do I get A and B from R?
Try something like this. I'm also dealing with this and find no documentation, think this can help you to start.
from sqlalchemy import inspect
i = inspect(model)
for relation in i.relationships:
print(relation.direction.name)
print(relation.remote_side)
print(relation._reverse_property)
dir(relation)
I spent the majority of the day working on this same problem, and I was able to write a list comprehension that takes in a table and then spits out a list of the table names which are connected via a relationship or a foreign key. You need to convert that string into a reference to the actual class, but otherwise it works just fine.
relationship_list = [str(list(column.remote_side)[0]).split('.')[0] for column \
in inspect(table).relationships]
By removing the .split('.')[0], you can get a list of the actual columns which are referred to by the connections. The comprehension is pretty ugly, but it works. Hope this helps anyone else who is looking for the same thing I was!
I am looking for an appropriate data structure in Python for processing variably structured forms. By variably structured forms I mean that the number of form fields and the types of the form's contents are not known in advance. They are defined by the user who populates the forms with his input.
What are the pros and cons of putting data in A) object attributes (e.g. of an otherwise empty "form"-class) or B) simply lists/dicts? Consider that I have to preserve the sequence of form fields, the form field names and the types.
(Strangely, it has been difficult to find conclusive information on this topic. As I am still new to Python, it's possible that I have searched for the wrong terms. If my question is not clear enough, please ask in the comments and I will try to clarify.)
In Python, as in all object-oriented languages, the purpose of classes is to associate data and closely-related methods that act on that data. If there's no real encapsulation going on (i.e. the methods help define the ways you can interact with the data), the best choice is a conglomeration of builtin types like lists and dictionaries as you mention and perhaps some utility functions that act on those sorts of data structures.
Python classes are literally just two dicts (one for functions, one for data), a name and the rules how Python looks for keys. When you access existing keys, there is absolutely no difference to a dict (unless you overwrote the access rules of cause).
That means that there is no drawback (besides more code) to using classes at all and you should never be afraid to write a class.
In your particular case I think you should go with classes, for one simple reason: You might want to extend them later. Maybe you want to add constraints on the name (length, allowed letters, uniqueness, ...) or the value (not empty, length, type, ...) of a field one day. Maybe you want to validate all fields in a form. If you use a class you can do this without changing any code outside the class! And as I said before, even if you don't, there are no drawbacks!
I guess my rule of thumb for classes is: Don't use a class if you're absolutely sure that there is nothing to add to it. If not just write those few extra lines.
It's not very Pythonic to randomly add members to an object. It would be more Pythonic if you used member methods to do it, but still not the way things are usually done.
Every library I've seen for this kind of thing uses dictionaries or lists. So that is the idiomatically Python way to handle the problem. Sometimes they use an object that overrides __getitem__ so it can behave like a dictionary or list, but it's still dictionary syntax that's used to access the fields.
I think all the pros and cons have to do with people understanding your code, and since I've never seen code that handles this by having an object with members that can appear I don't think many people will find code that does do that to be very understandable.
A list of dictionaries (e.g. [{"type": "text", "name": "field_name", "value": "test value"}, ...]) would be a usable structure, if I understand your requirement correctly.
Whether object are better in this case depends on what you're doing later. If you use the objects just as data storage, you don't gain anything. Maybe a list of field objects, which implement some appropriate methods to deal with your data, would also be a good choice.
maybe if you set up an object to use for each field and store those in a list, but that is practically ending up like a glorified dictionary
then you could access it like
fields[2].name
fields[2].value
ect