I'm currently storing IRC users in a dictionary, using their nickname as the 'key' for easy retrieval. The users are retrieved from SQLAlchemy, so users['deepy'] is an SQLAlchemy object which I regularly sync with my database.
Now the problem I have is that on IRC, people can be in many channels and I'm just keeping track of one. I need a suggestion on how to improve this.
I've been thinking about doing pretty much the same, but also storing the channel's users (as a list) in a dictionary with the channel names as key, so like:
{ '#two': ['reference to user9', 'reference to user62'], '#one': ['reference to user1', 'reference to user2'] }
The references being to the users dictionary which contains the SQLAlchemy object.
Is that a sensible approach?
I am using Python 2.7, PostgreSQL, SQLAlchemy and Twisted's irc.ircclient.
I've found that it's best to store as few SQLALchemy objects as possible. Storing a ton of SQLAlchemy objects leads to having to worry about synchronization issues and uses up memory.
I would keep track of usernames or user ids instead of actual user objects:
{ '#two': ['bob1', 'tom2'], '#one': ['bob1', 'mary1'] }
Then whenever I needed information for a user I would fetch them from the database. If I had only a few users and needed to access them frequently then I would create a dictionary that mapped usernames to SQLAlchemy user objects.
Related
I am designing a web application that has users becoming friends with other users. I am storing the users info in a database using sqlite3.
I am brainstorming on how I can keep track on who is friends with whom.
What I am thinking so far is; to make a column in my database called Friendships where I store the various user_ids( integers) from the user's friends.
I would have to store multiple integers in one column...how would I do that?
Is it possible to store a python list in a column?
I am also open to other ideas on how to store the friendship network information in my database....
The application runs through FLASK
What you are trying to do here is called a "many-to-many" relationship. Rather than making a "Friendships" column, you can make a "Friendship" table with two columns: user1 and user2. Entries in this table indicate that user1 has friended user2.
It is possible to store a list as a string into an sql column.
However, you should instead be looking at creating a Friendships table with primary keys being the user and the friend.
So that you can call the friendships table to pull up the list of friends.
Otherwise, I would suggest looking into a Graph Database, which handles this kind of things well too.
If you want to organize correct storage of data you should know more about relative databases. I recommend you to read this first of all. With some normalization it would perform better (some operations on db will be much more simplier).
As mentioned before you should make another table with friendships to perform first normal form. It would be much easier for you to perform modification of relationships.
I have a web application that accesses large amounts of JSON data.
I want to use a key value database for storing JSON data owned/shared by different users of the web application (not users of the database). Each user should only be able to access the records they own or share.
In a relational database, I would add a column Owner to the record table, or manage shared ownerships in a separate table, and check access on the application side (Python). For key value stores, two approaches come to mind.
User ID as part of the key
What if I use keys like USERID_RECORDID and then write code to check the USERID before accessing the record? Is that a good idea? It wouldn't work with records that are shared between users.
User ID as part of the value
I could store one or more USERIDs in the value data and check if the data contains the ID of the user trying to access the record. Performance is probably slower than having the user ID as part of the key, but shared ownerships are possible.
What are typical patterns to do what I am trying to do?
Both of the solutions you described have some limitations.
You point yourself that including the owner ID in the key does not solve the problem of shared data. However, this solution may be acceptable, if you add another key/value pair, containing the IDs of the contents shared with this user (key: userId:shared, value: [id1, id2, id3...]).
Your second proposal, in which you include the list of users who were granted access to a given content, is OK if and only if you application needs to make a query to retrieve the list of users who have access to a particular content. If your need is to list all contents a given user can access, this design will lead you to poor performances, as the K/V store will have to scan all records -and this type of database engine usually don't allow you to create an index to optimise this kind of request.
From a more general point of view, with NoSQL databases and especially Key/Value stores, the model has to be defined according to the requests to be made by the application. It may lead you to duplicate some information. The application has the responsibility of maintaining the consistency of the data.
By example, if you need to get all contents for a given user, whether this user is the owner of the content or these contents were shared with him, I suggest you to create a key for the user, containing the list of content Ids for that user, as I already said. But if your app also needs to get the list of users allowed to access a given content, you should add their IDs in a field of this content. This would result in something like :
key: contentID, value: { ..., [userId1, userID2...]}
When you remove the access to a given content for a user, your app (and not the datastore) have to remove the userId from the content value, and the contentId from the list of contents for this user.
This design may imply for your app to make multiple requests: by example one to get the list of userIDs allowed to access a given content, and one or more to get these user profiles. However, this should not really be a problem as K/V stores usually have very high performances.
I have the following entities, each of which is a table in my database:
User
Application
Role
I have another table called "user_app_role" which looks like this:
table user_app_role(
user_id int not null ,
application_id int not null,
role_id int not null,
primary key(user_id, application_id, role_id)
)
where user_id, application_id, and role_id are all foreign keys on the user, application, and role tables.
An entry in that table indicates that the user has a particular role within a particular application, so a row might return 1, 1, 1 indicating that user 1 has role 1 within application 1. similarly, 1, 2, 1 would mean that user 1 also has role 2 within application 1.
I have sqlalchemy mappings defined for User, Application, and Role. What I would like is for the User object to somehow have a list of Application objects and for each Application object, that object would contain a list of Role objects.
From reading the documentation for sqlalchemy, it appears this type of relationship is impossible to map and I have found only a few other questions on stackoverflow where this has been asked, none of which have an answer. This seems like a relatively normal 3NF database relationship (I have 4 of them in my whole data model), is it possible to somehow set this up in sqlalchemy? I could do this whole thing in pure SQL in about 10 minutes but I don't want to throw away all the other useful feature of SqlAlchemy but if I can't make this somehow work, then my application will not be able to ship.
Also, PLEASE DO NOT suggest that just I alter my data model or denormalize the database or otherwise mess with that in any way. Answers of that nature will not help me. I'm happy to change my object model or add additional objects if I need to somehow magically map this one table to 2 objects or something weird like that but I am not able to change the data model.
Ok as far as I can determine, creating an actual mapping for this scenario is impossible in SqlAlchemy so here is my workaround:
class UserAppRole(Base):
__tablename__ = 'userapprole'
user_id = Column(stuff, ForeignKey(users.id))
role_id = Column(otherstuff, ForeignKey(roles.id))
app_id = Column(morestuff, ForeignKey(apps.id))
I've decided that in my application's domain, a User has roles with applications, so the relationship for this new object is going to be on the User:
class User(Base):
__tablename__ = 'users'
approles = relationship(UserAppRole, backref=backref('app_user_role', uselist=True))
# other columns, relationships, etc.
This works well, because when I load a user I want to know what applications they have access to in any way and I want to know what roles they have for those applications. In my domain, a Role is a thing and an Application is also a thing, and there are relatively few of those things and they tend not to change (although that isn't a requirement for this solution). What's important is that I can load Application and Role objects by their ID, which conveniently enough I now have in my approles list in my User object.
To bind this all together, the last thing I do is use a repository to handle persistence for my User objects. When I load a User, SqlAlchemy will populate the approles list with ID's. I can then manually load the applications and roles and build a dictionary for applications with a list of roles. Now when I add a role for a user to an application, I need to pass in the Application object (which knows what roles are valid) and a Role object (which may or may not be valid for that application). Both of those will have ID's, so it's pretty trivial for me to update the dictionary and/or approles list to contain what I want. In the unlikely event that I attempt to write to the DB with a role id or app id that doesn't exist, then the constraints in my database will reject the insert/update and I'll get an exception.
It's not my ideal solution, but it works quite well for the two situations that I'm encountering this. If anyone has a better solution though, please post it. I feel like composite adjacency lists ought to be useful for this scenario but the current SqlAlchemy documentation seems to indicate that these have to be for self-referencing tables and I haven't found a way to make this work for this particular scenario.
I'm evaluating using redis to store some session values. When constructing the redis client (we will be using this python one) I get to pass in the db to use. Is it appropriate to use the DB as a sort of prefix for my keys? E.g. store all session keys in db 0 and some messages in db 1 and so on? Or should I keep all my applications keys in the same db?
Quoting my answer from this question:
It depends on your use case, but my rule of thumb is: If you have a
very large quantity of related data keys that are unrelated to all the
rest of your data in Redis, put them in a new database. Reasons being:
You may need to (non-ideally) use the keys command to get all of that
data at some point, and having the data segregated makes that much
cheaper.
You may want to switch to a second redis server later, and having
related data pre-segregated makes this much easier.
You can keep your databases named somewhere, so it's easier for you,
or a new employee to figure out where to look for particular data.
Conversely, if your data is related to other data, they should always
live in the same database, so you can easily write pipelines and lua
scripts that can access both.
I have a database table that is populated by a long running process. This process reads external data and updates the records in the database. Instead of really updating the records, it is easier to cascade-delete them and recreate. This way all the dependencies will be cleaned up too.
Each record has a unique name. I need to find a way to generate identifiers for these records in such a way that the same names are identified by the same identifiers. So that the identifier stays the same when the record is deleted and recreated. I tried using slugs but they can become very long and Django's SlugField does not always work.
Is it reasonable to use a secure hash as the key? I could create a hash from the slug and use that. Or is it too expensive?