Dealing with Arrays in Flask-SqlAlchemy and MySQL - python

I have a datamodel where I store a list of values separated by comma (1,2,3,4,5...).
In my code, in order to work with arrays instead of string, I have defined the model like this one:
class MyModel(db.Model):
pk = db.Column(db.Integer, primary_key=True)
__fake_array = db.Column(db.String(500), name="fake_array")
#property
def fake_array(self):
if not self.__fake_array:
return
return self.__fake_array.split(',')
#fake_array.setter
def fake_array(self, value):
if value:
self.__fake_array = ",".join(value)
else:
self.__fake_array = None
This works perfect and from the point of view of my source code "fake_array" is an array, It's only transformed into string when it's stored in database.
The problem appears when I try to filter by that field. Expressions like this doesn't work:
MyModel.query.filter_by(fake_array="1").all()
It seems that I cant filter using the SqlAlchemy query model.
What can I do here? Is there any way to filter this kind of fields? Is there is a better pattern for the "fake_array" problem?
Thanks!

What you're trying to do should really be replaced with a pair of tables and a relationship between them.
The first table (which I'll call A) contains everything BUT the array column, and it should have a primary key of some sort. You should have another table (which I'll call B) that contains a primary key, a foreign key column to A (which I'll call a_id, and an integer field.
Using this layout, each row in the A table has its associated array in table B where B's a_id == A.id via a join. You can add or remove values from the array by manipulating the rows in table B. You can filter by using a join.
If the order of the values is needed, then create an order column in table B.

Related

GeoDjango: How to perform a query of spatially close records

I have two Django models (A and B) which are not related by any foreign key, but both have a geometry field.
class A(Model):
position = PointField(geography=True)
class B(Model):
position = PointField(geography=True)
I would like to relate them spatially, i.e. given a queryset of A, being able to obtain a queryset of B containing those records that are at less than a given distance to A.
I haven't found a way using pure Django's ORM to do such a thing.
Of course, I could write a property in A such as this one:
#property
def nearby(self):
return B.objects.filter(position__dwithin=(self.position, 0.1))
But this only allows me to fetch the nearby records on each instance and not in a single query, which is far from efficient.
I have also tried to do this:
nearby = B.objects.filter(position__dwithin=(OuterRef('position'), 0.1))
query = A.objects.annotate(nearby=Subquery(nearby.values('pk')))
list(query) # error here
However, I get this error for the last line:
ValueError: This queryset contains a reference to an outer query and may only be used in a subquery
Does anybody know a better way (more efficient) of performing such a query or maybe the reason why my code is failing?
I very much appreciate.
I finally managed to solve it, but I had to perform a raw SQL query in the end.
This will return all A records with an annotation including a list of all nearby B records:
from collections import namedtuple
from django.db import connection
with connection.cursor() as cursor:
cursor.execute('''SELECT id, array_agg(b.id) as nearby FROM myapp_a a
LEFT JOIN myapp_b b ON ST_DWithin(a.position, p.position, 0.1)
GROUP BY a.id''')
nt_result = namedtuple('Result', [col[0] for col in cursor.description])
results = [nt_result(*row) for row in cursor.fetchall()]
References:
Raw queries: https://docs.djangoproject.com/en/2.2/topics/db/sql/#executing-custom-sql-directly
Array aggregation: https://www.postgresql.org/docs/8.4/functions-aggregate.html
ST_DWithin: https://postgis.net/docs/ST_DWithin.html

Insert dictionary into SQLlite3

I want to insert data from a dictionary into a sqlite table, I am using slqalchemy to do that, the keys in the dictionary and the column names are the same, and I want to insert the values into the same column name in the table. So this is my code:
#This is the class where I create a table from with sqlalchemy, and I want to
#insert my data into.
#I didn't write the __init__ for simplicity
class Sizecurve(Base):
__tablename__ = 'sizecurve'
XS = Column(String(5))
S = Column(String(5))
M = Column(String(5))
L = Column(String(5))
XL = Column(String(5))
XXL = Column(String(5))
o = Mapping() #This creates an object which is actually a dictionary
for eachitem in myitems:
# Here I populate the dictionary with keys from another list
# This gives me a dictionary looking like this: o={'S':None, 'M':None, 'L':None}
o[eachitem] = None
for eachsize in mysizes:
# Here I assign values to each key of the dictionary, if a value exists if not just None
# product_row is a class and size and stock are its attributes
if(product_row.size in o):
o[product_row.size] = product_row.stock
# I put the final object into a list
simplelist.append(o)
Now I want to put each the values from the dictionaries saved in simplelist into the right column in the sizecurve table. But I am stuck I don't know how to do that? So for example I have an object like this:
o= {'S':4, 'M':2, 'L':1}
And I want to see for the row for column S value 4, column M value 2 etc.
Yes, it's possible (though aren't you missing primary keys/foreign keys on this table?).
session.add(Sizecurve(**o))
session.commit()
That should insert the row.
http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
EDIT: On second read it seems like you are trying to insert all those values into one column? If so, I would make use of pickle.
https://docs.python.org/3.5/library/pickle.html
If performance is an issue (pickle is pretty fast, but if your doing 10000 reads per second it'll be the bottleneck), you should either redesign the table or use a database like PostgreSQL that supports JSON objects.
I have found this answer to a similar question, though this is about reading the data from a json file, so now I am working on understanding the code and also changing my data type to json so that I can insert them in the right place.
Convert JSON to SQLite in Python - How to map json keys to database columns properly?

GQL query checking numerical id within a list of ids

I'm trying to compare the unique numerical id of an element in my database with a list of longs.
My GQL query should return those elements which have this id I'm passing as part of their array of longs.
I've tried using a statement of the form:
"SELECT * FROM Table WHERE id IN :1", list_of_stored_ids
I've also tried using this question: GQL query with numeric id in datastore viewer, but I still can't find any way to compare to a list.
Is there such a way? If not, what must I do?
You will need to build up a list of ndb keys, not numeric ids, in order to get this to work.
eg:
ids = [5918782761467904, 5624113645223936, 5463928544952320]
keys = [ndb.Key('<Entity>', id) for id in ids]
entities = ndb.gql("SELECT * FROM <Entity> WHERE __key__ IN :1", keys).fetch()
or (non-GQL version)
entities = ndb.get_multi(keys)

What is the best way to store and query a very large number of variable-length lists in a MySQL database?

Maybe this question will be made more clear through an example. Let's say the dataset I'm working with is a whole bunch (several gigabytes) of variable-length lists of tuples, each associated with a unique ID and a bit of metadata, and I want to be able quickly retrieve any of these lists by its ID.
I currently have two tables set up more or less like this:
TABLE list(
id VARCHAR PRIMARY KEY,
flavor VARCHAR,
type VARCHAR,
list_element_start INT,
list_element_end INT)
TABLE list_element(
id INT PRIMARY KEY,
value1 FLOAT,
value2 FLOAT)
To pull a specific list out of the database I currently do something like this:
SELECT list_element_start, list_element_end FROM list WHERE id = 'my_list_id'
Then I use the retrieved list_element_start and list_element_end values to get the list elements:
SELECT *
FROM list_element
WHERE id BETWEEN(my_list_element_start, my_list_element_end)
Of course, this works very fast, but I feel as though there's a better way to do this. I'm aware that I could have another column in list_element_end called list_id, and then do something like SELECT * FROM list_element WHERE list_id = 'my_list_id' ORDER BY id. However, it seems to me that having that extra column, as well as a foreign key index on that column would take up a lot of unnecessary space.
Is there simpler way to do this?
Apologies if this question has been asked before, but I was unable to locate the answer. I'd also like to use SQLAlchemy in Python to do all of this, if possible.
Thanks in advance!
between is not a function so I don't know what you think is going on there. Anyway... Why not:
SELECT e.*
FROM list_element e
Join list l
On l.id between e.my_list_element_start and my_list_element_end
Or am I missing something
You can normalize each element of your array into a row. The following is the declarative style in SQLAlchemy that will give you a "MyList" object with flavor etc, and then elements will be an actual Python list of each "MyElement" object. You could get more complicated to weed out the extra id and idx within the returned element list, but this should be plenty fast enough.
Also, above, you had mixed varchar and int for your primary key, not sure if it was just oversight, but you ought not do that. Additionally, when handling large data sets remember options like chunking. You can use offset and limit to work with smaller sizes and process iteratively.
class MyList(Base):
__tablename__ = 'my_list'
id = Column(Integer, primary_key=True)
flavor = Column(String)
list_type = Column(String)
elements = Relationship('my_element', order_by='my_element.idx')
class MyElement(Base):
__tablename__ = 'my_element'
id = Column(Integer, ForeignKey('my_list.id'))
idx = Column(Integer)
val = Column(Integer)
__table_args__ = (PrimaryKeyConstraint('id','idx'), )

Querying a view in SQLAlchemy

I want to know if SQLAlchemy has problems querying a view. If I query the view with normal SQL on the server like:
SELECT * FROM ViewMyTable WHERE index1 = '608_56_56';
I get a whole bunch of records. But with SQLAlchemy I get only the first one. But in the count is the correct number. I have no idea why.
This is my SQLAlchemy code.
myQuery = Session.query(ViewMyTable)
erg = myQuery.filter(ViewMyTable.index1 == index1.strip())
# Contains the correct number of all entries I found with that query.
totalCount = erg.count()
# Contains only the first entry I found with my query.
ergListe = erg.all()
if you've mapped ViewMyTable, the query will only return rows that have a fully non-NULL primary key. This behavior is specific to versions 0.5 and lower - on 0.6, if any of the columns have a non-NULL in the primary key, the row is turned into an instance. Specify the flag allow_null_pks=True to your mappers to ensure that partial primary keys still count :
mapper(ViewMyTable, myview, allow_null_pks=True)
If OTOH the rows returned have all nulls for the primary key, then SQLAlchemy cannot create an entity since it can't place it into the identity map. You can instead get at the individual columns by querying for them specifically:
for id, index in session.query(ViewMyTable.id, ViewMyTable.index):
print id, index
I was facing similar problem - how to filter view with SQLAlchemy. For table:
t_v_full_proposals = Table(
'v_full_proposals', metadata,
Column('proposal_id', Integer),
Column('version', String),
Column('content', String),
Column('creator_id', String)
)
I'm filtering:
proposals = session.query(t_v_full_proposals).filter(t_v_full_proposals.c.creator_id != 'greatest_admin')
Hopefully it will help:)

Categories

Resources