SQLAlchemy: Logical Operators with case statement - python

Here is my mssql code snippet
SELECT
Sum((Case When me.status not in ('CLOSED','VOID') and me.pc_cd in ('IK','JM')
Then 1 else 0 end)) as current_cd
from
ccd_pvc me with(nolock)
How would i use the and operator with case statement if i write the above statement in sqlalchemy.
I have tried doing this but did not work
case([and_((ccd_pvc.status.in_(['CLOSED', 'VOID']),ccd_pvc.pc_cd.in_(['IK','JM'])),
literal_column("'greaterthan100'"))])
I have searched through the sqlalchemy documentation but did not find the info on using logical operators with case statement.
The link has some info on this.

This should get you started:
ccd_pvc = aliased(CcdPvc, name="me")
expr = func.sum(
case([(and_(
ccd_pvc.status.in_(['CLOSED', 'VOID']),
ccd_pvc.pc_cd.in_(['IK', 'JM'])
), 1)], else_=0)
).label("current_cd")
q = session.query(expr)

Related

"COUNT field incorrect or syntax error" when passing a long list of values to `.in_()`

I am trying to add a very basic SQL query in my flask API project. I am using SQLAlchemy as the database manipulation tool.
The query I want to run is the following:
SELECT * from trip_metadata where trip_id in ('trip_id_1', 'trip_id_2', ..., 'trip_id_n')
So, in my code, I wrote:
trips_ids = ['trip_id_1', 'trip_id_2', ..., 'trip_id_n']
result = session.query(dal.trip_table).filter(dal.trip_table.columns.trip_id.in_(trips_ids)).all()
When n is low, let'say n=10, it works very well. I get the expected result. However, when n is high, let's say n > 1000, it crashes. I am very surprised as it seems usual to put many values in the filter.
from sqlalchemy import text
result = session.execute(text(f"SELECT * FROM trip_metadata where trip_id in {trip_ids_tuple}"))
The error log is:
sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('07002', '[07002] [Microsoft][ODBC Driver 17 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')')
[SQL: SELECT * FROM trip_metadata
WHERE trip_metadata.trip_id IN (?, ?, ..., ?)]
[parameters: ('ABC12345-XXXX-XXXX-XXXX-000000000000', 'DEF12345-XXXX-XXXX-XXXX-000000000000', ..., 'GHI12345-XXXX-XXXX-XXXX-000000000000')]
(Background on this error at: https://sqlalche.me/e/14/dbapi)
127.0.0.1 - - [05/Jan/2023 10:35:48] "POST /api/v1/tripsAggregates HTTP/1.1" 500 -
However when I write the raw request, it works well, even when n is very high:
from sqlalchemy import text
trip_ids_tuple = ('trip_id_1', 'trip_id_2', ..., 'trip_id_n')
result = session.execute(text(f"SELECT * FROM trip_metadata where trip_id in {trip_ids_tuple}"))
But I don't think this is a good way of doing because I have much more complex requests to write and using sqlalchemy filters is more adapted.
Do you have any idea to fix my issue keeping using sqlalchemy library ? Thank you very much
Microsoft's ODBC drivers for SQL Server execute statements using a system stored procedure on the server (sp_prepexec or sp_prepare). Stored procedures on SQL Server are limited to 2100 parameter values, so with a model like
class Trip(Base):
__tablename__ = "trip"
id = Column(String(32), primary_key=True)
this code will work
with Session(engine) as session:
trips_ids = ["trip_id_1", "trip_id_2"]
q = session.query(Trip).where(Trip.id.in_(trips_ids))
results = q.all()
"""SQL emitted:
SELECT trip.id AS trip_id
FROM trip
WHERE trip.id IN (?, ?)
[generated in 0.00092s] ('trip_id_1', 'trip_id_2')
"""
because it only has two parameter values. If the length of the trips_ids list is increased to thousands of values the code will eventually fail.
One way to avoid the issue is to have SQLAlchemy construct an IN clause with literal values instead of parameter placeholders:
q = session.query(Trip).where(
Trip.id.in_(bindparam("p1", expanding=True, literal_execute=True))
)
results = q.params(p1=trips_ids).all()
"""SQL emitted:
SELECT trip.id AS trip_id
FROM trip
WHERE trip.id IN ('trip_id_1', 'trip_id_2')
[generated in 0.00135s] ()
"""
From the error, it could indicate a formatting issue (escape string characters properly?). This can happen when N is small, the data that breaks the formatting has low chance of occurring. When N gets large, there's more likelihood there is "bad data" that sqlalchemy tries to put into the query. Can't say for certain here, it might be a memory or operating system issue too.
First thing to ask here, do you need to have tuples provided externally into the query? Is there a way to query for the the trip_ids via a join? Usually it's best to push operations to the SQL engine, but this isn't always possible if you're getting the tuples/lists of id's elsewhere.
Rule out if there's a data issue that errors out during the execute(). Look into escaping the string values. You can try chunking the list into smaller bits to narrow down the potentially problematic values (see appendix below)
Try a different way to string format the query.
sql = f"SELECT * FROM trip_metadata where trip_id in ({','.join(trip_ids_tuple)})"
sql output str value:
'SELECT * FROM trip_metadata where trip_id in (trip_id_1,trip_id_2,trip_id_n)'
Appendix:
You can potentially build in a chunker mechanism to break. The crashing might be a operating system or memory issue. For Example, you can use list slicing:
def chunker(seq, size):
return (seq[pos:pos + size] for pos in range(0, len(seq), size))
# Example usage
my_list = [1, 2, 3, 4, 5, 6, 7, 8]
chunk_size = 3
for chunk in chunker(my_list, chunk_size):
print(chunk)
output
[1, 2, 3]
[4, 5, 6]
[7, 8]
Use a chunk size where N is manageable. This will also help narrow down potentially problematic str values that errors out.

Set query parameters of RFC_READ_TABLE using win32com module?

I'm trying to port to Python a SAP table download script, that already works on Excel VBA, but I want a command line version and I would prefer to avoid VBScript for a number of reasons that go beyond the goal of this post.
I'm stuck at the moment in which I need to fill the values in a table
from win32com.client import Dispatch
Functions = Dispatch("SAP.Functions")
Functions.Connection.Client = "400"
Functions.Connection.ApplicationServer = "myserver"
Functions.Connection.Language = "EN"
Functions.Connection.User = "myuser"
Functions.Connection.Password = "mypwd"
Functions.Connection.SystemNumber = "00"
Functions.Connection.UseSAPLogonIni = False
if (Functions.Connection.Logon (0,True) == True):
print("Logon OK")
RFC = Functions.Add("RFC_READ_TABLE")
RFC.exports("QUERY_TABLE").Value = "USR02"
RFC.exports("DELIMITER").Value = "~"
#RFC.exports("ROWSKIPS").Value = 2000
#RFC.exports("ROWCOUNT").Value = 10
tblOptions = RFC.Tables("OPTIONS")
#RETURNED DATA
tblData = RFC.Tables("DATA")
tblFields = RFC.Tables("FIELDS")
tblFields.AppendRow ()
print(tblFields.RowCount)
print(tblFields(1,"FIELDNAME"))
# the 2 lines above print 1 and an empty string, so the row in the table exists
Until here it is basically copied from VBA adapting the syntax.
In VBA at this point I'm able to do
tblFields(1,"FIELDNAME") = "BNAME"
if I do the same I get an error because the left part is a function and written that way it returns a string. In VBA it is probably a bi-dimensional array.
I unsuccessfully tried various approaches like
tblFields.setValue([{"FIELDNAME":"BNAME"}])
tblFields(1,"FIELDNAME").Value = "BNAME"
tblFields(1,"FIELDNAME").setValue("BNAME")
tblFields.FieldName = "BNAME" ##kinda desperate
The script works, without setting the FIELDS table, for outputs that produce rows shorter than 500 chars. This is a SAP limit in the function.
I know that this is not the best way, but I can't use the SAPNWRFC library and I can't use librfc32.dll.
I must be able to solve this way, or revert to the VB version.
Thanks to anyone who will provide a hint
After a lot of trial and error, i found a solution.
Instead of adding row by row to the "OPTIONS" or "FIELDS" tables, you can just submit a prefilled table.
This should work:
tblFields.Data = (('VBELN', '000000', '000000', '', ''),
('POSNR', '000000', '000000', '', ''))
same here:
tblOptions.Data = (("VBELN EQ '2557788'",),)

SQLAlchemy can't use func.bigger as func in query

To desc my problem. Can see this raw sql:
select datediff(now(), create_time) > 7 as is_new from test order by is_new desc limit 19;
I try to implement by SQLAlchemy step by step:
diff_days = func.datediff(today, test.create_time).label("diff_days")
session.query(diff_days).filter(test.id.in_((1,2,3,33344))).order_by(diff_days.asc()).all()
This work fine. But when I want to desc > in mysql. It failed:
is_new = func.greater(func.datediff(today, test.create_time), 7).label("is_new")
session.query(is_new).filter(test.id.in_((1,2,3,33344))).order_by(is_new.asc()).all()
I know SQLAlchemy explain my sql to greater while mysql don't support. So How can I to get my answer a > b with something like greater(a, b)
May be the simple sql select a > b from test can desc the problem too. While above is my origin need. So the problem can change :
How to using SQLAIchemy orm to implement select a > b from test.
SQLAlchemy offers you rich operator overloading, so just do
is_new = (func.datediff(today, test.create_time) > 7).label("is_new")
session.query(is_new).\
filter(test.id.in_([1, 2, 3, 33344])).\
order_by(is_new.asc()).\
all()
This works since the created Function is also a ColumnElement and as such has ColumnOperators.

How to include `search_type=count` in a query?

I have a Python script that runs many ElasticSearch aggregations, e.g.:
client = Elasticsearch(...)
q = {"aggs": {"my_name":{"terms": "field", "fieldname"}}}
res = client.search(index = "indexname*", doc_type = "doc_type", body = q)
But this returns the search query (match everything I think) res["hits"] and the aggregation results res["aggregations"].
What I want to run is the Python equivalent of the following
GET /index*/doc_type/_search?search_type=count
{"aggs": {"my_name":{"terms": "field", "fieldname"}}}
How do I make sure to include the ?search_type=count when using Python Elasticsearch?
I'd like to know this in general, but the current reason I'm looking into this is I occasionally get errors caused by timeouts or data size when running the queries. My suspicion is that if I can only ask for the counting then I'll avoid these.
The general consensus is to not use search_type=count anymore as it's been deprecated in 2.0. Instead you should simply use size: 0.
res = client.search(index = "indexname*", doc_type = "doc_type", body = q, size=0)
^
|
add this
Here is the documentation for search
Try this
res = client.search(index = "indexname*", doc_type = "doc_type", body = q, search_type='count')
Look at the answer of #Val if you are using ES 2.x

SQL Query translation to SQLAlchemy

Hello I am trying to translate the following relatively simple query to SQLAlchemy but I get
('Unexpected error:', <class 'sqlalchemy.exc.InvalidRequestError'>)
SELECT model, COUNT(model) AS count FROM log.logs
WHERE SOURCE = "WEB" AND year(timestamp) = 2015 AND month(timestamp) = 1
and account = "Test" and brand = "Nokia" GROUP BY model ORDER BY count DESC limit 10
This is what I wrote but it is not working. What is wrong ?
devices = db.session.query(Logs.model).filter_by(source=source).filter_by(account=acc).filter_by(brand=brand).\
filter_by(year=year).filter_by(month=month).group_by(Logs.model).order_by(Logs.model.count().desc()).all()
It's a bit hard to tell from your code sample, but the following is hopefully the correct SQLAlchemy code. Try:
from sqlalchemy.sql import func
devices = (db.session
.query(Logs.model, func.count(Logs.model).label('count'))
.filter(source=source)
.filter_by(account=acc)
.filter_by(brand=brand)
.filter_by(year=year)
.filter_by(month=month)
.group_by(Logs.model)
.order_by(func.count(Logs.model).desc()).all())
Note that I've enclosed the query in a (...) to avoid having to use \ at the end of each line.

Categories

Resources