inconsistent results from LIKE query: pyodbc vs. Access - python

I got a bunch of queries that should be executed in an Access database as a part of my Python script. Unfortunately, queries that used directly in MS Access are giving some records of output, in Python script return nothing (no error either). Connection with database and general syntax should be fine as simple queries (like select one column from table where something) are working just fine. Here is a code with one of these given queries:
import pyodbc
baza = r"C:\base.mdb"
driver = "{Microsoft Access Driver (*.mdb, *.accdb)}"
access_con_string = r"Driver={};Dbq={};".format(driver, baza)
cnn = pyodbc.connect(access_con_string)
db_cursor = cnn.cursor()
expression = """SELECT F_PARCEL.PARCEL_NR, F_PARCEL_LAND_USE.AREA_USE_CD, F_PARCEL_LAND_USE.SOIL_QUALITY_CD, F_ARODES.TEMP_ADRESS_FOREST, F_SUBAREA.AREA_TYPE_CD, F_AROD_LAND_USE.AROD_LAND_USE_AREA, F_PARCEL.COUNTY_CD, F_PARCEL.DISTRICT_CD, F_PARCEL.MUNICIPALITY_CD, F_PARCEL.COMMUNITY_CD, F_SUBAREA.SUB_AREA
FROM F_PARCEL INNER JOIN (F_PARCEL_LAND_USE INNER JOIN ((F_ARODES INNER JOIN F_AROD_LAND_USE ON F_ARODES.ARODES_INT_NUM = F_AROD_LAND_USE.ARODES_INT_NUM) INNER JOIN F_SUBAREA ON F_ARODES.ARODES_INT_NUM = F_SUBAREA.ARODES_INT_NUM) ON (F_PARCEL_LAND_USE.SHAPE_NR = F_AROD_LAND_USE.SHAPE_NR) AND (F_PARCEL_LAND_USE.PARCEL_INT_NUM = F_AROD_LAND_USE.PARCEL_INT_NUM)) ON F_PARCEL.PARCEL_INT_NUM = F_PARCEL_LAND_USE.PARCEL_INT_NUM
WHERE (((F_ARODES.TEMP_ADRESS_FOREST) Like ?) AND ((F_AROD_LAND_USE.AROD_LAND_USE_AREA)<?) AND ((F_ARODES.TEMP_ACT_ADRESS)= ?))
ORDER BY F_PARCEL.PARCEL_NR, F_PARCEL_LAND_USE.SHAPE_NR;"""
rows = db_cursor.execute(expression, ("14-17-2-03*", 0.0049, True)).fetchall()
for row in rows:
print row
cnn.close()
I know that those queries were generated from query builder in MS Access, so I was wondering that maybe this results in differences, but on the other hand this is still access database.
Anyway it seems, that the problem is in SQL, so I would like to know what elements could possibly result in different output between queries executed directly in MS Access and by pyodbc connection?

You are getting tripped up by the difference in LIKE wildcard characters between queries run in Access itself and queries run from an external application.
When running a query from within Access itself you need to use the asterisk as the wildcard character: "14-17-2-03*".
When running a query from an external application (like your Python app) you need to use the percent sign as the wildcard character: "14-17-2-03%".

Related

sqlalchemy error when calling mysql stored procedure

I'm using sqlalchemy to run query on a MySql server from python.
I initialize sqlalchemy with:
engine = create_engine("mysql+mysqlconnector://{user}:{password}#{host}:{port}/{database}".format(**connection_params))
conn = engine.connect()
Where connection_params is a dict containing the server access details.
I'm running this query:
SELECT
new_db.asset_specification.identifier_code,
new_db.asset_specification.asset_name,
new_db.asset_specification.asset_type,
new_db.asset_specification.currency_code,
new_db.sector_map.sector_description,
new_db.super_sector_map.super_sector_description,
new_db.country_map.country_description,
new_db.country_map.country_macro_area
FROM new_db.asset_specification
INNER JOIN new_db.identifier_code_legal_entity_map on new_db.asset_specification.identifier_code = new_db.identifier_code_legal_entity_map.identifier_code
INNER JOIN new_db.legal_entity_map on projecthf_db.identifier_code_legal_entity_map.legal_entity_code = new_db.legal_entity_map.legal_entity_code
INNER JOIN new_db.sector_map on new_db.legal_entity_map.legal_entity_sector = new_db.sector_map.sector_code
INNER JOIN new_db.super_sector_map on projecthf_db.legal_entity_map.legal_entity_super_sector = new_db.super_sector_map.super_sector_code
INNER JOIN new_db.country_map on new_db.legal_entity_map.legal_entity_country = new_db.country_map.country_code
WHERE new_db.asset_specification.identifier_code = str_identifier_code;
Using conn.execute(query) (where i set query equal to the string above).
This runs just fine.
I tried to put my query in a stored procedure like:
CREATE DEFINER=`root`#`localhost` PROCEDURE `test_anag`(IN str_identifier_code varchar(100))
BEGIN
SELECT
new_db.asset_specification.identifier_code,
new_db.asset_specification.asset_name,
new_db.asset_specification.asset_type,
new_db.asset_specification.currency_code,
new_db.sector_map.sector_description,
new_db.super_sector_map.super_sector_description,
new_db.country_map.country_description,
new_db.country_map.country_macro_area
FROM new_db.asset_specification
INNER JOIN new_db.identifier_code_legal_entity_map on new_db.asset_specification.identifier_code = new_db.identifier_code_legal_entity_map.identifier_code
INNER JOIN new_db.legal_entity_map on projecthf_db.identifier_code_legal_entity_map.legal_entity_code = new_db.legal_entity_map.legal_entity_code
INNER JOIN new_db.sector_map on new_db.legal_entity_map.legal_entity_sector = new_db.sector_map.sector_code
INNER JOIN new_db.super_sector_map on projecthf_db.legal_entity_map.legal_entity_super_sector = new_db.super_sector_map.super_sector_code
INNER JOIN new_db.country_map on new_db.legal_entity_map.legal_entity_country = new_db.country_map.country_code
WHERE new_db.asset_specification.identifier_code = str_identifier_code;
END
I can run the stored procedure from the query editor in mysql workbench with CALL new_db.test_anag('000000') and I get the desired result (which is a single line).
Now I try to run:
res = conn.execute("CALL new_db.test_anag('000000')")
But it fails with the following exception
sqlalchemy.exc.InterfaceError: (mysql.connector.errors.InterfaceError) Use multi=True when executing multiple statements [SQL: "CALL projecthf_db.test_anag('0237400')"]
I looked around but I can't find anything useful on this error and for the love of me I can't get my head around it. I'm not an expert on either Mysql nor sqlalchemy (or anything RDBMS) but this one looks like it should be easy to fix. Let me know if more info is required.
Thank in advance for the help
From reading a related question it can be seen that mysql.connector automatically fetches and stores multiple result sets when executing stored procedures producing such, even if only one result set is produced. SQLAlchemy on the other hand does not support multiple result sets – directly. To execute stored procedures use callproc(). To access a DB-API cursor in SQLAlchemy you have to use a raw connection. In case of mysql.connector the produced result sets can be accessed using stored_results():
from contextlib import closing
# Create a raw MySQLConnection
conn = engine.raw_connection()
try:
# Get a MySQLCursor
with closing(conn.cursor()) as cursor:
# Call the stored procedure
result_args = cursor.callproc('new_db.test_anag', ['000000'])
# Iterate through the result sets produced by the procedure
for result in cursor.stored_results():
result.fetchall()
finally:
conn.close()

Too few parameters error, while no parameters placeholders used

I am trying to execute SQL query within Access database using PYODBC and I get following error:
pyodbc.Error: ('07002', '[07002] [Microsoft][ODBC Microsoft Access Driver]
Too few parameters. Expected 1. (-3010) (SQLExecDirectW)')
The problem is that I am not using any additional parameters. Here is the code:
access_con_string = r"Driver={};Dbq={};".format(driver, base)
cnn = pyodbc.connect(access_con_string)
db_cursor = cnn.cursor()
expression = """SELECT F_ARODES.ARODES_INT_NUM, F_ARODES.TEMP_ADRESS_FOREST,F_AROD_LAND_USE.ARODES_INT_NUM, F_ARODES.ARODES_TYP_CD
FROM F_ARODES LEFT JOIN F_AROD_LAND_USE ON F_ARODES.ARODES_INT_NUM = F_AROD_LAND_USE.ARODES_INT_NUM
WHERE (((F_AROD_LAND_USE.ARODES_INT_NUM) Is Null) AND ((F_ARODES.ARODES_TYP_CD)="wydziel") AND ((F_ARODES.TEMP_ACT_ADRESS)=True));"""
db_cursor.execute(expression)
Query itself, if used inside MS-Access works fine. Also, connection is OK, as other queries are executed properly.
What am I doing wrong?
Constants in such queries are problematic - you never know the exact underlying syntax for booleans, strings etc. - even if it works in MS-Access, it can be different inside the intermediary library you're using.
The safest way is to extract them as parameters anyway:
expression = """SELECT F_ARODES.ARODES_INT_NUM, F_ARODES.TEMP_ADRESS_FOREST,F_AROD_LAND_USE.ARODES_INT_NUM, F_ARODES.ARODES_TYP_CD FROM F_ARODES LEFT JOIN F_AROD_LAND_USE ON F_ARODES.ARODES_INT_NUM = F_AROD_LAND_USE.ARODES_INT_NUM WHERE (((F_AROD_LAND_USE.ARODES_INT_NUM) Is Null)
AND ((F_ARODES.ARODES_TYP_CD)=?) AND ((F_ARODES.TEMP_ACT_ADRESS)=?));"""
db_cursor.execute(expression, "wydziel", True)
I had a similar problem, with an update I was trying to perform with pyodbc. When executed in Access, the query worked fine, same for when using the application (it allows some queries from within the app). But when ran in python with pyodbc the same text would throw errors. I determined the problem is the double quote (OP's query has a set of them as well). The query began to work when I replaced them with single quotes.
This does not work:
Update ApplicationStandards Set ShortCutKey = "I" Where ShortName = "ISO"
This does:
Update ApplicationStandards Set ShortCutKey = 'I' Where ShortName = 'ISO'

Iterating over query results from sqlalchemy

I have a sqlalchemy query function like this
def foo():
local_session = Session()
results = local_session.query(T.x, T.y, T.z, T.a, T.b, T.c
, T.d, T.e, T.f, T.g, T.h, T.i, T.j, T.k, T.l
, T.m, T.n, T.o, T.p, T.q, T.r, T.s, T.t, T.u
, T.v,
User.gender).join(User)\
.filter(T.language == 'en', T.where_i_am_from == 'US',
User.some_num >= 0.9).limit(1000000)
local_session.close()
return results, results.count()
The query works fine.
and then I call this function here:
def fubar():
raw_data,raw_data_length = myModule.foo()
df = pd.DataFrame()
for each in raw_data:
df = df.append(pd.DataFrame({ #add each.x etc to df..... }}
return df
The issue is that it wont iterate over the "for each in raw_data" loop when I have a .limit on my foo query above 5000, or use .all() or have no limit. The program will just hang and do nothing (0 cpu usage). I've tested this both on my local sql server and my amazon one. When I run the sql directly on the database I return around 800,000 rows. Why is this happening?
I'm using the latest mysql and latest sqlalchemy.
This may be like MySQL driver problem. I would do the following in order:
Run python with -v flag, like python -v yourprogram.py.
This has a potential of showing you where the program got stuck.
Get those 800,000 results and stick them in SQLite with tables in equivalent schema.
That's relatively cheap to do, all you have to do afterwards is change SQA database string. Obviously, this would show you whether the problem lies with the driver or it's in your code.
You're doing a join between two classes (T, User) - do eager load instead of default lazy load.
If you have 800,000 rows and doing a lazy join, that may be a problem. Add a joinedload (eagerload in earlier versions of SQLAlchemy) to options.

Condition in a request

I wonder if there is a way to filter the results given by a request with a function, written in python. Something like that:
SELECT id, name, path IF verifAccess(path)
In my example, verifAcces would be a function I wrote. It returns True if the path is accessible, or False if not.
Thanks.
That's the request I need to filter:
def displayWaitingList(self):
self.button_waiting.setEnabled(False)
self.showing_waiting = True
try:
self.waiting_list
except AttributeError:
return
self.query = QtSql.QSqlQuery()
requete = "SELECT * FROM videos WHERE id IN ("
for each_id in self.waiting_list:
if self.waiting_list.index(each_id) != len(self.waiting_list) - 1:
requete = requete + str(each_id) + ", "
else:
requete = requete + str(each_id) + ")"
self.query.prepare(requete)
self.query.exec_()
self.modele.setQuery(self.query)
self.proxy.setSourceModel(self.modele)
self.tableau.setModel(self.proxy)
The SQL query is executed in the database server, so if you have a backend supporting Python, you can write stored procedures and functions in Python.
PostgreSQL has PL/Py:
Pure Python: All code, at first, is written in pure Python so that py-postgresql will work anywhere that you can install Python 3. Optimizations in C are made where needed, but are always optional.
Prepared Statements: Using the PG-API interface, protocol-level prepared statements may be created and used multiple times. db.prepare(sql)(*args)
COPY Support: Use the convenient COPY interface to directly copy data from one connection to another. No intermediate files or tricks are necessary.
Arrays and Composite Types: Arrays and composites are fully supported. Queries requesting them will returns objects that provide access to the elements within.
Quick Console: Get a Python console with a connection to PostgreSQL for quick tests and simple scripts.
source: http://python.projects.pgfoundry.org/
You may find the Pony ORM very interesting. It allows querying a database using plain python instead of SQL:
select(c for c in Customer if sum(c.orders.price) > 1000)
The above statement generates the following query:
SELECT "c"."id"
FROM "Customer" "c"
LEFT JOIN "Order" "order-1"
ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
[update]
Okay, I'm going to have a look at it. Thanks. But nothing native ? – user1585507
By native you mean "using the core library only"? No, there isn't. If you can't use PL/Py, your best shot is an ORM like SQLAlchemy (that is very expressive in the SQL side), or one like Pony (that is more expressive in the Python side). Both will let you reuse and composite queries easily.
If you are letting the user construct complex query conditions, and trying to avoid the misery of composing SQL queries using string interpolation and concatenation, I recommend SQLAlchemy core.
If your queries are simple and you just want avoid the impedance mismatch between Python and SQL as much as you can, then use Pony.
You can filter it on the client side:
Create tables and populate with data:
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute('create table test(id int, name text)')
conn.execute("insert into test(id, name) select 1, 'test'")
conn.execute("insert into test(id, name) select 2, 'test2'")
def verifAccess(id):
return id == 1
Quering:
>>> [x for x in conn.execute('select id, name from test')]
[(1, u'test'), (2, u'test2')]
>>> [x for x in conn.execute('select id, name from test') if verifAccess(x[0])]
[(1, u'test')]
You could write Python functions in PostgreSQL, but it should be function created on your server and it's not very efficient way to filter data from the table - indexes will not be used.

Working with Cursors in Python

Searched the web and this forum without satisfaction. Using Python 2.7 and pyODBC on Windows XP. I can get the code below to run and generate two cursors from two different databases without problems. Ideally, I'd then like to join these result cursors thusly:
SELECT a.state, sum(b.Sales)
FROM cust_curs a
INNER JOIN fin_curs b
ON a.Cust_id = b.Cust_id
GROUP BY a.state
Is there a way to join cursors using SQL statements in python or pyODBC? Would I need to store these cursors in a common DB (SQLite3?) to accomplish this? Is there a pure python data handling approach that would generate this summary from these two cursors?
Thanks for your consideration.
Working code:
import pyodbc
#
# DB2 Financial Data Cursor
#
cnxn = pyodbc.connect('DSN=DB2_Fin;UID=;PWD=')
fin_curs = cnxn.cursor()
fin_curs.execute("""SELECT Cust_id, sum(Sales) as Sales
FROM Finance.Sales_Tbl
GROUP BY Cust_id""")
#
# Oracle Customer Data Cursor
#
cnxn = pyodbc.connect('DSN=Ora_Cust;UID=;PWD=')
cust_curs = cnxn.cursor()
cust_curs.execute("""SELECT Distinct Cust_id, gender, address, state
FROM Customers.Cust_Data""")
Cursors are simply objects used for executing SQL commands and retrieving the results. The data aren't migrated in a new database and thus joins aren't possible. If you would like to join the data you'll need to have the two tables in the same database. Whether that means brining both tables and their data into a SQLite database or doing it some other way depends on the specifics of your use case, but that would theoretically work.

Categories

Resources