How do I fix connection to db2 using SQLAlchemy in python? - python

I'm having trouble connecting to my database BLUDB in IBM Db2 on Cloud using SQLAlchemy. Here is the code I've always used and it's always worked fine:
%sql ibm_db_sa://user:pswd#some-host.services.dal.bluemix.net:50000/BLUDB
But now I get this error:
(ibm_db_dbi.ProgrammingError) ibm_db_dbi::ProgrammingError:
Exception('[IBM][CLI Driver] SQL1042C An unexpected system error
occurred. SQLSTATE=58004\r SQLCODE=-1042') (Background on this error
at: http://sqlalche.me/e/13/f405) Connection info needed in SQLAlchemy
format, example: postgresql://username:password#hostname/dbname or an
existing connection: dict_keys([])
These packages are loaded as always:
import ibm_db
import ibm_db_sa
import sqlalchemy
from sqlalchemy.engine import create_engine
I looked at the python db2 documentation on ibm and the sqlalchemy error message but couldn't get anywhere.
I am working in Jupyterlab locally. I've recently reinstalled Python and Jupyterlab. That's the only thing locally that's changed.
I am able to successfully run the notebooks in the cloud at kaggle and cognitive class. I am also able to connect and query sqlite3 via python without an issue using my local notebook.
All the ibm modules and version numbers are the same before and after installation. I used requirements.txt for reinstallation.
In db2diag.log here are the last two entries:
2020-11-05-14.06.47.081000-300 I13371F372 LEVEL: Warning
PID : 17500 TID : 7808 PROC : python.exe
INSTANCE: NODE : 000
HOSTNAME: DESKTOP-6FFFO2E
EDUID : 7808
FUNCTION: DB2 UDB, bsu security, sqlexLogPluginMessage, probe:20
DATA #1 : String with size, 43 bytes
loadAuthidMapper: GetModuleHandle rc = 126
2020-11-05-14.13.49.282000-300 I13745F373 LEVEL: Warning
PID : 3060 TID : 12756 PROC : python.exe
INSTANCE: NODE : 000
HOSTNAME: DESKTOP-6FFFO2E
EDUID : 12756
FUNCTION: DB2 UDB, bsu security, sqlexLogPluginMessage, probe:20
DATA #1 : String with size, 43 bytes
loadAuthidMapper: GetModuleHandle rc = 126

I think the root of this will be down to the new version of Python and pip caching.
What version did you move from and what version are you now on. Is this a Python 2 to Python 3 change? When changing versions, normally you would need to clean pip install all components, but pip does use a cache. Even for components that may need to be compiled, and there is a good chance that Db2 components are being compiled.
So what you will need to do is to re-install the dependancies with
pip install --no-cache-dir

Related

Connect to mysql rds using python lambda

I am very new to python code and doing some small tests to verify functionality
Currently trying to establish the connection between an RDS MySQL and a python lambda function.
However, it seems to fail in the code itself and I am not sure why this happens.
There are a couple of guides out there but they all seem to be outdated and fail to work for me.
These are the steps I took to get it working(using MAC-12.3.1 and VS-Studio 1.62.3):
created MYSQL RDS
connected to the MYSQL RDS and created a database called "igor" with table name "lumigor", with 2 columns: id and name (populated with random data).
created on the local machine a folder to contain the code and the package.
installed version Python 3.8.9
created lambda function file app.py with the following code:
import pymysql.cursors
# Connect to the database
connection = pymysql.connect(host='rds end point',
user='user',
password='pswrd',
database='igor',
cursorclass=pymysql.cursors.DictCursor)
with connection:
with connection.cursor() as cursor:
# Read a single record
sql = "SELECT * FROM `Lumigor`"
cursor.execute(sql, ('webmaster#python.org',))
result = cursor.fetchone()
print(result)
```
I added requirement.txt file with the following command
python3 -m pip install PyMySQL && pip3 freeze > requirements.txt --target...
But now I get an error from the visual studio:
"Import "pymysql.cursors" could not be resolved from sourcePylance"
When I zip the file and upload it to lambda, run a test it returns an error
{
"errorMessage": "Unable to import module 'app': No module named 'pymysql.cursors'",
"errorType": "Runtime.ImportModuleError",
"stackTrace": []
}
It seems like the dependcies are missing even though installed them and they exist in the directory
The proper way to add pymysql to your lambda is by creating a dedicated layer a as described in the AWS blog:
How do I create a Lambda layer using a simulated Lambda environment with Docker?
Create empty folder, e.g. mylayer.
Go to the folder and create requirements.txt file with the content of
PyMySQL
Run the following docker command (new image from
lambci/docker-lambda for Python 3.9):
docker run --rm --volume "$PWD:/var/task" --workdir /var/task senorcoder/aws-lambda-env:python3.9_build pip install -Ur requirements.txt --target python
Create layer as zip:
zip -r mypymysqllayer.zip python > /dev/null
Create lambda layer based on mypymysqllayer.zip in the AWS Console. Don't forget to specify Compatible runtimes to python3.9.
Add the layer to your function:
Alternatively, create your function as Lambda container image

High availability HDFS client python

In HDFSCLI docs it says that it can be configured to connect to multiple hosts by adding urls separated with semicolon ; (https://hdfscli.readthedocs.io/en/latest/quickstart.html#configuration).
I use kerberos client, and this is my code -
from hdfs.ext.kerberos import KerberosClient hdfs_client = KerberosClient('http://host01:50070;http://host02:50070')
And when I try to makedir for example, I get the following error - requests.exceptions.InvalidURL: Failed to parse: http://host01:50070;http://host02:50070/webhdfs/v1/path/to/create
Apparently the version of hdfs I installed was old, the code didn't work with version 2.0.8, and it did work with version 2.5.7

Anaconda3's python cannot find kerberos credential cache

I run into an error when creating an ODBC connection to a MSSQL server using Anaconda's version of Python 3:
pyodbc.Error: ('HY000', '[HY000] [Microsoft][ODBC Driver 17 for SQL Server]SSPI Provider: No Kerberos credentials available (default cache: KEYRING:persistent:1918003883) (851968) (SQLDriverConnect)')
The server has joined a Windows Active Directory domain and Kerberos realm via SSSD. I can SSH into the server, and retrieve a TGT using kinit. I can even see the credential cache with klist. But the python process cannot seem to either find the Kerberos TGT or the Kerberos credential cache.
the setup:
python
$ /mnt/ds/anaconda3/bin/python --version
Python 3.6.5 :: Anaconda, Inc.
test.py
from pyodbc import connect
connection = connect('DSN=MyDSN')
/etc/odbc.ini
[MyDSN]
#Driver=ODBC Driver 13 for SQL Server
Driver=ODBC Driver 17 for SQL Server
Description=MyMSSQL ODBC Driver
Trace=No
Server=MyMSSQL
Trusted_Connection=Yes
/etc/odbcinst.ini
[ODBC Driver 17 for SQL Server]
Description=Microsoft ODBC Driver 17 for SQL Server
Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.1.so.0.1
UsageCount=1
Red Hat Enterprise Linux
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.5 (Maipo)
$ uname -r
3.10.0-862.2.3.el7.x86_64
msodbcsql17
$ sudo yum info msodbcsql17
Loaded plugins: amazon-id, rhui-lb, search-disabled-repos
Installed Packages
Name : msodbcsql17
Arch : x86_64
Version : 17.1.0.1
Release : 1
Size : 17 M
Repo : installed
From repo : packages-microsoft-com-prod
Summary : ODBC Driver for Microsoft(R) SQL Server(R)
License : https://aka.ms/odbc170eula
Description : This package provides an ODBC driver that can connect to Microsoft(R) SQL Server(R).
unixODBC
$ sudo yum info unixODBC
Loaded plugins: amazon-id, rhui-lb, search-disabled-repos
Installed Packages
Name : unixODBC
Arch : x86_64
Version : 2.3.1
Release : 11.el7
Size : 1.2 M
Repo : installed
From repo : rhui-REGION-rhel-server-releases
Summary : A complete ODBC driver manager for Linux
URL : http://www.unixODBC.org/
License : GPLv2+ and LGPLv2+
Description : Install unixODBC if you want to access databases through ODBC.
: You will also need the mysql-connector-odbc package if you want to access
: a MySQL database, and/or the postgresql-odbc package for PostgreSQL.
$ /mnt/ds/anaconda3/bin/conda list unixodbc
# packages in environment at /mnt/ds/anaconda3:
#
# Name Version Build Channel
unixodbc 2.3.6 h1bed415_0
pyodbc
$ /mnt/ds/anaconda3/bin/conda list pyodbc
# packages in environment at /mnt/ds/anaconda3:
#
# Name Version Build Channel
pyodbc 4.0.23 py36hf484d3e_0
Here are some things I've tried:
Using Python-2.7.15, as packaged by Anaconda2. That worked!
Using isql. I ran isql MyDSN and that connected.
There are two unixODBC libraries (one installed via yum; the other with conda). By default, it will use conda's, but I forced it to use the system unixODBC package with LD_PRELOAD. Same error.
I tried downgrading the database driver to msodbcsql-13.1.9.2-1 and then to msodbcsql-13.0.1.0-1. Same error.
I tried swapping out PyODBC for TurbODBC, another Python ODBC library. Same error.
I created a separate environment in conda with python-3.5. And that worked! Still not sure why.
I wrote a simple C program that interfaced with unixODBC. That program was able to connect to the MSSQL server via Kerberos just fine.
I ran the python2 positive test case and the python3 negative test case through strace to review the system calls. I thought that might reveal something. It seems that they both start looking for the client.keytab file on the file system. Then, in the postive test case, it will fallback to searching the kernel's keyring, where it will successfully find the credential cache and proceed. However, in the negative test case, it simply retries to find client.keytab, and never attempts to search the keyring.
I enabled the unixODBC trace option, one with Python3 test case, and the other with Python2 test case. Unfortunately, the traces (shown below) don't reveal anything to me.
py3-unixodbc.trace
[ODBC][8741][1527046794.480751][__handles.c][460]
Exit:[SQL_SUCCESS]
Environment = 0x55eea73ed130
[ODBC][8741][1527046794.480806][SQLSetEnvAttr.c][189]
Entry:
Environment = 0x55eea73ed130
Attribute = SQL_ATTR_ODBC_VERSION
Value = 0x3
StrLen = 4
[ODBC][8741][1527046794.480824][SQLSetEnvAttr.c][363]
Exit:[SQL_SUCCESS]
[ODBC][8741][1527046794.480843][SQLAllocHandle.c][375]
Entry:
Handle Type = 2
Input Handle = 0x55eea73ed130
[ODBC][8741][1527046794.480861][SQLAllocHandle.c][493]
Exit:[SQL_SUCCESS]
Output Handle = 0x55eea7400500
[ODBC][8741][1527046794.481176][SQLDriverConnectW.c][290]
Entry:
Connection = 0x55eea7400500
Window Hdl = (nil)
Str In = [DSN=MyDSN][length = 15]
Str Out = (nil)
Str Out Max = 0
Str Out Ptr = (nil)
Completion = 0
UNICODE Using encoding ASCII 'ISO8859-1' and UNICODE 'UCS-2LE'
[ODBC][8741][1527046794.575566][__handles.c][460]
Exit:[SQL_SUCCESS]
Environment = 0x55eea746e360
[ODBC][8741][1527046794.575614][SQLGetEnvAttr.c][157]
Entry:
Environment = 0x55eea746e360
Attribute = 65002
Value = 0x7ffd399177f0
Buffer Len = 128
StrLen = 0x7ffd3991778c
[ODBC][8741][1527046794.575632][SQLGetEnvAttr.c][264]
Exit:[SQL_SUCCESS]
[ODBC][8741][1527046794.575651][SQLFreeHandle.c][219]
Entry:
Handle Type = 1
Input Handle = 0x55eea746e360
py2-unixodbc.trace
[ODBC][8746][1527046842.073439][__handles.c][460]
Exit:[SQL_SUCCESS]
Environment = 0x185e2e0
[ODBC][8746][1527046842.073530][SQLSetEnvAttr.c][189]
Entry:
Environment = 0x185e2e0
Attribute = SQL_ATTR_ODBC_VERSION
Value = 0x3
StrLen = 4
[ODBC][8746][1527046842.073552][SQLSetEnvAttr.c][363]
Exit:[SQL_SUCCESS]
[ODBC][8746][1527046842.073572][SQLAllocHandle.c][375]
Entry:
Handle Type = 2
Input Handle = 0x185e2e0
[ODBC][8746][1527046842.073590][SQLAllocHandle.c][493]
Exit:[SQL_SUCCESS]
Output Handle = 0x1857d40
[ODBC][8746][1527046842.073613][SQLDriverConnectW.c][290]
Entry:
Connection = 0x1857d40
Window Hdl = (nil)
Str In = [DSN=MyDSN][length = 15]
Str Out = (nil)
Str Out Max = 0
Str Out Ptr = (nil)
Completion = 0
UNICODE Using encoding ASCII 'ISO8859-1' and UNICODE 'UCS-2LE'
[ODBC][8746][1527046842.208760][__handles.c][460]
Exit:[SQL_SUCCESS]
Environment = 0x1967210
[ODBC][8746][1527046842.208830][SQLGetEnvAttr.c][157]
Entry:
Environment = 0x1967210
Attribute = 65002
Value = 0x7ffe1153fcf0
Buffer Len = 128
StrLen = 0x7ffe1153fc8c
[ODBC][8746][1527046842.208849][SQLGetEnvAttr.c][264]
Exit:[SQL_SUCCESS]
[ODBC][8746][1527046842.208869][SQLFreeHandle.c][219]
Entry:
Handle Type = 1
Input Handle = 0x1967210
Suffice it to say, I'm at my wit's end. Any ideas would be greatly appreciated!
late reply but I hope someone will find it useful.
This very same issue for me was casued by the krb5 package which was coexisting with the system-wide kerberos installation: removing it from the environment solved the problem (I tried to make it work but had no success).

How to connect to Azure MySQL from Azure Functions by Python

I am trying to;
Run the python code triggered by Cosmos DB when cosmos DB receives the data..
The python code in Azure Functions has code to ingest data from Azure MySQL.
What I have done are;
. Wrote python in Azure Functions and run it with triggered by Cosmos
DB. This was successful.
. Installed mysql.connector referred to
https://prmadi.com/running-python-code-on-azure-functions-app/ and
run the code to connect to Azure MySQL, but It does not work.
Do you know how to install mysql module for Python to Azure Functions and connect to the database?
Thanks!
According to your description ,I think your issue is about how to install the Python third-party module in the Azure function app.
Please refer to the steps as below :
Step 1 :
login kudu : https://Your_APP_NAME.scm.azurewebsites.net/DebugConsole.
Run Below command in d:/home/site/wwwroot/<your function name> folder.(will take some time)
python -m virtualenv myvenv
Step 2 :
Load the env via the below command in env/Scripts folder.
activate.bat
Step 3 :
Your shell should be now prefixed by (env).
Update pip
python -m pip install -U pip
Install what you need
python -m pip install MySQLdb
Step 4 :
In your code, update the sys.path to add this venv:
import sys, os.path
sys.path.append(os.path.abspath(os.path.join(os.path.dirname( __file__ ), 'env/Lib/site-packages')))
Then connect to mysql db via the snippet of code below
#!/usr/bin/python
import MySQLdb
# Connect
db = MySQLdb.connect(host="localhost",
user="appuser",
passwd="",
db="onco")
cursor = db.cursor()
# Execute SQL select statement
cursor.execute("SELECT * FROM location")
# Commit your changes if writing
# In this case, we are only reading data
# db.commit()
# Get the number of rows in the resultset
numrows = cursor.rowcount
# Get and display one row at a time
for x in range(0, numrows):
row = cursor.fetchone()
print row[0], "-->", row[1]
# Close the connection
db.close()
Hope it helps you.

How to access Hive on remote server using python client

Case: I have Hive on a cloudera platform. There is a database on Hive that I want to access using python client from my computer. I read a similar SO question but its using pyhs2 which I am unable to install on the remote server. And this SO question too uses Thrift but I cant seem to install it either.
Code: After following the documentation, when I execute the following program it gives me an error.
import pyodbc, sys, os
pyodbc.autocommit=True
con = pyodbc.connect("DSN=default",driver='SQLDriverConnect',autocommit=True)
cursor = con.cursor()
cursor.execute("select * from fb_mpsp")
Error: ssh://ashish#ServerIPAddress/home/ashish/anaconda/bin/python2.7 -u /home/ashish/PyCharm_proj/hdfsConnect/home/ashish/PyCharm_proj/hdfsConnect/Hive_connect/hive_connect.py
Traceback (most recent call last):
File "/home/ashish/PyCharm_proj/hdfsConnect/home/ashish/PyCharm_proj/hdfsConnect/Hive_connect/hive_connect.py", line 5, in
con = pyodbc.connect("DSN=default", driver='SQLDriverConnect',autocommit=True)
pyodbc.Error: ('IM002', '[IM002] [unixODBC][Driver Manager]Data source name not found, and no default driver specified (0) (SQLDriverConnect)')
Process finished with exit code 1
Please suggest how can I solve this problem? Also I am not sure why do I have to specify the driver as SQLDriverConnect when the code will be executed using hadoop hive?
Thanks
This worked for me
oODBC = pyodbc.connect("DSN=Cloudera Hive DSN 64;", autocommit = True, ansi = True )
And now everything works fine.
Be sure anything is fine with you DSN using:
isql -v "Cloudera Hive DSN 64"
and replace "Cloudera Hive DSN 64" with the name you used in your odbc.ini
Also, currently I'm not able to use the kerberos authentication unless I make a ticket by hand. Impala works smoothly using kerberos keytab files
Any help about how to have hive odbc working with keytab files is appreciated.
If you do decide to revisit pyhs2 note that it doesn't need to be installed on the remote server, it's installed on your local client.
If you continue with pyodbc, you need to install the ODBC driver for Hive, which you can get from Cloudera's site.
You don't need to specify the driver in your connection, it should be part of your DSN. The specifics of creating the DSN depend on your OS, but essentially you will create it using Administrative Tools -> Data Sources (Windows), install ODBC and edit /Library/ODBC/odbc.ini (Mac), or edit /etc/odbc.ini (Linux).
Conceptually, think of the DSN as a specification that represents all the information about the connection - it will contain the host, port, and driver information. That way in your code you don't have to specify these things and you can switch details about the database without changing your code.
# Note only the DSN name specifies the connection
import pyodbc
conn = pyodbc.connect("DSN=Hive1")
cursor = conn.cursor()
cursor.execute("select * from YYY")
Alternatively, I've updated the other question you referenced with information about how to install the thrift libraries. I think that's the way to go, if you have that option.
Try this method also to conenct and get data remotely from hive server:
connect remote server with ssh and give the cli command to access data from remote server:
ssh -o UserKnownHostsFile=/dev/null -o ConnectTimeout=90 -o StrictHostKeyChecking=no shashanks#remote_host 'hive -e "select * from DB.testtable limit 5;" >/home/shashanks/testfile'

Categories

Resources