Docker File Install Microsoft ODBC - python

I have application which deployed via Docker, in my application i am accessing the Azure SQL , so. my connection string look like this
mssql+pyodbc://<user>#<domain>:<password>#<conn_string>.database.windows.net:1433/<db>?driver={'ODBC Driver 17 for SQL Server'}
My Docker File :
# Configure the MS SQL query drivers
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt update
RUN ACCEPT_EULA=Y apt-get install -y msodbcsql17
I am getting below error :
Can't open lib '{'ODBC Driver 17 for SQL Server'' : file not found (0)
Can anyone help me how to properly configure the driver?

Related

Issue with connecting to pyodbc in Docker Container

Good Morning,
I am having the following issue with my Docker container and pyodbc / unixodbc-dev.
When running my Python API connecting to my Docker container I get the following error message--
(pyodbc.Error) ('01000', "[01000] [unixODBC][Driver
Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)"
Connecting to the same API using my local debug instance everything is working fine -- I can submit a string for searching in the backend database and I get results returned and sent back to the Postman UI.
I see that unixodbc-dev dev 2.3.6-0.1 amd64 installed in the Docker image and I noticed that unixODBC is at 2.3.11 - don't know if there might be any issue with that but that being said our Moonshot instances can't connect to http://deb.debian.org and to get our security group to open it up is next to impossible.
All this being said I'm wondering if I have something configured wrong in my Docker container that is causing my issues. I'm new to the Docker container world so this is definitely a learn as I go.
TIA,
Bill Youngman
I was able to get this figured out - thanks to m.b. for the solution I was looking for.
I was able to take the Debian suggestion from this Install ODBC driver in Alpine Linux Docker Container and modify it for my needs.
This is the code that I used to meet my requirements of downloading unixOdbc as well as downloading and installing the MS Sql ODBC driver.
FROM python:3.8.3
ARG ENV DEBIAN_FRONTEND noninteractive
# install Microsoft SQL Server requirements.
ENV ACCEPT_EULA=Y
RUN apt-get update -y && apt-get update \
&& apt-get install -y --no-install-recommends curl gcc g++ gnupg unixodbc-dev
# Add SQL Server ODBC Driver 17 for Ubuntu 18.04
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - \
&& curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends --allow-unauthenticated msodbcsql17 mssql-tools \
&& echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile \
&& echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
# clean the install.
RUN apt-get -y clean
Once this was accomplished and I built and deployed my container everything worked perfectly.
-Bill

Setting up PostgreSQL driver on Azure Databricks

How can I modify the code below to install a PostgreSQL JDBC driver instead of MS SQL? My goal is to use pyodbc to connect to a Redshift database from Azure Databricks. I thought that the PostgreSQL JDBC driver was already installed in my Databricks runtime by default, but when I run pyodbc.drivers() I get just "['ODBC Driver 17 for SQL Server']" so I guess not. And what would the pyodbc.connect() string look like once the PostgreSQL driver is installed? If it's easier, we can instead use Amazon's recommended JDBC driver for Redshift.
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev
sudo apt-get install python3-pip -y
pip3 install --upgrade pyodbc
You need to change your approach and use correct instructions to install and configure the Redshift driver:
install unixodbc-dev via apt-get
download the 64-bit Debian package (actual link is in the documentation)
modify odbcinst.ini with following information (you can store that file on DBFS and copy from it, or copy /opt/amazon/redshiftodbc/lib/64/amazon.redshiftodbc.ini into odbcinst.ini):
[ODBC Drivers]
Amazon Redshift (x64)=Installed
[Amazon Redshift (x64)]
Description=Amazon Redshift ODBC Driver (64-bit)
Driver=/opt/amazon/redshiftodbc/lib/64/libamazonredshiftodbc64.so
install pyodbc via pip
All these steps is better to implement as init script that could be attached to a cluster.
P.S. But do you really need to work through ODBC? Why not use more scalable spark-redshift connector provided by Databricks?

Automatically install pyodbc on a Databricks cluster upon each restart

I have been using pyodbc on one of my Databricks clusters and have been installing it using this shell command running in the first cell of my notebook:
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev
sudo apt-get install python3-pip -y
pip3 install --upgrade pyodbc
This works fine but I have to execute it each time I run the cluster and intend to use pyodbc. I have been doing this by including this piece of code as the first cell of each notebook that uses pyodbc. To fix this I tried to save this code as a .sh file, uploaded it to dbfs, and then added it as one of my cluster's init files. Upon running the code given below:
cnxn1 = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+jdbcHostname+';DATABASE='+jdbcDatabase+';UID='+username1+';PWD='+ password1)
I get the following error:
('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)")
What is it that I am doing wrong with my shell commands/init script that's causing this issue. Any help would be greatly appreciated. Thanks!
This is the recommended way of doing it.
Create the file like this :
dbutils.fs.put("dbfs:/databricks/scripts/pyodbc-install.sh","""
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev
sudo apt-get install python3-pip -y
pip3 install --upgrade pyodbc""", True)
Then go to your cluster configuration page.
Click on Edit:
Go down and expand Advanced Options > Init Scripts
There you can add the path of the script :
Then you can click on Confirm.
Now, this script will be executed at the start of your cluster and will make pyodbc available on all notebooks attached to it.
Is it how you did it ?

Can't open lib 'ODBC Driver 17 for SQL Server' : file not found

I am fairly new to Python and Azure web apps. Any help is appreciated.
My setup:
Program: Visual Studio code
Language: Python-Flask
Cloud provider: Microsoft Azure
Database: Azure SQL DB
Deployment option: Docker image > Azure container registry > Deploy the image to the Web app
Web App OS: Linux (I think Alpine?)
In my code, I am using pyodbc to connect to the Azure SQL DB. The code runs successfully locally in the terminal. However, when it runs on the web, it encounters the following error:
Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)")
I followed several troubleshooting posts, however, I have not been successful.
I tried using the $sudo ln to create a symbolic link that resulted in permission denied. I think this is a known limitation of the Azure web app.
I tried to look for the driver in etc/odbcinst.ini to see if the driver name exists, however, I am pretty new to Azure / VS Code so I do not even know how to open the file that is in the, etc/ folder. I do see it in the BASH command prompt when I navigate to the etc/ folder but not sure how to open the file.
I ran the following command in the BASH to install PYODBC, but that didn't resolve the issue.
python -m pip install pyodbc
The result from odbcinst -j
unixODBC 2.3.4
DRIVERS............: /etc/odbcinst.ini
SYSTEM DATA SOURCES: /etc/odbc.ini
FILE DATA SOURCES..: /etc/ODBCDataSources
USER DATA SOURCES..: /home/a49d42b0d7b8ce200a4f7e74/.odbc.ini
SQLULEN Size.......: 8
SQLLEN Size........: 8
SQLSETPOSIROW Size.: 8
My dockerFile:
# Pull a pre-built alpine docker image with nginx and python3 installed
FROM tiangolo/uwsgi-nginx-flask:python3.6-alpine3.7
ENV LISTEN_PORT=8000
EXPOSE 8000
COPY /app /app
# Uncomment to install additional requirements from a requirements.txt file
COPY requirements.txt /
RUN pip install --no-cache-dir -U pip
RUN pip install --no-cache-dir -r /requirements.txt
RUN apk add g++
RUN apk add unixodbc-dev
RUN pip install pyodbc
My requirements.txt. I commented out pyodbc; I think that's okay since I am installing it in the docker file.
click==6.7
Flask==0.12.2
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
Werkzeug==0.14.1
#pyodbc==4.0.28
Additional questions:
Should I be using PYODBC? or is there something better / more compatible I should be using?
Should I use MYSQL instead of Azure SQL DB?
Is there a way for me to open the odbcinst.ini file that is on the web app?
First, if you want to know what os release is in your docker, you can command cat /etc/os-release to get it, such as run it in my ubuntu as the figure below.
At here, I'm sure your web app os is Alpine Linux, due to the first line of your docker file FROM tiangolo/uwsgi-nginx-flask:python3.6-alpine3.7. Your base image is based on Alpine 3.7.
Second, according to your error info Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") and the content of your docker file and requirements.txt, I think the issue was caused by missing MS SQL Server ODBC driver for Linux which not be installed in your docker image, but pyodbc required it to connect Azure SQL Database.
However, for ODBC Driver 17 for SQL Server, the offical document Installing the Microsoft ODBC Driver for SQL Server on Linux and macOS shows there is not a released v17 package for Alpine. So the workaround is to change your DockerHub base image from tiangolo/uwsgi-nginx-flask:python3.6-alpine3.7 to tiangolo/uwsgi-nginx-flask:python3.6 to use debian as OS, then you can easily install MS ODBC driver 17 for SQL Server in it.
For your additional questions, as below.
Except for using pyodbc, pymssql is the other one of Python SQL Driver, please see the offical document Python SQL Driver, but The Pymssql Project is Being Discontinued. And SQLAlchemy as ORM framework can be used to connect Azure SQL Database, which also requires pyodbc or pymssql.
Use MySQL or Azure SQL Database, it's up to you. I think the only difference is that MySQL may be installed easiler than Azure SQL DB in Alpine.
The way to open odbcinst.ini file on webapp is to use vim over SSH to connect to your docker OS. Considering for the custom docker image you used, please see the section Enable SSH of the offical document Configure a custom Linux container for Azure App Service and replace the command apk with apt for Debian Linux.
The following instructions in the official website helped me solve my problem: Install the Microsoft ODBC driver for SQL Server (Linux)
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
#Download appropriate package for the OS version
#Choose only ONE of the following, corresponding to your OS version
#Debian 8 (only supported up to driver version 17.6)
curl https://packages.microsoft.com/config/debian/8/prod.list > /etc/apt/sources.list.d/mssql-release.list
#Debian 9
curl https://packages.microsoft.com/config/debian/9/prod.list > /etc/apt/sources.list.d/mssql-release.list
#Debian 10
curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list
exit
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev
# optional: kerberos library for debian-slim distributions
sudo apt-get install -y libgssapi-krb5-2
The official website instructions did not work for me as the packages were reported as missing:
$ sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package msodbcsql18
The following worked for me:
sudo apt-get install unixodbc
Navigate with a browser too:
https://packages.microsoft.com/ubuntu/21.04/prod/pool/main/m/
Download the deb files in:
msodbcsql18
mssql-tools18
Or
msodbcsql
msodbcsql17
Pick the version you want to install. I picked 18.
Install the two deb files:
sudo dpkg -i mssql-tools18_18.0.1.1-1_amd64.deb
sudo dpkg -i msodbcsql18_18.0.1.1-1_amd64.deb
If you had not run the script on the official website then you need to run the following (for 18 rather than 17):
echo 'export PATH="$PATH:/opt/mssql-tools18/bin"' >> ~/.bashrc

Python script job not running on Azure Batch - ODBC error

I am trying to run a Python job using a VM on Azure Batch. It's a simple script to add a line to my Azure SQL Database. I downloaded the ODBC connection string straight from my Azure portal yet I get this error. The strange thing is I can run the script perfectly fine on my own machine. I've configured the VM to install the version of Python that I need and then execute my script - I'm at a complete loss. Any ideas?
cnxn = pyodbc.connect('Driver={ODBC Driver 13 for SQL Server};Server=tcp:svr-something.database.windows.net,fakeport232;Database=db-something-prod;Uid=something#svr-something;Pwd{fake_passwd};Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;')
Traceback (most recent call last):
File "D:\batch\tasks\apppackages\batch_python_test1.02018-11-12-14-
30\batch_python_test\python_test.py", line 12, in
r'Driver={ODBC Driver 13 for SQL Server};Server=tcp:svr-
mydatabase.database.windows.net,'
pyodbc.InterfaceError: ('IM002', '[IM002] [Microsoft][ODBC Driver Manager]
Data
source name not found and no default driver specified (0) (SQLDriverConnect)')
Being new to Azure Batch I didn't realise the virtual machines didn't come with ODBC drivers installed. I wrote a .bat file to install drivers on the node when the pool is allocated. Problem solved.
You have to install the ODBC driver in each compute nodes of the pool.
Put the below commands inside a shell script file startup_tasks.sh:
sudo apt-get -y update;
export DEBIAN_FRONTEND=noninteractive;
sudo apt-get install -y python3-pip;
apt-get install -y --no-install-recommends apt-utils apt-transport-https;
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - ;
curl https://packages.microsoft.com/config/debian/9/prod.list >
/etc/apt/sources.list.d/mssql-release.list ;
sudo apt-get -y update ;
ACCEPT_EULA=Y apt-get -y install msodbcsql17 ;
ACCEPT_EULA=Y apt-get -y install mssql-tools ;
echo 'export PATH=\"$PATH:/opt/mssql-tools/bin\"' >> ~/.bash_profile ;
echo 'export PATH=\"$PATH:/opt/mssql-tools/bin\"' >> ~/.bashrc ;
source ~/.bashrc && sudo apt-get install -y unixodbc unixodbc-dev ;
Give bin/bash -c "startup_tasks.sh" as a startup task in azure batch pool.This will install the ODBC driver 17 in each nodes.
And then in your connection string change the ODBC driver version to 17

Categories

Resources