I am trying to connect to HANA in order to pull some metadata in a pandas dataframe. There are lots of mixed approaches and I couldn't find anything concrete.
All I have is:
username
password
serverip
servername
and table names.
The admin has provided all the read access to the required account for the specific tables.
What is the quickest way to get this done? I do not have the option of installing anything on SAPs site.
I have tried the below snippets but I get the error 'target machine actively refused it' and to debug at SAPs end is a lost cause. Thank you in advance.
import pyhdb
connection = pyhdb.connect(
host="123.com",
port=123,
user="user",
password="pswrd"
)
cursor = connection.cursor()
cursor.execute("SELECT * FROM Tablename")
cursor.fetchone()
connection.close()
and
from hdbcli import dbapi
conn = dbapi.connect(
address="123.com",
port=123,
user="user",
password="pswrd"
)
cursor = conn.cursor()
Given your server address and port examples, I'm not sure you got the right idea for how to connect to a HANA database.
Since you want to use pandas it is probably a good idea to have a look at the SAP HANA Machine Learning library for Python.
Check the tutorial blog post for this:
https://blogs.sap.com/2019/11/05/hands-on-tutorial-machine-learning-push-down-to-sap-hana-with-python/
To do any of this, there is no need to install or debug anything on the HANA system.
To connect to hana DB :
from hdbcli import dbapi
conn = dbapi.connect(
address="serverhost",
port=39015,
user="UserName",
password="Password",
databasename='DBNAME'
)
Make sure you enter the correct port number
To get the sql results in pandas dataframe:
query = 'select * from table'
df = pd.read_sql_query(query, conn)
df.head()
Related
import pyhdb
connect = pyhdb.connect(
host="example.com",
port=30015,
user="user",
password="secret"
)
From the official explanation
Pyhdb only gives four parameters. Without the parameter database name, I can't understand how the system knows which database you want to connect to in this case?
And when i connect in this way, Program error:"pyhdb.exceptions.DatabaseError: authentication failed",
it looks like my password is wrong, so i let friends use JAVA(jdbc) to connect with four parameters,
it failed too, but if he add database name, it worked! so my parameters is right , and question is how to specify database name in pyhdb?
Or there are other ways to connect to Hana, Thankyou!
Looking at the __init__.py file of the pyhdb package shows that DATABASENAME is not supported when creating a connection:
[...]
def connect(host, port, user, password, autocommit=False):
conn = Connection(host, port, user, password, autocommit)
conn.connect()
return conn
[...]
The good news here is that pyhdb is not what you should be using to connect to HANA anyhow as it is the old and unsupported client library.
Use hdbcli instead as described in the documentation.
With hdbcli it's no problem at all to use the DATABASENAME:
from hdbcli import dbapi
connection =dbapi.connect(address="hxehost", port=39013, databasename="HXE", user="xxxxx", password="xxxxx")
cursor = connection.cursor()
cursor.execute("SELECT 'Hello, Python world' FROM DUMMY")
print(cursor.fetchone())
connection.close()
This is a fairly common question but even using the answers on SO like here but I still can't connect.
When I setup my connection to pyodbc I can connect with the following:
cnxn = pyodbc.connect('DRIVER={SQL Server Native Client 11.0};SERVER=ip,port;DATABASE=db;UID=user;PWD=pass')
cursor = cnxn.cursor()
cursor.execute("some select query")
for row in cursor.fetchall():
print(row)
and it works.
However to do a .read_sql() in pandas I need to connect with sqlalchemy.
I have tried with both hosted connections and pass-through pyodbc connections like the below:
quoted = urllib.parse.quote_plus('DRIVER={SQL Server Native Client 11.0};Server=ip;Database=db;UID=user;PWD=pass;Port=port;')
engine = sqlalchemy.create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
engine.connect()
I have tried with both SERVER=ip,port format and the separate Port=port parameter like above but still no luck.
The error I'm getting is Login failed for user 'user'. (18456)
Any help is much appreciated.
I assume that you want to create a DataFrame so when you have a cnxn you can pass it to Pandas read_sql_query function.
Example:
cnxn = pyodbc.connect('your connection string')
query = 'some query'
df = pandas.read_sql_query(query, conn)
sqlplus sys/Oracle_1#pdborcl as sysdba;
i'm using this command to connect to Oracle 12c from Command Prompt.
How can i connect to the db using cx_Oracle. I'm new to Oracle DB.
I think this is the equivalent of the sqlplus command line that you posted:
import cx_Oracle
connect_string = "sys/Oracle_1#pdborcl"
con = cx_Oracle.connect(connect_string,mode=cx_Oracle.SYSDBA)
I tried it with a non-container database and not with a pdb so I can't verify that it would work with a pdb. You may not want to connect as sys as sysdba unless you know that you need that level of security.
Bobby
You can find the documentation here cx_Oracle docs
To query the database, use the below algorithm
import cx_Oracle
dsn = cx_Oracle.makedsn(host, port, sid)
connection = cx_Oracle.connect(dsn,mode = cx_Oracle.SYSDBA)
query = "SELECT * FROM MYTABLE"
cursor = connection.cursor()
cursor.execute(query)
resultSet=cursor.fetchall()
connection.close()
The above code works to fetch data from MYTABLE connecting to the above dsn.
Better to go through cx_Oracle docs.
In Amazon Redshift's Getting Started Guide, it's mentioned that you can utilize SQL client tools that are compatible with PostgreSQL to connect to your Amazon Redshift Cluster.
In the tutorial, they utilize SQL Workbench/J client, but I'd like to utilize python (in particular SQLAlchemy). I've found a related question, but the issue is that it does not go into the detail or the python script that connects to the Redshift Cluster.
I've been able to connect to the cluster via SQL Workbench/J, since I have the JDBC URL, as well as my username and password, but I'm not sure how to connect with SQLAlchemy.
Based on this documentation, I've tried the following:
from sqlalchemy import create_engine
engine = create_engine('jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy')
ERROR:
Could not parse rfc1738 URL from string 'jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy'
I don't think SQL Alchemy "natively" knows about Redshift. You need to change the JDBC "URL" string to use postgres.
jdbc:postgres://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy
Alternatively, you may want to try using sqlalchemy-redshift using the instructions they provide.
I was running into the exact same issue, and then I remembered to include my Redshift credentials:
eng = create_engine('postgresql://[LOGIN]:[PASSWORD]#shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy')
sqlalchemy-redshift is works for me, but after few days of reserch
packages (python3.4):
SQLAlchemy==1.0.14 sqlalchemy-redshift==0.5.0 psycopg2==2.6.2
First of all, I checked, that my query is working workbench (http://www.sql-workbench.net), then I force it work in sqlalchemy (this https://stackoverflow.com/a/33438115/2837890 helps to know that auto_commit or session.commit() must be):
db_credentials = (
'redshift+psycopg2://{p[redshift_user]}:{p[redshift_password]}#{p[redshift_host]}:{p[redshift_port]}/{p[redshift_database]}'
.format(p=config['Amazon_Redshift_parameters']))
engine = create_engine(db_credentials, connect_args={'sslmode': 'prefer'})
connection = engine.connect()
result = connection.execute(text(
"COPY assets FROM 's3://xx/xx/hello.csv' WITH CREDENTIALS "
"'aws_access_key_id=xxx_id;aws_secret_access_key=xxx'"
" FORMAT csv DELIMITER ',' IGNOREHEADER 1 ENCODING UTF8;").execution_options(autocommit=True))
result = connection.execute("select * from assets;")
print(result, type(result))
print(result.rowcount)
connection.close()
And after that, I forced to work sqlalchemy_redshift CopyCommand perhaps bad way, looks little tricky:
import sqlalchemy as sa
tbl2 = sa.Table(TableAssets, sa.MetaData())
copy = dialect_rs.CopyCommand(
assets,
data_location='s3://xx/xx/hello.csv',
access_key_id=access_key_id,
secret_access_key=secret_access_key,
truncate_columns=True,
delimiter=',',
format='CSV',
ignore_header=1,
# empty_as_null=True,
# blanks_as_null=True,
)
print(str(copy.compile(dialect=RedshiftDialect(), compile_kwargs={'literal_binds': True})))
print(dir(copy))
connection = engine.connect()
connection.execute(copy.execution_options(autocommit=True))
connection.close()
We make just that I made with sqlalchemy, excute query, except comine query by CopyCommand. I have not see some profit :(.
The following works for me with Databricks on all kinds of SQLs
import sqlalchemy as SA
import psycopg2
host = 'your_host_url'
username = 'your_user'
password = 'your_passw'
port = 5439
url = "{d}+{driver}://{u}:{p}#{h}:{port}/{db}".\
format(d="redshift",
driver='psycopg2',
u=username,
p=password,
h=host,
port=port,
db=db)
engine = SA.create_engine(url)
cnn = engine.connect()
strSQL = "your_SQL ..."
try:
cnn.execute(strSQL)
except:
raise
import sqlalchemy as db
engine = db.create_engine('postgres://username:password#url:5439/db_name')
This worked for me
I have case :
import pymysql
conn = pymysql.connect(host='127.0.0.1', unix_socket='/opt/lampp/var/mysql/mysql.sock', user='root', passwd=None, db='test')
cur = conn.cursor()
cur.execute("test < /mypath/test.sql")
cur.close()
conn.close()
I always get error :
1064 , "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'test < /mypath/test.sql' at line 1"
I tried to use source and it still failed. Did you know why?
Thank you.
Your error message says that the MySQL server can't understand
test < /mypath/test.sql' at line 1
If you're a long time *nix user, it seems intuitive that you should be able to use commands like this to pass various sorts of data streams to various programs. But that's not the way the Python sql API (or most language-specific) sql APIs works.
You need to pass a valid SQL query to the execute() method in the API, so the API can pass it to the database server. A vaild query will be something like INSERT or CREATE TABLE.
Look, the server might be on a different host machine, so telling the server to read from /mypath/test.sql is very likely a meaningless instruction to that server. Even if it did understand it, it might say File test.sql not found.
The mysql(1) command line client software package can read commands from files. Is that what you want?
>>> import MySQLdb
>>> db = MySQLdb.connect(host = 'demodb', user = 'root', passwd = 'root', db = 'mydb')
>>> cur = db.cursor()
>>> cur.execute('select * from mytable')
>>> rows = cur.fetchall()
Install MySQL-Python package to use MySQLdb.