I am attempting to save a pandas dataframe to a local microsoft sql server, however, I keep getting an invalid precision value error, when using the .to_sql function.
When I try the following code:
from sqlalchemy import create_engine
import urllib
import pyodbc
import pandas as pd
quoted = urllib.parse.quote_plus("DRIVER={SQL Server};SERVER=localhost;DATABASE=Database")
engine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted),echo=False)
df = pd.read_csv('file_name')
df.to_sql(name='WW_Table', con=engine,if_exists='replace',method='multi',chunksize=500,index=False)
I receive this error:
Error: ('HY104', '[HY104] [Microsoft][ODBC SQL Server Driver]Invalid precision value (0) (SQLBindParameter)')
I have tried multiple things including creating a table from scratch, preloading the CSV file and replacing it, but with no luck.
any help would be greatly appreciated.
Related
How to even start a basic query in databricks using python?
The data I need is in databricks and so far I have been using Juypterhub to pull the data and modify few things. But now I want to eliminate a step of pulling the data in Jupyterhub and directly move my python code in databricks then schedule the job.
I started like below
%python
import pandas as pd
df = pd.read_sql('select * from databasename.tablename')
and got below error
TypeError: read_sql() missing 1 required positional argument: 'con'
So I tried update
%python
import pandas as pd
import pyodbc
odbc_driver = pyodbc.drivers()[0]
conn = pyodbc.connect(odbc_driver)
df = pd.read_sql('select * databasename.tablename', con=conn)
and I got below error
ModuleNotFoundError: No module named 'pyodbc'
Can anyone please help? I can use sql to pull the data but I already have a lot of code in python that I dont know to convert in sql. So I just want my python code to work in databricks for now.
You should use directly spark's SQL facilities:
my_df = spark.sql('select * FROM databasename.tablename')
I through my collegue recieved .db file (which includes text and number data) which I need to pass into pandas dataframe for further processing. I never worked or know about SQLite. But, with few google search,I written following line of code:
import pandas as pd
import numpy as np
import sqlite3
conn = sqlite3.connect('data.db') # This create `data.sqlite`
sql="""
SELECT * FROM data;
"""
df=pd.read_sql_query(sql,conn)
df.head()
This giving me following error
'error Execution failed on sql ' SELECT * FROM data;
': no such table: data
What table this code is referring to ? I had only data.db.
I do not quite understand where i am going wrong with this. Any advice how to get my data into dataframe df?
I'm also new to SQL but based on what you've provided, "data" is referring to a table in your database "data.db".
The query that you typed is instructing the program to select all items from the table called "data". This website helped me with creating tables: https://www.tutorialspoint.com/sqlite/sqlite_create_table.htm
I used to work with pandas and cx_Oracle until now. But I haver to switch to dask now due to RAM limitations.
import pandas as pd
from dask import dataframe as dd
import os
import cx_Oracle as cx
con = cx.connect('USER','userpw' , 'oracle_db',encoding='utf-8')
cursor = con.cursor()
query_V_Branchen = ('''SELECT * FROM DBOWNER.V_BRANCHEN vb''')
daskdf = dd.read_sql_table(query_V_Branchen,con ,index_col= 'RECID')
I tried to do it similar to how I used cx_oracle with pandas. But I receive an AttributeError named:
'cx_Oracle.Connection' object has no attribute '_instantiate_plugins'
Any ideas if its just a problem with the package?
Please read the dask doc on SQL:
you should provide a connection string, not an object
you should give a table name, not a query, or phrase your query using sqlalchemy's expression syntax.
e.g.,
df = dd.read_sql_table('DBOWNER.V_BRANCHEN',
'oracle+cx_oracle://USER:userpw#oracle_db', index_col= 'RECID')
In trying to import an sql database into a python pandas dataframe, and I am getting a syntax error. I am newbie here, so probably the issue is very simple.
After downloading sqlite sample chinook.db from http://www.sqlitetutorial.net/sqlite-sample-database/
and reading pandas documentation, I tried to load it into a pandas dataframe with
import pandas as pd
import sqlite3
conn = sqlite3.connect('chinook.db')
df = pd.read_sql('albums', conn)
where 'albums' is a table of 'chinook.db' gathered with sqlite3 from command line.
The result is:
...
DatabaseError: Execution failed on sql 'albums': near "albums": syntax error
I tried variations of the above code to import in an ipython session the tables of the database for exploratory data analysis, with no success.
What am I doing wrong? Is there a documentation/tutorial for newbies with some examples around?
Thanks in advance for your help!
Found it!
An example of db connection with SQLAlchemy can be found here:
https://www.codementor.io/sagaragarwal94/building-a-basic-restful-api-in-python-58k02xsiq
import pandas as pd
from sqlalchemy import create_engine
db_connect = create_engine('sqlite:///chinook.db')
df = pd.read_sql('albums', con=db_connect)
print(df)
As suggested by #Anky_91, also pd.read_sql_table works, as read_sql wraps it.
The issue was the connection, that has to be made with SQLAlchemy and not with sqlite3.
Thanks
I'm trying to copy a table from the Redshift database to a dataframe in Python and then save it again in Redshift.
So, the first step is working but I have some problems with the second step. I get some errors when I'm trying to save a dataframe which has 100 rows.
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine("mssql+pyodbc://database")
df = pd.read_sql_query('select * from testing.table1 limit 100', engine)
df.to_sql(name='table2',schema='testing',con=engine,index=False,if_exists='append')
And I'm getting this error:
DBAPIError: (pyodbc.Error) ('HY000', '[HY000] [Amazon][ODBC] (10920) No data can be obtained from input parameter whose value has already been pushed down.
It's strange because when I'm trying to save a dataframe which has 10 rows there is no error at all.