Getting an error going from Dataframe to SQL Server - python

I'm looking at the documentation here.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html
I keep getting this error.
'DataFrame' object has no attribute 'to_sql'
Below, is all my code. I don't see what's wrong here. What is going on?
import pandas as pd
from sqlalchemy import create_engine
import urllib
import pyodbc
params = urllib.parse.quote_plus("DRIVER={SQL Server Native Client 11.0};SERVER=server_name.database.windows.net;DATABASE=my_db;UID=my_id;PWD=my_pw")
myeng = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
df.to_sql(name="dbo.my_table", con=myeng, if_exists='append', index=False)

As it turns out, the object wasn't an actual dataframe that Pandas could interpret. This fixed the problem.
# convert pyspark.sql DF to Pandas DF
df = df.toPandas()

Related

pd.read_sql, how to convert data types?

I am running a sql query from my python code and attempting to create a dataframe from it. When I execute the code, pandas produces the following error message:
pandas.io.sql.DatabaseError: Execution failed on sql '*my connection info*' : expecting string or bytes object
The relevant code is:
import cx_Oracle
import cx_Oracle as cx
import pandas as pd
dsn_tns = cx.makedsn('x.x.x.x', 'y',
service_name='xxx')
conn = cx.connect(user='x', password='y', dsn=dsn_tns)
sql_query1 = conn.cursor()
sql_query1.execute("""select * from *table_name* partition(p20210712) t""")
df = pd.read_sql(sql_query1,conn)
I was thinking to convert all values in query result to strings with df.astype(str) function, but I cannot find the proper way to accomplish this within the pd.read_sql statement. Would data type conversion correct this issue?

Data extracting from Cassandra DB into excel using Python

I am trying to extract the cassandra db results into excel using Python. When i run the below code, i am getting the following error. I couldn't able to resolve the following issue. Can someone please help me? This error appears for all columns.
Error: AttributeError: 'dict' object has no attribute "Column1"
Code:
import pandas as pd
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.query import dict_factory
auth_provider = PlainTextAuthProvider(username=CASSANDRA_USER, password=CASSANDRA_PASS)
cluster = Cluster(contact_points= ["host"], port=xxx, auth_provider=auth_provider)
session = cluster.connect("keyspace")
session.row_factory = dict_factory
sql_query = "SELECT * FROM db.tablename"
dictionary = {"column1":[],"column2":[],"column3":[],"column4":[]}
for row in session.execute(sql_query):
dictionary["column1"].append(row.column1)
dictionary["column2"].append(row.column2)
dictionary["column3"].append(row.column3)
dictionary["column4"].append(row.column4)
df = pd.DataFrame(dictionary)
df.to_excel(r'C:\Users\data.xlsx')

Importing a .sql file in python

I have just started learning SQL and I'm having some difficulties to import my sql file in python.
The .sql file is in my desktop, as well is my .py file.
That's what I tried so far:
import codecs
from codecs import open
import pandas as pd
sqlfile = "countries.sql"
sql = open(sqlfile, mode='r', encoding='utf-8-sig').read()
pd.read_sql_query("SELECT name FROM countries")
But I got the following message error:
TypeError: read_sql_query() missing 1 required positional argument: 'con'
I think I have to create some kind of connection, but I can't find a way to do that. Converting my data to an ordinary pandas DataFrame would help me a lot.
Thank you
This is the code snippet taken from https://www.dataquest.io/blog/python-pandas-databases/ should help.
import pandas as pd
import sqlite3
conn = sqlite3.connect("flights.db")
df = pd.read_sql_query("select * from airlines limit 5;", conn)
Do not read database as an ordinary file. It has specific binary format and special client should be used.
With it you can create connection which will be able to handle SQL queries. And can be passed to read_sql_query.
Refer to documentation often https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html
You need a database connection. I don't know what SQL flavor are you using, but suppose you want to run your query in SQL server
import pyodbc
con = pyodbc.connect(driver='{SQL Server}', server='yourserverurl', database='yourdb', trusted_connection=yes)
then pass the connection instance to pandas
pd.read_sql_query("SELECT name FROM countries", con)
more about pyodbc here
And if you want to query an SQLite database
import sqlite3
con = sqlite3.connect('pathto/example.db')
More about sqlite here

Data insertion in SQL server with pandas

I'm trying to upload a dataframe in SQL server using pandas (to_sql) function, I get the below error
[SQL Server Native Client 11.0]Invalid character value for cast
specification (0) (SQLExecDirectW)')
I checked for variables' names and types and they are exactly the same in the SQL database and pandas dataframe.
How can I fix this?
Thanks
df.to_sql(raw_table, connDB, if_exists='append', index=False )
plz try this , this code use to juypter note book and SQL workbench
import mysql.connector
from mysql.connector import Error
from sqlalchemy import create_engine
import pandas as pd
mydata = pd.read_csv("E:\\Hourly_Format\\upload.csv")
engine = create_engine("mysql://root:admin#localhost/pythondb", pool_size=10, max_overflow=20)
mydata.to_sql(name='emp',con=engine,if_exists='append', index=False)
jupyter :-
workbench :-

pandas: "Lost connection to MySQL server" "system error: 32 Broken pipe"

I am getting the above error when trying to import a pandas dataframe:
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('mysql://username:password#localhost/dbname')
c = getsomedata()
fields = ['user_id', 'timestamp', 'text']
c1 = c[fields].reset_index()
c1.to_sql(name='comments', con=engine, if_exists='replace', index=False)
There are lots of questions with this MySql issue - but how to address with a pandas import?
The solution for me was very simple: Use the chunksize option:
c1.to_sql(name='comments', con=engine, chunksize=1000, if_exists='replace', index=False)
^^^^^^^^^^^^^^^
Probably related to this issue with overly large packets.

Categories

Resources