Can't get full data from SQL Server with python - python

I'm trying to get query as a xml data from mssql server with pyodbc. After query, im writing data to new xml file with unique name. Everything works fine to this point with small data. Problem is when i try to read data over 2037 character, i cant get all of them. Its gives me just first 2037 character.
SQL Server version 15.0.2000.5(SQL Server 2019 Express)
Driver is ODBC Driver 17 for SQL Server
Python version 3.11.1
pyodbc version 4.0.35
Code is running on Windows Server 2016 Standard
SQL Query For XML Data
SELECT
C.BLKODU AS "BLKODU",
C.CARIKODU AS "CARIKODU",
C.TICARI_UNVANI AS "TICARI_UNVANI",
C.ADI_SOYADI AS "ADI_SOYADI",
C.VERGI_NO AS "VERGI_NO",
C.TC_KIMLIK_NO AS "TC_KIMLIK_NO",
C.VERGI_DAIRESI AS "VERGI_DAIRESI",
C.CEP_TEL AS "CEP_TEL",
C.ILI AS "ILI",
C.ILCESI AS "ILCESI",
C.ADRESI_1 AS "ADRESI",
(SELECT
CHR.BLKODU AS "BLKODU",
CHR.EVRAK_NO AS "EVRAK_NO",
CHR.MAKBUZNO AS "MAKBUZ_NO",
CAST(CHR.TARIHI AS DATE) AS "TARIHI",
CAST(CHR.VADESI AS DATE) AS "VADESI",
CHR.MUH_DURUM AS "MUH_DURUM",
CAST(CHR.KPB_ATUT AS DECIMAL(10, 2)) AS "KPB_ATUT",
CAST(CHR.KPB_BTUT AS DECIMAL(10, 2)) AS "KPB_BTUT"
FROM CARIHR AS CHR
WHERE CHR.BLCRKODU = C.BLKODU
ORDER BY CHR.TARIHI
FOR XML PATH('CARIHR'), TYPE)
FROM CARI AS C
WHERE C.CARIKODU = 'CR00001'
FOR XML PATH ('CARI')
Python Code
import pyodbc
import uuid
import codecs
import query
import core
conn = pyodbc.connect(core.connection_string, commit=True)
cursor = conn.cursor()
cursor.execute(query.ctr)
row = cursor.fetchval()
id = uuid.uuid4()
xml_file = "./temp/"+str(id)+".xml"
xml = codecs.open(xml_file, "w", "utf-8")
xml.write(row)
xml.close()
I've tried to use pymssql and it didn't change anything.
cursor.fetchvall(), cursor.fetchone() is gives me same result.
cursor.fetchall() gives me full data. But its gives as a list. When its gives as a list i need to convert to string. Before converting to string i need to select first element in the list. So i came with then idea like this below. But result didn't change at all. Its still gives only first 2037 character.
conn = pyodbc.connect(connect_string, commit=True)
cursor = conn.cursor()
cursor.execute(query.ctr)
row = cursor.fetchall()
data = ','.join(row[0])
id = uuid.uuid4()
xml_file = "./temp/"+str(id)+".xml"
xml = codecs.open(xml_file, "w", "utf-8")
xml.write(data)
xml.close()

For XML queries are split to multiple lines by SQL Server automatically if they're long enough. Some clients like Management Studio "merge" these to single row but it's not actually one row.
So you need to concatenate your string yourself:
#code in pseudo-python
xmlString = ""
rows = cursor.fetchall()
for row in rows:
xmlString = xmlString + row[0]

Related

How do I write a header row using csv.writer when using a sqlalchemy cursor?

The questions stems from my previous question here. By switching from using pandas to run the query and output to csv to using sqlalchemy and csv.writer, I've lost the header row.
I apologize for the example code, as it's only going to be runnable if you have a SQL Server.
Running this code, you get the output "hello world|goodbye world" and the headers (a and b, respectively) are missing.
from sqlalchemy.engine import URL
from sqlalchemy import create_engine
import csv
serverName = 'FOO'
databaseName = 'BAR'
connection_string = ("Driver={SQL Server};"
"Server=" + serverName + ";"
"Database=" + databaseName + ";"
"Trusted_Connection=yes;")
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect":connection_string })
engine = create_engine(connection_url)
connection = engine.raw_connection()
cursor = connection.cursor()
sql = "select 'hello world' as 'a', 'goodbye world' as 'b'"
with open('foo.csv', 'w') as file:
w = csv.writer(file, delimiter='|', )
for row in cursor.execute(sql):
w.writerow(row)
cursor.close()
Not shown, as I was minimizing the code, is that my actual code grabs all .sql files in a given directory and runs each in turn, outputting the results to csv. As such, the header needs to be dynamically added, not hard-coded as the first row.
A user mentioned this post as potentially helpful, but I can't seem to find a way to use .keys() when using a cursor. Given the size of some of the queries running, running the query once to simply return the header, then again for the rows, isn't a possible solution.
I also found this which seems to say that I can return CursorResult.keys() to write the column headers, but I'm unsure how to use that in the script above, as I can't find any object that has the attribute 'CursorResult'.
How do I get the header row from the sql query written to the csv as the expected first row when using a cursor?
When using a raw pyodbc cursor you can get the column names from Cursor.description after you .execute():
cnxn = engine.raw_connection()
crsr = cnxn.cursor()
crsr.execute("SELECT 1 AS foo, 2 AS bar")
col_names = [x[0] for x in crsr.description]
print(col_names) # ['foo', 'bar']

pyodbc with special characters

I have the following sql
select * from table
where "column" = 'ĚČÍ'
which works fine when committed directly to the database (in my case Microsoft SQL Server 2016).
If I use (in python)
import pyodbc
ex_str= """select * from table
where "column" = (?)"""
str2insert= ('ĚČÍ')
conn = pyodbc.connect(cstring)
cur = conn.cursor()
cur.execute(ex_str,str2insert)
content = cur.fetchall()
I get no result although ĚČÍ is definetly an entry in the database column
I added
conn.setencoding("utf-8")
encoded and decoded the str2insert with latin utf-8 etc. and added
cstring = cstring + ";convert_unicode=True"
but nothing worked.
There seems to be an issue with pyodbc and special characters. Has anyone an idea?

What is the best way to dump MySQL table data to csv and convert character encoding?

I have a table with about 200 columns. I need to take a dump of the daily transaction data for ETL purposes. Its a MySQL DB. I tried that with Python both using pandas dataframe as well as basic write to CSV file method. I even tried to look for the same functionality using shell script. I saw one such for oracle Database using sqlplus. Following are my python codes with the two approaches:
Using Pandas:
import MySQLdb as mdb
import pandas as pd
host = ""
user = ''
pass_ = ''
db = ''
query = 'SELECT * FROM TABLE1'
conn = mdb.connect(host=host,
user=user, passwd=pass_,
db=db)
df = pd.read_sql(query, con=conn)
df.to_csv('resume_bank.csv', sep=',')
Using basic python file write:
import MySQLdb
import csv
import datetime
currentDate = datetime.datetime.now().date()
host = ""
user = ''
pass_ = ''
db = ''
table = ''
con = MySQLdb.connect(user=user, passwd=pass_, host=host, db=db, charset='utf8')
cursor = con.cursor()
query = "SELECT * FROM %s;" % table
cursor.execute(query)
with open('Data_on_%s.csv' % currentDate, 'w') as f:
writer = csv.writer(f)
for row in cursor.fetchall():
writer.writerow(row)
print('Done')
The table has about 300,000 records. It's taking too much time with both the python codes.
Also, there's an issue with encoding here. The DB resultset has some latin-1 characters for which I'm getting some errors like : UnicodeEncodeError: 'ascii' codec can't encode character '\x96' in position 1078: ordinal not in range(128).
I need to save the CSV in Unicode format. Can you please help me with the best approach to perform this task.
A Unix based or Python based solution will work for me. This script needs to be run daily to dump daily data.
You can achieve that just leveraging MySql. For example:
SELECT * FROM your_table WHERE...
INTO OUTFILE 'your_file.csv'
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
FIELDS ESCAPED BY '\'
LINES TERMINATED BY '\n';
if you need to schedule your query put such a query into a file (e.g., csv_dump.sql) anche create a cron task like this one
00 00 * * * mysql -h your_host -u user -ppassword < /foo/bar/csv_dump.sql
For strings this will use the default character encoding which happens to be ASCII, and this fails when you have non-ASCII characters. You want unicode instead of str.
rows = cursor.fetchall()
f = open('Data_on_%s.csv' % currentDate, 'w')
myFile = csv.writer(f)
myFile.writerow([unicode(s).encode("utf-8") for s in rows])
fp.close()
You can use mysqldump for this task. (Source for command)
mysqldump -u username -p --tab -T/path/to/directory dbname table_name --fields-terminated-by=','
The arguments are as follows:
-u username for the username
-p to indicate that a password should be used
-ppassword to give the password via command line
--tab Produce tab-separated data files
For mor command line switches see https://dev.mysql.com/doc/refman/5.5/en/mysqldump.html
To run it on a regular basis, create a cron task like written in the other answers.

Python PYODBC - Previous SQL was not a query

I have the following python code, it reads through a text file line by line and takes characters x to y of each line as the variable "Contract".
import os
import pyodbc
cnxn = pyodbc.connect(r'DRIVER={SQL Server};CENSORED;Trusted_Connection=yes;')
cursor = cnxn.cursor()
claimsfile = open('claims.txt','r')
for line in claimsfile:
#ldata = claimsfile.readline()
contract = line[18:26]
print(contract)
cursor.execute("USE calms SELECT XREF_PLAN_CODE FROM calms_schema.APP_QUOTE WHERE APPLICATION_ID = "+str(contract))
print(cursor.fetchall())
When including the line cursor.fetchall(), the following error is returned:
Programming Error: Previous SQL was not a query.
The query runs in SSMS and replace str(contract) with the actual value of the variable results will be returned as expected.
Based on the data, the query will return one value as a result formatted as NVARCHAR(4).
Most other examples have variables declared prior to the loop and the proposed solution is to set NO COUNT on, this does not apply to my problem so I am slightly lost.
P.S. I have also put the query in its own standalone file without the loop to iterate through the file in case this was causing the problem without success.
In your SQL query, you are actually making two commands: USE and SELECT and the cursor is not set up with multiple statements. Plus, with database connections, you should be selecting the database schema in the connection string (i.e., DATABASE argument), so TSQL's USE is not needed.
Consider the following adjustment with parameterization where APPLICATION_ID is assumed to be integer type. Add credentials as needed:
constr = 'DRIVER={SQL Server};SERVER=CENSORED;Trusted_Connection=yes;' \
'DATABASE=calms;UID=username;PWD=password'
cnxn = pyodbc.connect(constr)
cur = cnxn.cursor()
with open('claims.txt','r') as f:
for line in f:
contract = line[18:26]
print(contract)
# EXECUTE QUERY
cur.execute("SELECT XREF_PLAN_CODE FROM APP_QUOTE WHERE APPLICATION_ID = ?",
[int(contract)])
# FETCH ROWS ITERATIVELY
for row in cur.fetchall():
print(row)
cur.close()
cnxn.close()

python - does cx_Oracle allow you to force all columns to be cx_Oracle.STRING?

This is a small snippet of python code (not the entire thing) to write results to a file. But because my table that I'm querying has some TIMESTAMP(6) WITH LOCAL TIME ZONE datatypes, the file is storing the values in a different format ie '2000-5-15 0.59.8.843679000' instead of '15-MAY-00 10.59.08.843679000 AM'. Is there a way to force it to write to the file as if the datatype were a VARCHAR (ie cx_Oracle.STRING or otherwise so that the file has the same content as querying through a client tool)?
db = cx_Oracle.connect(..<MY CONNECT STRING>.)
cursor = db.cursor()
file = open('C:/blah.csv', "w")
r = cursor.execute(<MY SQL>)
for row in cursor:
writer.writerow(row)
Could you use to_char inside your query? That way it will be forced to STRING type.
r = cursor.execute("select to_char( thetime, 'DD-MON-RR HH24.MI.SSXFF' ) from my_table")

Categories

Resources