Airflow: mistake "decoding with 'utf-16le'" when using several connections - python

Have a problem with my ETL process.
I've got ETL process, written in python and it works great, but operations
starts one after another, so the whole process lasts much time.
I'm slightly new in Apache Airflow, but I've made a DUG and there is a problem
with him)
I get a mistake:
File "/usr/lib/python3.8/encodings/utf_16_le.py", line 15, in decode
def decode(input, errors='strict'):
File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 1543, in signal_handler
raise AirflowException("Task received SIGTERM signal")
airflow.exceptions.AirflowException: Task received SIGTERM signal
The above exception was the direct cause of the following exception:
airflow.exceptions.AirflowException: decoding with 'utf-16le' codec failed (AirflowException: Task received SIGTERM signal)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
SystemError: <class 'pyodbc.Error'> returned a result with an error set
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 896, in _rollback_impl
self.engine.dialect.do_rollback(self.connection)
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 666, in do_rollback
dbapi_connection.rollback()
pyodbc.OperationalError: ('08S01', '[08S01] [Microsoft][ODBC Driver 18 for SQL Server]Communication link failure (0) (SQLEndTran)')
Here is a code of my Task. There can be up to 10 connections at once:
def update_from_gladiator_ost(market_id):
query = "DELETE from [stage].[dbo].[rests_by_docs_temp] where market_id = %d" % market_id
execute_query_dwh(query)
engine = dwh_conn()
connection = engine.raw_connection()
abc = connection.cursor()
# abc.execute("DELETE from [stage].[dbo].[sell_movement_temp]; DELETE from [stage].[dbo].[rests_by_docs_temp]")
df_op = pd.read_sql(
"SET NOCOUNT ON exec [dbo].[mp_report_finance_agent_enhanced_basis_transport_royalty_NC_ost_by_docs4] #pmarket_id = %d, #pstart_date = '%s', #pend_date = '%s', #pselect = '1'" % (
market_id, z, w), gladiator_conn())
df_op = df_op.fillna(value=0)
for row_count in range(0, df_op.shape[0]):
chunk = df_op.iloc[row_count:row_count + 1, :].values.tolist()
tuple_of_tuples = tuple(tuple(x) for x in chunk)
abc.executemany(
"insert into stage.dbo.rests_by_docs_temp" + " ([date_start],[market_id],[good_id],[agent_id],[doc_id],[tstart_qty],[tstart_amt],[IMP],[doc_name]) values (?,?,?,?,?,?,?,?,?)",
tuple_of_tuples)
abc.commit()
connection.close()
As you see, I get data from database and INSERT it in my DWH
And here is my connections:
def dwh_conn():
mySQL = '192.168.240.1'
myDB = 'DWH'
login = 'sa'
PWD = '....'
Encrypt = 'No'
Certificate = 'Yes'
params = urllib.parse.quote_plus("DRIVER={ODBC Driver 18 for SQL Server};"
"SERVER=" + mySQL + ";"
"SERVER=" + mySQL + ";"
"Port=1433" + ";"
"DATABASE=" + myDB + ";"
"UID=" + login + ";"
"PWD=" + PWD + ";"
"Encrypt=" + Encrypt + ";"
"TrustServerCertificate=" + Certificate + ";")
engine = sa.create_engine('mssql+pyodbc:///?odbc_connect={}?charset=utf8mb4'.format(params), fast_executemany=True)
return engine
def gladiator_conn():
mySQL = '...'
myDB = '...'
login = '...'
PWD = '...'
Encrypt = 'No'
Certificate = 'Yes'
params = urllib.parse.quote_plus("DRIVER={ODBC Driver 18 for SQL Server};"
"SERVER=" + mySQL + ";"
"Port=1433" + ";"
"DATABASE=" + myDB + ";"
"UID=" + login + ";"
"PWD=" + PWD + ";"
"Encrypt=" + Encrypt + ";"
"TrustServerCertificate=" + Certificate + ";")
engine = sa.create_engine('mssql+pyodbc:///?odbc_connect={}?charset=utf8mb4'.format(params), fast_executemany=True)
return engine
I think the problem is in unixODBC. Because when I do the whole code in Pycharm on Windows - everythong is fine.
But on docker Ubuntu/Airflow - it sometimes fails.
I can restart the task which failed and it can go fine but can fail again
updated:
I guess, I Found one solution but I cant realize it on my case.
def decode_sketchy_utf16(raw_bytes):
s = raw_bytes.decode("utf-16le", "ignore")
try:
n = s.index('\u0000')
s = s[:n] # respect null terminator
except ValueError:
pass
return s
# ...
prev_converter = cnxn.get_output_converter(pyodbc.SQL_WVARCHAR)
cnxn.add_output_converter(pyodbc.SQL_WVARCHAR, decode_sketchy_utf16)
col_info = crsr.columns("Clients").fetchall()
cnxn.add_output_converter(pyodbc.SQL_WVARCHAR, prev_converter) # restore previous behaviour
Help me how to make it work in my code? Where should I implement it?

Found an answer. These problem rises when I'm lack of memory (operative). Especially when several containers on, it might go to this error

Related

Python Flask Send Multiple Files as a response

Im making a backend script to manage some stuff for a front end. I try to send multiple pictures back as a response to a HTTP request but i get an error.
Python code on the server:
#app.route('/get_place_content/<locid>', methods=['GET']) # type: ignore
def placecontent(locid):
mydb = fun.connectDB()
cursor = mydb.cursor() #type: ignore
token = request.headers.get('token')
if fun.authenticate(token) or True:
query = "select photos,videos,mapscreen from locdic where locid = " + str(locid)
cursor.execute(query)
for entry in cursor:
pic = entry[0]
print(pic)
vid = entry[1]
print(vid)
pics = pic.split(',')
vids = vid.split(',')
map = entry[2]
print(str(pics) + str(vids) + str(map))
return (send_file(os.path.join(app.config['UPLOAD_FOLDER2'], pics[0]), mimetype='image/gif'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], pics[1]), mimetype='image/gif'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], pics[2]), mimetype='image/gif'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], vids[0]), mimetype='video/mp4'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], map), mimetype='image/gif')),202
and i get this back:
connect to database ---- OK connect to database ---- OK disconnect from database ---- OK 9d3e358f8770edde764603a2ffa1726bdfc3a8e0_2490.jpg,202426be23c5e3d5photo_2023-01-17_16-50-13.jpg,4f70fc06barcodes.jpg 4a0ba73fredditsave.com_this_remote_surgery_was_made_by_a_london_surgeon-iymq8cxnanba1.mp4 ['9d3e358f8770edde764603a2ffa1726bdfc3a8e0_2490.jpg', '202426be23c5e3d5photo_2023-01-17_16-50-13.jpg', '4f70fc06barcodes.jpg']['4a0ba73fredditsave.com_this_remote_surgery_was_made_by_a_london_surgeon-iymq8cxnanba1.mp4']5622e8eaScreenshot 2023-02-02 103254.png [2023-02-02 10:49:29,060] ERROR in app: Exception on /get_place_content/10 [GET] Traceback (most recent call last): File "/home/blank/.local/lib/python3.10/site-packages/flask/app.py", line 2525, in wsgi_app response = self.full_dispatch_request() File "/home/blank/.local/lib/python3.10/site-packages/flask/app.py", line 1823, in full_dispatch_request return self.finalize_request(rv) File "/home/blank/.local/lib/python3.10/site-packages/flask/app.py", line 1842, in finalize_request response = self.make_response(rv) File "/home/blank/.local/lib/python3.10/site-packages/flask/app.py", line 2170, in make_response raise TypeError( TypeError: The view function did not return a valid response. The return type must be a string, dict, list, tuple with headers or status, Response instance, or WSGI callable, but it was a tuple. 80.106.135.81 - - [02/Feb/2023 10:49:29] "GET /get_place_content/10 HTTP/1.1" 500 -
the "fun" thing in the code is an external script i made so i dont have to type the same stuff over and over again.
Thank you in advance
EDIT:
tried to change the formating to this but still no luck
#app.route('/get_place_content/<locid>', methods=['GET']) # type: ignore
def placecontent(locid):
mydb = fun.connectDB()
cursor = mydb.cursor() #type: ignore
token = request.headers.get('token')
if fun.authenticate(token) or True:
query = "select photos,videos,mapscreen from locdic where locid = " + str(locid)
cursor.execute(query)
for entry in cursor:
pic = entry[0]
vid = entry[1]
pics = pic.split(',')
vids = vid.split(',')
map = entry[2]
print(str(pics) + str(vids) + str(map))
pic1 = pics[0]
pic2 = pics[1]
pic3 = pics[2]
vid1 = vids[0]
fun.disconnectDB(False,mydb)
return (send_file(os.path.join(app.config['UPLOAD_FOLDER2'], pic1), mimetype='image/gif'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], pic2), mimetype='image/gif'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], pic3), mimetype='image/gif'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], vid1), mimetype='video/mp4'),send_file(os.path.join(app.config['UPLOAD_FOLDER2'], map), mimetype='image/gif')),202

Is it possible to view a postgresql table using Pandas?

def connectxmlDb(dbparams):
conn_string = "host='{}' dbname='{}' user='{}' password='{}' port='{}'"\
.format(dbparams['HOST'], dbparams['DB'], dbparams['USERNAME'], dbparams['PASSWORD'], dbparams['PORT'])
try:
conn = psycopg2.connect(conn_string)
pass
except Exception as err:
print('Connection to Database Failed : ERR : {}'.format(err))
return False
print('Connection to Database Success')
return conn
dbconn = connectxmlDb(params['DATABASE'])
The above mentioned is the code I use to connect to postgresql database
sql_statement = """ select a.slno, a.clientid, a.filename, a.user1_id, b.username, a.user2_id, c.username as username2, a.uploaded_ts, a.status_id
from masterdb.xmlform_joblist a
left outer join masterdb.auth_user b
on a.user1_id = b.id
left outer join masterdb.auth_user c
on a.user2_id = c.id
"""
cursor.execute(sql_statement)
result = cursor.fetchall()
This is my code to extract data from a postgres table using python.
I want to know if it is possible to view this data using Pandas
This is the code I used:
df = pd.read_sql_query(sql_statement, dbconn)
print(df.head(10))
but it is showing error.
C:\Users\Lukmana\AppData\Local\Programs\Python\Python310\lib\site-
packages\pandas\io\sql.py:762: UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
warnings.warn(
Traceback (most recent call last):
File "C:\Users\Lukmana\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\sql.py", line 2023, in execute
cur.execute(*args, **kwargs)
psycopg2.errors.SyntaxError: syntax error at or near ";"
LINE 1: ...ELECT name FROM sqlite_master WHERE type='table' AND name=?;
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\Users\Lukmana\Desktop\user_count\usercount.py", line 55, in <module>
df = pd.read_sql_table(result, dbconn)
File "C:\Users\Lukmana\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\sql.py", line 286, in read_sql_table
if not pandas_sql.has_table(table_name):
File "C:\Users\Lukmana\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\sql.py", line 2200, in has_table
return len(self.execute(query, [name]).fetchall()) > 0
File "C:\Users\Lukmana\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\sql.py", line 2035, in execute
raise ex from exc
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': syntax error at or near ";"
LINE 1: ...ELECT name FROM sqlite_master WHERE type='table' AND name=?;

i am trying to unload data from snowflake internal stage to unix file path using COPY INTO and GET command, but getting error

I am running all the sql scripts under the scripts path in a for loop and copying the data into #priya_stage area in snowflake and then using GET command , i am unloading data from stage area to my Unix path in csv format. But I am getting error.
Note: this same code works on my MAC but not on unix server.
import logging
import os
import snowflake.connector
from snowflake.connector import DictCursor as dict
from os import walk
try:
conn = snowflake.connector.connect(
account = 'xxx' ,
user = 'xxx' ,
password = 'xxx' ,
database = 'xxx' ,
schema = 'xxx' ,
warehouse = 'xxx' ,
role = 'xxx' ,
)
conn.cursor().execute('USE WAREHOUSE xxx')
conn.cursor().execute('USE DATABASE xxx')
conn.cursor().execute('USE SCHEMA xxx')
take = []
scripts = '/xxx/apps/xxx/xxx/scripts/snow/scripts/'
os.chdir('/xxx/apps/xxx/xxx/scripts/snow/scripts/')
for root , dirs , files in walk(scripts):
for file in files:
inbound = file[0:-4]
sql = open(file , 'r').read()
# file_number = 0
# file_number += 1
file_prefix = 'bridg_' + inbound
file_name = file_prefix
result_query = conn.cursor(dict).execute(sql)
query_id = result_query.sfqid
sql_copy_into = f'''
copy into #priya_stage/{file_name}
from (SELECT * FROM TABLE(RESULT_SCAN('{query_id}')))
DETAILED_OUTPUT = TRUE
HEADER = TRUE
SINGLE = FALSE
OVERWRITE = TRUE
max_file_size=4900000000'''
rs_copy_into = conn.cursor(dict).execute(sql_copy_into)
for row_copy in rs_copy_into:
file_name_in_stage = row_copy["FILE_NAME"]
sql_get_to_local = f"""
GET #priya_stage/{file_name_in_stage} file:///xxx/apps/xxx/xxx/inbound/zip_files/{inbound}/"""
rs_get_to_local = conn.cursor(dict).execute(sql_get_to_local)
except snowflake.connector.errors.ProgrammingError as e:
print('Error {0} ({1}): {2} ({3})'.format(e.errno , e.sqlstate , e.msg , e.sfqid))
finally:
conn.cursor().close()
conn.close()
Error
Traceback (most recent call last):
File "Generic_local.py", line 52, in <module>
rs_get_to_local = conn.cursor(dict).execute(sql_get_to_local)
File "/usr/local/lib64/python3.6/site-packages/snowflake/connector/cursor.py", line
746, in execute
sf_file_transfer_agent.execute()
File "/usr/local/lib64/python3.6/site-
packages/snowflake/connector/file_transfer_agent.py", line 379, in execute
self._transfer_accelerate_config()
File "/usr/local/lib64/python3.6/site-
packages/snowflake/connector/file_transfer_agent.py", line 671, in
_transfer_accelerate_config
self._use_accelerate_endpoint = client.transfer_accelerate_config()
File "/usr/local/lib64/python3.6/site-
packages/snowflake/connector/s3_storage_client.py", line 572, in
transfer_accelerate_config
url=url, verb="GET", retry_id=retry_id, query_parts=dict(query_parts)
File "/usr/local/lib64/python3.6/site-
packages/snowflake/connector/s3_storage_client.py", line 353, in _.
send_request_with_authentication_and_retry
verb, generate_authenticated_url_and_args_v4, retry_id
File "/usr/local/lib64/python3.6/site-
packages/snowflake/connector/storage_client.py", line 313, in
_send_request_with_retry
f"{verb} with url {url} failed for exceeding maximum retries."
snowflake.connector.errors.RequestExceedMaxRetryError: GET with url b'https://xxx-
xxxxx-xxx-x-customer-stage.xx.amazonaws.com/https://xxx-xxxxx-xxx-x-customer-
stage.xx.amazonaws.com/?accelerate' failed for exceeding maximum retries.
This link redirects me to a error message .
https://xxx-
xxxxx-xxx-x-customer-stage.xx.amazonaws.com/https://xxx-xxxxx-xxx-x-customer-
stage.xx.amazonaws.com/?accelerate
Access Denied error :
<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>1X1Z8G0BTX8BAHXK</RequestId>
<HostId>QqdCqaSK7ogAEq3sNWaQVZVXUGaqZnPv78FiflvVzkF6nSYXTSKu3iSiYlUOU0ka+0IMzErwGC4=</HostId>
</Error>

Error trying to drop a database using MySQLdb

I'm trying drop a possible database (if exists), and create a new one. But got some errors using MySQLdb.
I'm using a class called 'Repository' to execute my code, and all my database information is on __init__(self): function.
__init__(self):
def __init__(self):
file_name = 'Repositories.json'
json_file = open(file_name)
self.json_data = json.load(json_file)
self.configuration = Configuration.Config()
self.database_host,self.database_user,self.database_password = self.configuration.get_database()
database_connect = MySQLdb.connect(host = self.database_host, user = self.database_user, passwd = self.database_password)
self.database = database_connect.cursor()
Function:
def create_database(self,repository,repository_host):
database_name = repository + '_' + repository_host
database_drop_query = 'DROP DATABASE IF EXISTS ' + str(database_name)
database_create_query = 'CREATE DATABASE ' + str(database_name)
self.database.execute(database_drop_query)
self.database.execute(database_create_query)
self.database.close()
return database_name
Error:
File "Repository.py", line 46, in create_database
self.database.execute(database_drop_query)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 154, in execute
db = self._get_db()
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 136, in _get_db
self.errorhandler(self, ProgrammingError, "cursor closed")
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: cursor closed
I am unable to interpret this error, or find a possible solution . Could anyone help me ?

AttributeError:__exit__ on python 3.4

Original Code:
import sys
import os
import latexmake
import mysql.connector
conn = mysql.connector.connect(user='root',password='oilwell',host='localhost',database='sqlpush1')
with conn:
mycursor = conn.cursor()
mycursor=execute("SELECT DATE,oil,gas,oilprice,gasprice,totrev FROM results WHERE DATE BETWEEN '2011-01-01' AND '2027-12-01'")
rows = mycursor.fetchall()
a.write("\\documentclass{standalone}\\usepackage{booktabs}\n\n\\usepackage{siunitx}\r \n\
\r\n\\begin{document}\r\n\\begin{tabular}{ccS[table-format = 5.2]} \\\\ \\toprule\r")
a.write("Date & Oil & Gas & Oil price & Gas price & Total Revenue \\\\ \\midrule \r")
for row in rows:
a = open("testtest.tex", "w")
a.write("" + str(row[0]) + " & " + str(row[1]) + " & " + str(row[2]) + " & " + str(row[3]) + " & " + str(row[4]) + " & " + str(row[5]) + " \\\\ \r")
a.write("\\bottomrule \\end{tabular}\r\\end{document}")
a.close
print (os.path.getsize("testtest.tex"))
os.system('latexmk.py -q testtest.tex')
mycursor.close()
conn.close()
a.close()
After run by IDLE, and red error pop up like
Traceback (most recent call last):
File "C:\Users\Cheng XXXX\Desktop\tabletest.py", line 8, in <module>
with conn:
AttributeError: __exit__
I checked the file and cannot file mistake, need help.
You are trying to use the connection as a context manager:
with conn:
This object doesn't implement the necessary methods to be used like that; it is not a context manager, as it is missing (at least) the __exit__ method.
If you are reading a tutorial or documentation that uses a different MySQL library, be aware that this feature may be supported by some libraries, just not this one. The MySQLdb project does support it, for example.
For your specific case, you don't even need to use the with conn: line at all; you are not making any changes to the database, no commit is required anywhere. You can safely remove the with conn: line (unindent everything under it one step). Otherwise you can replace the context manager with a manual conn.commit() elsewhere.
Alternatively, you can create your own context manager for this use-case, using the #contextlib.contextmanager() decorator:
from contextlib import contextmanager
#contextmanager
def manage_transaction(conn, *args, **kw):
exc = False
try:
try:
conn.start_transaction(*args, **kw)
yield conn.cursor()
except:
exc = True
conn.rollback()
finally:
if not exc:
conn.commit()
and use this as:
with manage_transaction(conn) as cursor:
# do things, including creating extra cursors
where you can pass in extra arguments for the connection.start_transaction() call.

Categories

Resources