Data loaded to MySQL via python disappears - python

I've looked around to see whether anyone had this problem but looks like not! Basically my problem is as follows:
I try loading data into MYSQL db using the MySQLdb library for python
I seem to succeed, since I'm able to retrieve the items I loaded within the same python instance
ONce the python code is run and closed, when I try to retrieve the data either by running a query in MySQL workbench or by running a python code in command prompt, I cannot retrieve the data..
So in summary, I do load the data in, but the moment I close the python instance, the data seems to disappear..
To try to isolate the problem later, I placed a time.sleep(60) line into my code, so that once the python code loads the data, I can go and try retrieving the data from MYSQL workbench using queries, but I still cant..
I thought perhaps I'm saving data into different instances, but I checked things like "port" etc. and they are identical!..
I've spent 4-5 hours trying to figure out, but starting to lose hope.. Help much apperciated.. Please find below my code:
db = MySQLdb.connect("localhost","root","password","mydb")
cursor = db.cursor()
cursor.execute("SELECT VERSION()")
data = cursor.fetchone()
print data
cursor.execute("LOAD DATA LOCAL INFILE "+ "filepath/file.txt" +" INTO TABLE addata FIELDS TERMINATED BY ';' LINES TERMINATED BY '\r\n'")
data = cursor.fetchall()
print data ###At this point data displays warnings etc
cursor.execute("select * from addata")
data = cursor.fetchmany(10)
print data ###Here I can see that the data is loaded
time.sleep(60) ##Here while the code is sleeping I go to mysql workbench and try the query "select * from addata".. It returns nothing:(

You almost certainly need to commit the data after you have loaded it.
If your program exits without committing the data, the DB will roll back your transaction, on the assumption that something has gone wrong.
You may be able to set autocommit as part of your connection request, otherwise you should call 'commit()' via your cursor object.

Related

Trying to understand why Sqlite3 queries through Anaconda/python (JupyterLab) take ~1,000 times longer than if ran through DB Browser

I have a Python script to import data from raw csv/xlsx files. For these I use Pandas to load the files, do some light transformation, and save to an sqlite3 database. This is fast (as fast as any method). After this, I run some queries against these to make some intermediate datasets. These I run through a function (see below).
More information: I am using Anaconda/Python3 (3.9) on Windows 10 Enterprise.
UPDATE:
Just as information for anybody reading this, I ended up going back to
just using standalone python (still using JupyterLab though)... I no
longer have this issue. So not sure if it is a problem with something
Anaconda does or just the versions of various libraries being used for
that particular Anaconda distribution (latest available). My script
runs more or less in the time that I would expect using Python 3.11
and the versions pulled in by pip for Pandas and sqlite (1.5.3 and
3.38.4).
Python function for running sqlite3 queries:
def runSqliteScript(destConnString, queryString):
'''Runs an sqlite script given a connection string and a query string
'''
try:
print('Trying to execute sql script: ')
print(queryString)
cursorTmp = destConnString.cursor()
cursorTmp.executescript(queryString)
except Exception as e:
print('Error caught: {}'.format(e))
Because somebody asked, here is the function that creates the "destConnString", though it's called something else in the actual function call, but is the same type.
def createSqliteDb(db_file):
''' Creates an sqlite database at direct/file name specified
'''
conSqlite = None
try:
conSqlite = sqlite3.connect(db_file)
return conSqlite
except Error as e:
print('Error {} when trying to create {}'.format(e, db_file))
Example of one of the queries (I commented out journal mode/synchronous pragmas after it didn't seem to help at all):
-- PRAGMA journal_mode = WAL;
-- PRAGMA synchronous = NORMAL;
BEGIN;
drop table if exists tbl_1110_cop_omd_fmd;
COMMIT;
BEGIN;
create table tbl_1110_cop_omd_fmd as
select
siteId,
orderNumber,
familyGroup01, familyGroup02,
count(*) as countOfLines
from tbl_0000_ob_trx_for_frazelle
where 1 = 1
-- and dateCreated between datetime('now', '-365 days') and datetime('now', 'localtime') -- temporarily commented due to no date in file
group by siteId,
orderNumber,
familyGroup01, familyGroup02
order by dateCreated asc
;
COMMIT
;
Here is a list of things that I have tried. Unfortunately, no matter what combination of things I have tried, it has ended up having one bottleneck or another. It seems there is some kind of write bottleneck from python to sqlite3, yet the pandas to_sql method doesn't seem to be affected by it. Complete list of all combinations of things that I have tried.
I tried wrapping all my queries in begin/commit statements. I put these in-line with the query, though I'd be interested in knowing if this is the correct way to do this. This seemed to have no effect.
I tried setting the journal mode to WAL and synchronous to normal, again to no effect.
I tried running the queries in an in-memory database.
Firstly, I tried creating everything from scratch in the in-memory database. The tables didn't create any faster. Saving this in-memory database seems to be a bottleneck (backup method).
Next, I tried creating views instead of tables (again, creating everything from scratch in the in-memory database). This created really quickly. Weirdly, querying these views was very fast. Saving this in-memory database seems to be a bottleneck (backup method).
I tried just writing views to the database file (not in-memory). Unfortunately, the views take as long as the make tables when running from Python/sqlite.
I don't really want to do anything strictly in-memory for the database creation, as this python script is used for different sets of data, some which could have too many rows for an in-memory setup. The only thing I have left to try is to take the in-memory from scratch setup, make views instead of tables, read ALL the in-memory db tables with pandas (from_sql), then write ALL the tables to a file db with pandas (to_sql)... Hoping there is something easy to try to resolve this problem.
connOBData = sqlite3.connect('file:cachedb?mode=memory?cache=shared')
These take approximately 1,000 times or more longer than if I run these queries directly in DB Browser (an sqlite frontend). These queries aren't that complex and run fine (in ~2-4 seconds) in DB Browser. All told, if I run all the queries in a row in DB Browser they'd run in 1-2 minutes. If I let them run through the Python script, it literally takes close to 20 hours. I'd expect the queries to finish in approximately the same time that they run in DB Browser.

Django project doesnt insert into Oracle table

I'm building a integration app that consumes data from a API and save the sensitive information into a table inside a oracle database. My models succesfully migrated and created the tables and I was able to also succesfully consume and filter the data I need from the API, so I proceeded to use objects.update_or_create to populate my table with the data, initially it worked fine and inserted the information normally until it got stuck and stoped the querys. After that I droped the tables and started the migration process anew, and also changed my method to objects.create with .save(force_insert=True) to brute force the process and insert the data inside the table, but the problem persisted and I'm kinda lost not knowing what is wrong mainly because it doesnt raise any error nor exception and just remains stuck into the block.
for item in value_list['itens']:
print(item)
i = Item.objects.using('adm_int').create(
nature=item['nature'],
nr_doc=item['nr_doc'],
name=item['name'],
value=item['value'],
type_op=item['type'],
description=item['history']['description'],
)
i.save(force_insert=True)
Inside the response from the API there'll be N number of itens, so I need to insert the data from each item into the table. When it begins the loop it doesnt insert the data and stops there.
I was able to solve this. I added a sleep at the end of my loop so that django would wait for the database to end the insert before running the loop one more time. What I think was happening is that the db was not able to keep up with the update of the app and was blocking the insert while holding the session up.

Execute sql insert statements via sql alchemy within for loop

I need to extract a bunch of records within a singlestore database and insert the records into another table. For performance, the ideal way to do this is to create a query string with an Insert Into statement and iterate through on a daily basis.
I can't seem to get python to execute the query in the database, but it appears to run successfully?
fh = 'file_containing_insert_select_query.sql'
qry = open(fh).read()
for i in range(2):
qry_new = some_custom_function_to_replace_dates(qry, i)
engine = tls.custom_engine_function()
engine.execute(qry_new)
I've verified that the sql statements created by my custom function can be copy/pasted to a sql editor and executed successfully, but it won't run in python... any thoughts?
After execution of the above query, you need to send a 'commit' query to database using connection.commit()
(Where connection contains the Database connection credentials and ip address) so it can save the no. of rows inserted via python program.
if you want it to run really fast, it’s usually better to use set-oriented SQL, like INSERT INTO targetTbl … SELECT FROM …;
That way, it doesn’t have to round-trip through the client app.

why cursor.fetchone() or cursor.fetchall() not returning/reading data?

I am learning python and i am following this tutorial. i have completed part one and now i am doing the second one. I have git repo for the same here. Currently i am stuck at this weird problem. i am trying to get data from mysql using stored procedure. i am getting data in cursor but when i call fetchall or fetchone i get no data.
con = mysql.connect()
cursor = con.cursor()
cursor.callproc('sp_validateLogin',(_username,))
data = cursor.fetchone() #tried fetchall here, bit no data in "data" vriable
here is the screenshot of data fetched in cursor.
so everytime code jumps to else, which should go to if. how to overcome this?
Dev environment: Win10, VS2013 with PTVS, Python v 2.7
i have gone thru
this, this and so many like those on internet.
after lots of searching and experimenting this code works.
data = cursor._rows
here i can access data fetched from mysql into the cursor. still unable to work with .fetchall() or .fetchone()

issues with Mysql python code,when restarting python script after abnormal abort

I am writing python code to get movies data from dbpedia and put in MySql database.
presently, my application is having issues and it is hanging in between or if it aborts, then next time I restart, there some issues occur when droping the table and creating table in database.
Is it because last time connection with the database didnt get close?
If so then what can be possible solution to avoid this issue when restarting the application
What python library are you using to access MySQL? If you are using MySQLdb, then to be sure everything is written correctly, you need to use the "close" method of your cursor and the "commit" method of the connection. For example,
import MySQLdb
conn = MySQLdb.connect(user="username",pass="password",db="dbname")
cur = conn.cursor()
# Work with your cur object to do what you want
cur.close()
conn.commit()

Categories

Resources