Insert or update rows in MS Access database in Python - python

I've got an MS Access table (SearchAdsAccountLevel) which needs to be updated frequently from a python script. I've set up the pyodbc connection and now I would like to UPDATE/INSERT rows from my pandas df to the MS Access table based on whether the Date_ AND CampaignId fields match with the df data.
Looking at previous examples I've built the UPDATE statement which uses iterrows to iterate through all rows within df and execute the SQL code as per below:
connection_string = (
r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
r"c:\AccessDatabases\Database2.accdb;"
)
cnxn = pyodbc.connect(connection_string, autocommit=True)
crsr = cnxn.cursor()
for index, row in df.iterrows():
crsr.execute("UPDATE SearchAdsAccountLevel SET [OrgId]=?, [CampaignName]=?, [CampaignStatus]=?, [Storefront]=?, [AppName]=?, [AppId]=?, [TotalBudgetAmount]=?, [TotalBudgetCurrency]=?, [DailyBudgetAmount]=?, [DailyBudgetCurrency]=?, [Impressions]=?, [Taps]=?, [Conversions]=?, [ConversionsNewDownloads]=?, [ConversionsRedownloads]=?, [Ttr]=?, [LocalSpendAmount]=?, [LocalSpendCurrency]=?, [ConversionRate]=?, [Week_]=?, [Month_]=?, [Year_]=?, [Quarter]=?, [FinancialYear]=?, [RowUpdatedTime]=? WHERE [Date_]=? AND [CampaignId]=?",
row['OrgId'],
row['CampaignName'],
row['CampaignStatus'],
row['Storefront'],
row['AppName'],
row['AppId'],
row['TotalBudgetAmount'],
row['TotalBudgetCurrency'],
row['DailyBudgetAmount'],
row['DailyBudgetCurrency'],
row['Impressions'],
row['Taps'],
row['Conversions'],
row['ConversionsNewDownloads'],
row['ConversionsRedownloads'],
row['Ttr'],
row['LocalSpendAmount'],
row['LocalSpendCurrency'],
row['ConversionRate'],
row['Week_'],
row['Month_'],
row['Year_'],
row['Quarter'],
row['FinancialYear'],
row['RowUpdatedTime'],
row['Date_'],
row['CampaignId'])
crsr.commit()
I would like to iterate through each row within my df (around 3000) and if the ['Date_'] AND ['CampaignId'] match I UPDATE all other fields. Otherwise I want to INSERT the whole df row in my Access Table (create new row). What's the most efficient and effective way to achieve this?

Consider DataFrame.values and pass list into an executemany call, making sure to order columns accordingly for the UPDATE query:
cols = ['OrgId', 'CampaignName', 'CampaignStatus', 'Storefront',
'AppName', 'AppId', 'TotalBudgetAmount', 'TotalBudgetCurrency',
'DailyBudgetAmount', 'DailyBudgetCurrency', 'Impressions',
'Taps', 'Conversions', 'ConversionsNewDownloads', 'ConversionsRedownloads',
'Ttr', 'LocalSpendAmount', 'LocalSpendCurrency', 'ConversionRate',
'Week_', 'Month_', 'Year_', 'Quarter', 'FinancialYear',
'RowUpdatedTime', 'Date_', 'CampaignId']
sql = '''UPDATE SearchAdsAccountLevel
SET [OrgId]=?, [CampaignName]=?, [CampaignStatus]=?, [Storefront]=?,
[AppName]=?, [AppId]=?, [TotalBudgetAmount]=?,
[TotalBudgetCurrency]=?, [DailyBudgetAmount]=?,
[DailyBudgetCurrency]=?, [Impressions]=?, [Taps]=?, [Conversions]=?,
[ConversionsNewDownloads]=?, [ConversionsRedownloads]=?, [Ttr]=?,
[LocalSpendAmount]=?, [LocalSpendCurrency]=?, [ConversionRate]=?,
[Week_]=?, [Month_]=?, [Year_]=?, [Quarter]=?, [FinancialYear]=?,
[RowUpdatedTime]=?
WHERE [Date_]=? AND [CampaignId]=?'''
crsr.executemany(sql, df[cols].values.tolist())
cnxn.commit()
For the insert, use a temp, staging table with exact structure as final table which you can create with make-table query: SELECT TOP 1 * INTO temp FROM final. This temp table will be regularly cleaned out and inserted with all data frame rows. The final query migrates only new rows from temp into final with NOT EXISTS, NOT IN, or LEFT JOIN/NULL. You can run this query anytime and never worry about duplicates per Date_ and CampaignId columns.
# CLEAN OUT TEMP
sql = '''DELETE FROM SearchAdsAccountLevel_Temp'''
crsr.executemany(sql)
cnxn.commit()
# APPEND TO TEMP
sql = '''INSERT INTO SearchAdsAccountLevel_Temp (OrgId, CampaignName, CampaignStatus, Storefront,
AppName, AppId, TotalBudgetAmount, TotalBudgetCurrency,
DailyBudgetAmount, DailyBudgetCurrency, Impressions,
Taps, Conversions, ConversionsNewDownloads, ConversionsRedownloads,
Ttr, LocalSpendAmount, LocalSpendCurrency, ConversionRate,
Week_, Month_, Year_, Quarter, FinancialYear,
RowUpdatedTime, Date_, CampaignId)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?);'''
crsr.executemany(sql, df[cols].values.tolist())
cnxn.commit()
# MIGRATE TO FINAL
sql = '''INSERT INTO SearchAdsAccountLevel
SELECT t.*
FROM SearchAdsAccountLevel_Temp t
LEFT JOIN SearchAdsAccountLevel f
ON t.Date_ = f.Date_ AND t.CampaignId = f.CampaignId
WHERE f.OrgId IS NULL'''
crsr.executemany(sql)
cnxn.commit()

Related

pyodbc insert into Access database with for loop

Trying to use python run code to write data to MS Access database
showing some syntax error. can someone help me? thanks.
df = pd.read_excel(path, sheet_name='Sheet1')
acc_cnxn = pyodbc.connect(conn_str)
acc_crsr = acc_cnxn.cursor()
for index, row in df.iterrows():
acc_crsr.executemany("""INSERT INTO testing
(
[Col. Version],
Version,Entity,
Segment,
LOB,
[Week Starting],
Week,
Day,
Date,
Interval,
[col beta],
[col alpha],
)
values
(
?,
?,
?,
?,
?,
?,
?,
?,
?,
?,
?,
?
);""", row.Col. Version, row.Version, row.Entity, row.Segment, row.LOB, row.Week Starting, row.Week, row.Day, row.Date, row.Interval, row.col beta, row.col alpha)
acc_crsr.commit()
showing this error:
SyntaxError: invalid syntax
You cannot use the dot operator to access columns with a space or special character in the name, you should be using row["Col. Version"] in this case.

python list to database column (sqlite)

I have a list called
fullpricelist=[373.97, 381.0, 398.98, 402.98, 404.98, 457.97, 535.99, 550.97, 566.98]
I would like to write this list into a slqlite database column, I found the following code from another question and changed it to my situation.
cursor.executemany("""INSERT INTO cardata (fullprice) VALUES (?)""",
zip(fullpricelist))
My current script is this
for name, name2, image in zip(car_names, car_names2, images):
cursor.execute(
"insert into cardata (carname, carmodel, imageurl, location, Fro, T, companyid) values (?, ?, ?, ?, ?, ?, ?)",
(name.text, name2.text, image.get_attribute('src'), location, pickup_d, return_d, Rental_ID)
)
But now I am confused how to add these codes together
In your second piece of code, execute() is called and one specific object is stored in the database each loop iteration. This is slow and inefficient.
for price in fullpricelist:
cursor.execute("""INSERT INTO cardata (fullprice) VALUES (?)""", price)
executemany() reads from an iterable and adds each element of the iterable to the database as a distinct row. If you add many elements to a database and care about efficiency, you want to use executemany()
cursor.executemany("""INSERT INTO cardata (fullprice) VALUES (?)""", fullpricelist)
If you want to include the other columns in your question, your code will be
cursor.executemany("""INSERT INTO cardata (carname, carmodel, imageurl, location, Fro, T, companyid) values (?, ?, ?, ?, ?, ?, ?)""",
[
[name.text for name in car_names],
[name.text for name in car_names2],
[image.get_attribute('src') for image in images],
[location]*len(car_names),
[pickup_d]*len(car_names),
[return_d]*len(car_names),
[Rental_ID]*len(car_names)
]
)
This assumes all values for location, pickup_d, return_d and Rental_ID are the same, as you did not provide a list of the values.

Shorten SQLite3 insert statement for efficiency and readability

From this answer:
cursor.execute("INSERT INTO booking_meeting (room_name,from_date,to_date,no_seat,projector,video,created_date,location_name) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", (rname, from_date, to_date, seat, projector, video, now, location_name ))
I'd like to shorten it to something like:
simple_insert(booking_meeting, rname, from_date, to_date, seat, projector, video, now, location_name)
The first parameter is the table name which can be read to get list of column names to format the first section of the SQLite3 statement:
cursor.execute("INSERT INTO booking_meeting (room_name,from_date,to_date,no_seat,projector,video,created_date,location_name)
Then the values clause (second part of the insert statement):
VALUES (?, ?, ?, ?, ?, ?, ?, ?)"
can be formatted by counting the number of column names in the table.
I hope I explained the question properly and you can appreciate the time savings of such a function. How to write this function in python? ...is my question.
There may already a simple_insert() function in SQLite3 but I just haven't stumbled across it yet.
If you're inserting into all the columns, then you don't need to specify the column names in the INSERT query. For that scenario, you could write a function like this:
def simple_insert(cursor, table, *args):
query = f'INSERT INTO {table} VALUES (' + '?, ' * (len(args)-1) + '?)'
cursor.execute(query, args)
For your example, you would call it as:
simple_insert(cursor, 'booking_meeting', rname, from_date, to_date, seat, projector, video, now, location_name)
Note I've chosen to pass cursor to the function, you could choose to just rely on it as a global variable instead.

PYODBC Insert Statement in MS Access DB Extremely slow

I am looking to speed up my insert statement into Access Db. the data is only 86500 records and is takikng maorew than 24 hours to process. The part of the code i am looking to speed up is comparing two tables for duplicates. If no duplicates are found then insert that row. I am running 64bit windows 10, 32bit python 2.7, 32bit ms access odbc driver and a 32 bit pyodbc module. Any help would be greatly appreciated the code sample is below.
def importDIDsACC():
"""Compare the Ledger to ImportDids to find any missing records"""
imdidsLst = []
ldgrLst = readMSAccess("ActivityNumber", "Ledger")
for x in readMSAccess("DISP_NUM", "ImportDids"):
if x not in ldgrLst and x not in imdidsLst:
didsLst.append(x)
#Select the records to import
if len(imdidsLst) > 0:
sql = ""
for row in imdidsLst:
sql += "DISP_NUM = '" + row[0]
cursor.execute("SELECT * FROM ImportDids WHERE " + sql)
rows = cursor.fetchall()
#Import to Ledger
dupChk = []
for row in rows:
if row[4] not in dupChk:
cursor.execute('INSERT into Ledger ([ActivityNumber], [WorkArea], [ClientName], [SurfacePurpose], [OpsApsDist], [AppDate], [LOADate], [EffDate], [AmnDate], [CanDate], [RenDate], [ExpDate], [ReiDate], [AmlDate], [DispType], [TRM], [Section], [Quarter], [Inspected_Date], [Inspection_Reason], [Inspected_By], [InspectionStatus], [REGION], [DOC], [STATCD]) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)',
str(row[1]), str(row[18]), str(row[17]), row[14], str(row[26]), row[4], row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], str(row[1][0:3]), trmCal(str(row[21]),str(row[20]), str(row[19])), str(row[22]), str(row[23]), inspSts(str(row[1]), 0),inspSts(str(row[1]), 1), inspSts(str(row[1]), 2), inspSts(str(row[1]), 3), str(row[27]), str(row[3]), str(row[13]))
dupChk.append(row[4])
cnxn.commit()
def readMSAccess(columns, table):
"""Select all records from the chosen field"""
sql = "SELECT "+ columns + " FROM " + table
cursor.execute(sql)
rows = cursor.fetchall()
return rows
def dbConn():
"""Connects to Access dataBase"""
connStr = """
DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};
DBQ=""" + getDatabasepath() + ";"
cnxn = pyodbc.connect(connStr)
cursor = cnxn.cursor()
return cursor, cnxn
def getDatabasepath():
"""get the path to the access database"""
mycwd = os.getcwd()
os.chdir("..")
dataBasePath = os.getcwd() + os.sep + "LandsAccessTool.accdb"
os.chdir(mycwd)
return dataBasePath
# Connect to the Access Database
cursor, cnxn = dbConn()
# Update the Ledger with any new records from importDids
importDIDsACC()
Don't use external code to check for duplicates. The power of a database (even Access) is maximizing its data-set operations. Don't try to rewrite that kind of code, especially since as you've discovered it is not efficient. Instead, import everything into a temporary database table, then use Access (or the appropriate Access Data Engine) to execute SQL statements to compare tables, either finding or excluding duplicate rows. Results of those queries can then be used to create and/or update other tables--all within context of the database engine. Of course set up the temporary table(s) with appropriate indexes and keys to maximize the efficiency.
In the mean time, it is usually faster (am I allowed to say always?) when comparing data sets locally (i.e. tables) to load all values into some searchable collection from a single database request (i.e. SQL SELECT statement), then use that in-memory collection to search for matches. This may seem ironic after my last statement about maximizing the database capabilities, but the big idea is understanding how the data set as a whole is being processed. Transporting the data back and forth between python processes and the database engine, even if it is on the same machine, will be much slower than either processing everything within the python process or everything within the database engine process. The only time that might not be useful is when the remote dataset is much too large to download, but 87,000 key values is definitely small enough to load all the values into a python collection.

Sqlite not using default values

When using:
import datetime
import sqlite3
db = sqlite3.connect('mydb.sqlite', detect_types=sqlite3.PARSE_DECLTYPES)
c = db.cursor()
db.text_factory = str
c.execute('create table if not exists mytable (date timestamp, title str, \
custom str, x float, y float, z char default null, \
postdate timestamp default null, id integer primary key autoincrement, \
url text default null)')
c.execute('insert into mytable values(?, ?, ?, ?, ?)', \
(datetime.datetime(2018,4,23,23,00), 'Test', 'Test2', 2.1, 11.1))
I have:
sqlite3.OperationalError: table mytable has 9 columns but 5 values were supplied
Why doesn't SQlite take default values (specified during table creation) in consideration to populate a new row?
(Also, as I'm reopening a project I wrote a few years ago, I don't find the datatypes str, char anymore in the sqlite3 doc, is it still relevant?)
Because you are saying that you want to insert all columns by not specifying the specific columns.
Change 'insert into mytable values(?, ?, ?, ?, ?)'
to 'insert into mytable (date, title, custom, x, y) values(?, ?, ?, ?, ?)'
Virtually any value for column type can be specified, the value will follow a set of rules and be converted to TEXT, INTEGER, REAL, NUMERIC or BLOB. However, you can store any type of value in any column.
STR will resolve to NUMERIC,
TIMESTAMP will resolve to NUMERIC,
FLOAT will resolve to REAL,
CHAR to TEXT.
Have a read of Datatypes In SQLite or perhaps have a look at How flexible/restricive are SQLite column types?
If you're going to only supply values for some columns, you need to specify which columns. Otherwise the engine won't know where to put them. This line needs to be changed:
c.execute('insert into mytable values(?, ?, ?, ?, ?)', \
(datetime.datetime(2018,4,23,23,00), 'Test', 'Test2', 2.1, 11.1))
To this:
c.execute('insert into mytable (date, title, custom, x, y)values(?, ?, ?, ?, ?)', \
(datetime.datetime(2018,4,23,23,00), 'Test', 'Test2', 2.1, 11.1))
Example Solution
cursor.execute('CREATE TABLE vehicles_record (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, timestamp DATETIME DEFAULT CURRENT_TIMESTAMP)')
SQLite3 Query Above
cursor.execute("INSERT INTO vehicles_record(name) VALUES(?)", (name))
Result
id would be 1, name would be value of name var, current timestamp for last column.

Categories

Resources