Related
I'm trying to upsert in SQL Server from python.
Basically I have scraped a website, converted it to DF and I'm already inserting it in my DB.
What I need: When there is data different from the scraped like the item price for example, then update it, and if the id does not exist, then insert.
Follows my code:
for index, row in df.iterrows():
cursor.execute("""INSERT INTO db_demo1.[dbo].[scrape]
(market, product_id, section_item, title_item, title_item_new, price_item,
qty, unit, sku, product_image, url, delivery_available,
delivery_long_distance, barcode, scrape_date) values(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
row.market, row.product_id, row.section_item, row.title_item, row.title_item_new,
row.price_item, row.qty, row.unit, row.sku, row.product_image, row.url,
row.delivery_available, row.delivery_long_distance, row.barcode, row.scrape_date)
cnxn.commit()
cursor.close()
The data that I have is something like this:
In this case, for example, it will check the product_id and see if the price_item or another column have changed, if it did, then it replaces the current value with the new one, and also update the scrape_date with the new date.
#Charlieface
I tried some solutions and none of them worked.
First I tried the articles that you posted, I have to change some parameters because it was raising some errors, but the final output:
'''
BEGIN TRANSACTION;
DECLARE #val as float
DECLARE #pid as int
DECLARE #pk as int
SET IDENTITY_INSERT scrape ON
select * from scrape;
UPDATE dbo.scrape WITH (UPDLOCK, SERIALIZABLE) SET [price_item] = #val, [product_id] = #pid WHERE id = #pk;
GO
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO dbo.scrape([id], [product_id], [price_item]) VALUES(1, 1000, 17.5);
SELECT * FROM scrape
END
SET IDENTITY_INSERT scrape OFF;
COMMIT TRANSACTION;
'''
ERRORS>
Msg 1088, Level 16, State 11, Line 11
Cannot find the object "scrape" because it does not exist or you do not have permissions.
Msg 208, Level 16, State 1, Line 19
Invalid object name 'dbo.scrape'.
I'm the ADM btw, dont get the permission stuff.
second one:
'''
INSERT INTO db_demo1.dbo.scrape(id, product_id, price_item) VALUES(1, 'X', 'X'); -- to be updated
SELECT * FROM db_demo1.dbo.scrape;
MERGE scrape trg
USING (VALUES ('1','2','3'),
('C','D','E'),
('F','G','H'),
('I','J','K')) src(id, product_id, price_item)
ON trg.id = src.id
WHEN MATCHED THEN
UPDATE SET product_id = src.product_id, price_item = src.price_item
WHEN NOT MATCHED THEN
INSERT(id, product_id, price_item)
VALUES(src.id, src.product_id, src.price_item);
SELECT * FROM scrape;
'''
Same error: Invalid object name 'db_demo1.dbo.scrape'.
Someone please can save me?
I'm busy trying to add data that is being collected from an API to a sqlite3 database and afterwards extracting the next page, loading that page and adding that data to the database until there is no more next page (next_is_after).
Loading the data is working well (thanks to another user) but when I'm trying to load it as a list the formatting is faulty and when I'm loading it as a dictionary it works well but I'm struggling to load it into the database.
I have no clue how to achieve the loop of loading the next page as it isn't a "normal" page number but a code (e.g. "WzExNDEuMTYsNTYwNTkzNTFd").
I have attached the code where I'm at below. I would again appreciate any help.
import requests
import json
import sqlite3
con = sqlite3.connect('takealot.db')
cur = con.cursor()
cur.execute('''CREATE TABLE IF NOT EXISTS products
(sku text PRIMARY KEY, title text, slug text, reviews integer, star_rating real, listing_price real, pretty_price real)''')
baseurl = 'https://api.takealot.com/rest/v-1-10-0/searches/products,filters,facets,sort_options,breadcrumbs,slots_audience,context,seo'
endpoint = '?after=WzExNDEuMTYsNTYwNTkzNTFd'
def main_request(baseurl, endpoint):
res = requests.get(baseurl + endpoint)
return res.json()
daten = main_request(baseurl, endpoint)
next_is_after = daten['sections']['products']['paging']['next_is_after']
prodlist = []
for data in daten['sections']['products']['results']:
prod = {
'sku': data['product_views']['core']['id'],
'title': data['product_views']['core']['title'],
'slug': data['product_views']['core']['slug'],
'reviews': data['product_views']['core']['reviews'],
'star_rating': data['product_views']['core']['star_rating'],
'listing_price': data['product_views']['buybox_summary']['listing_price'],
'pretty_price': data['product_views']['buybox_summary']['pretty_price']
}
prodlist.append(prod)
cur.executemany("INSERT OR IGNORE INTO products VALUES (?, ?, ?, ?, ?, ?, ?)", [prod['sku'], prod['title'], prod['slug'], prod['reviews'], prod['star_rating'], prod['listing_price'], prod['pretty_price']])
con.commit()
for row in cur.execute('''SELECT * FROM products'''):
print(row)
You just have to add a while loop around it like:
import requests
import json
import sqlite3
con = sqlite3.connect('takealot.db')
cur = con.cursor()
cur.execute('''CREATE TABLE IF NOT EXISTS products
(sku text PRIMARY KEY, title text, slug text, reviews integer, star_rating real, listing_price real, pretty_price real)''')
baseurl = 'https://api.takealot.com/rest/v-1-10-0/searches/products,filters,facets,sort_options,breadcrumbs,slots_audience,context,seo'
endpoint = '?after='
def main_request(baseurl, endpoint):
res = requests.get(baseurl + endpoint)
return res.json()
# This seems to work
next_is_after = "\"\""
while next_is_after != "" :
daten = main_request(baseurl, endpoint+next_is_after)
next_is_after = daten['sections']['products']['paging']['next_is_after']
prodlist = []
for data in daten['sections']['products']['results']:
prod = (\
data['product_views']['core']['id'],\
data['product_views']['core']['title'],\
data['product_views']['core']['slug'],\
data['product_views']['core']['reviews'],\
data['product_views']['core']['star_rating'],\
data['product_views']['buybox_summary']['listing_price'],\
data['product_views']['buybox_summary']['pretty_price'])
prodlist.append(prod)
cur.executemany("INSERT OR IGNORE INTO products VALUES (?, ?, ?, ?, ?, ?, ?)", prodlist)
con.commit()
for row in cur.execute('''SELECT * FROM products'''):
print(row)
So you get the next starting point from the response, where you will find the next starting point and so on.
This assumes that (like with the previous_is_before) when you reach the end its just an empty string.
I'm working in a project, and i'm facing a issue when trying to update my database...
I'm running the following command:
con = sqlite3.connect("DATASETS/SQLite.db")
cur = con.cursor()
with open("DATASETS/test.csv","r") as fin:
dr = csv.DictReader(fin, ["element1", "element2", "element3", "element4", "element5", "element6", "element7", "element8", "id"])
to_db = [(i["element1"], i["element2"], i["element3"], i["element4"], i["element5"], i["element6"], i["element7"], i["element8"], i["id"]) for i in dr]
cur.executemany("UPDATE Table SET element1 = ?, element2 = ?, element3 = ?, element4 = ?, element5 = ?, element6 = ?, element7 = ?, element8 = ? WHERE ID = ?;", to_db)
con.commit()
con.close()
The data is beeing filtered in a dataset csv with 6000+ rows and delimiter ",".
The code runs without errors, but, when i opened the DB to check, there ware not even a single row that was updated.
I do not know if this is a VSCode bug, a version error (Python 3.8.3), or anything like this, and i wnated to ask if anyone has already saw this.
Thanks!!
Dears,
how can I check if pos_cli from database is equal to variable pos_id? for now with code below I get the following error
cur.execute("CREATE TABLE IF NOT EXISTS Magnit_Coor (pos_cli INTEGER PRIMARY KEY, lat INTEGER, long INTEGER);")
cur.execute('SELECT * FROM Magnit_pos')
data = cur.fetchall()
while True:
for coo in data:
full_add = coo[6:11]
pos_id = coo[0]
print (pos_id)
yand_add = ", ".join(full_add)
g = cur.execute('SELECT EXISTS (SELECT * FROM Magnit_Coor WHERE pos_cli = (?))',pos_id)
g = cur.fetchone()[0]
error below
10001
Traceback (most recent call last):
File "geoco.py", line 17, in <module>
g = cur.execute('SELECT EXISTS (SELECT * FROM Magnit_pos WHERE pos_cli = (?))',pos_id)
ValueError: parameters are of unsupported type
The initial code to create Magnit_pos table and pos_cli especially below
cur.execute("DROP TABLE IF EXISTS Magnit_Pos;")
cur.execute(
"CREATE TABLE Magnit_Pos (pos_cli INTEGER PRIMARY KEY, magnit_name TEXT, codesfa TEXT, codewsot TEXT, pos_sap TEXT, source_dc TEXT, zip TEXT, region TEXT, area TEXT, city TEXT, street TEXT, house TEXT, build TEXT);")
with open('magnit.csv') as csvfile:
magnit = csv.reader(csvfile, delimiter=';')
print(magnit)
for row in magnit:
print(row[0])
# to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8")]
cur.execute("INSERT INTO Magnit_Pos (pos_cli, magnit_name, codesfa, codewsot, pos_sap, source_dc, zip, region, area, city, street, house, build) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);", row)
From python's sqlite3 documentation (emphasis mine):
Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method.
So you should be using:
g = cur.execute('SELECT EXISTS (SELECT * FROM Magnit_Coor WHERE pos_cli = (?))',(pos_id,))
i have a script like below :
import psutil
import sqlite3
DISK = {'1': ['C:\\', 'C:\\', 'NTFS', 'rw,fixed', '75.0Gb', '54.0Gb', '20.0Gb', '72.2%'], '2': ['D:\\', 'D:\\', 'NTFS', 'rw,fixed', '399.0Gb', '208.0Gb', '191.0Gb', '52.2%']}
conn = sqlite3.connect("Test.db")
c = conn.cursor()
result = c.execute("SELECT * FROM clientinfo WHERE IP = ?", ("192.168.10.111",))
if (len(result.fetchall()) > 0):
for x in DISK :
c.execute("UPDATE disk SET Device = ?, 'Mount Point' = ?, 'fstyle' = ?, 'opts' = ?, 'total' = ?, 'used' = ?, 'free' = ?, 'percent' = ? WHERE Client_IP = ?", (DISK[x][0], DISK[x][1], DISK[x][2], DISK[x][3], DISK[x][4], DISK[x][5], DISK[x][6], DISK[x][7], "192.168.10.111"))
else :
for x in DISK :
c.execute("INSERT INTO disk('Client_IP', 'Device', 'Mount Point', 'fstyle', 'opts', 'total', 'used', 'free', 'percent') VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)", ("192.168.10.111", DISK[x][0], DISK[x][1], DISK[x][2], DISK[x][3], DISK[x][4], DISK[x][5], DISK[x][6], DISK[x][7]))
conn.commit()
conn.close()
Script will check in db if there is any record about IP "192.168.10.111". If db already have record about IP "192.168.10.111", Script will update data from dict DISK to table disk of db.
If db don't have record about IP "192.168.10.111", Script will create record to insert **DISK"" to database.
INSERT command work well but the UPDATE command don't work like i want. After INSERT command run, in table disk i will have two record about disk C and disk D with same value of column Client_IP (192.168.10.111).
After UPDATE, two record of IP "192.168.10.111" get same value on every column ehich is very wrong. One record must be contain information about disk C and another record cotain disk D information.
How can i make the UPDATE work right ? length of dict DISK depend on how many mounted devices the computer has. So i need to use for loop to UPDATE but not static UPDATE.
Please tell me how to fix this,
Many thanks,
Francis
Your SELECT query column value is incorrect based on your UPDATE and INSERT queries later on. Instead of IP, should't it be CLIENT_IP? Also, you need to change your string query formatting. Remove the single quotes form your column names. Lastely, you can shorten your code by using a simple list comprehension and cursor.executemany:
import sqlite3
DISK = {'1': ['C:\\', 'C:\\', 'NTFS', 'rw,fixed', '75.0Gb', '54.0Gb', '20.0Gb', '72.2%'], '2': ['D:\\', 'D:\\', 'NTFS', 'rw,fixed', '399.0Gb', '208.0Gb', '191.0Gb', '52.2%']}
conn = sqlite3.connect("Test.db")
c = conn.cursor()
if not list(c.execute('SELECT * FROM lientinfo WHERE CLIENT_IP = ?', ("192.168.10.111",))):
c.executemany("INSERT INTO disk (Client_IP, Device, Mount Point, fstyle, opts, total, used, free, percent) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)", [["192.168.10.111", *i] for i in DISK.values()])
else:
c.executemany('UPDATE disk SET Device = ?, Mount Point = ?, fstyle = ?, opts = ?, total = ?, used = ?, free = ?, percent = ? WHERE Client_IP = ?', [[*i, "192.168.10.111"] for i in DISK.values()])