What would be the best way to get the PK of the following:
self.cursor.execute('INSERT IGNORE INTO table (url, country) VALUES (%s, %s)', (line['url'], line['country']))
In other words, if it's already there, I would need to get that PK, but if it's not there, it would be INSERTing and then getting the LAST_INSERT_ID. Is there a way to do this without doing three queries? What would be the best way to do this pattern?
To get the LAST_INSERT_ID while inserting data, don't use INSERT IGNORE. Instead, use the ON DUPLICATE KEY UPDATE clause to get the id:
INSERT INTO table (url, country)
VALUES (%s, %s)
ON DUPLICATE KEY
UPDATE
id = LAST_INSERT_ID(id);
where id represents the unique column of your table.
You'd still need another query to fetch the updated LAST_INSERT_ID now.
I think the most straightforward way to do this without altering previous data would be to do an INSERT IGNORE followed by a SELECT to retrieve the id.
cursor.execute('INSERT IGNORE INTO...')
cursor,execute('SELECT id FROM table...')
id = cursor.fetchone()[0]
Related
I am trying to load the data inside the table trial and it says Invalid Column name - Name.
I am passing values inside Name and Area dynamically.
cursor.execute("insert into trial (NameofTheProperty, AreaofTheProperty)
values (Name, Area)")
cnxn.commit()
You need to have quotes around the column values so that they are not gonna be interpreted as column names instead:
insert into
trial (NameofTheProperty, AreaofTheProperty)
values
("Name", "Area")
Now, since you mentioned that you dynamically insert these values into the query, you can just let your database driver handle the quotes and other things like type conversions:
property_name = "Name"
property_area = "Area"
cursor.execute("""
insert into
trial (NameofTheProperty, AreaofTheProperty)
values
(?, ?)""", (property_name, property_area))
cnxn.commit()
This is called query parameterization and is considered the safest and the most robust way to insert values into the SQL queries. These ? values are called "placeholders".
Note that the database driver is gonna put quotes around the string values automatically - no need to do it manually.
I try to use INSERT INTO...NO DUPLICATE KEY UPDATE clause in python to update mysql records where name is the primary key. If the name exist, update record's age column, otherwise insert it:
sql = """INSERT INTO mytable(name, age) \
VALUES ('Tim',30),('Sam',21),('John','35') \
ON DUPLICATE KEY UPDATE age=VALUES(age)"""
with db.connection() as conn:
with conn.cursor as cursor:
cursor.execute(sql)
if cursor.rowcount == 0:
result = 'UPDATE'
else:
result = 'INSERT'
I want to find out whether this execution has add one or more new rows or not. But the cursor.rowcount is not correct for each insert and update. Any comments about that?
I ran into this problem before, where I wanted to know if my insert was successful or not. My short-term solution was to call a count(*) on the table before and after the insert and and compare the numbers.
I never found a way to determine which action you have used for both INSERT IGNORE and INSERT ... ON DUPLICATE KEY.
Just to add more clarification to the previous answer.
With a cursor.rowcount is particularly hard to achieve your goal if inserting multiple rows.
The reason is that rowcount returns the number of affected rows.
Here is how it is defined:
The affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. (https://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html)
So, to solve your problem you will need to do count(*) before insert and after the insert.
I am using the Python-MySQL (MySQLdb) library to insert values into a database. I want to avoid duplicate entries from being inserted into the database, so I have added the unique constraint to that column in MySQL. I am checking for duplicates in the title column. In my Python script, I am using the following statement:
cursor.execute ("""INSERT INTO `database` (title, introduction) VALUES (%s, %s)""", (title, pure_introduction))
Now when a duplicate entry is added to the database, it will produce an error. I do not want an error message to appear; I just want that if a duplicate entry is found then it should simply not enter that value into the database. How do I do this?
You can utilize the INSERT IGNORE syntax to suppress this type of error.
If you use the IGNORE keyword, errors that occur while executing the INSERT statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors may generate warnings instead, although duplicate-key errors do not.
In your case, the query would become:
INSERT IGNORE INTO `database` (title, introduction) VALUES (%s, %s)
Aside from what #Andy suggested (which should really be posted as an answer), you can also catch the exception in Python and silence it:
try:
cursor.execute ("""INSERT INTO `database` (title, introduction) VALUES (%s, %s)""", (title, pure_introduction))
except MySQLdb.IntegrityError:
pass # or may be at least log?
here is the table
CREATE TABLE IF NOT EXISTS kompas_url
(
id BIGINT(20) NOT NULL AUTO_INCREMENT,
url VARCHAR(1000),
created_date datetime,
modified_date datetime,
PRIMARY KEY(id)
)
I am trying to do INSERT to kompas_url table only if url is not exist yet
any idea?
thanks
You can either find out whether it's in there first, by SELECTing by url, or you can make the url field unique:
CREATE TABLE IF NOT EXISTS kompas_url
...
url VARCHAR(1000) UNIQUE,
...
)
This will stop MySQL from inserting a duplicate row, but it will also report an error when you try and insert. This isn't good—although we can handle the error, it might disguise others. To get around this, we use the ON DUPLICATE KEY UPDATE syntax:
INSERT INTO kompas_url (url, created_date, modified_date)
VALUES ('http://example.com', NOW(), NOW())
ON DUPLICATE KEY UPDATE modified_date = NOW()
This allows us to provide an UPDATE statement in the case of a duplicate value in a unique field (this can include your primary key). In this case, we probably want to update the modified_date field with the current date.
EDIT: As suggested by ~unutbu, if you don't want to change anything on a duplicate, you can use the INSERT IGNORE syntax. This simply works as follows:
INSERT IGNORE INTO kompas_url (url, created_date, modified_date)
VALUES ('http://example.com', NOW(), NOW())
This simply turns certain kinds of errors into warnings—most usefully, the error that states there will be a duplicate unique entry. If you place the keyword IGNORE into your statement, you won't get an error—the query will simply be dropped. In complex queries, this may also hide other errors that might be useful though, so it's best to make doubly sure your code is correct if you want to use it.
I'm using PyGreSQL to access my DB. In the use-case I'm currently working on; I am trying to insert a record into a table and return the last rowid... aka the value that the DB created for my ID field:
create table job_runners (
id SERIAL PRIMARY KEY,
hostname varchar(100) not null,
is_available boolean default FALSE
);
sql = "insert into job_runners (hostname) values ('localhost')"
When I used the db.insert(), which made the most sense, I received an "AttributeError". And when I tried db.query(sql) I get nothing but an OID.
Q: Using PyGreSQL what is the best way to insert records and return the value of the ID field without doing any additional reads or queries?
INSERT INTO job_runners
(hostname,is_available) VALUES ('localhost',true)
RETURNING id
That said, I have no idea about pygresql, but by what you've already written, I guess it's db.query() that you want to use here.
The documentation in PyGreSQL says that if you call dbconn.query() with and insert/update statement that it will return the OID. It goes on to say something about lists of OIDs when there are multiple rows involved.
First of all; I found that the OID features did not work. I suppose knowing the version numbers of the libs and tools would have helped, however, I was not trying to return the OID.
Finally; by appending "returning id", as suggested by #hacker, pygresql simply did the right thing and returned a record-set with the ID in the resulting dictionary (see code below).
sql = "insert into job_runners (hostname) values ('localhost') returning id"
rv = dbconn.query(sql)
id = rv.dictresult()[0]['id']
Assuming you have a cursor object cur:
cur.execute("INSERT INTO job_runners (hostname) VALUES (%(hostname)s) RETURNING id",
{'hostname': 'localhost'})
id = cur.fetchone()[0]
This ensures PyGreSQL correctly escapes the input string, preventing SQL injection.