time complexity to get auto incremented row id in mysql - python

I am a newbie at mysql and databases.
I have a simple question. I have created a table that has an integer type id column that is auto incremented. After each insert I get the last row inserted id (in python using cursor.lastrowid, or connection.insert_id()). I wanted to know what is the time complexity in mysql to get this value?
I am guessing its O(1) as the database should be storing this value somewhere and updating it after each insert?
Thanks.

cursor.lastrowid will return the value from the single insert
see: http://www.python.org/dev/peps/pep-0249/
connection.insert_id() will have to make a seperate call to get the last_insert_id, and would slightly slower

Related

Are there disadvantages to making all columns except the primary key column in a table a unique index?

I want to avoid making duplicate records, but there are some occasions when updating the record, the values I receive are exactly the same as the record's version. This results in 0 affected rows which is a value I retain to help me determine if I need to insert a new transaction.
I've tried using a select statement to look for the exact transaction, but some fields (out of many) can be null which doesn't bode well when I have string variables that all have 'field1 = %s' in their where clauses when I'd need something like 'field1 is NULL' instead to get an accurate result back.
My last thought is using a unique index on all of the columns except the one for the table's primary key, but I'm not too familiar with using unique indexes. Should I be able to update these records after the fact? Are there risks to consider when implementing this solution?
Or is there another way I can tell whether I have an unchanged transaction or a new one when provided with values to update with?
The language I'm using is Python with mysql.connector

Insertion into SQL Server based on condition

I have a table in SQL Server and the table has already data for month of November. I have to insert data for previous months such as starting from Jan through October. I have data in a spreadsheet. I want to do bulk insert using Python. I have successfully established the connection to the server using Python and able to access the table. However, I don't know how to insert data above the rows those are already present in the table of the server. The table doesn't have any constraints, primary keys and index.
I am not sure whether the insertion is possible based on the condition. If it is possible kindly share some clues.
Notes: I don't have access to SSIS. I can't do insertion using "BULK INSERT" because the I can't map my shared drive with SQL server. That's why I have decided to use python script to do the operation.
SQL Server Management Studio is just the GUI for interacting with SQL Server.
However, I don't know how to insert data above the rows those are
already present in the table of the server
Tables are ordered or structured based off the clustered index. Since you don't have one since you said there aren't any PK's or indexes, inserting the records "below" or "above" won't happen. A table without a clustered index is called a HEAP which is what you have.
Thus, just insert the data. The order will be determined by any order by clauses you place on a statement (at least the order of the results) or the clustered index on the table if you create one.
I assume you think your data is ordered because, by chance, when you run select * from table your results appear to be in the same order each time. However, this blog will show you that this isn't guaranteed and elaborates on the fact that your results truly aren't ordered without an order by clause.

Insert hash id in database based on auto-increment id (needed for fast search?)

I'm writing a simple flask-restful API and I need to insert some resource into database. I want to have hash id visible in the URL like this /api/resource/hSkR3V9aS rather than just simple auto-increment id /api/resource/34
My first thought was to use Hashids and just generate the hash_id from auto-increment id and store both values in the database, but the problem is that I would have to first INSERT new row of data, GET the id and then UPDATE the hash_id field.
Second attempt was to generate hash_id (e.g. sha1) not from id but some other field that I'm passing to databse and use it as a primary key (get rid of auto-inc id), but I fear that searching and comparing string each time rather than int will be much, much slower.
What is the best way to achive desired hash_id based URL along with acceptable speed of database SELECT queries?
I think this is the most related stack question, but it doesn't answer my question.
Major technology details: Python 3.6, flask_mysqldb library, MySQL database
Please let me know if I ommited some information and I will provide it.
I think I found a decent solution myself in this answer
Use cursor.lastrowid to get the last row ID inserted on the cursor
object, or connection.insert_id() to get the ID from the last insert
on that connection.
It's per-connection based so there is no fear that I'll have 2 rows with the same ID.
I'll now use previously mentioned by myself Hashids and return hashed value to client. Hashids can be also decoded and I'll do it each time I get a request from url with this hash id included.
Also I found out that MongoDB database generates this kind of hashed id by itself, maybe this is a solution for someone else with similar problem.

Update field with no-value

I have a table in a PostgreSQL database.
I'm writing data to this table (using some computation with Python and psycopg2 to write results down in a specific column in that table).
I need to update some existing cell of that column.
Till now, I was able either to delete the complete row before writing this single cell because all other cells on the row were also written back as the same time, or delete the entire column for the same reason.
Now I can't do that anymore because that would mean long computation time to rebuild either the row or the column for only a few new values to be written in some cell.
I know the update command. It works well for that.
But, if I had existing values in some cells, and that a new computation gives me no more result for these cells, I would like to "clear" the existing values to keep the table up-to-date with the last computation I've done.
Is there a simple way to do that ? update doesn't seems to work (it seems to keep the old values).
I precise again I'm using psycopg2 to write things to my table.
you simple update the cell with the value NULL in SQL - psycopg2 will insert NULL into the database when you update your column with None-type from python.

Fast number of rows in Sqlite

I have a single table in an Sqlite DB, with many rows. I need to get the number of rows (total count of items in the table).
I tried select count(*) from table, but that seems to access each row and is super slow.
I also tried select max(rowid) from table. That's fast, but not really safe -- ids can be re-used, table can be empty etc. It's more of a hack.
Any ideas on how to find the table size quickly and cleanly?
Using Python 2.5's sqlite3 version 2.3.2, which uses Sqlite engine 3.4.0.
Do you have any kind of index on a not-null column (for example a primary key)? If yes, the index can be scanned (which hopefully does not take that long). If not, a full table scan is the only way to count all rows.
Other way to get the rows number of a table is by using a trigger that stores the actual number of rows in other table (each insert operation will increment a counter).
In this way inserting a new record will be a little slower, but you can immediately get the number of rows.
To follow up on Thilo's answer, as a data point, I have a sqlite table with 2.3 million rows. Using select count(*) from table, it took over 3 seconds to count the rows. I also tried using SELECT rowid FROM table, (thinking that rowid is a default primary indexed key) but that was no faster. Then I made an index on one of the fields in the database (just an arbitrary field, but I chose an integer field because I knew from past experience that indexes on short fields can be very fast, I think because the index is stored a copy of the value in the index itself). SELECT my_short_field FROM table brought the time down to less than a second.

Categories

Resources