Select a random row from the table using Python - python

Below is the my table.I use MySQL for the database queries.
Structure of the table
I want to print questions randomly by taking the questions from the table. How can I do that using Python?

from random import randint
num = randint(1,5)
Then db query:
SELECT question FROM your_table WHERE ques_id = num;
Alternatively:
SELECT question FROM your_table LIMIT num-1, 1;
num would be a random number between 1 and 5, replace num in the query and it only returns 1 row. Be aware it is starting from index 0, therefore the first argument should be num-1 other than num, second argument is always 1 because you only want to get one row per query.

If all the Ids are in order, get the max one and use the random library to get a random number from 1 to the max id in database.
from random import randint
random_id = randint(1,my_max_id)
then use random_id to get the item from the database.
If you have not setup your python mysql connection, you can refer this
How do I connect to a MySQL Database in Python?.

You could do it at the database level (in MySQL) and thus you would gain an extra speed (by doing the calculations in a lower level of software).
In MySQL, you could get all the questions that you are going to show in a random way.
SELECT qus_id, question FROM your_table ORDER BY RAND();
And then, in python show them by sequentially obtaining the records previously out of order in MySQL.
for question in rows:
show_question(question)
Any "Random" operation is costly to process, so the lower the software level at which it is calculated, the more optimal your program will be.

Related

Python 3 update sqlite column by an amount specified in a variable

Scenario: A quiz program with questions worth different amounts of points.
Sqlite database with a table Table1 with a field RunningTotal of type Int.
I'm looking to update the RunningTotal by the quantity 'updateby' passed to the function. This is a numerical value (but may be a string, so i'm converting it to integer to be sure.
tableid is used to identify which row to update.
eg (non-working code : error is that updateby is not a column name)
def UpdateRunningTotal(tableid,updateby)
updateby = int(updateby)
conn.execute("UPDATE Table1 RunningTotal=RunningTotal+updateby WHERE tableid=?", (tableid,))
I know if I put the following it works to increment the field by 1, but as a function i want more flexibility to increment by different amounts.
conn.execute("UPDATE Table1 RunningTotal=RunningTotal+1 WHERE tableid=?", (tableid,))
I'm trying to avoid doing a SELECT statement to read the current value of RunningTotal, do the math on that, and then use that result in the UPDATE statement...that seems inefficient to me (but may not be?)
conn.execute("UPDATE Table1 set RunningTotal=RunningTotal+? WHERE tableid=?", (updateby, tableid,))
use this statement ... i have checked.. its working fine its updting the previous qnty present in database by RunningTotal+updateby
hope your issue will be resolved

Python Sqlite: How would you randomly select a value-specific row?

I want to be able to randomly select rows that have a certain value. For example, I have a student table in sqlite that stores different characteristics (i.e Gender). I want to randomly pick a student that is male using python. I have looked at other questions (e.g Select a random row from the table using Python) but isn't relevant to value specific rows. How would I do this?
No python specific syntax is needed: sqlite has a random() function:
select
*
from users
where gender == 'M'
order by random()
limit 1
For performance see this: https://stackoverflow.com/a/24591688/788700

Python, SQL: How to update multiple rows and columns in a single trip around the database?

Hello StackEx community.
I am implementing a relational database using SQLite interfaced with Python. My table consists of 5 attributes with around a million tuples.
To avoid large number of database queries, I wish to execute a single query that updates 2 attributes of multiple tuples. These updated values depend on the tuples' Primary Key value and so, are different for each tuple.
I am trying something like the following in Python 2.7:
stmt= 'UPDATE Users SET Userid (?,?), Neighbours (?,?) WHERE Username IN (?,?)'
cursor.execute(stmt, [(_id1, _Ngbr1, _name1), (_id2, _Ngbr2, _name2)])
In other words, I am trying to update the rows that have Primary Keys _name1 and _name2 by substituting the Neighbours and Userid columns with corresponding values. The execution of the two statements returns the following error:
OperationalError: near "(": syntax error
I am reluctant to use executemany() because I want to reduce the number of trips across the database.
I am struggling with this issue for a couple of hours now but couldn't figure out either the error or an alternate on the web. Please help.
Thanks in advance.
If the column that is used to look up the row to update is properly indexed, then executing multiple UPDATE statements would be likely to be more efficient than a single statement, because in the latter case the database would probably need to scan all rows.
Anyway, if you really want to do this, you can use CASE expressions (and explicitly numbered parameters, to avoid duplicates):
UPDATE Users
SET Userid = CASE Username
WHEN ?5 THEN ?1
WHEN ?6 THEN ?2
END,
Neighbours = CASE Username
WHEN ?5 THEN ?3
WHEN ?6 THEN ?4
END,
WHERE Username IN (?5, ?6);

Bigquery query limits upper and lower bounds

On mysql I would enter the following query, but running the same on google BigQuery throws an error for the upper limit. How do I specify limits on a query? Say I have a query that returns 20 results and I want results between 5 and 10 only, how should I frame the query on Google BigQuery?)
For example:
SELECT id,
COUNT(total) AS total
FROM ABC.data
GROUP BY id
ORDER BY count DESC
LIMIT 5,10;
If I only put "LIMIT 5" on the end of the query, I get the top 5 and if I put "LIMIT 10" I ge t the top 10, but what syntax do I use to get between 5 and 10.
Could someone please shed some light on this?
Any help is much appreciated.
Thanks and have a great day.
I would use window functions...
something like
select * from
(Select id, total, row_number() over (order by total desc) as rnb
from
(SELECT id,
COUNT(total) AS total
FROM ABC.data
GROUP BY id
))
where rnb>=5 and rnb<=10
The windowing function answer is a good one, but I thought I'd give another option that involves how your result is fetched rather than how the query is run.
If you only need the first N rows you can add a LIMIT N to your query. But if you don't need the first M rows, you can change how you fetch the results. If you're using the the java API, you can use the setStartIndex() method on either the TableData.list() or the Jobs.getQueryResults() call to only fetch rows starting from a particular index.
That question makes no sense to an ever changing dataset. if you have a 1 second delay between when you ask for the first 5 and the next 5... the data could have changed. It's order is now different and you will miss data or get duplicate results. So databases like BigTable have a method for doing one query of the data and giving you the resultset to you in groups. If that were the case: What you are looking for is called query cursors. I can't say this any better than their own example so [Here is the documentation on them.][1]
But since you said the data does not change then fetch() will work just fine. fetch() has 2 options you will want to take note of limit and offset. 'limit' is the maximum number of results to return. If set to None, all available results will be retrieved. 'offset' is how many results to skip.
Check out other options here: https://developers.google.com/appengine/docs/python/datastore/queryclass#Query_fetch

How to distinctly bulk update all objects of a django model without iterating over them in python?

Basically can we achieve the same result without doing this:
from my_app import models
for prd,count in x.iteritems():
models.AggregatedResult.objects.filter(product=prd).update(linked_epp_count=count)
?
As is evident, x is a dictionary containing keys same as AggregatedResult's product field and the 'value' is the count that I wish to update. It is taking more than 2 - 3 minutes to run on a test table having < 15k rows and the size of the table is ~ 200k currently and is expected to grow upto a million. So, I need help.
Easiest (but not the safest) way is to use raw sql query.
Something like:
for prd,count in x.iteritems():
from django.db import connection, transaction
cursor = connection.cursor()
query = """
UPDATE {table}
set {column} = {value}
where {condition} = {condition_value}""".format(
table=AggregatedResult._meta.db_table,
condition='product',
condition_value=prd,
column='linked_epp_count',
value=count
)
cursor.execute(query)
transaction.commit_unless_managed()
Warning: Not tested and extremely vulnerable to sql-injections. Use at your own risk
Alternative (much safer) approach would be to first load contents of x into temporary table, than issue just one raw query to update. Assuming temp table for x is temp_prod:
update aggregated_result ar
set linked_epp_count=tp.count
from temp_prod tp
where ar.product = tp.product
How do you upload data from x to temp table is something that I'm not very proficient with, so it's left for you. :)

Categories

Resources