I'm having a database (sqlite) of members of an organisation (less then 200 people). Now I'm trying to write an wx app that will search the database and return some contact information in a wx.grid. The app will have 2 TextCtrls, one for the first name and one for the last name. What I want to do here is make it possible to only write one or a few letters in the textctrls and that will start to return result. So, if I search "John Smith" I write "Jo" in the first TextCtrl and that will return every single John (or any one else having a name starting with those letters). It will not have an "search"-button, instead it will start searching whenever I press a key.
One way to solve this would be to search the database with like " SELECT * FROM contactlistview WHERE forname LIKE 'Jo%' " But that seems like a bad idea (very database heavy to do that for every keystroke?). Instead i thought of use fetchall() on a query like this " SELECT * FROM contactlistview " and then, for every keystroke, search the list of tuples that the query have returned. And that is my problem: Searching a list is not that difficult but how can I search a list of tuples with wildcards?
selected = [t for t in all_data if t[1].startswith('Jo')]
but, measure, don't guess. I think that in some cases, the query would be faster - specially if you have too many records. Maybe you should use a query on the first char, and then start using python-side filter, since you already have the results.
I think that generally, you shouldn't be afraid of giving tasks to a database. It's quite possible that the LIKE clause will be very fast. Sqlite is implemented in fairly robust C code, and will happily deal with queries like this.
If you're worried about sending too many requests, why not send a query once a user has entered a threshold of characters, such as three?
A list comprehension is probably the best way to return the result if you want to do added filtering.
If you are searching for a string matching the start using LIKE, eg 'abc%' (rather than anywhere in the string - '%abc%'), the search should be quite fast if you have an index on the field, as the db can use the index to help find the matches.
Related
What I wanted to achieve is, if user enters search for "laptp" then database should return results with actual word "Laptop". Similarly if user enters "ambroidery", then database should return results with both "embroidery" and "embroidred" words containing strings. Hope it clears!!
So what I tried is, I went through whole django documentation and closest thing I found is "Trigram Similarity" search. I followed documentation and tried this:
data = 'silk'
data= Product.objects.annotate(
similarity=TrigramSimilarity('description', data)).filter(similarity__gt=0.3).order_by('-similarity')
In my database, I have Products whose description contains word "silky" but every time I runs this query I get empty query set. Even when I put data value "silky", again I got empty query set.
So first of all suggest me that whether this is right approach for what I wanted to achieve and secondly if it is, then why it is returning empty query set?
I would like to search my database for a value. I know you can do this for a key with db.search() but there does not seem to be any kind of similar function for searching for a value
I've tried using the contains() function but I have the same issue. It checks if the key is contained in the database. I would like to know if a certain value is contained in the database.
I would like to do something like this that would search for values in tinydb
db.search('value')
If I was able to execute the above command and get the value(if it does exist) or Nothing if it doesn't that would be ideal. Alternatively, if the able returned True or False accordingly, that would be fine as well
I don't know if this is what you are looking for but which the following command you can check for a specific field value:
from tinydb import Query
User = Query()
db.search(User.field_name == 'value')
I'm new here (doing some reading to see if TinyDB would even be applicable for my use case) so perhaps wrong, and also aware that this question is a little old. But I wonder if you can't address this by iterating over each field and searching within for your value. Then, you couldget the key or field wherein a value match was located.
Hello StackEx community.
I am implementing a relational database using SQLite interfaced with Python. My table consists of 5 attributes with around a million tuples.
To avoid large number of database queries, I wish to execute a single query that updates 2 attributes of multiple tuples. These updated values depend on the tuples' Primary Key value and so, are different for each tuple.
I am trying something like the following in Python 2.7:
stmt= 'UPDATE Users SET Userid (?,?), Neighbours (?,?) WHERE Username IN (?,?)'
cursor.execute(stmt, [(_id1, _Ngbr1, _name1), (_id2, _Ngbr2, _name2)])
In other words, I am trying to update the rows that have Primary Keys _name1 and _name2 by substituting the Neighbours and Userid columns with corresponding values. The execution of the two statements returns the following error:
OperationalError: near "(": syntax error
I am reluctant to use executemany() because I want to reduce the number of trips across the database.
I am struggling with this issue for a couple of hours now but couldn't figure out either the error or an alternate on the web. Please help.
Thanks in advance.
If the column that is used to look up the row to update is properly indexed, then executing multiple UPDATE statements would be likely to be more efficient than a single statement, because in the latter case the database would probably need to scan all rows.
Anyway, if you really want to do this, you can use CASE expressions (and explicitly numbered parameters, to avoid duplicates):
UPDATE Users
SET Userid = CASE Username
WHEN ?5 THEN ?1
WHEN ?6 THEN ?2
END,
Neighbours = CASE Username
WHEN ?5 THEN ?3
WHEN ?6 THEN ?4
END,
WHERE Username IN (?5, ?6);
I have a Postgres 9.3 table that has a column called id as PKEY, id is char(9), and only allow lowercase a-z0-9, I use Python with psycopg to insert to this table.
When I need to insert into this table, I call a Python function get_new_id(), my question is, how to make get_new_id() efficient?
I have the following solutions, none of them satisfy me.
a) Pre-generate a lot of ids, store them in some table, when I need a new id, I SELECT one from this table, then delete it from this table, then return this selected id. Down side of this solution is that it need to maintain this table, in each get_new_id() call, there will also have a SELECT COUNT in order to find out if I need to generate more ids to put into this table.
b) When get_new_id() gets called, it generate a random id, then pass this id to a stored procedure to check if this id is already in use, if no, we are good, if yes, do b) again. Down side of this solution is, when the table gets bigger, the failure rate may be high, and there is a chance that, two get_new_id() calls in two processes will generate the same id, say, 1234567, and 1234567 is not used a PKEY yet, so, when insert, one process will fail.
I think this is a pretty old problem, what's the perfect solution?
Edit
I think this has been answered, see Jon Clements' comment.
Offtopic because you already have a char(9) datatype:
I would use an UUID when a random string is needed, it's a standard and almost any programming language (including Python) can generate UUIDs for you.
PostgreSQL can also do it for you, using the uuid-ossp extension.
select left(md5(random()::text || now()), 9);
left
-----------
c4c384561
Make the id the primary key and try the insert. If an exception is thrown catch it and retry. Nothing fancy about it. why only 9 characters? Make it the full 32.
Check this answer for how to make it smaller: https://stackoverflow.com/a/15982876/131874
How is it possible to implement an efficient large Sqlite db search (more than 90000 entries)?
I'm using Python and SQLObject ORM:
import re
...
def search1():
cr = re.compile(ur'foo')
for item in Item.select():
if cr.search(item.name) or cr.search(item.skim):
print item.name
This function runs in more than 30 seconds. How should I make it run faster?
UPD: The test:
for item in Item.select():
pass
... takes almost the same time as my initial function (0:00:33.093141 to 0:00:33.322414). So the regexps eat no time.
A Sqlite3 shell query:
select '' from item where name like '%foo%';
runs in about a second. So the main time consumption happens due to the inefficient ORM's data retrieval from db. I guess SQLObject grabs entire rows here, while Sqlite touches only necessary fields.
The best way would be to rework your logic to do the selection in the database instead of in your python program.
Instead of doing Item.select(), you should rework it to do Item.select("""name LIKE ....
If you do this, and make sure you have the name and skim columns indexed, it will return very quickly. 90000 entries is not a large database.
30 seconds to fetch 90,000 rows might not be all that bad.
Have you benchmarked the time required to do the following?
for item in Item.select():
pass
Just to see if the time is DB time, network time or application time?
If your SQLite DB is physically very large, you could be looking at -- simply -- a lot of physical I/O to read all that database stuff in.
If you really need to use a regular expression, there's not really anything you can do to speed that up tremendously.
The best thing would be to write an sqlite function that performs the comparison for you in the db engine, instead of Python.
You could also switch to a db server like postgresql that has support for SIMILAR.
http://www.postgresql.org/docs/8.3/static/functions-matching.html
I would definitely take a suggestion of Reed to pass the filter to the SQL (forget the index part though).
I do not think that selecting only specified fields or all fields make a difference (unless you do have a lot of large fields). I would bet that SQLObject creates/instanciates 80K objects and puts them into a Session/UnitOfWork for tracking. This could definitely take some time.
Also if you do not need objects in your session, there must be a way to select just what the fields you need using custom-query creation so that no Item objects are created, but only tuples.
Initially doing regex via Python was considered for y_serial, but that
was dropped in favor of SQLite's GLOB (which is far faster).
GLOB is similar to LIKE except that it's syntax is more
conventional: * instead of %, ? instead of _ .
See the Endnotes at http://yserial.sourceforge.net/ for more details.
Given your example and expanding on Reed's answer your code could look a bit like the following:
import re
import sqlalchemy.sql.expression as expr
...
def search1():
searchStr = ur'foo'
whereClause = expr.or_(itemsTable.c.nameColumn.contains(searchStr), itemsTable.c.skimColumn.contains(searchStr))
for item in Items.select().where(whereClause):
print item.name
which translates to
SELECT * FROM items WHERE name LIKE '%foo%' or skim LIKE '%foo%'
This will have the database do all the filtering work for you instead of fetching all 90000 records and doing possibly two regex operations on each record.
You can find some info here on the .contains() method here.
As well as the SQLAlchemy SQL Expression Language Tutorial here.
Of course the example above assumes variable names for your itemsTable and the column it has (nameColumn and skimColumn).