How to filter with "contains"?

How to filter with "contains"? - python

I try to filter and get some set of objects using this segment.
baseSet = ThreadedComment.objects.filter(tree_path__contains = baseT.comment_ptr_id)
but it brings some objects that are not supposed to be there.
For example, my baseT.comment_ptr_id is 1, it brought items with these tree_path.
comment_ptr_id=1 treepath = 0000000001
comment_ptr_id=3 treepath = 0000000001/0000000003
comment_ptr_id=4 treepath = 0000000001/0000000003/0000000004
comment_ptr_id=8 treepath = 0000000001/0000000003/0000000004/0000000008
comment_ptr_id=10 treepath = 0000000006/0000000010
comment_ptr_id=11 treepath = 0000000011
The last 2 ones are not supposed to be here. But since their tree_path contains "1"
filter brings those as well.
How can I write regex to create a filter that does not bring those items?

Why not do
baseSet = ThreadedComment.objects.filter(tree_path__contains = ('%010i' % int(baseT.comment_ptr_id)))
so that the search string for id=1 will be "0000000001" and won't be a substring of "0000000011"?
EDIT: As per the comment below, it might be better to use COMMENT_PATH_DIGITS. This is a little messier because you're using formatting to set a formatting tag. It looks like this:
tree_path__contains = ('%%0%ii' % COMMENT_PATH_DIGITS % int(baseT.comment_ptr_id))

the regexp would be '(^|/)0*%d(/|$)' % baseT.comment_ptr_id and you use it with tree_path__regex
read about MPTT for alternatives to this approach.

Related

Generate a sequence in Python for barcoding items

I am trying to generate barcodes in an app to tag the products which includes 3 things:
Batch no. (GRN ID)
Product ID
serial ID
Something like this:
def get(self, request, *args, **kwargs):
pk = self.kwargs['pk']
grn = Grn.objects.filter(pk=pk)[0]
grn_prod = grn.items.all()
items = []
for i in grn_prod:
for j in range(i.item_quantity):
items.append("YNT" + str(pk) + str(i.item.pk) + str(j + 1))
It generates a sequence like the following:
YNT55232
Which is good but while scanning it if I want to know the item ID or Serial ID the it becomes a problem as it could be 23, 523, 3, etc.
For this I want to specify a no of digits for GRN, Product and Serial Id something like this:
GRN Barcode GRN ID Product ID Serial ID
YNT 000X 000X 0000X
I am unable to figure out how to append 0 before the IDs ?

you can use format in Python. It is commonly used to format many variables.
If you want to format this: "YNT" + str(pk) + str(i.item.pk) + str(j + 1)
you can use format as below:
'"YNT"\t{:04d}\t{:04d}\t{:05d}'.format(pk, i.item.pk, j+1)
In case you do not know; the {} are for each variable as in order in format().
As you want to have pk and i.item.pk as four characters, then you add :04d. :04d completes the words with 0. For instance;
if pk = 1, then it converts it to 0001, or if it is 101 then it converts to 0101.
Same is for j+1, if j+1 is 1, then it generates 00001, if it is 101, then it generates 00101.
If you have not used format in Python, I suggest you learn it. It is really helpful for formatting variables.

The zfill function does exactly this.
str.zfill(5) will pad given string in variable str to at least 5 characters, for example.

There are different ways to achieve String formatting.
%-formatting
Look at the above mention documentation of string function zfill. A box follows that explains:
printf style String Formatting using the % operator (modulo):
barcode = '%(gnr)03d%(product)03d%(serial)04d' % {'gnr': 123, 'product': 456, 'serial': 7890}
print(barcode)
produces YNT1234567890.
f-strings (since Python 3.6)
You can also use the Python 3 way with f-strings:
# barcode components readable named
gnr = pk
product = i.item.pk
serial = j + 1
# format-literal simply will replace the variables named
barcode = f"YNT{gnr:03}{product:03}{serial:04}"
items.append(barcode)
It uses a prefix after the variable-name:
:0x for left-padding with x leading zeros.
Note: I always clearly named the template-variables (here: barcode components):
put in a map with descriptive keys (like above in %-formatting)
put in separate variables with names describing each of them (like in f-string example)
So we can use a readable template also called format-literal like:
"{component_1} before {component_2} then {the_rest}"

How do you iterate over a set or a list in Flask and PyMongo?

I have produced a set of matching IDs from a database collection that looks like this:
{ObjectId('5feafffbb4cf9e627842b1d9'), ObjectId('5feaffcfb4cf9e627842b1d8'), ObjectId('5feb247f1bb7a1297060342e')}
Each ObjectId represents an ID on a collection in the DB.
I got that list by doing this: (which incidentally I also think I am doing wrong, but I don't yet know another way)
# Find all question IDs
question_list = list(mongo.db.questions.find())
all_questions = []
for x in question_list:
all_questions.append(x["_id"])
# Find all con IDs that match the question IDs
con_id = list(mongo.db.cons.find())
con_id_match = []
for y in con_id:
con_id_match.append(y["question_id"])
matches = set(con_id_match).intersection(all_questions)
print("matches", matches)
print("all_questions", all_questions)
print("con_id_match", con_id_match)
And that brings up all the IDs that are associated with a match such as the three at the top of this post. I will show what each print prints at the bottom of this post.
Now I want to get each ObjectId separately as a variable so I can search for these in the collection.
mongo.db.cons.find_one({"con": matches})
Where matches (will probably need to be a new variable) will be one of each ObjectId's that match the DB reference.
So, how do I separate the ObjectId in the matches so I get one at a time being iterated. I tried a for loop but it threw an error and I guess I am writing it wrong for a set. Thanks for the help.
Print Statements:
**matches** {ObjectId('5feafffbb4cf9e627842b1d9'), ObjectId('5feaffcfb4cf9e627842b1d8'), ObjectId('5feb247f1bb7a1297060342e')}
**all_questions** [ObjectId('5feafb52ae1b389f59423a91'), ObjectId('5feafb64ae1b389f59423a92'), ObjectId('5feaffcfb4cf9e627842b1d8'), ObjectId('5feafffbb4cf9e627842b1d9'), ObjectId('5feb247f1bb7a1297060342e'), ObjectId('6009b6e42b74a187c02ba9d7'), ObjectId('6010822e08050e32c64f2975'), ObjectId('601d125b3c4d9705f3a9720d')]
**con_id_match** [ObjectId('5feb247f1bb7a1297060342e'), ObjectId('5feafffbb4cf9e627842b1d9'), ObjectId('5feaffcfb4cf9e627842b1d8')]

Usually you can just use find method that yields documents one-by-one. And you can filter documents during iterating with python like that:
# fetch only ids
question_ids = {question['_id'] for question in mongo.db.questions.find({}, {'_id': 1})}
matches = []
for con in mongo.db.cons.find():
con_id = con['question_id']
if con_id in question_ids:
matches.append(con_id)
# you can process matched and loaded con here
print(matches)
If you have huge amount of data you can take a look to aggregation framework

Django: how to use a variable key filter with __range

date_one = form.cleaned_data.get('date_one')
date_two = form.cleaned_data.get('date_two')
date_type = form.cleaned_data.get('date_type')
search = MyClass.objects.filter(date_type__range(date_one, date_two))
My model has two different date columns. (created and expires). The user can make a query filtering between two dates, but he can choose if he wants to filter by creation or expiration.
I could make two query lines using if, but I really want to know how to do it in the way I'm asking.
How can I do this? Since the key before __range is a variable. I tried with (**{ filter: search_string }) but it seems not to be compatible with __range.

try this
filter_dict = {"{}__range".format(date_type): [date_one, date_two]}
search = MyClass.objects.filter(**filter_dict)

The thing you attempted is almost correct!
Lookups are not functions (so it's not foo__range(start, end)), but they are keyword arguments: foo__range=(start, end)
So you would have:
date_one = form.cleaned_data.get('date_one')
date_two = form.cleaned_data.get('date_two')
date_type = form.cleaned_data.get('date_type')
query_kwargs = {
"{}__range".format(date_type): (date_one, date_two)
}
search = MyClass.objects.filter(**query_kwargs)

Make a list with a name that is only known after the program runs

I want to make a list and call it a name which I only know after I run the program:
For example:
#making shelfs
group_number = 1
group_name = 'group' + str(group_number)
print group_name
group_name will be: group1
Now I want to make an empty list called group1. How to do such a thing?

Usually you just put this into a dictionary:
d = {group_name:[]}
Now you have access to your list via the dictionary. e.g.:
d['group1'].append('Hello World!')
The alternative is to modify the result of the globals() function (which is a dictionary). This is definitely bad practice and should be avoided, but I include it here as it's always nice to know more about the tool you're working with:
globals()[group_name] = []
group1.append("Hello World!")

You are wanting to create a pseudo-namespace of variables starting with "group". Why not use a dict instead?
#making shelfs
groups = {}
group_number = 1
name = str(group_number)
groups[name] = [] # or whatever
print groups[name]
This is subtly different to #mgilson's answer because I am trying to encourage you to create new namespaces for each collection of related objects.

you do this:
locals()['my_variable_name'] = _whatever_you_wish_
or
globals()['my_variable_name'] = _whatever_you_wish_
or
vars()['my_variable_name'] = _whatever_you_wish_
Google to find out the differences yourself :P

Python sorting question - given list of ['url', 'tag1', 'tag2',..]s and search specification ['tag3', 'tag1',...], return relevant url list

I'm quite new to programming so I'm sure there's a terser way to pose this, but I'm trying to create a personal bookmarking program. Given multiple urls each with a list of tags ordered by relevance, I want to be able to create a search consisting of a list of tags that returns a list of most relevant urls. My first solution, below, is to give the first tag a value of 1, the second 2, and so on & let the python list sort function do the rest. 2 questions:
1) Is there a much more elegant/efficient way of doing this (embarrass me!)
2) Any other general approaches to the sorting by relevance given the inputs above problem?
Much obliged.
# Given a list of saved urls each with a corresponding user-generated taglist
# (ordered by relevance), the user enters a "search" list-of-tags, and is
# returned a sorted list of urls.
# Generate sample "content" linked-list-dictionary. The rationale is to
# be able to add things like 'title' etc at later stages and to
# treat each url/note as in independent entity. But a single dictionary
# approach like "note['url1']=['b','a','c','d']" might work better?
content = []
note = {'url':'url1', 'taglist':['b','a','c','d']}
content.append(note)
note = {'url':'url2', 'taglist':['c','a','b','d']}
content.append(note)
note = {'url':'url3', 'taglist':['a','b','c','d']}
content.append(note)
note = {'url':'url4', 'taglist':['a','b','d','c']}
content.append(note)
note = {'url':'url5', 'taglist':['d','a','c','b']}
content.append(note)
# An example search term of tags, ordered by importance
# I'm using a dictionary with an ordinal number system
# This seems clumsy
search = {'d':1,'a':2,'b':3}
# Create a tagCloud with one entry for each tag that occurs
tagCloud = []
for note in content:
for tag in note['taglist']:
if tagCloud.count(tag) == 0:
tagCloud.append(tag)
# Create a dictionary that associates an integer value denoting
# relevance (1 is most relevant etc) for each existing tag
d={}
for tag in tagCloud:
try:
d[tag]=search[tag]
except KeyError:
d[tag]=100
# Create a [[relevance, tag],[],[],...] result list & sort
result=[]
for note in content:
resultNote=[]
for tag in note['taglist']:
resultNote.append([d[tag],tag])
resultNote.append(note['url'])
result.append(resultNote)
result.sort()
# Remove the relevance values & recreate a list containing
# the url string followed by corresponding tags.
# Its so hacky i've forgotten how it works!
# It's mostly for display, but suggestions on "best-practice"
# intermediate-form data storage?
finalResult=[]
for note in result:
temp=[]
temp.append(note.pop())
for tag in note:
temp.append(tag[1])
finalResult.append(temp)
print "Content: ", content
print "Search: ", search
print "Final Result: ", finalResult

1) Is there a much more elegant/efficient way of doing this (embarrass me!)
Sure thing. The basic idea: quit trying to tell Python what to do, and just ask it for what you want.
content = [
{'url':'url1', 'taglist':['b','a','c','d']},
{'url':'url2', 'taglist':['c','a','b','d']},
{'url':'url3', 'taglist':['a','b','c','d']},
{'url':'url4', 'taglist':['a','b','d','c']},
{'url':'url5', 'taglist':['d','a','c','b']}
]
search = {'d' : 1, 'a' : 2, 'b' : 3}
# We can create the tag cloud like this:
# tagCloud = set(sum((note['taglist'] for note in content), []))
# But we don't actually need it: instead, we'll just use a default value
# when looking things up in the 'search' dict.
# Create a [[relevance, tag],[],[],...] result list & sort
result = sorted(
[
[search.get(tag, 100), tag]
for tag in note['taglist']
] + [[note['url']]]
# The result will look like [ [relevance, tag],... , [url] ]
# Note that the url is wrapped in a list too. This makes the
# last processing step easier: we just take the last element of
# each nested list.
for note in content
)
# Remove the relevance values & recreate a list containing
# the url string followed by corresponding tags.
finalResult = [
[x[-1] for x in note]
for note in result
]
print "Content: ", content
print "Search: ", search
print "Final Result: ", finalResult

I suggest you also give a weight to each tag, depending on how rare it is (e.g. a “tarantula” tag would weigh more than a “nature” tag¹). For a given URL, rare tags that are common with other URLs should mark a stronger relevance, while frequently used tags of the given URL not existing in another URL should mark down the relevance.
It's easy to convert the rules I describe above as calculations of a numerical relevance for every other URL.
¹ unless all your URLs are related to “tarantulas”, of course :)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to filter with "contains"? - python

the regexp would be '(^|/)0*%d(/|$)' % baseT.comment_ptr_id and you use it with tree_path__regex read about MPTT for alternatives to this approach.

Related

Generate a sequence in Python for barcoding items

How do you iterate over a set or a list in Flask and PyMongo?

Django: how to use a variable key filter with __range

Make a list with a name that is only known after the program runs

Python sorting question - given list of ['url', 'tag1', 'tag2',..]s and search specification ['tag3', 'tag1',...], return relevant url list

Categories

Resources