How to find all cells matching a regex with gspread?

How to find all cells matching a regex with gspread? - python

So I am very new to programming and I am using python gspread module to use a google sheet as a database.
There's a function for said module called sheet.findall(query, row, column), and this is great, but there's one issue, the query parameter will only look for an exact match, meaning that if i write "DDG", it will not get me the info from a cell with the value of "DDG-87".
After reading the documentation, I found out that you can use python regular expressions to structure the query parameter, so I did that, but there's a problem; The second parameter in re.findall is WHERE to look for, but the issue is that the whole variable is the action of searching, example shown below:
search = sheet.findall(re.findall("[DDG]", The where to search goes here))
As you can see, the whole variable (SEARCH) is the search function, and therefore, I can not specify where to search.
I have tried to set the second parameter of the regex as (SEARCH), but obviously, it won't work.
Any idea or a clue on how I can set the second parameter of re.findall() to be self, or what I can do so that the function doesn't search for an exact match, but if it contains the text?
Thank you.

From the gspread docs:
Find all cells matching a regexp:
criteria_re = re.compile(r'(Small|Room-tiering) rug')
cell_list = worksheet.findall(criteria_re)
So the following should work in your case:
criteria_re = re.compile(r'DDG.*')
search = sheet.findall(criteria_re)

Related

How to get Document Name from DocumentReference in Firestore Python

I have a document reference that I am retreiving from a query on my Firestore database. I want to use the DocumentReference as a query parameter for another query. However, when I do that, it says
TypeError: sequence item 1: expected str instance, DocumentReference found
This makes sense, because I am trying to pass a DocumentReference in my update statement:
db.collection("Teams").document(team).update("Dictionary here") # team is a DocumentReference
Is there a way to get the document name from a DocumentReference? Now before you mark this as duplicate: I tried looking at the docs here, and the question here, although the docs were so confusing and the question had no answer.
Any help is appreciated, Thank You in advance!

Yes,split the .refPath. The document "name" is always the last element after the split; something like lodash _.last() can work, or any other technique that identifies the last element in the array.
Note, btw, the refPath is the full path to the document. This is extremely useful (as in: I use it a lot) when you find documents via collectionGroup() - it allows you to parse to find parent document(s)/collection(s) a particular document came from.
Also note: there is a pseudo-field __name__ available. (really an alias of documentID()). In spite of it's name(s), it returns the FULL PATH (i.e. refPath) to the document NOT the documentID by itself.

I think I figured out - by doing team.path.split("/")[1] I could get the document name. Although this might not work for all firestore databases (like subcollections) so if anyone has a better solution, please go ahead. Thanks!

Does "in" do the same thing as str.contains()?

I'm new to Python but am very confused as to how this code works:
Correct code I don't understand:
I don't understand how in the function, you can just write ".org' in domain to capture whether the referrer_domain is an organization. I thought you would have to filter via .str.contains() to be able to see if the domain includes .org or .com.
I originally coded:
dot_org = data[data['referrer_domain'].str.contains('.org')
dot_com = data[data['referrer_domain'].str.contains('.com')
def domain_type(type):
if type in dot_org['referrer_domain']:
return 'organization'
elif type in dot_com['referrer_domain']:
return 'company'
else:
return 'other'
data['new_column'] = data['referrer_domain'].apply(domain_type)
But this ended up labeling all of the rows in the new column I created as "other".
Is anyone able to explain why the code in the picture works, but why the code above doesn't?

First, you should not use type as a variable name, because it's a reserved word.
Aside from that, there is no str.contains method, at least not in plain Python. The official way of checking if a string contains another string is using the in operator.

How to delete document from index by it's path in Whoosh

First i add documents to index like this:
writer.add_document(title=doc_path.split(os.sep)[-1], path=doc_path, content=text, textdata=text)
And then i just need to delete one of them completely from index by it's path. Documentation says there are few no low level method to do this:
delete_by_term(fieldname, termtext)
Deletes any documents where the given (indexed) field contains the
given term. This is mostly useful for ID or KEYWORD fields.
delete_by_query(query)
Deletes any documents that match the given query.
but i can't find suitable and very convenient method for me where i can specify path of the document and just remove it. There is some low level method where i can specify internal doc_number, which i supposed to get somehow.
Can anyone give me advice how it's better to accomplish this task?

ix = open_dir('/my_index_dir_path/..')
writer = ix.writer()
writer.delete_by_term('path', doc_path)
writer.commit()
delete_by_term
method does exactly what i need. Note, that first argument is a text string 'path', and them goes the actual path. My mistake was to put an actual path instead of attribute name.

Flask SQLAlchemy Contains/Ilike producing different results?

I am trying to query a column from a database with contains/ilike, they are producing different results. Any idea why?
My current code;
search = 'nel'
find = Clients.query.filter(Clients.lastName.ilike(search)).all()
# THE ABOVE LINE PRODUCES 0 RESULTS
find = Clients.query.filter(Clients.lastName.contains(search)).all()
# THE ABOVE LINE PRODUCES THE DESIRED RESULTS
for row in find:
print(row.lastName)
My concern is am I missing something? I have read that 'contains' does not always work either. Is there a better way to do what I am doing?

For ilike and like, you need to include wildcards in your search like this:
Clients.lastName.ilike(r"%{}%".format(search))
As the Postgres docs say:
LIKE pattern matching always covers the entire string. Therefore, to match a sequence anywhere within a string, the pattern must start and end with a percent sign.
The other difference is that contains is case-sensitive, while ilike is insensitive.

Google app engine, full text search for empty (None) field

I'd like to use Google AppEngine full text search to search for items in an index that have their logo set to None
tried
"NOT logo_url:''"
is there any way I write such a query, or do I have to add another property which is has_logo?

You can not filter by non existing values by nature of full text search indexes.
You would need to create a column/property "no_logo" to be able to do this.
As an option you can define some default for empty values. For example just a string "None". Then search like:
logo_url: None
That is how I would do it.

Have you tried with:
NOT logo_url: Null

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to find all cells matching a regex with gspread? - python

From the gspread docs: Find all cells matching a regexp: criteria_re = re.compile(r'(Small|Room-tiering) rug') cell_list = worksheet.findall(criteria_re) So the following should work in your case: criteria_re = re.compile(r'DDG.*') search = sheet.findall(criteria_re)

Related

How to get Document Name from DocumentReference in Firestore Python

Does "in" do the same thing as str.contains()?

How to delete document from index by it's path in Whoosh

Flask SQLAlchemy Contains/Ilike producing different results?

Google app engine, full text search for empty (None) field

Categories

Resources