App Engine, items disappearing from search index

App Engine, items disappearing from search index - python

I'm using App Engine, Python, v1.9.23.290
Currently I'm doing Alpha testing before opening up the app to the public.
I'm finding that some items "randomly" disappear from the search index.
I'm looking at one particular item where a user entered the item a week ago.
The search index was updated.
The item showed up as expected in searches.
The NDB entity was no "touched"/modified since last week.
This morning it is not in the index.
I don't have a code sample to share, because there is no "error".
Is this a common problem with a common solution?
To clarify:
When a user creates/edits an NDB entity, I update the item index thusly:
doc = search.Document(doc_id=str(this_item.key.id()), fields=fields)
search_index = search.Index(name="ItemIndex")
try:
search_index.put(doc)
except search.Error:
logging.exception('Put failed on search index ItemIndex')
All is fine. But the item 'disappeared' from the index.
With only a dozen items in the index I've had this happen a couple of times in the last week.
If it never happens for anybody else, I guess that is a good sign. I just have to find where the error is in my code.
If somebody else has had this problem, any indication as to the problem would be a great help.

Related

Spotify wep api could not find existing item

i am trying to recover spotify ID's of some tracks i got title and artist, and while i was trying to complete the task i came across a strange situation:
the track is the following:
"artist" : "Noisettes",
"track" : "Don't Upset The Rhythm (Go Baby Go)"
The track exists and seems written the same way (so no spaces, strange characters), i manually found the spotifyid of the item ("6Pfp47eUtnj2D1LMMtmDne"), but when i perform the search specifing this query parameter
q=artist%3ANoisettes+track%3ADon%27t+Upset+The+Rhythm+%28Go+Baby+Go%29&type=track
by the search for item
https://developer.spotify.com/documentation/web-api/reference/#/operations/search
it returns a response with 0 items, meaning it didn't match any item.
Do you have an idea why this happens?

Using "contains (text)" to find parent and following sibling in selenium with Python?

So I'm trying to build a tool to transfer tickets that I sell. A sale comes into my POS, I do an API call for the section, row, and seat numbers ordered (as well as other information obviously). Using the section, row, and seat number, I want to plug those values into a contains (text) statement to in order to find and select the right tickets on the host site.
Here is a sample of how the tickets are laid out:
And here is a screenshot (sorry if this is inconvenient) of the DOM related to one of the rows above:
Given this, how should I structure my contains(text) statement so that it is able to find and select the correct seats? I am very new/inexperienced with automation. I messed around with it a few months ago with some success and have managed to get a tool that gets me right up to selecting the seats but the "div" path confuses me when it comes to searching for text that is tied to other text.
I tried the following structure:
for i in range(int(lowseat), int(highseat)):
web.find_element_by_xpath('//*[contains (text(), "'+section+'")]/following-sibling::[contains text(), "'+row+'")]/following-sibling::[contains text(), "'+str(i)+'")]').click()
to no avail. Can someone help me explain how to structure these statements correctly so that it searches for section, row, and seat number correctly?
Thanks!
Also, if needed, here is a screenshot with more context of the button (in cases its needed). Button is highlighted in sky blue:

you can't use text() for that because it's in nested elements. You probably want to map all these into dicts and select with filter.
Update
Here's an idea for a lazy way to do this (untested):
button = driver.execute_script('''
return [...document.querySelectorAll('button')].find(b => {
return b.innerText.match(/Section 107\b.*Row P.*Seat 10\b/)
})
''')

How to get Document Name from DocumentReference in Firestore Python

I have a document reference that I am retreiving from a query on my Firestore database. I want to use the DocumentReference as a query parameter for another query. However, when I do that, it says
TypeError: sequence item 1: expected str instance, DocumentReference found
This makes sense, because I am trying to pass a DocumentReference in my update statement:
db.collection("Teams").document(team).update("Dictionary here") # team is a DocumentReference
Is there a way to get the document name from a DocumentReference? Now before you mark this as duplicate: I tried looking at the docs here, and the question here, although the docs were so confusing and the question had no answer.
Any help is appreciated, Thank You in advance!

Yes,split the .refPath. The document "name" is always the last element after the split; something like lodash _.last() can work, or any other technique that identifies the last element in the array.
Note, btw, the refPath is the full path to the document. This is extremely useful (as in: I use it a lot) when you find documents via collectionGroup() - it allows you to parse to find parent document(s)/collection(s) a particular document came from.
Also note: there is a pseudo-field __name__ available. (really an alias of documentID()). In spite of it's name(s), it returns the FULL PATH (i.e. refPath) to the document NOT the documentID by itself.

I think I figured out - by doing team.path.split("/")[1] I could get the document name. Although this might not work for all firestore databases (like subcollections) so if anyone has a better solution, please go ahead. Thanks!

python3.6 How do I regex a url from a .txt?

I need to grab a url from a text file.
The URL is stored in a string like so: 'URL=http://example.net'.
Is there anyway I could grab everything after the = char up until the . in '.net'?
Could I use the re module?

text = """A key feature of effective analytics infrastructure in healthcare is a metadata-driven architecture. In this article, three best practice scenarios are discussed: https://www.healthcatalyst.com/clinical-applications-of-machine-learning-in-healthcare Automating ETL processes so data analysts have more time to listen and help end users , https://www.google.com/, https://www.facebook.com/, https://twitter.com
code below catches all urls in text and returns urls in list."""
urls = re.findall('(?:(?:https?|ftp):\/\/)?[\w/\-?=%.]+\.[\w/\-?=%.]+', text)
print(urls)
output:
[
'https://www.healthcatalyst.com/clinical-applications-of-machine-learning-in-healthcare',
'https://www.google.com/',
'https://www.facebook.com/',
'https://twitter.com'
]

i dont have much information but i will try to help with what i got im assuming that URL= is part of the string in that case you can do this
re.findall(r'URL=(.*?).', STRINGNAMEHERE)
Let me go more into detail about (.*?) the dot means Any character (except newline character) the star means zero or more occurences and the ? is hard to explain but heres an example from the docs "Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. ab? will match either ‘a’ or ‘ab’." the brackets place it all into a group. All this togethear basicallly means it will find everything inbettween URL= and .

You don't need RegEx'es (the re module) for such a simple task.
If the string you have is of the form:
'URL=http://example.net'
Then you can solve this using basic Python in numerous ways, one of them being:
file_line = 'URL=http://example.net'
start_position = file_line.find('=') + 1 # this gives you the first position after =
end_position = file_line.find('.')
# this extracts from the start_position up to but not including end_position
url = file_line[start_position:end_position]
Of course that this is just going to extract one URL. Assuming that you're working with a large text, where you'd want to extract all URLs, you'll want to put this logic into a function so that you can reuse it, and build around it (achieve iteration via the while or for loops, and, depending on how you're iterating, keep track of the position of the last extracted URL and so on).
Word of advice
This question has been answered quite a lot on this forum, by very skilled people, in numerous ways, for instance: here, here, here and here, to a level of detail that you'd be amazed. And these are not all, I just picked the first few that popped up in my search results.
Given that (at the time of posting this question) you're a new contributor to this site, my friendly advice would be to invest some effort into finding such answers. It's a crucial skill, that you can't do without in the world of programming.
Remember, that whatever problem it is that you are encountering, there is a very high chance that somebody on this forum had already encountered it, and received an answer, you just need to find it.

Please try this. It worked for me.
import re
s='url=http://example.net'
print(re.findall(r"=(.*)\.",s)[0])

Catching an exception and continuing with the loop to continue web search with python

I am running into exceptions when i try to search for data from a list of values in search bar. I would like to capture these exceptions and continue with the rest of the loop. Is there a way i could do this. I am getting two kinds of exceptions, one is above the search bar and one below. I am currently using selenium to login and get the necessary details
Error Messages:
Above search bar
our search returned more than 100 results. Only the first 100 results will be displayed. Please select 'Reset' and refine the search criteria for specific results. (29)
Employer Number is not a valid . Minimum length should be 8. (890)
Error Message below the search bar.
No records found...
This is my code:
for i in ids:
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[16]/td[1]/a').click()
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[4]/td[3]/a').click()
# searching for an id.
driver.find_element_by_xpath('//*[#id="ctl00_ctl00_cphMain_cphMain_txtEmprAcctNu"]').send_keys(i)
driver.find_element_by_id('ctl00_ctl00_cphMain_cphMain_btnSearch').click()
driver.find_element_by_xpath('//*[#id="ctl00_ctl00_cphMain_cphMain_grdAgentEmprResults"]/tbody/tr[2]/td[1]/a').click()
#navigating to the employee details
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[8]/td[3]/a').click()
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[4]/td[1]/a').click()
After the above code runs if there is an error or mismatch i am getting the mentioned exceptions and the code shuts down. How do i capture those exceptions and continue with the code. If i code do something similar the way i am capturing the date would be really helpful.
#copying the and storing the date
subdate = driver.find_element_by_id('ctl00_ctl00_cphMain_cphMain_frmViewAccountProfile_lblSubjectivityDate').text
subjectivitydate.append(subdate)
#exiting current employee details
driver.find_element_by_id('ctl00_ctl00_cphMain_ULinkButton4').click()
sleep(1)
Edited Code:
for i in ids:
try:
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[16]/td[1]/a').click()
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[4]/td[3]/a').click()
# searching for an id.
driver.find_element_by_xpath('//*[#id="ctl00_ctl00_cphMain_cphMain_txtEmprAcctNu"]').send_keys(i)
driver.find_element_by_id('ctl00_ctl00_cphMain_cphMain_btnSearch').click()
driver.find_element_by_xpath('//*[#id="ctl00_ctl00_cphMain_cphMain_grdAgentEmprResults"]/tbody/tr[2]/td[1]/a').click()
#navigating to the employee profile
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[8]/td[3]/a').click()
driver.find_element_by_xpath('//*[#id="print_area"]/table/tbody/tr[4]/td[1]/a').click()
#copying the and storing the date
subdate = driver.find_element_by_id('ctl00_ctl00_cphMain_cphMain_frmViewAccountProfile_lblSubjectivityDate').text
subjectivitydate.append(subdate)
#exiting current employee details
driver.find_element_by_id('ctl00_ctl00_cphMain_ULinkButton4').click()
sleep(1)
except:
continue
How do i restart the loop?
Regards,
Ren

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

App Engine, items disappearing from search index - python

Related

Spotify wep api could not find existing item

Using "contains (text)" to find parent and following sibling in selenium with Python?

How to get Document Name from DocumentReference in Firestore Python

python3.6 How do I regex a url from a .txt?

Catching an exception and continuing with the loop to continue web search with python

Categories

Resources