python Search for text in a gtk textview

python Search for text in a gtk textview - python

I have looked around and I would think this to be really simple, for some reason I ahve only found parts of what I need.
I have made a text editor and I have a box that what is typed it will find the problem is that it will only find the first word the in the text view adn I can't get it to search the next line.
like a find function in a textdocument.
def search(found):
search_str = findentry.get_text()
start_iter = textbuffer.get_start_iter()
found = start_iter.forward_search(search_str,0, None)
if found:
match_start,match_end = found
textbuffer.select_range(match_start,match_end)
I thought I would be able to do a button that is a search next and make it forward search again adding something and a variable +1.
how can I make it search forward and backwards.

You are using get_start_iter(), which returns the first position in text buffer. Probably, you want to start from match_end, which is the position where word ends in the first search, that is, you should start from there.
Assuming you are returning found and calling again search with that parameter, then can replace the line:
start_iter = textbuffer.get_start_iter()
by
start_iter = found[1] if found else textbuffer.get_start_iter()
The first time, or whenever you want to reset the search, you can pass found=None.

Related

How to use Python Fitz detect Hyphen when using search_for?

I'm new to the Fitz library and am working on a project where I need to find a string in a PDF page. I'm running into a case where the text on the page that I'm searching on is hyphenated. I am aware of the TEXT_DEHYPHENATE flag that I can use in the search for function, but that doesn't work for me (as shown in the image here https://postimg.cc/zHZPdd6v ). I'm getting no cases when I search for the hyphenated string.
Python Script
LOC = "./test.pdf"
doc = fitz.open(LOC)
page = doc[1]
print(page.get_text())
found = page.search_for("lowcost", flags=TEXT_DEHYPHENATE)
print("DONE")
print(len(found))
found = page.search_for("low-cost", flags=TEXT_DEHYPHENATE)
print("DONE")
print(len(found))
found = page.search_for("low cost", flags=TEXT_DEHYPHENATE)
print("DONE")
print(len(found))
for rect in found:
print(rect)
Output
Abstract
The objective of “XXXXXXXXXXXXXXXXXX” was design and assemble a low-
cost and efficient tool.
DONE
0
DONE
0
DONE
0
Can someone please point me to how I might be able to detect the hyphen in my file? Thank you!

Your first approach should work, look here:
# insert some hyphenated text
page.insert_textbox((100,100,300,300),"The objective of 'xxx' was design and assemble a low-\ncost and efficient tool.")
157.94699853658676
# now search for it again
page.search_for("lowcost") # 2 rectangles!
[Rect(159.3009796142578, 116.24800109863281, 175.8009796142578, 131.36199951171875),
Rect(100.0, 132.49501037597656, 120.17399597167969, 147.6090087890625)]
# each containing a text portion with hyphen removed
for rect in page.search_for("lowcost"):
print(page.get_textbox(rect))
low
cost
Without the original file there is no way to tell the reason for your failure.
Are you sure there really is text - and not e.g. an image or other hickups?
Edited: As per the comment of user #KJ below: PyMuPDF's C base library MuPDF regards all of the unicodes '-', 0xAD, 0x2010, 0x2011 as hyphens in this context. They all should work the same. Just reconfirmed it in an example.

How to make bolding text in the text widget work optimally?

I want to make text bold work as intended, meaning for example when you want to make a selected word bold, but you mistakenly didn't select the whole word, and you left out the last letter, and then you want to correct that mistake and select the whole word, instead of bolding the word, it will change its weight to normal. Specifically I'm talking about this way of doing this:
`
def bold_it():
bold_font = font.Font(my_text, my_text.cget("font"))
bold_font.configure(weight="bold")
my_text.tag_configure("bold", font=bold_font)
current_tags = my_text.tag_names("sel.first")
if "bold" in current_tags:
my_text.tag_remove("bold", "sel.first", "sel.last")
else:
my_text.tag_add("bold", "sel.first", "sel.last")
`
I am fully aware of what the problem is, and it is in the current_tags variable, since the variable will return "bold" because tag names only looks at tags which are at the first selected position. In turn, this will make the if statements remove the bold tag instead of applying it.
So my question is, how do you fix this, or optimize this?
Codemy.com did a video on this, and this question is based on this video, https://www.youtube.com/watch?v=X6zqePBPDVU.
I tried utilizing the tag_ranges() method so I could get two indexes instead of just where the selecting begins, but it did not work because tag_names() accepts only one argument.

Why are my charFormat styles only working on selections, and only those made in a specific direction?

I've been trying to be more explicit in my assignment of character formats for a text editor so that I can understand what I might be able to customize with my current skill range. While the basic copy-paste versions of my format methods worked pretty well, the version below keeps working and then not working in frustrating ways and need help figuring out what might be causing it.
The editor was originally intended to be a WYSIWYG editor styled via tags for documentation. Qt's confusing use of Html hasn't made that easy.
My basic flow is to extract a copy of the current format, check its current state, invert it, and reapply the format to the position or selection it was extracted from.
# textEdit is a QTextEdit with a loaded document.
# This function is one of several related pairs called by a switchboard.
# It's intent is to invert the italic state of the current position/selection.
def toggle_italic_text(textEdit):
# Get the cursor, and the format's state at its current selection/position.
cursor = textEdit.textCursor()
charFormat = cursor.charFormat()
currentState = charFormat.fontItalic()
# Invert the state within the format.
print(currentState)
charFormat.setFontItalic(not currentState)
print(charFormat.fontItalic())
# Reapply the format to the cursor's current selection/position.
cursor.mergeCharFormat(charFormat)
When I first implemented it, this worked find. Now, it only works on selections, and even then it seems to identify the wrong state depending which direction I make a selection. After experimenting with it, it appears that if I make a selection to the right, it inverts correctly. If I make a selection to the left, it doesn't.
When trying to assign it to a position without a selection, the printed state changes from False to True, which is desired, yet the effect doesn't apply as I type. If I run it repeatedly in place, it continues to change from False to True, meaning the change is being lost.
The function is being called consistently and running through completely. The stored state of the charFormat copy does change.
Why has this pattern stopped working? Am I using charFormats wrong? Why does the direction of selection change the results?
As far as what changed on my end, I had been getting lost in my styling efforts after needing to apply styles through QFonts, QCharFormats, QPalette, and CSS stylesheets (and doc.defaultStylesheet) targeting both widgets and html tags. I desperately wanted my styles to be controlled through one approach, but couldn't figure out the hierarchy or find an approach that applied widely enough. In the end, I stripped out everything except for the stylesheet assigned to the window.
If there's no issue with the code itself, I'm really hoping for hints at what might be disrupting things. It took me awhile to get used to the idea that cursors and formats are copies meant to be changed and reapplied, while the document and its blocks are the real structure.

The important thing that must be considered about QTextCursor.charFormat() is this:
Returns the format of the character immediately before the cursor position().
So, not only this doesn't work very well with selections that include multiple character formats, but you also have to consider the cursor position, which might change in a selection: it could be at the beginning (so it would return the format of the character before the selection), or at the end (returning the format of the last character in the selection).
If you want to invert the state based on the current cursor position (if at the beginning, use the first character, if at the end, use the last), then you can use the following:
def toggle_italic_text(self):
cursor = self.textEdit.textCursor()
if not cursor.hasSelection():
charFormat = cursor.charFormat()
charFormat.setFontItalic(not charFormat.fontItalic())
cursor.setCharFormat(charFormat)
# in this case, the cursor has to be applied to the textEdit to ensure
# that the following typed characters use the new format
self.textEdit.setTextCursor(cursor)
return
start = cursor.selectionStart()
end = cursor.selectionEnd()
newCursor = QtGui.QTextCursor(self.textEdit.document())
newCursor.setPosition(start)
if cursor.position() == start:
cursor.setPosition(start + 1)
charFormat = cursor.charFormat()
charFormat.setFontItalic(not charFormat.fontItalic())
newCursor.setPosition(end, cursor.KeepAnchor)
newCursor.mergeCharFormat(charFormat)
If you want to invert all states in the selection, you need to cycle through all characters.
While you could just change the char format for each character, that wouldn't be a very good thing for very large selections, so the solution is to apply the italic only when the char format actually changes from the previous state, and when at the end of the selection.
def toggle_italic_text(self):
# ...
start = cursor.selectionStart()
end = cursor.selectionEnd()
newCursor = QtGui.QTextCursor(self.textEdit.document())
newCursor.setPosition(start)
cursor.setPosition(start)
prevState = cursor.charFormat().fontItalic()
while cursor.position() < end:
cursor.movePosition(cursor.Right)
charFormat = cursor.charFormat()
if charFormat.fontItalic() != prevState or cursor.position() >= end:
newPos = cursor.position()
if cursor.position() < end:
newPos -= 1
newCursor.setPosition(newPos, cursor.KeepAnchor)
charFormat.setFontItalic(not prevState)
newCursor.mergeCharFormat(charFormat)
prevState = not prevState
newCursor.setPosition(cursor.position() - 1)

Pyqt: Get text under cursor

How can I get the text under the cursor? So if I hover over it and the word was "hi" I could read it? I think I need to do something with QTextCursor.WordUnderCursor but I am not really sure what. Any help?
This is what I am trying to work with right now:
textCursor = text.cursorForPosition(event.pos());
textCursor.select(QTextCursor.WordUnderCursor);
text.setTextCursor(textCursor);
word = textCursor.selectedText();
I have it selecting the text right now just so I can see it.
Edit 2:
What I am really trying to do is display a tooltip over certain words in the text.

Unfortunately, I can't test this at the moment, so this is a best guess at what you need. This is based on some code I wrote that had a textfield that showed errors in a tooltip as you typed, but should work.
You've already got code to select the word under the hover over, you just need the tooltip in the right spot.
textCursor = text.cursorForPosition(event.pos())
textCursor.select(QTextCursor.WordUnderCursor)
text.setTextCursor(textCursor)
word = textCursor.selectedText()
if meetsSomeCondition(word):
toolTipText = toolTipFromWord(word)
# Put the hover over in an easy to read spot
pos = text.cursorRect(text.textCursor()).bottomRight()
# The pos could also be set to event.pos() if you want it directly under the mouse
pos = text.mapToGlobal(pos)
QtGui.QToolTip.showText(pos,toolTipText)
I've left meetsSomeCondition() and toolTipFromWord() up to you to fill in as you don't describe those, but they are pretty descriptive in what needs to go there.
Regarding your comment on doing it without selecting the word, the easiest way to do this is to cache the cursor before you select a new one and then set it back. You can do this by calling QTextEdit.textCursor() and then setting it like you did previously.
Like so:
oldCur = text.textCursor()
textCursor.select(QTextCursor.WordUnderCursor) # line from above
text.setTextCursor(textCursor) # line from above
word = textCursor.selectedText() # line from above
text.setTextCursor(oldCur)
# if condition as above

Finding links in file, keeps repeating same link

I'm a bit new to Python, but I have taken a HS level Java class. I'm trying to write a Python script that will take all the torrent links in my Humble Bundle downloads page and spit them out into a .txt file. I'm currently trying to get it to read all of them and print them, but I can't seem to get it to look past the first one. I've tried some different loops, and some of them spit it out once, others continuously spit out the same one over and over. Here is my code.
f = open("Humble Bundle.htm").read()
pos = f.find('torrents.humblebundle.com') #just to initialize it for the loop
end = f.find('.torrent') #same here
pos1 = f.find('torrents.humblebundle.com') #first time it appears
end1 = f.rfind('.torrent') #last time it appears
while pos >= pos1 and end <= end1:
pos = f.find('torrents.humblebundle.com')
end = f.find('.torrent')
link = f[pos:end+8]#the link in String form
print(link)
I would like help in both my current issue and on how to continue to the final script. This is my first post here, but I've researched what I could before giving up and asking for help. Thanks for your time.

You can find more information about find method at http://docs.python.org/2/library/string.html#string.find
The problem is when you execute these two lines they always return same value for pos and end because function always gets same arguments.
pos = f.find('torrents.humblebundle.com')
end = f.find('.torrent')
find method has another optional parameter called start which tells function where to start searching for given string. So if you change your code:
pos = f.find('torrents.humblebundle.com', pos+1)
end = f.find('.torrent', end+1)
it should work

You can try a regular expression here:
import re
f = open('Humble Bundle.htm').read()
pattern = re.compile(r'torrents\.humblebundle\.com.*\.torrent')
print re.findall(pattern, f)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python Search for text in a gtk textview - python

Related

How to use Python Fitz detect Hyphen when using search_for?

How to make bolding text in the text widget work optimally?

Why are my charFormat styles only working on selections, and only those made in a specific direction?

Pyqt: Get text under cursor

Finding links in file, keeps repeating same link

Categories

Resources