Python - take words from text [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I need to parse a text but the problem with it is that words which I am looking for are not on only one line.
For example the word computer can have "comp" at the end of the line and at the beginning of the line I have "uter" (without any white spaces). I want to print that I have found the word "computer"
Which is the best solution to do it, taking into account that I need an optimized algorithm, not something that checks for each letter in the word computer.

Try using this kind of format:
word in "".join(line.strip() for line in text)
Here is a demo

Related

Regex: Ignore the first match of continuous same characters [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I'm trying to ignore the first \ in a string in a continuous sequence by using regex. So that when there is only \, we will not match it. When there are two \\, we only match one \.
For example:
I got:
{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/tuchong.fullscreen\\/29016715_tt\"}
what regex I can use to make it become:
{"url":"http:\/\/p1.pstatp.com\/origin\/tuchong.fullscreen\/29016715_tt"}
How to make this happen?
This regex should suit your needs :
(?<!\\)\\
See this demo here.

If a string word contains particular characters and remove the word that contains the characters [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Suppose I have a data as follows,
data['sentences']
This is a sentence
Donald Trump
Machine Learning
Python is good
I want to search for pattern of characters and if we find one, need to remove that word which contains the characters.
Suppose I want to remove words with "enc" , "ood" and "ump", the output should be,
data['sentences']
This is a
Donald
Machine Learning
Python is
I tried the following where I used re.sub,
re.sub("enc", "", y)
But this is giving output like, This is a sente . I am not sure how to remove the entire word.
Can anybody help me in doing this is python? I want to find the efficient way to do this because, I want to run this for nearly 1 Billion records using pyspark. Can anybody help me in doing this?
Thanks
Add iterations before and after the identifier:
re.sub(r'\w*enc\w*', '', y)
That would replace with blank all the alphanumeric characters along with the specified string (i.e. the word it is contains within).

Is there a way to generate words that sound similar to a given dictionary word without using a corpus? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am trying to use phonetic algorithms like Soundex and/or Metaphone to generate words that sound similar to a given dictionary word. Do I have to have a corpus of all dictionary words for doing that? Is there another way to generate words that sound similar to a given word without using a corpus? I am trying to do it in Python.
If you don't use a corpus, then you will probably have to manually define a set of rules to split a word in phonetic parts and then find the list of close phonemes. This can generate similar sounding words but most won't exist. If you want to generate close sounding words that exist, then you necessarily need a corpus.
You didn't precise the goal of your task, but you may be interested in the works of Will Leben "Sounder I" (and II and III) and Jabberwocky sentences.

How to keep python from breaking one word over 2 lines? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
What I mean by the title is how can I transform this:
Example Example Example Example Example Example Example Example Example Exam
ple Example
Into this:
Example Example Example Example Example Example Example Example Example
Example Example
Any ideas?
Use the textwrap module.
From the docs:
textwrap.wrap(text[, width[, ...]])
Wraps the single paragraph in text (a string) so every
line is at most width characters long. Returns a list
of output lines, without final newlines.
You would need to get the width of the thing you are printing to. (Probably the terminal). And then keep track of the length of string as you add the words. If the word would make python print it over two lines use a line break before it. This link is for getting terminal width How to get Linux console window width in Python

Indent by exactly one space [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have written a large block of code in Python that I just realized is indented exactly three spaces (not four!) Using Notepad++ as my IDE, I cannot find any way to indent exactly one additional space to make it line up with everything else.
I imagine there is some way to write a macro to shift everything by one space, but I have little intention on mastering Notepad++'s macros just for this one case. Perhaps there is even a setting I missed?
Is there a non-manual way to indent to the proper alignment (adding one space)?
Just to write up the comment as an answer (as asked by the OP).
You just need to do a find and replace with a Regular Expression that matches for 3 space characters at the beginning of a line and replace it with four characters. So the pattern to match would be something like ^\s{3}[^\s].
Option 1:
Search Replace
Search Mode - Regex
Find what ^\s
Replace with <2 space characters>
Option 2:
Do a block select of all the columns. For block select, use ALT + SHIFT followed by dragging your mouse all the way from start to end
Add as many spaces as you want

Categories

Resources