Text Indexing and Slicing - python

I am supposed to transform each sentence such that we only keep the words between the third and the third-last word (inclusive) and skip every second word on the way.
Text in jane_eyre_sentences.txt:
My feet they are sore and my limbs they are weary
Long is the way and the mountains are wild
Soon will the twilight close moonless and dreary
Over the path of the poor orphan child
My Code is shown as below:
for line in open("jane_eyre_sentences.txt"):
line_strip = line.rstrip()
words = line_strip.split()
if len(words)%2 == 0:
print(" ".join(words[2:-4:2]), ""+ "".join(words[-3]))
else:
print(" ".join(words[2:-3:2]),""+ "".join(words[-3]))
My Output:
they sore my they
the and mountains
the moonless
path poor
Expected Output:
they sore my they
the and mountains
the close
path the

You are appending the wrong word for even lines. You must change this line
print(" ".join(words[2:-4:2]), ""+ "".join(words[-3]))
to
print(" ".join(words[2:-4:2]), ""+ "".join(words[-4]))
You can also get rid of the unnecessary empty string and the second join as it is a single word anyway:
print(" ".join(words[2:-4:2]), words[-4])

Related

Stop at nth element?

Not sure how to add a condition to make result output stops at third last words.
Currently working on printing out specific words within the texts:
My feet they are sore and my limbs they are weary
Long is the way and the mountains are wild
I managed to print out the output of:
they sore my they weary
the and mountains wild
But what i am trying to print out is:
they sore my they
the and mountains
Which stops at the third last of the sentences
The code:
for line in open("testing.txt"):
low = line.lower()
words = low.split()
n = 2
print(" ".join(words[n::n]))
Thanks for reading.
You just need to use the word list from n to -2 (to stop at 3rd last word) instead of the whole list
print(" ".join(words[n:-2:n]))
for line in open("testing.txt")
low = line.lower()
words = low.split()
words = words[0:-2]
n = 2
print(" ".join(words[n::n]))

Is there a way of getting this string down to 3 words?

There are multiple problems with the code i posted below, since as i also said on my previous post im new to coding i have some trouble finding stuff by myself :(
My goal is to take user input, narrow it down to 3 words by size and then sort them alphabetically. Am i doing this right?
Probably not because it prints it out with commas. For example, with "i like eating cake" as input, the output is:
"'cake',", "'eating'", "'i',", "'like',"
But I want it to be:
cake, eating, like
Any help is much appreciated.
input = input(" ")
prohibited = {'this','although','and','as','because','but','even if','he','and','however','cosmos','an','a','is','what','question :','question','[',']',',','cosmo',' ',' ',' '}
processedinput = [word for word in re.split("\W+",input) if word.lower() not in prohibited]
processed = processedinput
processed.sort(key = len)
processed = re.sub('[\[\]]','',repr(processedinput)) #removes brackets
keywords = processed
keywords = keywords.split()
keywords.sort(key=str.lower)
keywords.sort()
keywords = re.sub('[\[\]]','',repr(keywords))
str(keywords)
print(keywords)
The first issue with your code is input = input(). The problem with this is that input is the name of the function you are calling, but you are overwriting input with the user's string. Consequently, if you tried to run input() again, it would fail.
The second issue is that you are misunderstanding lists. In the code below, tokens is a list, not a string. Each element in the list is a string. So there is no need to strip out brackets and such. You can simply order the list (that part of your code was correct) in reverse order of length, then print the first three words.
Code:
import re
user_input = input(" ")
prohibited = {'this','although','and','as','because','but','even if','he','and','however','cosmos','an','a','is','what','question :','question','[',']',',','cosmo',' ',' ',' '}
tokens = [word for word in re.split("\W+", user_input) if word.lower() not in prohibited]
tokens.sort(key=len, reverse=True)
print(tokens[0], end=', ')
print(tokens[1], end=', ')
print(tokens[2])
Input:
i like eating cake
Output:
eating, like, cake

How to return to original formatting

I have broken down lines of text file into individual words to check if they are in a dictionary. I now want to return/print the words back in the same lines.
I have tried editing the positions in my loop as I know I have the lines broken down already. I have thought that maybe I have to use a pop or remove function. I cannot use swap function.
def replace_mode(text_list,misspelling):
for line in text_list:
word = line.split(' ')
for element in word:
if element in misspelling.keys():
print(misspelling[element], end=(' '))
else:
print(element, end=(' '))
It is printing in a single line:
"joe and his family went to the zoo the other day the zoo had many animals including an elephant the elephant was being too dramatic though after they walked around joe left the zoo"
I want the processed text to be back in its original format(4 lines):
joe and his family went to the zoo the other day
the zooo had many animals including an elofent
the elaphant was being too dramati though
after they walked around joe left the zo
Add this line, right after your last print(element, end=(' ')) statement, at the same level of indentation as for element in word::
print()
This will print a newline at the end of each of the original lines, right after you've finished processing every word from that line but before you've moved on to the next line.

Splitting elements within a list and separate strings, then counting the length

If I have several lines of code, such that
"Jane, I don't like cavillers or questioners; besides, there is something truly forbidding in a child taking up her elders in that manner.
Be seated somewhere; and until you can speak pleasantly, remain silent."
I mounted into the window- seat: gathering up my feet, I sat cross-legged, like a Turk; and, having drawn the red moreen curtain nearly close, I was shrined in double retirement.
and I want to split the 'string' or sentences for each line by the ";" punctuation, I would do
for line in open("jane_eyre_sentences.txt"):
words = line.strip("\n")
words_split = words.split(";")
However, now I would get strings of text such that,
["Jane, I don't like cavillers or questioners', 'besides, there is something truly forbidding in a child taking up her elders in that manner.']
[Be seated somewhere', 'and until you can speak pleasantly, remain silent."']
['I mounted into the window- seat: gathering up my feet, I sat cross-legged, like a Turk', 'and, having drawn the red moreen curtain nearly close, I was shrined in double retirement.']
So it has now created two separate elements in this list.
How would I actually separate this list.
I know I need a 'for' loop because it needs to process through all the lines. I will need to use another 'split' method, however I have tried "\n" as well as ',' but it will not generate an answer, and the python thing says "AttributeError: 'list' object has no attribute 'split'". What would this mean?
Once I separate into separate strings, I want to calculate the length of each string, so i would do len(), etc.
You can iterate through the list of created words like this:
for line in open("jane_eyre_sentences.txt"):
words = line.strip("\n")
for sentence_part in words.split(";"):
print(sentence_part) # will print the elements of the list
print(len(sentence_part) # will print the length of the sentence parts
Alernatively if you just need the length for each of the parts:
for line in open("jane_eyre_sentences.txt"):
words = line.strip("\n")
sentence_part_lengths = [len(sentence_part) for sentence_part in words.split(";")]
Edit: With further information from your second post.
for count, line in enumerate(open("jane_eyre_sentences.txt")):
words = line.strip("\n")
if ";" in words:
wordssplit = words.split(";")
number_of_words_per_split = [(x, len(x.split())) for x in wordsplit]
print("Line {}: ".format(count), number_of_words_per_split)

Accumulating Characters in Python

So I have this textfile, and in that file it goes like this... (just a bit of it)
"The truest love that ever heart
Felt at its kindled core
Did through each vein in quickened start
The tide of being pour
Her coming was my hope each day
Her parting was my pain
The chance that did her steps delay
Was ice in every vein
I dreamed it would be nameless bliss
As I loved loved to be
And to this object did I press
As blind as eagerly
But wide as pathless was the space
That lay our lives between
And dangerous as the foamy race
Of ocean surges green
And haunted as a robber path
Through wilderness or wood
For Might and Right and Woe and Wrath
Between our spirits stood
I dangers dared I hindrance scorned
I omens did defy
Whatever menaced harassed warned
I passed impetuous by
On sped my rainbow fast as light
I flew as in a dream
For glorious rose upon my sight
That child of Shower and Gleam"
Now, the calculate the length of words without the letter 'e' in each line of text. So in the first line it should have 4, then 5, then 17, etc.
My current code is
for line in open("textname.txt"):
line_strip = line.strip()
line_strip_split = line_strip.split()
for word in line_strip_split:
if "e" not in word:
word_e = word
print (len(word_e))
My explanation is: Strip each word from each other by removing spaces, so it becomes ['Felt','at','its','kindled','core'], etc. Then we split each word because we can regard it individually when removing words with 'e'?. So we want words without e, then print the length of the string.
HOWEVER, this separates each word into a different line by splitting then separating the string? So this doesn't add all the words together in each line but separates it, so the answer becomes "4 / 2 / 3"
Try this:
for line in open("textname.txt"):
line_strip = line.strip()
line_strip_split = line_strip.split()
words_with_no_e = []
for word in line_strip_split:
if "e" not in word:
# Adding words without e to a new list
words_with_no_e.append(word)
# ''.join() will returns all the elements of array concatenated
# len() will count the length
print(len(''.join(words_with_no_e)))
It append all the words without e in into new list in each line, then concatenate all words then it prints length of it.

Categories

Resources