Reversing a text and find same words - python - python

I have to write some code in python that will read all words from a text, reverse them and find which of them are the same in normal and reverse format. So far, I 've done this:
filename=raw_input("enter the file name: ")
fop=open(filename)
for line in fop:
words=line.split()
li=[]
li.extend(words)
size=len(li)
for i in range(0,size/2):
li[i], li[size-1-i] = li[size-1-i], li[i]
`enter code here`''.join(li)
but it doesn 't work, because if i give a text with more than one lines, it only processes the last line and doesn 't actually seem to reverse anything. Some help please?

You can just do the following , you can check for reverse with word == word[::-1] that word[::-1] is reverse indexing :
filename=raw_input("enter the file name: ")
with open(filename) as f :
for line in f:
for word in line.split() :
if word == word[::-1]:
print word

If you want to print just once the palindrome words you can use a set comprehension
print '\n'.join({w for w in open('file.txt).read().split() if w==w[::-1]})
Note that my answer doesn't filter any single letters, punctuation etc, in other words it depends on a loose and broad definition of what is a word.

Related

extracting words from a string without using the .split() function

I coded this in order to get a list full of a given string words .
data=str(input("string"))
L=[]
word=""
for i in data:
if i.isalpha() :
word+=i
elif :
L.append(word)
word=""
but, when I run this code it doesn't show the last word !
You can simply split words on a string using str.split() method, here is a demo:
data = input("string: ")
words = data.split()
L = []
for word in words:
if word.isalpha():
L.append(word)
print(L)
Note that .split() splits a string by any whitespace character by default, if you want for example to split using commas instead, you can simply use data.split(",").
You are not getting the last word into the list because it does not have non-alpha character to make it pass to the else stage and save the word to list.
Let's correct your code a little. I assume you want to check the words in the string but not characters(because what you are doing right now is checking each charackter not words.):
data=input("Input the string: ") #you don't need to cast string to string (input() returns string)
data = data+' ' # to make it save the last word
l=[] #variable names should be lowercase
word=""
for i in data:
if i.isalpha() :
word+=i
else: # you shouldn't use elif it is else if no condition is provided
l.append(word)
word=" " # not to make each word connected right after each other

How to remove values in list which contain alphabetical characters?

I am reading a .dat file and the first few lines are just metadata before it gets to the actual data. A shortened example of the .dat file is below.
&SRS
SRSRUN=266128,SRSDAT=20180202,SRSTIM=122132,
fc.fcY=0.9000
&END
energy rc ai2
8945.016 301.32 6.7959
8955.497 301.18 6.8382
8955.989 301.18 6.8407
8956.990 301.16 6.8469
Or as the list:
[' &SRS\n', ' SRSRUN=266128,SRSDAT=20180202,SRSTIM=122132,\n', 'fc.fcY=0.9000\n', '\n', ' &END\n', 'energy\trc\tai2\n', '8945.016\t301.32\t6.7959\n', '8955.497\t301.18\t6.8382\n', '8955.989\t301.18\t6.8407\n', '8956.990\t301.16\t6.8469\n']
I tried this previously but it :
def import_absorptionscan(file_path,start,end):
for i in range(start,end):
lines=[]
f=open(file_path+str(i)+'.dat', 'r')
for line in f:
lines.append(line)
for line in lines:
for c in line:
if c.isalpha():
lines.remove(line)
print lines
But i get this error: ValueError: list.remove(x): x not in list
i started looking through stack overflow then but most of what came up was how to strip alphabetical characters from a string, so I made this question.
This produces a list of strings, with each string making up one line in the file. I want to remove any string which contains any alphabet characters as this should remove all the metadata and leave just the data. Any help would be appreciated thank you.
I have a suspicion you will want a more robust rule than "does the string contain a letter?", but you can use a regular expression to check:
re.search("[a-zA-Z]", line)
You'll probably want to take a look at the regular expression docs.
Additionally, you can use the any statement to check for letters. Inside your inner for loop add:
If any (word.isalpha() for word in line)
Notice that this will say that "ver9" is all numbers, so if this is a problem, just replace it with:
line_is_meta = False
for word in line:
if any (letter.isalpha() for letter in word):
line_is_meta = True
break
for letter in word:
if letter.isalpha():
line_is_meta = True
break
if not line_is_meta: lines.append (line)

How do I output the acronym on one line

I am following the hands-on python tutorials from Loyola university and for one exercise I am supposed to get a phrase from the user, capatalize the first letter of each word and print the acronym on one line.
I have figured out how to print the acronym but I can't figure out how to print all the letters on one line.
letters = []
line = input('?:')
letters.append(line)
for l in line.split():
print(l[0].upper())
Pass end='' to your print function to suppress the newline character, viz:
for l in line.split():
print(l[0].upper(), end='')
print()
Your question would be better if you shared the code you are using so far, I'm just guessing that you have saved the capital letters into a list.
You want the string method .join(), which takes a string separator before the . and then joins a list of items with that string separator between them. For an acronym you'd want empty quotes
e.g.
l = ['A','A','R','P']
acronym = ''.join(l)
print(acronym)
You could make a string variable at the beginning string = "".
Then instead of doing print(l[0].upper()) just append to the string string += #yourstuff
Lastly, print(string)

Removing \n from myFile

I am trying to create a dictionary of list that the key is the anagrams and the value(list) contains all the possible words out of that anagrams.
So my dict should contain something like this
{'aaelnprt': ['parental', 'paternal', 'prenatal'], ailrv': ['rival']}
The possible words are inside a .txt file. Where every word is separated by a newline. Example
Sad
Dad
Fruit
Pizza
Which leads to a problem when I try to code it.
with open ("word_list.txt") as myFile:
for word in myFile:
if word[0] == "v": ##Interested in only word starting with "v"
word_sorted = ''.join(sorted(word)) ##Get the anagram
for keys in list(dictonary.keys()):
if keys == word_sorted: ##Heres the problem, it doesn't get inside here as theres extra characters in <word_sorted> possible "\n" due to the linebreak of myfi
print(word_sorted)
dictonary[word_sorted].append(word)
If every word in "word_list.txt" is followed by '\n' then you can just use slicing to get rid of the last char of the word.
word_sorted = ''.join(sorted(word[:-1]))
But if the last word in "word_list.txt" isn't followed by '\n', then you should use rstrip().
word_sorted = ''.join(sorted(word.rstrip()))
The slice method is slightly more efficient, but for this application I doubt you'll notice the difference, so you might as well just play safe & use rstrip().
Use rstrip(), it removes the \n character.
...
...
keys == word_sorted.rstrip()
...
You should try to use the .rstrip() function in your code, it will remove the "\n"
Here you can check it .rstrip()
strip only removes characters from the beginning or end of a string.
Use rstrip() to remove \n character
Also you can use replace syntax, to replace newline with something else.
str2 = str.replace("\n", "")
So, I see a few problems here, how is anything getting into the dictionary, I see no assignments? Obviously you've only provided us a snippet, so maybe that's elsewhere.
You're also using a loop when you could be using in (it's more efficient, truly it is).
with open ("word_list.txt") as myFile:
for word in myFile:
if word[0] == "v": ##Interested in only word starting with "v"
word_sorted = ''.join(sorted(word.rstrip())) ##Get the anagram
if word_sorted in dictionary:
print(word_sorted)
dictionary[word_sorted].append(word)
else:
# The case where we don't find an anagram in our dict
dictionary[word_sorted] = [word,]

How to search for words containg certain letters in a txt file with Python?

Look at the code below. This finds the letter 'b' containing in the text file and prints all the words containing the letter 'b' right?
x = open("text file", "r")
for line in x:
if "b" and in line: print line
searchfile.close()
Now here is my problem. I would like to search with not only one, but several letters.
Like, a and b both has to be in the same word.
And then print the list of words containing both letters.
And I'd like to have the user decide what the letters should be.
How do I do that?
Now I've come up with something new. After reading an answer.
x = open("text file", "r")
for line in x:
if "b" in line and "c" in line and "r" in line: print line
Would this work instead?
And how do I make the user enter the letters?
No, your code (apart from the fact that it's syntactically incorrect), will print every line that has a "b", not the words.
In order to do what you want to do, we need more information about the text file. Suppossing words are separated by single spaces, you could do something like this
x = open("file", "r")
words = [w for w in x.read().split() if "a" in w or "b" in w]
You could use sets for this:
letters = set(('l','e'))
for line in open('file'):
if letters <= set(line):
print line
In the above,letters <= set(line) tests whether every element of letters is present in the set consisting of the unique letters of line.
First you need to split the contents of the file into a list of words. To do this you need to split it on line-breaks and on spaces, possibly hypens too, I don't really know. You might want to use re.split depending on how complicated the requirements are. But for this examples lets just go:
words = []
with open('file.txt', 'r') as f:
for line in f:
words += line.split(' ')
Now it will help efficiency if we only have to scan words once and presumably you only want a word to appear once in the final list anyway, so we cast this list as a set
words = set(words)
Then to get only those selected_words containing all of the letters in some other iterable letters:
selected_words = [word for word in words if
[letter for letter in letters if letter in word] == letters]
I think that should work. Any thoughts on efficiency? I don't know the details of how those list comprehensions run.
x = open("text file", "r")
letters = raw_input('Enter the letters to match') # "ro" would match "copper" and "word"
letters = letters.lower()
for line in x:
for word in line.split()
if all(l in word.lower() for l in letters): # could optimize with sets if needed
print word

Categories

Resources