I am trying to find all fullstops in a body of text and create a whitespace in from of them. I am only allowed to use for loops or I would use regular expression. I can't understand why the following for-loop does not replace the original letter with my new assignment. My code is as follows:
text = 'I have a ruler and bought for 25 gpb.'
text_split = text.split()
for word in text_split:
for letter in word:
if letter == '.':
letter = ' .'
If anyone could help then it would be greatly appreciated
letter = ' .' just rebinds the name letter to a different string. The original object bound to letter is unchanged (and can't be changed even in theory; str is an immutable type). A similar problem prevents changing the original str in text_split bound to word on each loop.
For this specific case, you just want:
text_split = [word.replace('.', ' .') for word in text_split]
or the slightly longer spelled out version (that modifies text_split in place instead of replacing it with a new list of modified str):
for i, word in enumerate(text_split):
text_split[i] = word.replace('.', ' .')
Related
Given a string, I have to reverse every word, but keeping them in their places.
I tried:
def backward_string_by_word(text):
for word in text.split():
text = text.replace(word, word[::-1])
return text
But if I have the string Ciao oaiC, when it try to reverse the second word, it's identical to the first after beeing already reversed, so it replaces it again. How can I avoid this?
You can use join in one line plus generator expression:
text = "test abc 123"
text_reversed_words = " ".join(word[::-1] for word in text.split())
s.replace(x, y) is not the correct method to use here:
It does two things:
find x in s
replace it with y
But you do not really find anything here, since you already have the word you want to replace. The problem with that is that it starts searching for x from the beginning at the string each time, not at the position you are currently at, so it finds the word you have already replaced, not the one you want to replace next.
The simplest solution is to collect the reversed words in a list, and then build a new string out of this list by concatenating all reversed words. You can concatenate a list of strings and separate them with spaces by using ' '.join().
def backward_string_by_word(text):
reversed_words = []
for word in text.split():
reversed_words.append(word[::-1])
return ' '.join(reversed_words)
If you have understood this, you can also write it more concisely by skipping the intermediate list with a generator expression:
def backward_string_by_word(text):
return ' '.join(word[::-1] for word in text.split())
Splitting a string converts it to a list. You can just reassign each value of that list to the reverse of that item. See below:
text = "The cat tac in the hat"
def backwards(text):
split_word = text.split()
for i in range(len(split_word)):
split_word[i] = split_word[i][::-1]
return ' '.join(split_word)
print(backwards(text))
How is one of the following versions different from the other?
The following code returns the first letter of a word from string capitalize:
s = ' '.join(i[0].upper() + i[1:] for i in s.split())
The following code prints only the last word with every character separated by space:
for i in s.split():
s=' '.join(i[0].upper()+i[1:]
print s
For completeness and for people who find this question via a search engine, the proper way to capitalize the first letter of every word in a string is to use the title method.
>>> capitalize_me = 'hello stackoverlow, how are you?'
>>> capitalize_me.title()
'Hello Stackoverlow, How Are You?'
for i in s.split():`
At this point i is a word.
s = ' '.join(i[0].upper() + i[1:])
Here, i[0] is the first character of the string, and i[1:] is the rest of the string. This, therefore, is a shortcut for s = ' '.join(capitalized_s). The str.join() method takes as its argument a single iterable. In this case, the iterable is a string, but that makes no difference. For something such as ' '.join("this"), str.join() iterates through each element of the iterable (each character of the string) and puts a space between each one. Result: t h i s There is, however, an easier way to do what you want: s = s.title()
def make_new_words(start_word):
"""create new words from given start word and returns new words"""
new_words=[]
for letter in start_word:
pass
#for letter in alphabet:
#do something to change letters
#new_words.append(new_word)
I have a three letter word input for example car which is the start word.
I then have to create new word by replacing one letter at a time with every letter from the alphabet. Using my example car I want to create the words, aar, bar, car, dar, ear,..., zar. Then create the words car, cbr, ccr, cdr, cer,..., czr. Finally caa, cab, cac, cad, cae,..., caz.
I don't really know what the for loop should look like. I was thinking about creating some sort of alphabet list and by looping through that creating new words but I don't know how to choose what parts of the original word should remain. The new words can be appended to a list to be returned.
import string
def make_new_words(start_word):
"""create new words from given start word and returns new words"""
new_words = []
for i, letter in enumerate(start_word):
word_as_list = list(start_word)
for char in string.ascii_lowercase:
word_as_list[i] = char
new_words.append("".join(word_as_list))
return new_words
lowercase is just a string containing the lowercase letters...
We want to change each letter of the original word (here w) so we
iterate on the letters of w, but we'll mostly need the index of the letter, so we do our for loop on enumerate(w).
First of all, in python strings are immutable so we build a list x from w... lists are mutable
Now a second, inner loop on the lowercase letters: we change the current element of the x list accordingly (having changed x, we need to reset it before the next inner loop) and finally we print it.
Because we want to print a string rather than the characters in a list, we use the join method of the null string '' that glue together the elements of x using, of course, the null string.
I have not reported the output but it's exactly what you've asked for, just try...
from string import lowercase
w = 'car'
for i, _ in enumerate(w):
x = list(w)
for s in lowercase:
x[i] = s
print ''.join(x)
import string
all_letters = string.ascii_lowercase
def make_new_words(start_word):
for index, letter in enumerate(start_word):
template = start_word[:index] + '{}' + start_word[index+1:]
for new_letter in all_letters:
print template.format(new_letter)
You can do this with two loops, by looping over the word and then looping over a range for all letters. By keeping an index for the first loop, you can use a slice to construct your new strings:
for index in enumerate(start_word):
for let in range(ord('a'), ord('z')+1):
new_words.append(start_word[:index] + chr(let) + start_word[index+1:])
This could work as a brute-force approach, although you might end up with some performance issues when you go to try it with longer words.
It also sounds like you might want to constrain it only to words that exist in a dictionary at some point, which is a whole other can of worms.
But for right now, for three-letter words, you're onto something of the right track, although I worry that the question might be a little too specific for Stack Overflow.
First, you will probably have more success if you loop through the index for the word, rather than the letter:
alphabet = 'abcdefghijklmnopqrstuvwxyz'
for i in range(len(start_word)):
Then, you can use a slice to grab the letters before and after the index.
for letter in alphabet:
new_word = start_word[:i] + letter + start_word[i + 1:]
Another approach is given above, which casts the string to a list. That works around the fact that python will disallow simply setting start_word[i] = letter, which you can read about here.
This is a homework question. I need to define a function that takes a word and letter and deletes all occurrences of that letter in the word. I can't use stuff like regex or the string library. I've tried...
def delete(word,letter):
word = []
char = ""
if char != letter:
word+=char
return word
and
def delete(word,letter):
word = []
char = ""
if char != letter: #I also tried "if char not letter" for both
word = word.append(char)
return word
Both don't give any output. What am I doing wrong?
Well, look at your functions closely:
def delete(word,letter):
word = []
char = ""
if char != letter:
word+=char # or `word = word.append(char)` in 2nd version
return word
So, the function gets a word and a letter passed in. The first thing you do is throw away the word, because you are overwriting the local variable with a different value (a new empty list). Next, you are initializing an empty string char and compare its content (it’s empty) with the passed letter. If they are not equal, i.e. if letter is not an empty string, the empty string in char is added to the (empty list) word. And then word is returned.
Also note that you cannot add a string to a list. The + operation on lists is only implemented to combine two lists, so your append version is definitelly less wrong. Given that you want a string as a result, it makes more sense to just store the result as one to begin with.
Instead of adding an empty string to an empty string/list when something completely unrelated to the passed word happens, what you rather want to do is keep the original word intact and somehow look at each character. You basically want to loop through the word and keep all characters that are not the passed letter; something like this:
def delete(word, letter):
newWord = '' # let's not overwrite the passed word
for char in word:
# `char` is now each character of the original word.
# Here you now need to decide if you want to keep the
# character for `newWord` or not.
return newWord
The for var in something will basically take the sequence something and execute the loop body for each value of that sequence, identified using the variable var. Strings are sequences of characters, so the loop variable will contain a single character and the loop body is executed for each character within the string.
You're not doing anything with word passed to your function. Ultimately, you need to iterate over the word passed into your function (for character in word: doSomething_with_character) and build your output from that.
def delete(word, ch):
return filter(lambda c: c != ch, word)
Basically, just a linear pass over the string, dropping out letters that match ch.
filter takes a higher order function and an iterable. A string is an iterable and iterating over it iterates over the characters it contains. filter removes the elements from the iterable for which the higher order function returns False.
In this case, we filter out all characters that are equal to the passed ch argument.
I like the functional style #TC1 and #user2041448 that is worth understanding. Here's another implementation:
def delete( letter, string ):
s2 = []
for c in string:
if c!=letter:
s2.append( c )
return ''.join(s2)
Your first function uses + operator with a list which probably isn't the most appropriate choice. The + operator should probably be reserved for strings (and use .append() function with lists).
If the intent is to return a string, assign "" instead of [], and use + operators.
If the intent is to return a list of characters assign [], and use .append() function.
Change the name of the variable you are using to construct the returned value.
Assigning anything to word gets rid of the content that was given to the function as an argument.
so make it result=[] OR result="" etc..
ALSO:
the way you seem to be attempting to solve this requires you to loop over the characters in the original string, the code you posted does not loop at all.
you could use a for loop with this type of semantic:
for characterVar in stringVar:
controlled-code-here
code-after-loop
you can/should change the names of course, but i named them in a way that should help you understand. In your case stringVar would be replaced with word and you would append or add characterVar to result if it isn't the deleted character. Any code that you wish to be contained in the loop must be indented. the first unindented line following the control line indicates to python that the code comes AFTER the loop.
This is what I came up with:
def delete(word, letter):
new_word = ""
for i in word:
if i != letter:
new_word += i
return new_word
I had some code that worked fine removing punctuation/numbers using regular expressions in python, I had to change the code a bit so that a stop list worked, not particularly important. Anyway, now the punctuation isn't being removed and quite frankly i'm stumped as to why.
import re
import nltk
# Quran subset
filename = raw_input('Enter name of file to convert to ARFF with extension, eg. name.txt: ')
# create list of lower case words
word_list = re.split('\s+', file(filename).read().lower())
print 'Words in text:', len(word_list)
# punctuation and numbers to be removed
punctuation = re.compile(r'[-.?!,":;()|0-9]')
for word in word_list:
word = punctuation.sub("", word)
print word_list
Any pointers on why it's not working would be great, I'm no expert in python so it's probably something ridiculously stupid. Thanks.
Change
for word in word_list:
word = punctuation.sub("", word)
to
word_list = [punctuation.sub("", word) for word in word_list]
Assignment to word in the for-loop above, simply changes the value referenced by this temporary variable. It does not alter word_list.
You're not updating your word list. Try
for i, word in enumerate(word_list):
word_list[i] = punctuation.sub("", word)
Remember that although word starts off as a reference to the string object in the word_list, assignment rebinds the name word to the new string object returned by the sub function. It doesn't change the originally referenced object.