Python Counting words

Python Counting words - python

I want to run a program in which it select every third letter out of a sentence (starting from the first letter), and print out those letters with spaces in between them. i am unable to work it out as it should run like this:
Message? pbaynatnahproarnsm
p y t h o n
and the code I am using it to work out is:
p = raw_input("Message? ")
count = 3
p.count()
print p
Can you please help me out with this thanks

Grabbing every third letter is easy with Python slice notation:
In [5]: x = 'pbaynatnahproarnsm'
In [6]: x[::3]
Out[6]: 'python'
You can then add a space in between each letter using str.join:
In [7]: ' '.join(x[::3])
Out[7]: 'p y t h o n'

Related

How to split string while keeping \n

I want to write the first letter of every item while linebreak stays the same but when I turn the list to string it's written in one line. Like this "I w t w f l o e I w l s s" but I want output to look like this "I w t \n w t f l \n o e i \n w l \n s s".
r = '''I want to
write the first letter
of every item
while linebreak
stay same'''
list_of_words = r.split()
m = [x[0] for x in list_of_words]
string = ' '.join([str(item) for item in m])
print(string)

What you are doing is you are splitting all the lines in a single go, so you are losing the information of each line. You need to create list of list to preserve the line information.
When you provide no argument means split according to any whitespace, that means both ' ' and '\n'.
r = '''I want to
write the first letter
of every item
while linebreak
stay same'''
list_of_words = [i.split() for i in r.split('\n')]
m = [[y[0] for y in x] for x in list_of_words]
string = '\n'.join([' '.join(x) for x in m])
print(string)
I w t
w t f l
o e i
w l
s s

Via regexp
r = '''I want to
write the first letter
of every item
while linebreak
stay same'''
import re
string = re.sub(r"(.)\S+(\s)", r"\1\2", r + " ")[:-1]
print(string)
Output:
I t
w t f l
o e i
w l
s s

What you're doing is - Get the first letter from each word
of the list and then joining them. You are not keeping track of the \n in the string.
You could do this instead.
list_of_words = r.split('\n')
m = [[x[0] for x in y.split()] for y in list_of_words]
for i in m:
string = ' '.join(i)
print(string)
Output
I w t
w t f l
o e i
w l
s s

Here is the solution by using while loop
r = '''I want to
write the first letter
of every item
while linebreak
stay same'''
total_lines = len(r.splitlines())
line_no = 0
while line_no < total_lines:
words_line = r.splitlines()[line_no]
list_of_words = words_line.split()
m = [x[0] for x in list_of_words]
print(' '.join([str(item) for item in m]))
line_no = line_no + 1

Since many valid methods were already provided, here's a nice and comprehensive way of doing the same task without the use of str.split(), which creates unnecessary list intermediates in memory (not that it represents any problem in this case though).
This method takes advantage of str.isspace() to deliver the whole set of instructions in one line:
string = "".join([string[i] for i in range(len(string)) if string[i].isspace() or string[i-1].isspace() or i == 0])

Python create a street address table

Very new to programming, very new to Python, I take different tasks online. The goal is to accomplish a lot without relying on external libraries.
One task I couldn't do today is this one:
Given a street name and a user provided number, create a table of user_provided_number columns and output the name of the street. Then, in the same table create the same output but reverse the street address. The space between the street addresses should be replaced with a "|". If the street name is too short to complete the row, render "?" for each remaining space.
Scenario Example:
Street address: Mystreet road, user provided number: 6
Expected output:
M y s t r e
e t | r o a
d | d a o r
| t e e r t
s y M ? ? ?
So far I managed to do the following:
strAddress = input("What's your street address?")
givenNumber = input("What's your favourite number from 1 to 10?")
reverseAddress = strAddress[::-1]
splitAddress = list(strAddress)
for row in range(0,int(len(strAddress)/givenNumber)):
for element in range(0,givenNumber):
print (splitAddress[element], end=' ')
print()
Why is this "array"(?) printing the same elements on each row? Assuming that the user provided "4" as their number, from the code I wrote I expected an output like that:
M y s t
r e e t
r o a
d
however the output is:
M y s t
M y s t
M y s t

First of all you should convert your givenNumber into int() since input() always returns string. Also you could convert your whole strAddress into itself and reversed version of itself to make accessing it easier. splitAddress wont be needed here since you can access string length and elements the same way as list in this example. In your first loop you're iterating over len(strAddress)/givenNumber which isn't enough since we need to print our Address two times (with reversed version) and we need to fill extending characters with ? so we need to round it up, without using math library we could do this like I've shown. Lastly splitAddress[element] here you access element'th index of your Address which will be number 0 - 6 on every iteration so we need take into account row to print more elements.
strAddress = input("What's your street address?")
givenNumber = int(input("What's your favourite number from 1 to 10?"))
strAddress += '|' + strAddress[::-1]
strAddress = strAddress.replace(' ', '|')
lines_to_print = len(strAddress)//givenNumber + (len(strAddress)%givenNumber>0)
for row in range(lines_to_print):
for element in range(givenNumber):
if row*givenNumber + element < len(strAddress):
print(strAddress[row*givenNumber + element], end=' ')
else:
print('? ', end='')
print()
Output for Mystreet road and 6
M y s t r e
e t | r o a
d | d a o r
| t e e r t
s y M ? ? ?

Your issue is that the nested loop starts back at 0 every time and ends at the same place every time. With your current code, the first loop is just declaring how many times to do the second loop, it doesn't have any input on the second loop. To fix this you could do for element in range(givenNumber*(row-1),givenNumber*(row)).

You never progress through the street address. row takes on values 0, 1, 2; but you never use those values to move along the address string. Look at what you print:
for element in range(0,givenNumber):
print (splitAddress[element], end=' ')
This prints the same four characters, regardless of the row value. Instead, you need to truly split the address into rows and print those. Alternately, you can compute the correct indices for each row: givenNumber*row + element.

Another solution would be to just build your string (replace characters, reverse it, ...) and then print this string character by character for each defined row. In order to calculate the number of filling characters for the last row, you could make use of the modulo operator with negative numbers.
Say your final string (chars) is 27 characters long and the given cell number (givenNumber) is 7. This would result in -27 % 7 = 1. So in this case one filling character would need to be added. chars += charFill * numCharFill will then just add the filling character x times at the end.
With an index you can then go through your string step by step and configure the output as required.
# strAddress = input("What's your street address?")
# givenNumber = int(input("What's your favourite number from 1 to 10?"))
strAddress = "Mystreet road"
givenNumber = 6
charFill = "?" # char to fill last row
chars = strAddress.replace(" ","|") # replace spaces in strAddress
chars += "|" + chars[::-1] # add reverse chars
numCharFill = -len(chars)%givenNumber # modulo of negative number
chars += charFill * numCharFill # add fill character x times
index = 0
for char in chars:
if index > 0 and not index%givenNumber:
print()
print(chars[index], end=' ')
index = index + 1

Try:
strAddress = input("What's your street address?\n")
givenNumber = int(input("What's your favourite number from 1 to 10?\n"))
charGroupSize = len(strAddress)/givenNumber
charGroups = [strAddress[i:i+givenNumber] for i in range(0, len(strAddress), givenNumber)]
for group in charGroups:
for char in group:
print (char, end=' ')
print()
Output:
What's your street address?
Mystreet road
What's your favourite number from 1 to 10?
4
M y s t
r e e t
r o a
d

(basic) Print each letter in Indexing

a = 'Hello World!'
print(a[0])
then I will get 'H'.
But is there any way I can get all the letters inside a seperately without typing print many times?

This code:
a = 'Hello World!'
print(*a, sep=" ")
will print:
H e l l o W o r l d !
This code:
a = 'Hello World!'
print(*a, sep="\n")
will print:
H
e
l
l
o
W
o
r
l
d
!

Are you asking how to iterate?
for letter in a:
print(letter, end='')
This assigns each item in the iterable a in turn to the variable letter for the duration of the indented block, and executes it as many times as there are items.
Equivalently, you can use an index into the string:
for idx in range(len(a)):
print(a[idx], end='')
range(n) simply produces the numbers 0, 1, 2, ... n-1 (you can optionally make it count from 0 or any other integer, in a different direction, etc).
If you want to keep a newline after each letter, you can take out the end='' keyword parameter.

Python programming - beginner

so i have to create a code in which it reads every third letter and it creates a space in between each letter, my code creates the spaces but it also has a space after the last letter, this is my code:
msg = input("Message? ")
length = len(msg)
for i in range (0, length, 3):
x = msg[i]
print(x, end=" ")
My output was:
Message?
I enter:
cxohawalkldflghemwnsegfaeap
I get back
c h a l l e n g e
when the output isn't meant to have the last " " after the e.
I have read by adding print(" ".join(x)) should give me the output i need but when i put it in it just gives me a error. Please and Thank you

In Python, strings are one kind of data structures called sequences. Sequences support slicing, which is a simple and fancy way of doing things like "from nth", "to nth" and "every nth". The syntax is sequence[from_index:to_index:stride]. One does not even a for loop for doing that.ago
We can get every 3th character easily by omitting from_index and to_index, and have stride of 3:
>>> msg = input("Message? ")
cxohawalkldflghemwnsegfaeap
>>> every_3th = msg[::3]
>>> every_3th
'challenge'
Now, we just need to insert spaces after each letter. separator.join(iterable) will join elements from iterable together in order with the given separator in between. A string is an iterable, whose elements are the individiual characters.
Thus we can do:
>>> answer = ' '.join(every_3th)
>>> answer
'c h a l l e n g e'
For the final code we can omit intermediate variables and have still a quite readable two-liner:
>>> msg = input('Message? ')
>>> print(' '.join(msg[::3]))

Try
>>> print " ".join([msg[i] for i in range(0, len(msg), 3)])
'c h a l l e n g e'

Python fastest way to remove single spaces from spaced out letters in string

I have a document with some lines that have spaced out letters which I want to remove.
The problem is, that the strings are not following all the same rules. So I have some with just one space, also between the words and some with two or three speaces between the words
Examples:
"H e l l o g u y s"
"H e l l o g u y s"
"H e l l o g u y s"
all the above should be converted to --> "Hello guys"
"T h i s i s P a g e 1" --> "This is Page 1"
I wrote a script to remove every second space but not if next letter is numeric or capital. It's working almost OK, since the processed text is German and almost every time the words begin with capital letters... almost.
Anyways I'm not satisfied with it. So I'm asking if there is a neat function for my problem.
text = text.strip() # remove spaces from start and end
out = text
if text.count(' ') >= (len(text)/2)-1:
out = ''
idx = 0
for c in text:
if c != ' ' or re.match('[0-9]|\s|[A-Z0-9ÄÜÖ§€]', text[idx+1]) or (idx > 0 and text[idx-1] == '-'):
out += c
idx += 1
text = out

Not the most original answer but I've seen that your problem almost matches this one.
I have taken unutbu's answer, slightly modified it to solve your queries with enchant. If you have any other dictionary, you can use that instead.
import enchant
d = enchant.Dict("en_US") # or de_DE
def find_words(instring, prefix = ''):
if not instring:
return []
if (not prefix) and (d.check(instring)):
return [instring]
prefix, suffix = prefix + instring[0], instring[1:]
solutions = []
# Case 1: prefix in solution
if d.check(prefix):
try:
solutions.append([prefix] + find_words(suffix, ''))
except ValueError:
pass
# Case 2: prefix not in solution
try:
solutions.append(find_words(suffix, prefix))
except ValueError:
pass
if solutions:
return sorted(solutions,
key = lambda solution: [len(word) for word in solution],
reverse = True)[0]
else:
raise ValueError('no solution')
inp = "H e l l o g u y s T h i s i s P a g e 1"
newInp = inp.replace(" ", "")
print(find_words(newInp))
This outputs:
['Hello', 'guys', 'This', 'is', 'Page', '1']
The linked page certainly is a good starting point for some pragmatic solutions. However, I think a proper solution should use n-grams. This solution could be modified to make use of multiple whitespaces as well, since they might indicate the presence of a word boundary.
Edit:
You can also have a look at Generic Human's solution using a dictionary with relative word frequencies.

You can check whether a word is a english word and then split the words. You could use a dedicated spellchecking library like PyEnchant.
For example:
import enchant
d = enchant.Dict("en_US")
d.check("Hello")
This will be a good starter. But there is the problem with "Expertsexchange".

Converting "H e l l o g u y s" might be very hard or not under the scope of this site. but if you wont to convert the strings like "H e l l o g u y s" or other that the number of spaces between words is different from spaces between letters you can use a the following code :
>>> import re
>>> s1="H e l l o g u y s"
>>> s2="H e l l o g u y s"
>>> ' '.join([''.join(i.split()) for i in re.split(r' {2,}',s2)])
'Hello guys'
>>> ' '.join([''.join(i.split()) for i in re.split(r' {2,}',s1)])
'Hello guys'
this code use a regular expression (' {2,}') for split the words . that split the string from where that have more than 2 spaces !

Demo
This is an algorithm that could do it. Not battle-tested, but just an idea.
d = ['this', 'is', 'page', 'hello', 'guys']
m = ["H e l l o g u y s", "T h i s i s P a g e 1", "H e l l o g u y s", "H e l l o g u y s"]
j = ''.join(m[0].split()).lower()
temp = []
fix = []
for i in j:
temp.append(i)
s = ''.join(temp)
if s in d:
fix.append(s)
del temp[:]
if i.isdigit():
fix.append(i)
print(' '.join(fix))
Prints the following:
this is page 1, hello guys with your supplied test inputs.
Extending
You can use this dictionary which has words on each line, convert it to a list and play around from there.
Issues
As Martjin suggested, what would you do when you encounter "E x p e r t s e x c h a n g e". Well, in such scenarios, using n-gram probabilities would be an appropriate solution. For this you would have to look into NLP (Natural Language Processing) but I assume you don't want to go that far.

You cannot do this - the situation where valid word boundaries are represented the same way as spaces which should be removed is theoretically the same situation where you have no spaces at all in the text.
So you can "reduce" your problem to the problem of re-inserting word boundary spaces in a text with no spaces at all - which is just as impossible, because even with a dictionary containing every valid word - which you do not have -, you can either go for a greedy match and insert too few spaces, or go for a non-greedy match and insert too many.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Counting words - python

Grabbing every third letter is easy with Python slice notation: In [5]: x = 'pbaynatnahproarnsm' In [6]: x[::3] Out[6]: 'python' You can then add a space in between each letter using str.join: In [7]: ' '.join(x[::3]) Out[7]: 'p y t h o n'

Related

How to split string while keeping \n

Python create a street address table

(basic) Print each letter in Indexing

Python programming - beginner

Python fastest way to remove single spaces from spaced out letters in string

Categories

Resources