String concatenation acting crazy - python

I'm trying to join each word from a .txt list to a string eg{ 'word' + str(1) = 'word1' }
with this code:
def function(item):
for i in range(2):
print item + str(i)
with open('dump/test.txt') as f:
for line in f:
function(str(line))
I'll just use a txt file containing just two words ('this', 'that').
What I get is:
this
0
this
1
that0
that1
What I was expecting:
this0
this1
that0
that1
It works fine if I just use function('this') and function('that') but why doesn't it work with the txt input?
--- Edit:
Solved, thank you!
Problem was caused by
newline characters in the strings received
Solution: see Answers

You should change
print item + str(i)
to
print item.rstrip() + str(i)
This will remove any newline characters in the strings received in function.
A couple of other tips:
A better way to print data is to use the .format method, e.g. in your case:
print '{}{}'.format(item.strip(), i)
This method is very flexible if you have a more complicated task.
All rows read from a file are strings - you don't have to call str() on them.

The first line read by python is this\n and when you append 0 and 1 to this you get this\n0 and this\n1. Whereas in case in the second line you are not having a new line at the end of the file (inferring from what you are getting on printing). So appending is working fine for it.
For removing the \n from the right end of the string you should use rstrip('\n')
print (item.rstrip('\n') + str(i))

In the first "This\n" is there that's why you are getting not formatted output. pass the argument after removing new line char.

Related

Weird error in loop not understanding why the range is out of index

I was asked to:-
Write a program to input a string and then using a function change(), create a new string
with all the consonants deleted from the string. The function should take in the string as
a parameter and return the converted string.
my code :-
str=input("enter a string: ")
def change(stri):
for i in range(0,len(stri)):
for e in ['a','e','i','o','u','A','E','I','O','U']:
if stri[i]==e:
if i==len(stri)-1:
stri = stri[0:i-1] + "" + stri[i: ]
else:
stri = stri[0:i] + "" + stri[i+1: ]
else:
continue
return stri
str=change(str)
print(str)
output :-
Traceback (most recent call last):
File "main.py", line 13, in <module> str=change(str)
File "main.py", line 5, in change if stri[i]==e:
IndexEnoi: string index out of range
^ for any string
Please someone help me out this is my imp project
As said by Johny Mopp: "Let's say len(stri) == 10. Your for loop then is for i in range(0, 10):. But then in the loop you modify stri so that it's new length is less than 10. You will now get an index out of range error at if stri[i]==e: because i is too large"
Use this instead:
text=input("enter a string: ")
vowels = ['a','e','i','o','u','A','E','I','O','U']
for x in vowels:
text = text.replace(x,"")
print(text)
Generally speaking, don't try to reinvent the wheel. Make use of all functions Python offers you.
The issue you are facing is common when deleting from the same object that you are iterating. Eventually you delete a part of the object that is already queued up as part of a future iteration in the for loop and you get an error.
The best way around this, in your case, is to write your vowel only string out to a new variable keeping your stri variable intact as it was passed into your function.
A quick rewrite of your code with the addition of the new variable to catch your output string would look like:
vowels=['a','e','i','o','u','A','E','I','O','U']
stri='This is a test string'
stro=''
for character in stri:
if character in vowels:
stro=stro+character
print(stro)
I believe the issue you have is that you change the string inside the for loop. You delete letters but still have the same range (len(stri) is not updated while changing the stri). Try with the word without vowels. You will not get any error.
But there is a much simpler way of doing this if I understand your task correctly.
def change2(str):
return ''.join([letter for letter in str if letter not in ['a','e','i','o','u','A','E','I','O','U']])
print(change2(input('enter a string: ')))
The join method creates the new string and returns it. The separator between the list elements is a string that calls join() method. In this example, it is an empty string.
You should use a new string variable for concatenation. You are changing the passed string to the function that's why it gives the error.
str=input("enter a string: ")
def change(stri):
nstr = ''
for i in range(0,len(stri)):
for e in ['a','e','i','o','u','A','E','I','O','U']:
print(stri[i])
if stri[i]==e:
if i==len(stri)-1:
nstr = stri[0:i-1] + "" + stri[i: ]
else:
nstr = stri[0:i] + "" + stri[i+1: ]
else:
continue
return nstr
str=change(str)
print(str)

python 3 parsing a semicolon separated very long string to remove each second element

I'm pretty new to python and are looking for a way to get the following result from a long string
reading in lines of a textfile where each line looks like this
; 2:55:12;PuffDG;66,81; Puff4OG;66,75; Puff3OG;35,38;
after dataprocessing the data shall be stored in another textfile with this data
short example
2:55:12;66,81;66,75;35,38;
the real string is much longer but always with the same pattern
; 2:55:12;PuffDG;66,81; Puff4OG;66,75; Puff3OG;35,38; Puff2OG;30,25; Puff1OG;29,25; PuffFB;23,50; ....
So this means remove leading semicolon
keep second element
remove third element
keep fourth element
remove fith element
keep sixth element
and so on
the number of elements can vary so I guess as a first step I have to parse the string to get the number of elements and then do some looping through the string and assign each part that shall be kept to a variable
I have tried some variations of the command .split() but with no success.
Would it be easier to store all elements in a list and then for-loop through the list keeping and dropping elements?
If Yes how would this look like so at the end I have stored a file with
lines like this
2:55:12 ; 66,81 ; 66,75 ; 35,38 ;
2:56:12 ; 67,15 ; 74;16 ; 39,15 ;
etc. ....
best regards Stefan
This solution works independently of the content between the semicolons
One line, though it's a bit messier:
result = ' ; '.join(string.split(';')[1::2])
Getting rid of lead semicolon:
Just slice it off!
string = string[2:]
Splitting by semicolon & every second element:
Given a string, we can split by semicolon:
arr = string.split(';')[1::2]
The [::2] means to slice out every second element, starting with index 1. This keeps all "even" elements (second, fourth, etcetera).
Resulting string
To produce the string result you want, simply .join:
result = ' ; '.join(arr)
A regex based solution, which operates on the original input:
inp = "; 2:55:12;PuffDG;66,81; Puff4OG;66,75; Puff3OG;35,38;"
output = re.sub(r'\s*[A-Z][^;]*?;', '', inp)[2:]
print(output)
This prints:
2:55:12;66,81;66,75;35,38;
This shows how to do it for one line of input if the same pattern repeats itself every time
input_str = "; 2:55:12;PuffDG;66,81; Puff4OG;66,75; Puff3OG;35,38;"
f = open('output.txt', 'w') # open text to write to
output_list = input_str.split(';')[1::2] # create list with numbers of interest
# write to file
for out in output_list:
f.write(f"{out.strip()} ; ")
# end line
f.write("\n")
thank you very much for the quick response. You are awesome.
Your solutions are very comact.
In the meantime I found another solution but this solution needs more lines of code
best regards Stefan
I'm not familiar with how to insert code as a code-section properly
So I add it as plain text
fobj = open(r"C:\Users\Stefan\AppData\Local\Programs\Python\Python38-32\Heizung_2min.log")
wobj = open(r"C:\Users\Stefan\AppData\Local\Programs\Python\Python38-32\Heizung_number_2min.log","w")
for line in fobj:
TextLine = fobj.readline()
print(TextLine)
myList = TextLine.split(';')
TextLine = ""
for index, item in enumerate(myList):
if index % 2 == 1:
TextLine += item
TextLine += ";"
TextLine += '\n'
print(TextLine)
wobj.write(TextLine)
fobj.close()
wobj.close()`

Get some string before " not in all lines python

I have such entries in a txt file with such structure:
Some sentence.
Some other "other" sentence.
Some other smth "other" sentence.
In original:
Камиш-Бурунський залізорудний комбінат
Відкрите акціонерне товариство "Кар'єр мармуровий"
Закрите акціонерне товариство "Кар'єр мармуровий"
I want to extract everything before " and write to another file. I want the result to be:
Some other
Some other smth
Відкрите акціонерне товариство
Закрите акціонерне товариство
I have done this:
f=codecs.open('organization.txt','r+','utf-8')
text=f.read()
words_sp=text.split()
for line in text:
before_keyword, after_keyword = line.split(u'"',1)
before_word=before_keyword.split()[0]
encoded=before_word.encode('cp1251')
print encoded
But it doesn't work since there is a file lines that doesn't have ". How can I improve my code to make it work?
There are two problems. First you must use the splitlines() function to break a string into lines. (What you have will iterate one character at a time.) Secondly, the following code will fail when split returns a single item:
before_keyword, after_keyword = line.split(u'"',1)
The following works for me:
for line in text.splitlines():
if u'"' in line:
before_keyword, after_keyword = line.split(u'"',1)
... etc. ...

python, string.replace() and \n

(Edit: the script seems to work for others here trying to help. Is it because I'm running python 2.7? I'm really at a loss...)
I have a raw text file of a book I am trying to tag with pages.
Say the text file is:
some words on this line,
1
DOCUMENT TITLE some more words here too.
2
DOCUMENT TITLE and finally still more words.
I am trying to use python to modify the example text to read:
some words on this line,
</pg>
<pg n=2>some more words here too,
</pg>
<pg n=3>and finally still more words.
My strategy is to load the text file as a string. Build search-for and a replace-with strings corresponding to a list of numbers. Replace all instances in string, and write to a new file.
Here is the code I've written:
from sys import argv
script, input, output = argv
textin = open(input,'r')
bookstring = textin.read()
textin.close()
pages = []
x = 1
while x<400:
pages.append(x)
x = x + 1
pagedel = "DOCUMENT TITLE"
for i in pages:
pgdel = "%d\n%s" % (i, pagedel)
nplus = i + 1
htmlpg = "</p>\n<p n=%d>" % nplus
bookstring = bookstring.replace(pgdel, htmlpg)
textout = open(output, 'w')
textout.write(bookstring)
textout.close()
print "Updates to %s printed to %s" % (input, output)
The script runs without error, but it also makes no changes whatsoever to the input text. It simply reprints it character for character.
Does my mistake have to do with the hard return? \n? Any help greatly appreciated.
In python, strings are immutable, and thus replace returns the replaced output instead of replacing the string in place.
You must do:
bookstring = bookstring.replace(pgdel, htmlpg)
You've also forgot to call the function close(). See how you have textin.close? You have to call it with parentheses, like open:
textin.close()
Your code works for me, but I might just add some more tips:
Input is a built-in function, so perhaps try renaming that. Although it works normally, it might not for you.
When running the script, don't forget to put the .txt ending:
$ python myscript.py file1.txt file2.txt
Make sure when testing your script to clear the contents of file2.
I hope these help!
Here's an entirely different approach that uses re(import the re module for this to work):
doctitle = False
newstr = ''
page = 1
for line in bookstring.splitlines():
res = re.match('^\\d+', line)
if doctitle:
newstr += '<pg n=' + str(page) + '>' + re.sub('^DOCUMENT TITLE ', '', line)
doctitle = False
elif res:
doctitle = True
page += 1
newstr += '\n</pg>\n'
else:
newstr += line
print newstr
Since no one knows what's going on, it's worth a try.

Get a value from a string in python

Program Details:
I am writing a program for python that will need to look through a text file for the line:
Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.
Problem:
Then after the program has found that line, it will then store the line into an array and get the value 19.612545, from f = 19.612545.
Question:
I so far have been able to store the line into an array after I have found it. However I am having trouble as to what to use after I have stored the string to search through the string, and then extract the information from variable f. Does anyone have any suggestions or tips on how to possibly accomplish this?
Depending upon how you want to go at it, CosmicComputer is right to refer you to Regular Expressions. If your syntax is this simple, you could always do something like:
line = 'Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.'
splitByComma=line.split(',')
fValue = splitByComma[1].replace('f= ', '').strip()
print(fValue)
Results in 19.612545 being printed (still a string though).
Split your line by commas, grab the 2nd chunk, and break out the f value. Error checking and conversions left up to you!
Using regular expressions here is maddness. Just use string.find as follows: (where string is the name of the variable the holds your string)
index = string.find('f=')
index = index + 2 //skip over = and space
string = string[index:] //cuts things that you don't need
string = string.split(',') //splits the remaining string delimited by comma
your_value = string[0] //extracts the first field
I know its ugly, but its nothing compared with RE.

Categories

Resources