I'm trying to remove a lot of stuff from a text file to rewrite it.
The text file has several hundred items each consisting of 6 lines of.
I got my code working to a point where puts all lines in an array, identifies the only 2 important in every item and deletes the whitespaces, but any further stripping gives me the following error:
'list' object has no attribute 'strip'
Here my code:
x = 0
y = 0
names = []
colors = []
array = []
with open("AA_Ivory.txt", "r") as ins:
for line in ins:
array.append(line)
def Function (currentElement, lineInSkinElement):
name = ""
color = ""
string = array[currentElement]
if lineInSkinElement == 1:
string = [string.strip()]
# string = [string.strip()]
# name = [str.strip("\n")]
# name = [str.strip(";")]
# name = [str.strip(" ")]
# name = [str.strip("=")]
names.append(name)
return name
# if lineInSkinElement == 2:
# color = [str.strip("\t")]
# color = [str.strip("\n")]
# color = [str.strip(";")]
# color = [str.strip(" ")]
# color = [str.strip("=")]
# colors.append(color)
# return color
print "I got called %s times" % currentElement
print lineInSkinElement
print currentElement
for val in array:
Function(x, y)
x = x +1
y = x % 6
#print names
#print colors
In the if statement for the names, deleting the first # will give me the error.
I tried converting the list item to string, but then I get extra [] around the string.
The if statement for color can be ignored, I know it's faulty and trying to fix this is what got me to my current issue.
but then I get extra [] around the string
You can loop through this to get around the listed string. For example:
for lst, item in string:
item = item.strip("\n")
item = item.strip(";")
item = item.strip(" ")
item = item.strip("=")
name.append(item)
return name
This will get you to the string within the list and you can append the stripped string.
If this isn't what you were looking for, post some of the data you're working with to clarify.
Alright, I found the solution. It was a rather dumb mistake of mine. The eerror occured due to the [] arroung the strip function making the outcome a list or list item. Removing them fixed it. Feeling relieved now, a bit stupid, but relieved.
You can also do that in one line using the following code.
item = item.strip("\n").strip("=").strip(";").strip()
The last strip will strip the white spaces.
Related
Yet again, I do not understand an error I keep encountering. Here is my code:
s = input()
name = input()
splits = s.split(" ")
i = 0
for i in range(len(splits)):
if(splits[i] == name):
break
print(splits[i+1])
Here is the error:
Traceback (most recent call last):
File "main.py", line 15, in <module>
print(splits[i+1])
IndexError: list index out of range
I am not sure why [i+1] returns as out of range. What did I screw up this time? I appreciate the help in advance as I don't get much guidance from my instructor or TA. You folks rock here!
Edit: I apologize I did not include a desired outcome.
The input is:
Joe,123-5432 Linda,983-4123 Frank,867-5309
Frank
The output is supposed to be:
867-5309
s = 'Hello'
name = 'Goodbye'
splits = s.split() # default value is a single space ['Hello'] - notice the single value
for i in range(1): # because splits has a single item in the list
if 'Hello' == 'Goodbye':
break
print(splits[i+1]) # this will not work because splits has a single index = 0
# same as
s = [0,1]
print(s[2])
I advise to come up with a desired output when you are asking a question on StackOverflow so others will have a better understanding of your problem.
Also, as mentioned by Hossein in the comment, try using print() in each line of your code to see if you get what you expect.
-- Update --
s_input = 'Joe,123-5432 Linda,983-4123 Frank,867-5309'
to_search = 'Frank'
# split the input
splitted_input = s_input.split()
for item_pair in splitted_input:
# split the pair by a comma
pair_split = item_pair.split(',')
name = pair_split[0]
number = pair_split[1]
if name == to_search:
print(name, number)
Frank 867-5309
The problem you had is that after you split you have 3 elements but I assume you have thought you get 6, 3 pairs for a name and a number, therefore when you try to reach the number you get out of index, you need to split by a comma to separate the people one from another and then split again to separate the number from the name.
But I would suggest to use:
def find_number(string_to_search, name_to_find):
splits = string_to_search.split()
for item in splits:
if name_to_find.lower() in item.lower():
return item.split(',')[1]
find_number('Joe,123-5432 Linda,983-4123 Frank,867-5309', 'Frank')
# '867-5309'
So I have a list of lists that I need to parse through and manipulate the contents of. There are strings of numbers and words in the sublists, and I want to change the numbers into integers. I don't think it's relevant but I'll mention it just in case: my original data came from a CSV that I split on newlines, and then split again on commas.
What my code looks like:
def prep_data(data):
list = data.split('\n') #Splits data on newline
list = list[1:-1] #Gets rid of header and last row, which is an empty string
prepped = []
for x in list:
prepped.append(x.split(','))
for item in prepped: #Converts the item into an int if it is able to be converted
for x in item:
try:
item[x] = int(item[x])
except:
pass
return prepped
I tried to loop through every sublist in prepped and change the type of the values in them, but it doesn't seem like the loop does anything as the prep_data returns the same thing as it did before I implemented that for loop.
I think I see what is wrong, you are thinking python is more generous with it's assignment than it actually is.
def prep_data(data):
list = data.split('\n') #Splits data on newline
list = list[1:-1] #Gets rid of header and last row, which is an empty string
prepped = []
for x in list:
prepped.append(x.split(','))
for i in prepped: #Converts the item into an int if it is able to be converted
item = prepped[i]
for x in item:
try:
item[x] = int(item[x])
except:
pass
prepped[i] = item
return prepped
I can't run this on the machine I'm on right now but it seems the problem is that "prepped" wasn't actually receiving any new assignments, you were just changing values in the sub array "item"
I'm not sure about your function, because maybe I didn't understand your income data, but you could try something like the following because if you only pass, you could lose string or weird data:
def parse_data(raw_data):
data_lines = raw_data.split('\n') #Splits data on newline
data_rows_without_header = data_lines[1:-1] #Gets rid of header and last row, which is an empty string
parsed_date = []
for raw_row in data_rows_without_header:
splited_row = raw_line.split(',')
parsed_row = []
for value in splited_row:
try:
parsed_row.append(int(value)
except:
print("The value '{}' is not castable".format(value))
parsed_row.append(value) # if cast fails, add the string as it is
parsed_date.append(parsed_row)
return parsed_date
I have this code that I am trying to decompress. First, I have compressed the code which is all working but then when I go onto decompressing it there is a ValueError.
List.append(dic[int(bob)])
ValueError: invalid literal for int() with base 10: '1,2,3,4,5,6,7,8,9,'
This is the code...
def menu():
print("..........................................................")
para = input("Please enter a paragraph.")
print()
s = para.split() # splits sentence
another = [0] # will gradually hold all of the numbers repeated or not
index = [] # empty index
word_dictionary = [] # names word_dictionary variable
for count, i in enumerate(s): # has a count and an index for enumerating the split sentence
if s.count(i) < 2: # if it is repeated
another.append(max(another) + 1) # adds the other count to another
else: # if is has not been repeated
another.append(s.index(i) +1) # adds the index (i) to another
new = " " # this has been added because other wise the numbers would be 01234567891011121341513161718192320214
another.remove(0) # takes away the 0 so that it doesn't have a 0 at the start
for word in s: # for every word in the list
if word not in word_dictionary: # If it's not in word_dictionary
word_dictionary.append(word) # adds it to the dicitonary
else: # otherwise
s.remove(word) # it will remove the word
fo = open("indx.txt","w+") # opens file
for index in another: # for each i in another
index= str(index) # it will turn it into a string
fo.write(index) # adds the index to the file
fo.write(new) # adds a space
fo.close() # closes file
fo=open("words.txt", "w+") # names a file sentence
for word in word_dictionary:
fo.write(str(word )) # adds sentence to the file
fo.write(new)
fo.close() # closes file
menu()
index=open("indx.txt","r+").read()
dic=open("words.txt","r+").read()
index= index.split()
dic = dic.split()
Num=0
List=[]
while Num != len(index):
bob=index[Num]
List.append(dic[int(bob)])
Num+=1
print (List)
The problem is down on line 50. with ' List.append(dic[int(bob)])'.
Is there a way to get the Error message to stop popping up and for the code to output the sentence as inputted above?
Latest error message has occurred:
List.append(dic[int(bob)])
IndexError: list index out of range
When I run the code, I input "This is a sentence. This is another sentence, with commas."
The issue is index= index.split() is by default splitting on spaces, and, as the exception shows, your numbers are separated by ,s.
Without seeing index.txt I can't be certain if it will fix all of your indexes, but for the issue in OP, you can fix it by specifying what to split on, namely a comma:
index= index.split(',')
To your second issue, List.append(dic[int(bob)]) IndexError: list index out of range has two issues:
Your indexes start at 1, not 0, so you are off by one when reconstituting your array
This can be fixed with:
List.append(dic[int(bob) - 1])
Additionally you're doing a lot more work than you need to. This:
fo = open("indx.txt","w+") # opens file
for index in another: # for each i in another
index= str(index) # it will turn it into a string
fo.write(index) # adds the index to the file
fo.write(new) # adds a space
fo.close() # closes file
is equivalent to:
with open("indx.txt","w") as fo:
for index in another:
fo.write(str(index) + new)
and this:
Num=0
List=[]
while Num != len(index):
bob=index[Num]
List.append(dic[int(bob)])
Num+=1
is equivalent to
List = []
for item in index:
List.append(dic[int(item)])
Also, take a moment to review PEP-8 and try to follow those standards. Your code is very difficult to read because it doesn't follow them. I fixed the formatting on your comments so StackOverflow's parser could parse your code, but most of them only add clutter.
I have a file i am trying to replace parts of a line with another word.
it looks like bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212
i need to delete everything but bob123#bobscarshop.com, but i need to match 23rh32o3hro2rh2 with 23rh32o3hro2rh2:poniacvibe , from a different text file and place poniacvibe infront of bob123#bobscarshop.com
so it would look like this bob123#bobscarshop.com:poniacvibe
I've had a hard time trying to go about doing this, but i think i would have to split the bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212 with data.split(":") , but some of the lines have a (:) in a spot that i don't want the line to be split at, if that makes any sense...
if anyone could help i would really appreciate it.
ok, it looks to me like you are using a colon : to separate your strings.
in this case you can use .split(":") to break your strings into their component substrings
eg:
firststring = "bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212"
print(firststring.split(":"))
would give:
['bobkeiser', 'bob123#bobscarshop.com', '0.0.0.0.0', '23rh32o3hro2rh2', '234212']
and assuming your substrings will always be in the same order, and the same number of substrings in the main string you could then do:
firststring = "bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212"
firstdata = firststring.split(":")
secondstring = "23rh32o3hro2rh2:poniacvibe"
seconddata = secondstring.split(":")
if firstdata[3] == seconddata[0]:
outputdata = firstdata
outputdata.insert(1,seconddata[1])
outputstring = ""
for item in outputdata:
if outputstring == "":
outputstring = item
else
outputstring = outputstring + ":" + item
what this does is:
extract the bits of the strings into lists
see if the "23rh32o3hro2rh2" string can be found in the second list
find the corresponding part of the second list
create a list to contain the output data and put the first list into it
insert the "poniacvibe" string before "bob123#bobscarshop.com"
stitch the outputdata list back into a string using the colon as the separator
the reason your strings need to be the same length is because the index is being used to find the relevant strings rather than trying to use some form of string type matching (which gets much more complex)
if you can keep your data in this form it gets much simpler.
to protect against malformed data (lists too short) you can explicitly test for them before you start using len(list) to see how many elements are in it.
or you could let it run and catch the exception, however in this case you could end up with unintended results, as it may try to match the wrong elements from the list.
hope this helps
James
EDIT:
ok so if you are trying to match up a long list of strings from files you would probably want something along the lines of:
firstfile = open("firstfile.txt", mode = "r")
secondfile= open("secondfile.txt",mode = "r")
first_raw_data = firstfile.readlines()
firstfile.close()
second_raw_data = secondfile.readlines()
secondfile.close()
first_data = []
for item in first_raw_data:
first_data.append(item.replace("\n","").split(":"))
second_data = []
for item in second_raw_data:
second_data.append(item.replace("\n","").split(":"))
output_strings = []
for item in first_data:
searchstring = item[3]
for entry in second_data:
if searchstring == entry[0]:
output_data = item
output_string = ""
output_data.insert(1,entry[1])
for data in output_data:
if output_string == "":
output_string = data
else:
output_string = output_string + ":" + data
output_strings.append(output_string)
break
for entry in output_strings:
print(entry)
this should achieve what you're after and as prove of concept will print the resulting list of stings for you.
if you have any questions feel free to ask.
James
Second edit:
to make this output the results into a file change the last two lines to:
outputfile = open("outputfile.txt", mode = "w")
for entry in output_strings:
outputfile.write(entry+"\n")
outputfile.close()
This is my code:
import re
with open("C:\\Corpora\\record-13.txt") as f:
concepts = f.readlines()
j = 0
for line in concepts:
PATTERN = re.compile(r'''((?:[^ "]|"[^"]*")+)''')
TokCurrLineCon = PATTERN.split(line)[1::2]
temp = TokCurrLineCon[1].split(':')
StartLineNum[j] = temp[0]
StartOffset[j] = temp[1]
temp = TokCurrLineCon[2].split('||')
EndOfCon[j] = temp[0]
TypeOfCon[j] = temp[1]
temp = EndOfCon[j].split(':')
EndLineNum[j] = temp[0]
EndOffset[j] = temp[1]
temp = TypeOfCon[j].split('"')
TypeOfCon[j] = temp[1]
j +=1
I need 5 lists as the end (StartLineNum, StartOffset, EndLineNum, EndOffset, TypeOfCon), but when I run it I face the error StartLineNum[j] = temp[0]
TypeError: 'str' object does not support item assignment
Any idea how to fix it?
The error message is telling you that StartLineNum is a str, so StartLineNum[j] = <anything> is illegal.
From your description, it sounds like you expected StartLineNum to be a list. So presumably the problem is that you constructed a string instead of a list somewhere in the code above. Since we can't see that code, we can't fix it, beyond saying that you should create a list if you want a list.
However, I suspect there's another problem in your code. For this to work, StartLineNum would have to be not just a list, but a list that's already got as many members as the file has lines. But you can't know how many that is until you've read the whole file in. A better solution would be to use the append method on lists. (Then you don't need the j variable, either.) For example:
StartLineNum = []
for line in concepts:
# blah blah
StartLineNum.append(temp[0])
# etc.