This is my code:
import re
with open("C:\\Corpora\\record-13.txt") as f:
concepts = f.readlines()
j = 0
for line in concepts:
PATTERN = re.compile(r'''((?:[^ "]|"[^"]*")+)''')
TokCurrLineCon = PATTERN.split(line)[1::2]
temp = TokCurrLineCon[1].split(':')
StartLineNum[j] = temp[0]
StartOffset[j] = temp[1]
temp = TokCurrLineCon[2].split('||')
EndOfCon[j] = temp[0]
TypeOfCon[j] = temp[1]
temp = EndOfCon[j].split(':')
EndLineNum[j] = temp[0]
EndOffset[j] = temp[1]
temp = TypeOfCon[j].split('"')
TypeOfCon[j] = temp[1]
j +=1
I need 5 lists as the end (StartLineNum, StartOffset, EndLineNum, EndOffset, TypeOfCon), but when I run it I face the error StartLineNum[j] = temp[0]
TypeError: 'str' object does not support item assignment
Any idea how to fix it?
The error message is telling you that StartLineNum is a str, so StartLineNum[j] = <anything> is illegal.
From your description, it sounds like you expected StartLineNum to be a list. So presumably the problem is that you constructed a string instead of a list somewhere in the code above. Since we can't see that code, we can't fix it, beyond saying that you should create a list if you want a list.
However, I suspect there's another problem in your code. For this to work, StartLineNum would have to be not just a list, but a list that's already got as many members as the file has lines. But you can't know how many that is until you've read the whole file in. A better solution would be to use the append method on lists. (Then you don't need the j variable, either.) For example:
StartLineNum = []
for line in concepts:
# blah blah
StartLineNum.append(temp[0])
# etc.
Related
Let's say I have a ton of HTML with no newlines. I want to get each element into a list.
input = "<head><title>Example Title</title></head>"
a_list = ["<head>", "<title>Example Title</title>", "</head>"]
Something like such. Splitting between each ><.
But in Python, I don't know of a way to do that. I can only split at that string, which removes it from the output. I want to keep it, and split between the two equality operators.
How can this be done?
Edit: Preferably, this would be done without adding the characters back in to the ends of each list item.
# initial input
a = "<head><title>Example Title</title></head>"
# split list
b = a.split('><')
# remove extra character from first and last elements
# because the split only removes >< pairs.
b[0] = b[0][1:]
b[-1] = b[-1][:-1]
# initialize new list
a_list = []
# fill new list with formatted elements
for i in range(len(b)):
a_list.append('<{}>'.format(b[i]))
This will output the given list in python 2.7.2, but it should work in python 3 as well.
You can try this:
import re
a = "<head><title>Example Title</title></head>"
data = re.split("><", a)
new_data = [data[0]+">"]+["<" + i+">" for i in data[1:-1]] + ["<"+data[-1]]
Output:
['<head>', '<title>Example Title</title>', '</head>']
The shortest approach using re.findall() function on extended example:
# extended html string
s = "<head><title>Example Title</title></head><body>hello, <b>Python</b></body>"
result = re.findall(r'(<[^>]+>[^<>]+</[^>]+>|<[^>]+>)', s)
print(result)
The output:
['<head>', '<title>Example Title</title>', '</head>', '<body>', '<b>Python</b>', '</body>']
Based on the answers by other people, I made this.
It isn't as clean as I had wanted, but it seems to work. I had originally wanted to not re-add the characters after split.
Here, I got rid of one extra argument by combining the two characters into a string. Anyways,
def split_between(string, chars):
if len(chars) is not 2: raise IndexError("Argument chars must contain two characters.")
result_list = [chars[1] + line + chars[0] for line in string.split(chars)]
result_list[0] = result_list[0][1:]
result_list[-1] = result_list[-1][:-1]
return result_list
Credit goes to #cforemanand #Ajax1234.
Or even simpler, this:
input = "<head><title>Example Title</title></head>"
print(['<'+elem if elem[0]!='<' else elem for elem in [elem+'>' if elem[-1]!='>' else elem for elem in input.split('><') ]])
I'm trying to remove a lot of stuff from a text file to rewrite it.
The text file has several hundred items each consisting of 6 lines of.
I got my code working to a point where puts all lines in an array, identifies the only 2 important in every item and deletes the whitespaces, but any further stripping gives me the following error:
'list' object has no attribute 'strip'
Here my code:
x = 0
y = 0
names = []
colors = []
array = []
with open("AA_Ivory.txt", "r") as ins:
for line in ins:
array.append(line)
def Function (currentElement, lineInSkinElement):
name = ""
color = ""
string = array[currentElement]
if lineInSkinElement == 1:
string = [string.strip()]
# string = [string.strip()]
# name = [str.strip("\n")]
# name = [str.strip(";")]
# name = [str.strip(" ")]
# name = [str.strip("=")]
names.append(name)
return name
# if lineInSkinElement == 2:
# color = [str.strip("\t")]
# color = [str.strip("\n")]
# color = [str.strip(";")]
# color = [str.strip(" ")]
# color = [str.strip("=")]
# colors.append(color)
# return color
print "I got called %s times" % currentElement
print lineInSkinElement
print currentElement
for val in array:
Function(x, y)
x = x +1
y = x % 6
#print names
#print colors
In the if statement for the names, deleting the first # will give me the error.
I tried converting the list item to string, but then I get extra [] around the string.
The if statement for color can be ignored, I know it's faulty and trying to fix this is what got me to my current issue.
but then I get extra [] around the string
You can loop through this to get around the listed string. For example:
for lst, item in string:
item = item.strip("\n")
item = item.strip(";")
item = item.strip(" ")
item = item.strip("=")
name.append(item)
return name
This will get you to the string within the list and you can append the stripped string.
If this isn't what you were looking for, post some of the data you're working with to clarify.
Alright, I found the solution. It was a rather dumb mistake of mine. The eerror occured due to the [] arroung the strip function making the outcome a list or list item. Removing them fixed it. Feeling relieved now, a bit stupid, but relieved.
You can also do that in one line using the following code.
item = item.strip("\n").strip("=").strip(";").strip()
The last strip will strip the white spaces.
I have two binary data files and I want to replace the contents of part of the second binary data file
This is the sample code I have so far
Binary_file1 = open("File1.yuv","rb")
Binary_file2 = open("File2.yuv","rb")
data1 = Binary_file1.read()
data2 = Binary_file2.read()
bytes = iter(data1)
for i in range(4, 10):
data2[i] = next(bytes)
It fails at the part where I equate the data2[i] with next(bytes) and gives me an error saying that “'str' object does not support item assignment”
The part I dont understand is that how is this a string object and how can I resolve this error ,Any help would be appreciated .
PLease note the Binary files here are huge and I would like to avoid creating duplicate files as I alwyas will run into Memory Issues
You opened file and read it. So, You have string in data2. Strings do not support item assignment.
Instead, You could do:
data2 = data[2][:i] + next(bytes) + data[2][i + 1:]
Strings cannot be changed inplace (i.e. they are immutable). Try this:
a = 'abcde'
a[2] = 'F'
You will get an error. But, this will work.
a = a.replace(a[2], 'F')
You might be better off building a new string, then slicing it into your data2.
newstring = ''
for i in range(4, 10):
newstring += next(bytes)
data2 = data2.replace(data2[4:10], newstring)
Of course, the problem here is that data2[4:10] may not be unique within data2, in which case you will have multiple replacements. So, the following may be even better:
data2 = data2[:4] + newstring + data[10:]
I wrote a script for system automation, but I'm getting the error described in the title. My code below is the relevant portion of the script. What is the problem?
import csv
import os
DIR = "C:/Users/Administrator/Desktop/key_list.csv"
def Customer_List(csv):
customer = open(DIR)
for line in customer:
row = []
(row['MEM_ID'],
row['MEM_SQ'],
row['X_AUTH_USER'],
row['X_AUTH_KEY'],
row['X_STORAGE_URL'],
row['ACCESSKEY'],
row['ACCESSKEYID'],
row['ACCESSKEY1'],
row['ACCESSKEYID1'],
row['ACCESSKEY2'],
row['ACCESSKEYID2'])=line.split()
if csv == row['MEM_ID']:
customer.close()
return(row)
else:
print ("Not search for ID")
return([])
id_input = input("Please input the Customer ID(Email): ")
result = Customer_List(id_input)
if result:
print ("iD: " + id['MEM_ID']
For the line
line.split()
What are you splitting on? Looks like a CSV, so try
line.split(',')
Example:
"one,two,three".split() # returns one element ["one,two,three"]
"one,two,three".split(',') # returns three elements ["one", "two", "three"]
As #TigerhawkT3 mentions, it would be better to use the CSV module. Incredibly quick and easy method available here.
The error message is fairly self-explanatory
(a,b,c,d,e) = line.split()
expects line.split() to yield 5 elements, but in your case, it is only yielding 1 element. This could be because the data is not in the format you expect, a rogue malformed line, or maybe an empty line - there's no way to know.
To see what line is causing the issue, you could add some debug statements like this:
if len(line.split()) != 11:
print line
As Martin suggests, you might also be splitting on the wrong delimiter.
Looks like something is wrong with your data, it isn't in the format you are expecting. It could be a new line character or a blank space in the data that is tinkering with your code.
I have a file i am trying to replace parts of a line with another word.
it looks like bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212
i need to delete everything but bob123#bobscarshop.com, but i need to match 23rh32o3hro2rh2 with 23rh32o3hro2rh2:poniacvibe , from a different text file and place poniacvibe infront of bob123#bobscarshop.com
so it would look like this bob123#bobscarshop.com:poniacvibe
I've had a hard time trying to go about doing this, but i think i would have to split the bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212 with data.split(":") , but some of the lines have a (:) in a spot that i don't want the line to be split at, if that makes any sense...
if anyone could help i would really appreciate it.
ok, it looks to me like you are using a colon : to separate your strings.
in this case you can use .split(":") to break your strings into their component substrings
eg:
firststring = "bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212"
print(firststring.split(":"))
would give:
['bobkeiser', 'bob123#bobscarshop.com', '0.0.0.0.0', '23rh32o3hro2rh2', '234212']
and assuming your substrings will always be in the same order, and the same number of substrings in the main string you could then do:
firststring = "bobkeiser:bob123#bobscarshop.com:0.0.0.0.0:23rh32o3hro2rh2:234212"
firstdata = firststring.split(":")
secondstring = "23rh32o3hro2rh2:poniacvibe"
seconddata = secondstring.split(":")
if firstdata[3] == seconddata[0]:
outputdata = firstdata
outputdata.insert(1,seconddata[1])
outputstring = ""
for item in outputdata:
if outputstring == "":
outputstring = item
else
outputstring = outputstring + ":" + item
what this does is:
extract the bits of the strings into lists
see if the "23rh32o3hro2rh2" string can be found in the second list
find the corresponding part of the second list
create a list to contain the output data and put the first list into it
insert the "poniacvibe" string before "bob123#bobscarshop.com"
stitch the outputdata list back into a string using the colon as the separator
the reason your strings need to be the same length is because the index is being used to find the relevant strings rather than trying to use some form of string type matching (which gets much more complex)
if you can keep your data in this form it gets much simpler.
to protect against malformed data (lists too short) you can explicitly test for them before you start using len(list) to see how many elements are in it.
or you could let it run and catch the exception, however in this case you could end up with unintended results, as it may try to match the wrong elements from the list.
hope this helps
James
EDIT:
ok so if you are trying to match up a long list of strings from files you would probably want something along the lines of:
firstfile = open("firstfile.txt", mode = "r")
secondfile= open("secondfile.txt",mode = "r")
first_raw_data = firstfile.readlines()
firstfile.close()
second_raw_data = secondfile.readlines()
secondfile.close()
first_data = []
for item in first_raw_data:
first_data.append(item.replace("\n","").split(":"))
second_data = []
for item in second_raw_data:
second_data.append(item.replace("\n","").split(":"))
output_strings = []
for item in first_data:
searchstring = item[3]
for entry in second_data:
if searchstring == entry[0]:
output_data = item
output_string = ""
output_data.insert(1,entry[1])
for data in output_data:
if output_string == "":
output_string = data
else:
output_string = output_string + ":" + data
output_strings.append(output_string)
break
for entry in output_strings:
print(entry)
this should achieve what you're after and as prove of concept will print the resulting list of stings for you.
if you have any questions feel free to ask.
James
Second edit:
to make this output the results into a file change the last two lines to:
outputfile = open("outputfile.txt", mode = "w")
for entry in output_strings:
outputfile.write(entry+"\n")
outputfile.close()