I have the above-mentioned error in s1="some very long string............"
Does anyone know what I am doing wrong?
You are not putting a " before the end of the line.
Use """ if you want to do this:
""" a very long string ......
....that can span multiple lines
"""
I had this problem - I eventually worked out that the reason was that I'd included \ characters in the string. If you have any of these, "escape" them with \\ and it should work fine.
(Assuming you don't have/want line breaks in your string...)
How long is this string really?
I suspect there is a limit to how long a line read from a file or from the commandline can be, and because the end of the line gets choped off the parser sees something like s1="some very long string.......... (without an ending ") and thus throws a parsing error?
You can split long lines up in multiple lines by escaping linebreaks in your source like this:
s1="some very long string.....\
...\
...."
In my situation, I had \r\n in my single-quoted dictionary strings. I replaced all instances of \r with \\r and \n with \\n and it fixed my issue, properly returning escaped line breaks in the eval'ed dict.
ast.literal_eval(my_str.replace('\r','\\r').replace('\n','\\n'))
.....
I faced a similar problem. I had a string which contained path to a folder in Windows e.g. C:\Users\ The problem is that \ is an escape character and so in order to use it in strings you need to add one more \.
Incorrect: C:\Users\
Correct: C:\\Users\\
You can try this:
s = r'long\annoying\path'
I too had this problem, though there were answers here I want to an important point to this
after
/ there should not be empty spaces.Be Aware of it
I also had this exact error message, for me the problem was fixed by adding an " \"
It turns out that my long string, broken into about eight lines with " \" at the very end, was missing a " \" on one line.
Python IDLE didn't specify a line number that this error was on, but it red-highlighted a totally correct variable assignment statement, throwing me off. The actual misshapen string statement (multiple lines long with " \") was adjacent to the statement being highlighted. Maybe this will help someone else.
In my case, I use Windows so I have to use double quotes instead of single.
C:\Users\Dr. Printer>python -mtimeit -s"a = 0"
100000000 loops, best of 3: 0.011 usec per loop
In my case with Mac OS X, I had the following statement:
model.export_srcpkg(platform, toolchain, 'mymodel_pkg.zip', 'mymodel.dylib’)
I was getting the error:
File "<stdin>", line 1
model.export_srcpkg(platform, toolchain, 'mymodel_pkg.zip', 'mymodel.dylib’)
^
SyntaxError: EOL while scanning string literal
After I change to:
model.export_srcpkg(platform, toolchain, "mymodel_pkg.zip", "mymodel.dylib")
It worked...
David
In my case, I forgot (' or ") at the end of string. E.g 'ABC' or "ABC"
I was getting this error in postgresql function. I had a long SQL which I broke into multiple lines with \ for better readability. However, that was the problem. I removed all and made them in one line to fix the issue. I was using pgadmin III.
Your variable(s1) spans multiple lines. In order to do this (i.e you want your string to span multiple lines), you have to use triple quotes(""").
s1="""some very long
string............"""
In this case, three single quotations or three double quotations both will work!
For example:
"""Parameters:
...Type something.....
.....finishing statement"""
OR
'''Parameters:
...Type something.....
.....finishing statement'''
I had faced the same problem while accessing any hard drive directory.
Then I solved it in this way.
import os
os.startfile("D:\folder_name\file_name") #running shortcut
os.startfile("F:") #accessing directory
The picture above shows an error and resolved output.
All code below was tested with Python 3.8.3
Simplest -- just use triple quotes.
Either single:
long_string = '''some
very
long
string
............'''
or double:
long_string = """some
very
long
string
............"""
Note: triple quoted strings retain indentation, it means that
long_string = """some
very
long
string
............"""
and
long_string = """some
very
long
string
............"""
or even just
long_string = """
some
very
long
string
............"""
are not the same.
There is a textwrap.dedent function in standard library to deal with this, though working with it is out of question's scope.
You can, as well, use \n inside a string, residing on single line:
long_string = "some \nvery \nlong \nstring \n............"
Also, if you don't need any linefeeds (i.e. newlines) in your string, you can use \ inside regular string:
long_string = "some \
very \
long \
string \
............"
Most previous answers are correct and my answer is very similar to aaronasterling, you could also do 3 single quotations
s1='''some very long string............'''
So I'm trying to parse a bunch of citations from a text file using the re module in python 3.4 (on, if it matters, a mac running mavericks). Here's some minimal code. Note that there are two commented lines: they represent two alternative searches. (Obviously, the little one, r'Rawls', is the one that works)
def makeRefList(reffile):
print(reffile)
# namepattern = r'(^[A-Z1][A-Za-z1]*-?[A-Za-z1]*),.*( \(?\d\d\d\d[a-z]?[.)])'
# namepattern = r'Rawls'
refsTuplesList = re.findall(namepattern, reffile, re.MULTILINE)
print(refsTuplesList)
The string in question is ugly, and so I stuck it in a gist: https://gist.github.com/paultopia/6c48c398a42d4834f2ae
As noted, the search string r'Rawls' produces expected output ['Rawls', 'Rawls']. However, the other search string just produces an empty list.
I've confirmed this regex (partially) works using the regex101 tester. Confirmation here: https://regex101.com/r/kP4nO0/1 -- this match what I expect it to match. Since it works in the tester, it should work in the code, right?
(n.b. I copied the text from terminal output from the first print command, then manually replaced \n characters in the string with carriage returns for regex101.)
One possible issue is that python has appended the bytecode flag (is the little b called a "flag?") to the string. This is an artifact of my attempt to convert the text from utf-8 to ascii, and I haven't figured out how to make it go away.
Yet re clearly is able to parse strings in that form. I know this because I'm converting two text files from utf-8 to ascii, and the following code works perfectly fine on the other string, converted from the other text file, which also has a little b in front of it:
def makeCiteList(citefile):
print(citefile)
citepattern = r'[\s(][A-Z1][A-Za-z1]*-?[A-Za-z1]*[ ,]? \(?\d\d\d\d[a-z]?[\s.,)]'
rawCitelist = re.findall(citepattern, citefile)
cleanCitelist = cleanup(rawCitelist)
finalCiteList = list(set(cleanCitelist))
print(finalCiteList)
return(finalCiteList)
The other chunk of text, which the code immediately above matches correctly: https://gist.github.com/paultopia/a12eba2752638389b2ee
The only hypothesis I can come up with is that the first, broken, regex expression is puking on the combination of newline characters and the string being treated as a byte object, even though a) I know the regex is correct for newlines (because, confirmation from the linked regex101), and b) I know it's matching the strings (because, confirmation from the successful match on the other string).
If that's true, though, I don't know what to do about it.
Thus, questions:
1) Is my hypothesis right that it's the combination of newlines and b that blows up my regex? If not, what is?
2) How do I fix that?
a) replace the newlines with something in the string?
b) rewrite the regex somehow?
c) somehow get rid of that b and make it into a normal string again? (how?)
thanks!
Addition
In case this is a problem I need to fix upstream, here's the code I'm using to get the text files and convert to ascii, replacing non-ascii characters:
this function gets called on utf-8 .txt files saved by textwrangler in mavericks
def makeCorpoi(citefile, reffile):
citebox = open(citefile, 'r')
refbox = open(reffile, 'r')
citecorpus = citebox.read()
refcorpus = refbox.read()
citebox.close()
refbox.close()
corpoi = [str(citecorpus), str(refcorpus)]
return corpoi
and then this function gets called on each element of the list the above function returns.
def conv2ASCII(bigstring):
def convHandler(error):
return ('1FOREIGN', error.start + 1)
codecs.register_error('foreign', convHandler)
bigstring = bigstring.encode('ascii', 'foreign')
stringstring = str(bigstring)
return stringstring
Aah. I've tracked it down and answered my own question. Apparently one needs to call some kind of encode method on the decoded thing. The following code produces an actual string, with newlines and everything, out the other end (though now I have to fix a bunch of other bugs before I can figure out if the final output is as expected):
def conv2ASCII(bigstring):
def convHandler(error):
return ('1FOREIGN', error.start + 1)
codecs.register_error('foreign', convHandler)
bigstring = bigstring.encode('ascii', 'foreign')
newstring = bigstring.decode('ascii', 'foreign')
return newstring
apparently the str() function doesn't do the same job, for reasons that are mysterious to me. This is despite an answer here How to make new line commands work in a .txt file opened from the internet? which suggests that it does.
I am trying to get some data in a list of dictionaries.
The data comes from a csv file so it's all string.
the the keys in the file all have double qoutes, but since these are all strings, I want to remove them so they look like this in the dictionary:
{'key':value}
instead of this
{'"key"':value}
I tried simply using string = string[1:-1], but this doesn's work...
Here is my code:
csvDelimiter = ","
tsvDelimiter = "\t"
dataOutput = []
dataFile = open("browser-ww-monthly-201305-201405.csv","r")
for line in dataFile:
line = line[:-1] # Removes \n after every line
data = line.split(csvDelimiter)
for i in data:
if type(i) == str: # Doesn't work, I also tried if isinstance(i, str)
# but that didn't work either.
print i
i = i[1:-1]
print i
dataOutput.append({data[0] : data[1]})
dataFile.close()
print "Data output:\n"
print dataOutput
all the prints I get from print i are good, without double quotes, but when I append data to dataOutput, the quotes are back!
Any idea how to make them disappear forever?
Strip it. For example:
data[0].strip('"')
However, when reading cvs files, the best is to use the built-in cvs module. It takes care of this for you.
As noted in the comments, when dealing with CSV files you truly ought to use Python's built-in csv module (linking to Python 2 docs since it seems that's what you're using).
Another thing to note is that when you do:
data = line.split(csvDelimiter)
every item in the returned list, if it is not empty, will be strings. There's no sense in doing a type check in the loop (though if there were a reason to you would use isinstance). I don't know what "didn't work" about it, though it's possible you were using unicode strings. On Python 2 you can usually use isinstance(..., basestring) where basestring is a base class for both str and unicode. On Python 3 just use str unless you know you're dealing with bytes.
You said: "I tried simply using string = string[1:-1], but this doesn't work...". It seems to work fine for me:
In [101]: s="'word'"
In [102]: s[1:-1]
Out[102]: 'word'
I want to do split a string using "},{" as the delimiter. I have tried various things but none of them work.
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
Split it into something like this:
2,1,6,4,5,1
8,1,4,9,6,6,7,0
6,1,2,3,9
2,3,5,4,3
string.split("},{") works at the Python console but if I write a Python script in which do this operation it does not work.
You need to assign the result of string.split("},{") to a new string. For example:
string2 = string.split("},{")
I think that is the reason you think it works at the console but not in scripts. In the console it just prints out the return value, but in the script you want to make sure you use the returned value.
You need to return the string back to the caller. Assigning to the string parameter doesn't change the caller's variable, so those changes are lost.
def convert2list(string):
string = string.strip()
string = string[2:len(string)-2].split("},{")
# Return to caller.
return string
# Grab return value.
converted = convert2list("{1,2},{3,4}")
You could do it in steps:
Split at commas to get "{...}" strings.
Remove leading and trailing curly braces.
It might not be the most Pythonic or efficient, but it's general and doable.
I was taking the input from the console in the form of arguments to the script....
So when I was taking the input as {{2,4,5},{1,9,4,8,6,6,7},{1,2,3},{2,3}} it was not coming properly in the arg[1] .. so the split was basically splitting on an empty string ...
If I run the below code from a script file (in Python 2.7):
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
print string.split("},{")
Then the output I got is:
['2,1,6,4,5,1', '8,1,4,9,6,6,7,0', '6,1,2,3,9', '2,3,5,4,3 ']
And the below code also works fine:
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
def convert2list(string):
string=string.strip()
string=string[:len(string)].split("},{")
print string
convert2list(string)
Use This:
This will split the string considering },{ as a delimiter and print the list with line breaks.
string = "2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3"
for each in string.split('},{'):
print each
Output:
2,1,6,4,5,1
8,1,4,9,6,6,7,0
6,1,2,3,9
2,3,5,4,3
If you want to print the split items in the list only you can use this simple print option.
string = "2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3"
print string.split('},{')
Output:
['2,1,6,4,5,1', '8,1,4,9,6,6,7,0', '6,1,2,3,9', '2,3,5,4,3']
Quite simply ,you have to use split() method ,and "},{" as a delimeter, then print according to arguments (because string will be a list ) ,
like the following :
string.split("},{")
for i in range(0,len(string)):
print(string[i])