I found this strange problem when I trying to add comments to my code. I used the triple-quoted strings to comment but the program crashed by giving the following error:
IndentationError: unexpected indent
When I use # to comment the triple-quoted strings, everything works normally. Does anyone know the reason behind this error and how I could fix it?
My Code:
#This programs show that comments using # rather than """ """
def main():
print("let's do something")
#Try using hashtag to comment this block to get code working
'''
Note following block gives you a non-sense indent error
The next step would be to consider how to get all the words from spam and ham
folder from different directory. My suggestion would be do it twice and then
concentrate two lists
Frist think about the most efficient way
For example, we might need to get rid off the duplicated words in the beginning
The thoughts of writing the algorithem to create the dictionary
Method-1:
1. To append all the list from the email all-together
2. Eliminate those duplicated words
cons: the list might become super large
I Choose method-2 to save the memory
Method-2:
1. kill the duplicated words in each string
2. Only append elements that is not already in the dictionary
Note:
1. In this case, the length of feature actually was determined by the
training cohorts, as we used the different English terms to decide feature
cons: the process time might be super long
'''
def wtf_python(var1, var2):
var3 = var1 + var2 + (var1*var2)
return var3
wtfRst1 = wtf_python(1,2)
wtfRst2 = wtf_python(3,4)
rstAll = { "wtfRst1" : wtfRst1,
"wtfRst2" : wtfRst2
}
return(rstAll)
if __name__ == "__main__":
mainRst = main()
print("wtfRst1 is :\n", mainRst['wtfRst1'])
print("wtfRst2 is :\n", mainRst['wtfRst2'])
The culprit:
Move the comments inside the function definition:
The reason:
Since the triple-quote strings are valid python exp, they should be treated like-wise, i.e. inside the function scope.
Hence:
def main():
print("let's do something")
#Try using hashtag to comment this block to get code working
'''
Note following block gives you a non-sense indent error
The next step would be to consider how to get all the words from spam and ham
folder from different directory. My suggestion would be do it twice and then
concentrate two lists
Frist think about the most efficient way
For example, we might need to get rid off the duplicated words in the beginning
The thoughts of writing the algorithem to create the dictionary
Method-1:
1. To append all the list from the email all-together
2. Eliminate those duplicated words
cons: the list might become super large
I Choose method-2 to save the memory
Method-2:
1. kill the duplicated words in each string
2. Only append elements that is not already in the dictionary
Note:
1. In this case, the length of feature actually was determined by the
training cohorts, as we used the different English terms to decide feature
cons: the process time might be super long
'''
def wtf_python(var1, var2):
var3 = var1 + var2 + (var1*var2)
return var3
wtfRst1 = wtf_python(1,2)
wtfRst2 = wtf_python(3,4)
rstAll = { "wtfRst1" : wtfRst1,
"wtfRst2" : wtfRst2
}
return(rstAll)
if __name__ == "__main__":
mainRst = main()
print("wtfRst1 is :\n", mainRst['wtfRst1'])
print("wtfRst2 is :\n", mainRst['wtfRst2'])
OUTPUT:
let's do something
wtfRst1 is :
5
wtfRst2 is :
19
You should push the indentation level of you triple-quote strings one tag to the right.
Although triple-quote strings are often used as comments, they are normal python expressions, so they should follow the language's syntax.
Triple quoted strings as comments must be valid Python strings. Valid Python strings must be properly indented.
Python sees the multi-line string, evaluates it, but since you don't assign a variable to it the string gets thrown away in the next line.
Related
I am very new to Python and am looking for assistance to where I am going wrong with an assignment. I have attempted different ways to approach the problem but keep getting stuck at the same point(s):
Problem 1: When I am trying to create a list of words from a file, I keep making a list for the words per line rather than the entire file
Problem 2: When I try and combine the lists I keep receiving "None" for my result or Nonetype errors [which I think means I have added the None's together(?)].
The assignment is:
#8.4 Open the file romeo.txt and read it line by line. For each line, split the line into a list of words using the split() method. The program should build a list of words. For each word on each line check to see if the word is already in the list and if not append it to the list. When the program completes, sort and print the resulting words in alphabetical order.You can download the sample data at http://www.py4e.com/code3/romeo.txt
My current code which is giving me a Nonetype error is:
poem = input("enter file:")
play = open(poem)
lst= list()
for line in play:
line=line.rstrip()
word=line.split()
if not word in lst:
lst= lst.append(word)
print(lst.sort())
If someone could just talk me through where I am going wrong that will be greatly appreciated!
your problem was lst= lst.append(word) this returns None
with open(poem) as f:
lines = f.read().split('\n') #you can also you readlines()
lst = []
for line in lines:
words = line.split()
for word in words:
if word:
lst.append(word)
Problem 1: When I am trying to create a list of words from a file, I keep making a list for the words per line rather than the entire file
You are doing play = open(poem) then for line in play: which is method for processing file line-by-line, if you want to process whole content at once then do:
play = open(poem)
content = play.read()
words = content.split()
Please always remember to close file after you used it i.e. do
play.close()
unless you use context manager way (i.e. like with open(poem) as f:)
Just to help you get into Python a little more:
You can:
1. Read whole file at once (if it is big it is better to grab it into RAM if you have enough of it, if not grab as much as you can for the chunk to be reasonable, then grab another one and so on)
2. Split data you read into words and
3. Use set() or dict() to remove duplicates
Along the way, you shouldn't forget to pay attention to upper and lower cases,
if you need same words, not just different not repeating strings
This will work in Py2 and Py3 as long as you do something about input() function in Py2 or use quotes when entering the path, so:
path = input("Filename: ")
f = open(filename)
c = f.read()
f.close()
words = set(x.lower() for x in c.split()) # This will make a set of lower case words, with no repetitions
# This equals to:
#words = set()
#for x in c.split():
# words.add(x.lower())
# set() is an unordered datatype ignoring duplicate items
# and it mimics mathematical sets and their properties (unions, intersections, ...)
# it is also fast as hell.
# Checking a list() each time for existance of the word is slow as hell
#----
# OK, you need a sorted list, so:
words = sorted(words)
# Or step-by-step:
#words = list(words)
#words.sort()
# Now words is your list
As for your errors, do not worry, they are common at the beginning in almost any objective oriented language.
Other explained them well in their comments. But not to make the answer lacking...:
Always pay attention on functions or methods which operate on the datatype (in place sort - list.sort(), list.append(), list.insert(), set.add()...) and which ones return a new version of the datatype (sorted(), str.lower()...).
If you ran into a similar situation again, use help() in interactive shell to see what exactly a function you used does.
>>> help(list.append)
>>> help(list.sort)
>>> help(str.lower)
>>> # Or any short documentation you need
Python, especially Python 3.x is sensitive to trying operations between types, but some might have a different connotation and can actually work while doing unexpected stuff.
E.g. you can do:
print(40*"x")
It will print out 40 'x' characters, because it will create a string of 40 characters.
But:
print([1, 2, 3]+None)
will, logically not work, which is what is happening somewhere in the rest of your code.
In some languages like javascript (terrible stuff) this will work perfectly well:
v = "abc "+123+" def";
Inserting the 123 seamlessly into the string. Which is usefull, but a programming nightmare and nonsense from another viewing angle.
Also, in Py3 a reasonable assumption from Py2 that you can mix unicode and byte strings and that automatic cast will be performed is not holding.
I.e. this is a TypeError:
print(b"abc"+"def")
because b"abc" is bytes() and "def" (or u"def") is str() in Py3 - what is unicode() in Py2)
Enjoy Python, it is the best!
If I were to take a dictionary, such as
living_beings= {"Reptile":"Snake","mammal":"whale", "Other":"bird"}
and wished to search for individual characters (such as "a") (e.g.
for i in living_beings:
if "a" in living_beings:
print("a is here")
would there be an efficient- runs fastest- method of doing this?
The input is simply searching as outlined above (although my approach didn't work).
My (failed) code goes as follows:
animals=[]
for row in reader: #'reader' is simply what was in the dictionary
animals.append(row) #I tried to turn it into a list to sort it that way
for i in range(1, len(animals)):
r= animals[i]
for i in r:
if i== "a": #My attempt to find "a". This is obviously False as i= one of the strings in
k=i.replace("'","/") #this is my attempt at the further bit, for a bit of context
test= animals.append(k)
print(test)
In case you were wondering,
The next step would be to insert a character- "/"- before that letter (in this case "a"), although this is a slightly different problem and so not linked with my question and is simply there to give a greater understanding of the problem.
EDIT
I have found another error relating to dictionary. If the dictionary features an apostrophe (') the output is affected as it prints that particular word in quotes ("") rather that the normal apostrophes. EXAMPLE: living_beings= {"Reptile":"Snake's","mammal":"whale", "Other":"bird"} and if you use the following code (which I need to):
new= []
for i in living_beings:
r=living_beings[i]
new.append(r)
then the output is "snake's", 'whale', 'bird' (Note the difference between the first and other outputs). So My question is: How to stop the apostrophes affecting output.
My approach would be to use dict comprehension to map over the dictionary and replace every occurence of 'a' by '/a'.
I don't think there are significant performance improvements that can be done from there. You algorithm will be linear with regard to the total number of characters in the keys and items of the dict as you need to traverse the whole dictionary whatever the input.
living_beings= {"Reptile":"Snake","mammal":"whale", "Other":"bird"}
new_dict = {
kind.replace('a', '/a'): animal.replace('a', '/a') for kind, animal in living_beings.items()
}
# new_dict: {"Reptile":"Sn/ake","m/amm/al":"wh/ale", "Other":"bird"}
You could maybe optimize with a more convoluted solution that loops through the dict to mutate it instead of creating a new one, but in general I recommend not trying to do such things in Python. Just write good code, with good practices, and let Python do the optimization under the hood. After all this is what the Zen of Python tells us: Simple is better than complex.
This can be done quite efficiently using a regular expression match, e.g.:
import re
re_containsA = re.compile(r'.*a.*')
for key, word in worddict.items():
if re_containsA.match(word):
print(key)
The re.match object can then be used to find the location of the matched text.
I'm writing some code that reads words from a text file and sorts them into a dictionary. It actually all runs fine, but for reference here it is:
def find_words(file_name, delimiter = " "):
"""
A function for finding the number of individual words, and the most popular words, in a given file.
The process will stop at any line in the file that starts with the word 'finish'.
If there is no finish point, the process will go to the end of the file.
Inputs: file_name: Name of file you want to read from, e.g. "mywords.txt"
delimiter: The way the words in the file are separated e.g. " " or ", "
: Delimiter will default to " " if left blank.
Output: Dictionary with all the words contained in the given file, and how many times each word appears.
"""
words = []
dictt = {}
with open(file_name, 'r') as wordfile:
for line in wordfile:
words = line.split(delimiter)
if words[0]=="finish":
break
# This next part is for filling the dictionary
# and correctly counting the amount of times each word appears.
for i in range(len(words)):
a = words[i]
if a=="\n" or a=="":
continue
elif dictt.has_key(a)==False:
dictt[words[i]] = 1
else:
dictt[words[i]] = int(dictt.get(a)) + 1
return dictt
The problem is that it only works if the arguments are given as string literals, e.g, this works:
test = find_words("hello.txt", " " )
But this doesn't:
test = find_words(hello.txt, )
The error message is undefined name 'hello'
I don't know how to alter the function arguments such that I can enter them without speech marks.
Thanks!
Simple, you define that name:
class hello:
txt = "hello.txt"
But joking aside, all the argument values in a function call are expressions. If you want to pass a string literally you'll have to make a string literal, using the quotes. Python is not a text preprocessor like m4 or cpp, and expects the entire program text to follow its syntax.
So it turns out I just misunderstood what was being asked. I've had it clarified by the course leader now.
As I am now fully aware, a function definition needs to be told when a string is being entered, hence the quote marks being required.
I admit full ignorance over my depth of understanding of how it all works - I thought you could pretty much put any assortment of letters and/or numbers in as an argument and then you can manipulate them within the function definition.
My ignorance may stem from the fact that I'm quite new to Python, having learned my coding basics on C++ where, if I remember correctly (it was well over a year ago), functions are defined with each argument being specifically set up as their type, e.g.
int max(int num1, int num2)
Whereas in Python you don't quite do it like that.
Thanks for the attempts at help (and ridicule!)
Problem is sorted now.
Given a string s containing (syntactically valid) Python source code, how can I split s into an array whose elements are the strings corresponding to the Python "statements" in s?
I put scare-quotes around "statements" because this term does not capture exactly what I'm looking for. Rather than trying to come up with a more accurate wording, here's an example. Compare the following two ipython interactions:
In [1]: if 1 > 0:
......: pass
......:
In [2]: if 1 > 0
File "<ipython-input-1082-0b411f095922>", line 1
if 1 > 0
^
SyntaxError: invalid syntax
In the first interaction, after the first [RETURN] statement, ipython processes the input if 1 > 0: without objection, even though it is still incomplete (i.e. it is not a full Python statement). In contrast, in the second interaction, the input is not only incomplete (in this sense), but also not acceptable to ipython.
As a second, more complete example, suppose the file foo.py contains the following Python source code:
def print_vertically(s):
'''A pretty useless procedure.
Prints the characters in its argument one per line.
'''
for c in s:
print c
greeting = ('hello '
'world'.
upper())
print_vertically(greeting)
Now, if I ran the following snippet, featuring the desired split_python_source function:
src = open('foo.py').read()
for i, s in enumerate(split_python_source(src)):
print '%d. >>>%s<<<' % (i, s)
the output would look like this:
0. >>>def print_vertically(s):<<<
1. >>> '''A pretty useless procedure.
Prints the characters in its argument one per line.
'''<<<
2. >>> for c in s:<<<
3. >>> print c<<<
4. >>>greeting = ('hello '
'world'.
upper())<<<
5. >>>print_vertically(greeting)<<<
As you can see, in this splitting, for c in s: (for example) gets assigned to its own item, rather being part of some "compound statement."
In fact, I don't have a very precise specification for how the splitting should be done, as long as it is done "at the joints" (like ipython does).
I'm not familiar with the internals of the Python lexer (though almost certainly many people on SO are :), but my guess is that you're basically looking for lines, with one important exception : paired open-close delimiters that can span multiple lines.
As a quick and dirty first pass, you might be able to start with something that splits a piece of code on newlines, and then you could merge successive lines that are found to contain paired delimiters -- parentheses (), braces {}, brackets [], and quotes '', ''' ''' are the ones that come to mind.
I am working on the letter distribution problem from HP code wars 2012. I keep getting an error message that says "invalid character in identifier". What does this mean and how can it be fixed?
Here is the page with the information.
import string
def text_analyzer(text):
'''The text to be parsed and
the number of occurrences of the letters given back
be. Punctuation marks, and I ignore the EOF
simple. The function is thus very limited.
'''
result = {}
# Processing
for a in string.ascii_lowercase:
result [a] = text.lower (). count (a)
return result
def analysis_result (results):
# I look at the data
keys = analysis.keys ()
values \u200b\u200b= list(analysis.values \u200b\u200b())
values.sort (reverse = True )
# I turn to the dictionary and
# Must avoid that letters will be overwritten
w2 = {}
list = []
for key in keys:
item = w2.get (results [key], 0 )
if item = = 0 :
w2 [analysis results [key]] = [key]
else :
item.append (key)
w2 [analysis results [key]] = item
# We get the keys
keys = list (w2.keys ())
keys.sort (reverse = True )
for key in keys:
list = w2 [key]
liste.sort ()
for a in list:
print (a.upper (), "*" * key)
text = """I have a dream that one day this nation will rise up and live out the true
meaning of its creed: "We hold these truths to be self-evident, that all men
are created equal. "I have a dream that my four little children will one day
live in a nation where they will not be Judged by the color of their skin but
by the content of their character.
# # # """
analysis result = text_analyzer (text)
analysis_results (results)
The error SyntaxError: invalid character in identifier means you have some character in the middle of a variable name, function, etc. that's not a letter, number, or underscore. The actual error message will look something like this:
File "invalchar.py", line 23
values = list(analysis.values ())
^
SyntaxError: invalid character in identifier
That tells you what the actual problem is, so you don't have to guess "where do I have an invalid character"? Well, if you look at that line, you've got a bunch of non-printing garbage characters in there. Take them out, and you'll get past this.
If you want to know what the actual garbage characters are, I copied the offending line from your code and pasted it into a string in a Python interpreter:
>>> s=' values = list(analysis.values ())'
>>> s
' values \u200b\u200b= list(analysis.values \u200b\u200b())'
So, that's \u200b, or ZERO WIDTH SPACE. That explains why you can't see it on the page. Most commonly, you get these because you've copied some formatted (not plain-text) code off a site like StackOverflow or a wiki, or out of a PDF file.
If your editor doesn't give you a way to find and fix those characters, just delete and retype the line.
Of course you've also got at least two IndentationErrors from not indenting things, at least one more SyntaxError from stay spaces (like = = instead of ==) or underscores turned into spaces (like analysis results instead of analysis_results).
The question is, how did you get your code into this state? If you're using something like Microsoft Word as a code editor, that's your problem. Use a text editor. If not… well, whatever the root problem is that caused you to end up with these garbage characters, broken indentation, and extra spaces, fix that, before you try to fix your code.
If your keyboard is set to English US (International) rather than English US the double quotation marks don't work. This is why the single quotation marks worked in your case.
Similar to the previous answers, the problem is some character (possibly invisible) that the Python interpreter doesn't recognize. Because this is often due to copy-pasting code, re-typing the line is one option.
But if you don't want to re-type the line, you can paste your code into this tool or something similar (Google "show unicode characters online"), and it will reveal any non-standard characters. For example,
s=' values = list(analysis.values ())'
becomes
s=' values U+200B U+200B = list(analysis.values U+200B U+200B ())'
You can then delete the non-standard characters from the string.
Carefully see your quotation, is this correct or incorrect! Sometime double quotation doesn’t work properly, it's depend on your keyboard layout.
I got a similar issue. My solution was to change minus character from:
—
to
-
I got that error, when sometimes I type in Chinese language.
When it comes to punctuation marks, you do not notice that you are actually typing the Chinese version, instead of the English version.
The interpreter will give you an error message, but for human eyes, it is hard to notice the difference.
For example, "," in Chinese; and "," in English.
So be careful with your language setting.
Not sure this is right on but when i copied some code form a paper on using pgmpy and pasted it into the editor under Spyder, i kept getting the "invalid character in identifier" error though it didn't look bad to me. The particular line was grade_cpd = TabularCPD(variable='G',\
For no good reason I replaced the ' with " throughout the code and it worked. Not sure why but it did work
A little bit late but I got the same error and I realized that it was because I copied some code from a PDF. Check the difference between these two:
-
−
The first one is from hitting the minus sign on keyboard and the second is from a latex generated PDF.
This error occurs mainly when copy-pasting the code. Try editing/replacing minus(-), bracket({) symbols.
You don't get a good error message in IDLE if you just Run the module. Try typing an import command from within IDLE shell, and you'll get a much more informative error message. I had the same error and that made all the difference.
(And yes, I'd copied the code from an ebook and it was full of invisible "wrong" characters.)
My solution was to switch my Mac keyboard from Unicode to U.S. English.
it is similar for me as well after copying the code from my email.
def update(self, k=1, step = 2):
if self.start.get() and not self.is_paused.get(): U+A0
x_data.append([i for i in range(0,k,1)][-1])
y = [i for i in range(0,k,step)][-1]
There is additional U+A0 character after checking with the tool as recommended by #Jacob Stern.