Learning Python the Hard Way: Example 5 - python

The following gives a syntax error:
my eyes = 'Brown' my_hair = 'Brown'
print "Hes got %s and %s hair" % (my_eyes, my_hair)
The only way this seems to work is if I put Brown, Brown in the last parenthesis.

You're incorrectly assigning, you should try to unpack the tuple of strings into two variables. In addition, Python variables can not contain spaces so you'll want to use an underscore for eyes.
my_eyes, my_hair = 'Brown', 'Brown' # unpacking tuple here
Also, I suggest you use the format method which is more common. That style is deprecated.
print "He's got {0} and {1} hair".format(my_eyes, my_hair)

The problem turned out to be that the period at the end of the print statement was outside of the parenthesis. This now works: % (eyes, hair). The format version also works now.

Here's your variables:
name = "some name"
Age = 57
Height = 64
Weight = 135
Eyes = "brown"
Teeth = "white"
Hair = "brown"
To print a string with variables, use str.format.
print "Let's talk about {}".format(name)
print "She's {} inches tall".format(Height)
... So on
Make sure that your variables contain no spaces. They're case sensitive too :)

Related

Trying to edit string using .replace with Python

I have a string:
some_string = "I rode my bike 100' North toward the train station"
I want to change the (' North ) part to (' N ), so that that part reads as (...my bike 100' N toward the...) etc.
Write now I'm trying:
some_string = some_string.replace("' North ", "' N ")
But it just stays the same.
I don't want to use anything tricky like .replace('orth', '') because I want it to work with longer sentences that might include instances of 'North' but no apostrophe nearby.
Why isn't my first method working?
Please help!
EDIT:
So I am getting that first string by searching within another string.
Python, for some reason, returns it so that the apostrophe is a different kind of apostrope!!? To distinguish it from the single quotes that are not escaped.
some_string = '’'
^ It looks like that (copied and pasted it). Where does that come from? How would I type it out using my keyboard? Wtf!
EDIT 2:
I am getting the first string from Adobe PDF. I think it is formatted as a "fancy quote" that you get by holding down Alt and typing 0146 on number pad!!!
In Python (and generally in most high level programming language), string are immutable. You can not change it. Indeed, you can produce another string.
So, to achieve your goal, here is my suggestion:
some_string = "I rode my bike 100' North toward the train station"
some_string = some_string.replace("' North ", "' N ") # assign the new string to the old string
print(some_string)
Output: "I rode my bike 100' N toward the train station"
This will work if you assign a new variable.
some_string = "I rode my bike 100' North toward the train station"
new_string = some_string.replace("North ", " N ")
print(new_string)
>> I rode my bike 100' N toward the train station
Strings are immutable, meaning they can't be changed; likewise, this method returns a new string, it does not edit in place. You simply need to assign the code you already have to a variable. You can even assign the output of your code above to the same variable, such that the variable name now points to a different string (i.e. the one you want.)
Try
some_string=some_string.replace("' North ", "' N ")
instead.
Note that there is a documentation, https://docs.python.org/3/library/stdtypes.html, telling
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
I call bs on that edit, it works properly, see https://ideone.com/8Mossq
some_string = "I rode my bike 100' North toward the train station"
some_string = some_string.replace("' North ", "' N ")
print(some_string)
Result:
I rode my bike 100' N toward the train station

String formatting in Python -textwrapper

I have a list of People=[Tom,Mike,Carol] and I want to print it in the the below format.
People Tom
Mike
Carol
Basically its the Title (People) some fixed tabs and then a list of the name in new line ( it sorta looks like a table). I want to achieve this using textwrap in Python but if there is any other possibility, I am open to that as well.
dedented_text = textwrap.dedent(' This is a test sentecne to see how well the textwrap works and hence using it here to test it').strip()
for width in [ 20 ]:
h='{0: <4}'.format('f')
k='{0: >4}'.format(textwrap.fill(dedented_text, width=20))
print h+k
Output:
f This is a test
sentecne to see how
well the textwrap
works and hence
using it here to
test it
Above i added my code for printing the Category and a sentence. But i'm not able to achieve what I want
Just print the first line separately then print the others in a loop:
print 'People ' + people[0]
for i in range(1, len(people)):
print ' ' * 7 + people[i]

Why I would use ".format()" instead of "+"?

I was wondering why I should use something like this:
name = "Doe"
surname = "John"
print("He is {0} {1}".format(surname, name))
Instead of:
name = "Doe"
surname = "John"
print("He is" + surname + " " + name)
For starters, try doing this with +:
>>> concatenate_me = (1,2,99999,100,600, 80)
>>>'{0} {0} {2} {2} {1} {2} {3} {5} {5} {4} {0} {2}'.format(*concatenate_me)
.format() benefits:
Contains placeholders, i.e...{0}..{1}..{2}. Using .format, arguments passed are substituted into their respective placeholders (based on their order). This allows you to re-use arguments, as seen in the example above.
In each replacement with .format, you have a format specification (:). This specification allows you control with respect to many properties for each substitution you make, and there's a whole mini-language for it.
Additionally, .format is a function, which you can pass as an argument when needed. In Python 3 it is called advanced string formatting as it is much more powerful than simple concatenation.
You can do some pretty wild and flexible things if you really want using the .format function as well, for instance:
>>>'Python {0.version_info[0]:!<13.2%}'.format(sys)
'Python 300.00%!!!!!!'
And one further example with a dictionary, to display its ability to take keyword arguments:
>>>my_dict = { 'adjective': 'cool', 'function':'format'}
>>>"Look how awesome my {adjective} Python {function} skills are!".format(**my_dict)
'Look how awesome my cool Python format skills are.'
There's some further examples and uses in the Python docs.
format is much more powerful, and as you can see in the other answer, you can do a loot of cool things with it. However, I would like to add that format is not the fastest (at least in python 3.4 on ubuntu 14.04). For simple formatting, plus notation is faster. For example:
import timeit
print(timeit.timeit("name = \"Doe\"; surname = \"John\"; 'He is {0} {1}'.format(surname, name)", number=100000))
# 0.04642631400201935
print(timeit.timeit("name = \"Doe\"; surname = \"John\"; \"He is\" + surname + \" \" + name", number=100000))
# 0.01718082799925469

Replace integer with number of spaces

If I have these names:
bob = "Bob 1"
james = "James 2"
longname = "longname 3"
And priting these gives me:
Bob 1
James 2
longname 3
How can I make sure that the numbers would be aligned (without using \t or tabs or anything)? Like this:
Bob 1
James 2
longname3
This is a good use for a format string, which can specify a width for a field to be filled with a character (including spaces). But, you'll have to split() your strings first if they're in the format at the top of the post. For example:
"{: <10}{}".format(*bob.split())
# output: 'Bob 1'
The < means left align, and the space before it is the character that will be used to "fill" the "emtpy" part of that number of characters. Doesn't have to be spaces. 10 is the number of spaces and the : is just to prevent it from thinking that <10 is supposed to be the name of the argument to insert here.
Based on your example, it looks like you want the width to be based on the longest name. In which case you don't want to hardcode 10 like I just did. Instead you want to get the longest length. Here's a better example:
names_and_nums = [x.split() for x in (bob, james, longname)]
longest_length = max(len(name) for (name, num) in names_and_nums)
format_str = "{: <" + str(longest_length) + "}{}"
for name, num in names_and_nums:
print(format_str.format(name, num))
See: Format specification docs

Python parsing

I'm trying to parse the title tag in an RSS 2.0 feed into three different variables for each entry in that feed. Using ElementTree I've already parsed the RSS so that I can print each title [minus the trailing )] with the code below:
feed = getfeed("http://www.tourfilter.com/dallas/rss/by_concert_date")
for item in feed:
print repr(item.title[0:-1])
I include that because, as you can see, the item.title is a repr() data type, which I don't know much about.
A particular repr(item.title[0:-1]) printed in the interactive window looks like this:
'randy travis (Billy Bobs 3/21'
'Michael Schenker Group (House of Blues Dallas 3/26'
The user selects a band and I hope to, after parsing each item.title into 3 variables (one each for band, venue, and date... or possibly an array or I don't know...) select only those related to the band selected. Then they are sent to Google for geocoding, but that's another story.
I've seen some examples of regex and I'm reading about them, but it seems very complicated. Is it? I thought maybe someone here would have some insight as to exactly how to do this in an intelligent way. Should I use the re module? Does it matter that the output is currently is repr()s? Is there a better way? I was thinking I'd use a loop like (and this is my pseudoPython, just kind of notes I'm writing):
list = bandRaw,venue,date,latLong
for item in feed:
parse item.title for bandRaw, venue, date
if bandRaw == str(band)
send venue name + ", Dallas, TX" to google for geocoding
return lat,long
list = list + return character + bandRaw + "," + venue + "," + date + "," + lat + "," + long
else
In the end, I need to have the chosen entries in a .csv (comma-delimited) file looking like this:
band,venue,date,lat,long
randy travis,Billy Bobs,3/21,1234.5678,1234.5678
Michael Schenker Group,House of Blues Dallas,3/26,4321.8765,4321.8765
I hope this isn't too much to ask. I'll be looking into it on my own, just thought I should post here to make sure it got answered.
So, the question is, how do I best parse each repr(item.title[0:-1]) in the feed into the 3 separate values that I can then concatenate into a .csv file?
Don't let regex scare you off... it's well worth learning.
Given the examples above, you might try putting the trailing parenthesis back in, and then using this pattern:
import re
pat = re.compile('([\w\s]+)\(([\w\s]+)(\d+/\d+)\)')
info = pat.match(s)
print info.groups()
('Michael Schenker Group ', 'House of Blues Dallas ', '3/26')
To get at each group individual, just call them on the info object:
print info.group(1) # or info.groups()[0]
print '"%s","%s","%s"' % (info.group(1), info.group(2), info.group(3))
"Michael Schenker Group","House of Blues Dallas","3/26"
The hard thing about regex in this case is making sure you know all the known possible characters in the title. If there are non-alpha chars in the 'Michael Schenker Group' part, you'll have to adjust the regex for that part to allow them.
The pattern above breaks down as follows, which is parsed left to right:
([\w\s]+) : Match any word or space characters (the plus symbol indicates that there should be one or more such characters). The parentheses mean that the match will be captured as a group. This is the "Michael Schenker Group " part. If there can be numbers and dashes here, you'll want to modify the pieces between the square brackets, which are the possible characters for the set.
\( : A literal parenthesis. The backslash escapes the parenthesis, since otherwise it counts as a regex command. This is the "(" part of the string.
([\w\s]+) : Same as the one above, but this time matches the "House of Blues Dallas " part. In parentheses so they will be captured as the second group.
(\d+/\d+) : Matches the digits 3 and 26 with a slash in the middle. In parentheses so they will be captured as the third group.
\) : Closing parenthesis for the above.
The python intro to regex is quite good, and you might want to spend an evening going over it http://docs.python.org/library/re.html#module-re. Also, check Dive Into Python, which has a friendly introduction: http://diveintopython3.ep.io/regular-expressions.html.
EDIT: See zacherates below, who has some nice edits. Two heads are better than one!
Regular expressions are a great solution to this problem:
>>> import re
>>> s = 'Michael Schenker Group (House of Blues Dallas 3/26'
>>> re.match(r'(.*) \((.*) (\d+/\d+)', s).groups()
('Michael Schenker Group', 'House of Blues Dallas', '3/26')
As a side note, you might want to look at the Universal Feed Parser for handling the RSS parsing as feeds have a bad habit of being malformed.
Edit
In regards to your comment... The strings occasionally being wrapped in "s rather than 's has to do with the fact that you're using repr. The repr of a string is usually delimited with 's, unless that string contains one or more 's, where instead it uses "s so that the 's don't have to be escaped:
>>> "Hello there"
'Hello there'
>>> "it's not its"
"it's not its"
Notice the different quote styles.
Regarding the repr(item.title[0:-1]) part, not sure where you got that from but I'm pretty sure you can simply use item.title. All you're doing is removing the last char from the string and then calling repr() on it, which does nothing.
Your code should look something like this:
import geocoders # from GeoPy
us = geocoders.GeocoderDotUS()
import feedparser # from www.feedparser.org
feedurl = "http://www.tourfilter.com/dallas/rss/by_concert_date"
feed = feedparser.parse(feedurl)
lines = []
for entry in feed.entries:
m = re.search(r'(.*) \((.*) (\d+/\d+)\)', entry.title)
if m:
bandRaw, venue, date = m.groups()
if band == bandRaw:
place, (lat, lng) = us.geocode(venue + ", Dallas, TX")
lines.append(",".join([band, venue, date, lat, lng]))
result = "\n".join(lines)
EDIT: replaced list with lines as the var name. list is a builtin and should not be used as a variable name. Sorry.

Categories

Resources