Python 3: Alignment and commas

Python 3: Alignment and commas - python

For my Concepts of Programming course in college, we need to use functions in Python 3 to create a data table that involves numbers in the thousands. We need to include commas in the numbers. I would like to use spacing in the format. Is there a way to keep spacing to keep the table aligned and still be able to use commas in the variable?

Like this?
n = 1303344095
"{:15,d}".format(n)
Yields:
' 1,303,344,095'
So you can provide a field width specification, and then the comma specification, then the type specification, within the overall format spec. It also works with floating point numbers, albeit with an uglier syntax:
f = 1299.21
"{:10,.2f}".format(f)
Yields:
' 1,299.21'

Related

Maintain underscore separators' positions when printing f-string

If I have an amount like $40,000.00 I might want to make this more readable in code by writing the number with underscore separators
deposit = 40_000_00
I know I can use the following syntactic sugar for formatting f-strings but this prints a thousands-separated integer (4_000_000)
print(f"{deposit:_}")
# 4_000_000
Are the underscores that annotate the integer stored at any point for downstream use, or would I simply need to use an appropriate class with a __repr__ to achieve the desired effect?

This is impossible. As detailed in the PEP515 that you cited:
The underscores have no semantic meaning, and literals are parsed as if the underscores were absent.
Thus the resulting python object has no knowledge that those underscores where ever there.
If you need to store a custom separator, be explicit. Use a string.
deposit = '40_000_00'

Are the underscores that annotate the integer stored at any point for downstream use
No, they're purely a syntactic convenience.
You should just format your values using a real formatter:
import locale
locale.setlocale(locale.LC_ALL , 'en_CA') # E.g. for Canada
print(locale.format_string('%d', 1234567, grouping=True)) # 1,234,567
As a bonus, it can be made it be correct of the user's locale (e.g. many places use spaces for separators in stead of commans, and commas instead of the period for the decimal point).

loading complex numbers from yaml file into python

I have used the fix presented in YAML loads 5e-6 as string and not a number to correctly load into python numbers represented in scientific notation in a yaml file.
I also have complex numbers in my yaml file. These numbers may have varying amounts of whitespace (e.g., 2+2j, or 2 + 2j, etc). The complex numbers are currently being read in as strings (in the same way that the numbers in scientific notation were read in as strings prior to the fix referenced above). I would like to know how to modify the add_implicit_resolver argument in the fix to correctly read in complex numbers. Ideally, I'd like to continue to use pyyaml.
More specifically, if I have an entry in my yaml file such as:
offset: 2 + 1j
I would like this to be recognized as the complex number (of class 'complex'):
2+1j
Currently, the value in the python dictionary corresponding to the key 'offset' is a string:
'2 + 1j'
which I have to manually convert into a complex number via:
complex('2 + 1j'.replace(' ', ''))
I'm looking to automate this process by modifying the argument to the add_implicit_resolver using the same strategy as in the link above for dealing with numbers in scientific notation.
As for a spec, yes, in general, the real and imaginary parts of the complex number may be in scientific notation: (e.g., ' 2e-3 + 1.3e-4j'). I am fine with restricting the format to have a trailing j for the imaginary part. No tabs, no linefeeds, just individual spaces. Real or imaginary part can be missing and can be expressed as integers or floats.

python .format including float formatting

I have been aware of .format for a long while now but when I tried to include float formatting in a string like below:
"ab_{}_cd{}_e_{}_{}_{}_val={0:.5f}_{}".format(string1,str(2),string2,string3,string4,str(0.12345678),string5)
In the above example string1-5 denote random strings included for this example.
Returned the following error ValueError: cannot switch from automatic field numbering to manual field specification
After some searching this question does seem to hold the solution Using .format() to format a list with field width arguments
The above link shows I need to format all of the {} in the string to achieve the formatting I want. I have glanced at official documentation https://docs.python.org/2/library/string.html and https://docs.python.org/2/library/stdtypes.html#str.format and there isn't much in the way of explaining what I'm looking for.
Desired output:
An explanation as to how I can convert the automatic field specification of the .format option to a manual field specification with only formatting the float variable I have supplied and leaving all other string1-5 variables unformatted. Whilst I could just use something like round() or a numpy equivalent this might cause issues and I feel like I would learn better with the .format example.

in your code remove the zero before the rounded digit
"ab_{}_cd{}_e_{}_{}_{}_val={:.5f}_{}".format(string1,str(2),string2,string3,string4,str(0.12345678),string5)
NOTE: why it did not work is , you can either mention index in {} as {0},{1} or leave all of them, but you cannot keep few {} with index and few without any index.

Are there other ways to format strings other then comma, percent, plus sign?

I've been looking around and I've been unable to find a definitive answer to this question: what's the recommended way to print variables in Python?
So far, I've seen three ways: using commas, using percent signs, or using plus signs:
>>> a = "hello"
>>> b = "world"
>>> print a, "to the", b
hello to the world
>>> print "%s to the %s" % (a, b)
hello to the world
>>> print a + " to the " + b
hello to the world
Each method seems to have its pros and cons.
Commas allow to write the variable directly and add spaces, as well as automatically perform a string conversion if needed. But I seem to remember that good coding practices say that it's best to separate your variables from your text.
Percent signs allow that, though they require to use a list when there's more than one variable, and you have to write the type of the variable (though it seems able to convert even if the variable type isn't the same, like trying to print a number with %s).
Plus signs seem to be the "worst" as they mix variables and text, and don't convert on the fly; though maybe it is necessary to have more control on your variable from time to time.
I've looked around and it seems some of those methods may be obsolete nowadays. Since they all seem to work and each have their pros and cons, I'm wondering: is there a recommended method, or do they all depend on the context?

Including the values from identifiers inside a string is called string formatting. You can handle string formatting in different ways with various pros and cons.
Using string concatenation (+)
Con: You must manually convert objects to strings
Pro: The objects appear where you want to place the into the string
Con: The final layout may not be clear due to breaking the string literal
Using template strings (i.e. $bash-style substitution):
Pro: You may be familiar with shell variable expansion
Pro: Conversion to string is done automatically
Pro: Final layout is clear.
Con: You cannot specify how to perform the conversion
Using %-style formatting:
Pro: similar to formatting with C's printf.
Pro: conversions are done for you
Pro: you can specify different type of conversions, with some options (e.g. precision for floats)
Pro: The final layout is clear
Pro: You can also specify the name of the elements to substitute as in: %(name)s.
Con: You cannot customize handling of format specifiers.
Con: There are some corner cases that can puzzle you. To avoid them you should always use either tuple or dict as argument.
Using str.format:
All the pros of %-style formatting (except that it is not similar to printf)
Similar to .NET String.Format
Pro: You can manually specify numbered fields which allows you to use a positional argument multiple times
Pro: More options in the format specifiers
Pro: You can customize the formatting specifiers in custom types
The commas do not do string-formatting. They are part of the print statement statement syntax.
They have a softspace "feature" which is gone in python3 since print is a function now:
>>> print 'something\t', 'other'
something other
>>> print 'something\tother'
something other
Note how the above outputs are exactly equivalent even though the first one used comma.
This is because the comma doesn't introduce whitespace in certain situations (e.g. right after a tab or a newline).
In python3 this doesn't happen:
>>> print('something\t', 'other')
something other
>>> print('something\tother') # note the difference in spacing.
something other
Since python2.6 the preferred way of doing string formatting is using the str.format method. It was meant to replace the %-style formatting, even though currently there are no plans (and I don't there will ever be) to remove %-style formatting.

string.format() basics
Here are a couple of example of basic string substitution, the {} is the placeholder for the substituted variables. If no format is specified, it will insert and format as a string.
s1 = "so much depends upon {}".format("a red wheel barrow")
s2 = "glazed with {} water beside the {} chickens".format("rain", "white")
You can also use the numeric position of the variables and change them in the strings, this gives some flexibility when doing the formatting, if you made a mistake in the order you can easily correct without shuffling all variables around.
s1 = " {0} is better than {1} ".format("emacs", "vim")
s2 = " {1} is better than {0} ".format("emacs", "vim")
The format() function offers a fair amount of additional features and capabilities, here are a few useful tips and tricks using .format()
Named Arguments
You can use the new string format as a templating engine and use named arguments, instead of requiring a strict order.
madlib = " I {verb} the {object} off the {place} ".format(verb="took", object="cheese", place="table")
>>> I took the cheese off the table
Reuse Same Variable Multiple Times
Using the % formatter, requires a strict ordering of variables, the .format() method allows you to put them in any order as we saw above in the basics, but also allows for reuse.
str = "Oh {0}, {0}! wherefore art thou {0}?".format("Romeo")
>>> Oh Romeo, Romeo! wherefore art thou Romeo?
Use Format as a Function
You can use .format as a function which allows for some separation of text and formatting from code. For example at the beginning of your program you could include all your formats and then use later. This also could be a nice way to handle internationalization which not only requires different text but often requires different formats for numbers.
email_f = "Your email address was {email}".format
print(email_f(email="bob#example.com"))
Escaping Braces
If you need to use braces when using str.format(), just double up
print(" The {} set is often represented as {{0}} ".format("empty"))
>>> The empty set is often represented as {0}

the question is, wether you want print variables (case 1) or want to output formatted text (case 2). Case one is good and easy to use, mostly for debug output.
If you like to say something in a defined way, formatting is the better choice. '+' is not the pythonic way of string maipulation.
An alternative to % is "{0} to the {1}".format(a,b) and is the preferred way of formatting since Python 3.

Depends a bit on which version.
Python 2 will be simply:
print 'string'
print 345
print 'string'+(str(345))
print ''
Python 3 requires parentheses (wish it didn't personally)
print ('string')
print (345)
print ('string'+(str(345))
Also the most foolproof method to do it is to convert everything into a variable:
a = 'string'
b = 345
c = str(345)
d = a + c

finding all float literals in Python code

I am trying to find all occurrences of a literal float value in Python code. Can I do that in Komodo (or in any other way)?
In other words, I want to find every line where something like 0.0 or 1.5 or 1e5 is used, assuming it is interpreted by Python as a float literal (so no comments, for example).
I'm using Komodo 6.0 with Python 3.1.
If possible, a way to find string and integer literals would be nice to have as well.

Our SD Source Code Search Engine (SCSE) can easily do this.
SCSE is a tool for searching large source code bases, much faster than grep, by indexing the elements of the source code languages of interest. Queries can then be posed, which use the index to enable fast location of search hits. Queries and hits are displayed in a GUI, and a click on a hit will show the block of source code containing the hit.
The SCSE knows the lexical structure of each language it has indexed with the precision as that langauge's compiler. (It uses front ends from family of accurate programming language processors; this family is pretty large and happens to include the OP's target langauge of Python/Perl/Java/...). Thus it knows exactly where identifiers, comments, and literals (integral, float, character or string) are, and exactly their content.
SCSE queries are composed of commands representing sequences of language elements of interest. The query
'for' ... I '=' N=103
finds a for keyword near ("...") an arbitrary identifier(I) which is initialized ("=") with the numeric value ("N") of 103. Because SCSE understands the language structure, it ignores language-whitespace between the tokens, e.g., it can find this regardless off intervening blanks, whitespace, newlines or comments.
The query tokens I, N, F, S, C represent I(dentifier), Natural (number), F(loat), S(tring) and C(omment) respectively. The OP's original question, of finding all the floats, is thus the nearly trivial query
F
Similarly for finding all String literals ("S") and integral literals ("N"). If you wanted to find just copies of values near Pi, you'd add low and upper bound constraints:
F>3.14<3.16
(It is pretty funny to run this on large Fortran codes; you see all kinds of bad approximations of Pi).
SCSE won't find a Float in a comment or a string, because it intimately knows the difference. Writing a grep-style expression to handle all the strange combinations to eliminate whitespace or surrounding quotes and commente delimiters should be obviously a lot more painful. Grep ain't the way to do this.

You could do that by selecting what you need with regular expressions.
This command (run it on a terminal) should do the trick:
sed -r "s/^([^#]*)#.*$/\1/g" YOUR_FILE | grep -P "[^'\"\w]-?[1-9]\d*[.e]\d*[^'\"\w]"
You'll probably need to tweak it to get a better result.
`sed' cuts out comments, while grep selects only lines containing (a small subsect of - the expression I gave is not perfect) float values...
Hope it helps.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.