This should be easy.
Here's my array (rather, a method of generating representative test arrays):
>>> ri = numpy.random.randint
>>> ri2 = lambda x: ''.join(ri(0,9,x).astype('S'))
>>> a = array([float(ri2(x)+ '.' + ri2(y)) for x,y in ri(1,10,(10,2))])
>>> a
array([ 7.99914000e+01, 2.08000000e+01, 3.94000000e+02,
4.66100000e+03, 5.00000000e+00, 1.72575100e+03,
3.91500000e+02, 1.90610000e+04, 1.16247000e+04,
3.53920000e+02])
I want a list of strings where '\n'.join(list_o_strings) would print:
79.9914
20.8
394.0
4661.0
5.0
1725.751
391.5
19061.0
11624.7
353.92
I want to space pad to the left and the right (but no more than necessary).
I want a zero after the decimal if that is all that is after the decimal.
I do not want scientific notation.
..and I do not want to lose any significant digits. (in 353.98000000000002 the 2 is not significant)
Yeah, it's nice to want..
Python 2.5's %g, %fx.x, etc. are either befuddling me, or can't do it.
I have not tried import decimal yet. I can't see that NumPy does it either (although, the array.__str__ and array.__repr__ are decimal aligned (but sometimes return scientific).
Oh, and speed counts. I'm dealing with big arrays here.
My current solution approaches are:
to str(a) and parse off NumPy's brackets
to str(e) each element in the array and split('.') then pad and reconstruct
to a.astype('S'+str(i)) where i is the max(len(str(a))), then pad
It seems like there should be some off-the-shelf solution out there... (but not required)
Top suggestion fails with when dtype is float64:
>>> a
array([ 5.50056103e+02, 6.77383566e+03, 6.01001513e+05,
3.55425142e+08, 7.07254875e+05, 8.83174744e+02,
8.22320510e+01, 4.25076609e+08, 6.28662635e+07,
1.56503068e+02])
>>> ut0 = re.compile(r'(\d)0+$')
>>> thelist = [ut0.sub(r'\1', "%12f" % x) for x in a]
>>> print '\n'.join(thelist)
550.056103
6773.835663
601001.513
355425141.8471
707254.875038
883.174744
82.232051
425076608.7676
62866263.55
156.503068
Sorry, but after thorough investigation I can't find any way to perform the task you require without a minimum of post-processing (to strip off the trailing zeros you don't want to see); something like:
import re
ut0 = re.compile(r'(\d)0+$')
thelist = [ut0.sub(r'\1', "%12f" % x) for x in a]
print '\n'.join(thelist)
is speedy and concise, but breaks your constraint of being "off-the-shelf" -- it is, instead, a modular combination of general formatting (which almost does what you want but leaves trailing zero you want to hide) and a RE to remove undesired trailing zeros. Practically, I think it does exactly what you require, but your conditions as stated are, I believe, over-constrained.
Edit: original question was edited to specify more significant digits, require no extra leading space beyond what's required for the largest number, and provide a new example (where my previous suggestion, above, doesn't match the desired output). The work of removing leading whitespace that's common to a bunch of strings is best performed with textwrap.dedent -- but that works on a single string (with newlines) while the required output is a list of strings. No problem, we'll just put the lines together, dedent them, and split them up again:
import re
import textwrap
a = [ 5.50056103e+02, 6.77383566e+03, 6.01001513e+05,
3.55425142e+08, 7.07254875e+05, 8.83174744e+02,
8.22320510e+01, 4.25076609e+08, 6.28662635e+07,
1.56503068e+02]
thelist = textwrap.dedent(
'\n'.join(ut0.sub(r'\1', "%20f" % x) for x in a)).splitlines()
print '\n'.join(thelist)
emits:
550.056103
6773.83566
601001.513
355425142.0
707254.875
883.174744
82.232051
425076609.0
62866263.5
156.503068
Pythons string formatting can both print out only the necessary decimals (with %g) or use a fixed set of decimals (with %f). However, you want to print out only the necessary decimals, except if the number is a whole number, then you want one decimal, and that makes it complex.
This means you would end up with something like:
def printarr(arr):
for x in array:
if math.floor(x) == x:
res = '%.1f' % x
else:
res = '%.10g' % x
print "%*s" % (15-res.find('.')+len(res), res)
This will first create a string either with 1 decimal, if the value is a whole number, or it will print with automatic decimals (but only up to 10 numbers) if it is not a fractional number. Lastly it will print it, adjusted so that the decimal point will be aligned.
Probably, though, numpy actually does what you want, because you typically do want it to be in exponential mode if it's too long.
Related
I would like to format my floats with a fixed amount of digits. Right now I'm doing the following
format="%6.6g"
print(format%0.00215165)
print(format%1.23260)
print(format%145.5655)
But this outputs
0.00215165
1.2326
145.565
I also tried format="%6.6f" but it doesn't really give what I want either...
0.002152
1.232600
145.565500
What would be a good way to format the numbers so that all of them have exactly width 6 (and no spaces) like so ?
0.002152
1.232600
145.5655
This is complicated because you want the precision (number of decimals) to depend on the available space, while the general thrust of floating-point formatting is to make the number of significant digits depend on the available space. To do what you want you need a function that computes the desired number of decimals from the log of the number. There isn't, so far as I know, a built-in function that will do this for you.
def decimals(v):
return max(0, min(6,6-int(math.log10(abs(v))))) if v else 6
This simply takes the log of number and truncates it to int. So 10-99 -> 1, 100-999 -> 2 etc. You then use that
result to work out the precision to which the number needs to be formatted. In practice the
function is more complex because of the corner cases: what to do with negative numbers, numbers that underflow, etc.
For simplicity I've deliberately left your figure of 6 decimals hard-coded 3 times in the function.
Then formatting isn't so hard:
>>> v = 0.00215165
>>> "{0:.{1}f}".format(v, decimals(v))
'0.002152'
>>> v2 = 1.23260
>>> "{0:.{1}f}".format(v2, decimals(v2))
'1.232600'
>>> v3 = 145.5655
>>> "{0:.{1}f}".format(v3, decimals(v3))
'145.5655'
>>> vz = 0e0 # behaviour with zero
>>> "{0:.{1}f}".format(vz, decimals(vz))
'0.000000'
>>> vu = 1e-10 # behaviour with underflow
>>> "{0:.{1}f}".format(vu, decimals(vu))
'0.000000'
>>> vo = 1234567 # behaviour when nearly out of space
>>> "{0:.{1}f}".format(vo, decimals(vo))
'1234567'
>>> voo = 12345678 # behaviour when all out of space
>>> "{0:.{1}f}".format(voo, decimals(voo))
'12345678'
You can use %-notation for this instead of a call to format but it is not very obvious or intuitive:
>>> "%.*f" % (decimals(v), v)
'0.002152'
You don't say what you want done with negative numbers. What this approach does is to take an extra
character to display the minus sign. If you don't want that then you need to reduce the number of
decimals for negative numbers.
Context
We display percentage values to agents in our app without trailing zeros (50% is much easier to quickly scan than is 50.000%), and hitherto we've just used quantize to sort of brute force normalize the value to remove trailing zeros.
This morning I decided to look into using Decimal.normalize instead, but ran into this:
Given the decimal value:
>>> value = Decimal('50.000')
Normalizing that value:
>>> value = value.normalize()
Results in:
>>> value
Decimal('5E+1')
I understand the value is the same:
>>> Decimal('5E+1') == Decimal('50')
True
But from a non-technical user's perspective, 5E+1 is basically meaningless.
Question
Is there a way to convert Decimal('5E+1') to Decimal('50')?
Note
I'm not looking to do anything that would change the value of the Decimal (e.g., removing decimal places altogether), since the value could be e.g., Decimal('33.333'). IOW, don't confuse my 50.000 example as meaning that we're only dealing with whole numbers.
For the purposes of output formatting, you can print your normalized Decimal objects with the f format specifier. (While the format string docs say this defaults to a precision of 6, this does not appear to be the case for Decimal objects.)
>>> print('{:f}%'.format(decimal.Decimal('50.000').normalize()))
50%
>>> print('{:f}%'.format(decimal.Decimal('50.003').normalize()))
50.003%
>>> print('{:f}%'.format(decimal.Decimal('1.23456789').normalize()))
1.23456789%
If for some reason, you really want to make a new Decimal object with different precision, you can do that by just calling Decimal on the f format output, but it sounds like you're dealing with an output format problem, not something you should change the internal representation for.
>>> Decimal('{:f}'.format(Decimal('5E+1')))
Decimal('50')
>>>
>>> Decimal('{:f}'.format(Decimal('50.000').normalize()))
Decimal('50')
>>> Decimal('{:f}'.format(Decimal('50.003').normalize()))
Decimal('50.003')
>>> Decimal('{:f}'.format(Decimal('1.23456789').normalize()))
Decimal('1.23456789')
according to the python 3.9 docs the below is how to do it - https://docs.python.org/3.9/library/decimal.html#decimal-faq
def remove_exponent(d):
return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
Add Decimal(0) to your result.
Decimal('50.000').normalize()
# Decimal('5E+1')
Decimal('50.000').normalize() + Decimal(0)
# Decimal('50')
suppose a float number x=3.1234. I want to print this number in the middle of the string containing space in the left side and right side of x. string length will be variable. Precision of x will be variable. if string length=10 and precision=2 the output will be " 3.14 " Have any function in python that can return this?
This is really nicely documented at https://docs.python.org/3.6/library/string.html#format-specification-mini-language
But since you clearly didn't have time to google for it:
>>> x = 3.1234
>>> length=10
>>> precision=2
>>> f"{x:^{length}.{precision}}"
' 3.1 '
I'm afraid your notion of precision doesn't agree with Python's in the default case. You can fix it by specifying fixed point formatting instead of the default general formatting:
>>> f"{x:^{length}.{precision}f}"
' 3.12 '
This notation is more perspicuous than calling the method str.format(). But in Python 3.5 and earlier you need to do this instead:
>>> "{x:^{length}.{precision}f}".format(x=x, length=length, precision=precision)
But no amount of fiddling with the format is going to make 3.1234 come out as 3.14. I suspect that that was an error in the question, but if you really meant it, then there is no alternative but adjust the value of x before formatting it. Here is one way to do that:
>>> from decimal import *
>>> (Decimal(x) / Decimal ('0.02')).quantize(Decimal('1'), rounding=ROUND_UP) * Decimal('0.02')
Decimal('3.14')
This divides your number into a whole number of chunks of size 0.02, rounding up where necessary, then multiplies by 0.02 again to get the value you want.
I need to record SerialNumber(s) on an object. We enter many objects. Most serial numbers are strings - the numbers aren't used numerically, just as unique identifiers - but they are often sequential. Further, leading zeros are important due to unique id status of serial number.
When doing data entry, it's nice to just enter the first "sequential" serial number (eg 000123) and then the number of items (eg 5) to get the desired output - that way we can enter data in bulk see below:
Obj1.serial = 000123
Obj2.serial = 000124
Obj3.serial = 000125
Obj4.serial = 000126
Obj5.serial = 000127
The problem is that when you take the first number-as-string, turn to integer and increment, you loose the leading zeros.
Not all serials are sequential - not all are even numbers (eg FDM-434\RRTASDVI908)
But those that are, I would like to automate entry.
In python, what is the most elegant way to check for leading zeros (*and, I guess, edge cases like 0009999) in a string before iterating, and then re-application of those zeros after increment?
I have a solution to this problem but it isn't elegant. In fact, it's the most boring and blunt alg possible.
Is there an elegant solution to this problem?
EDIT
To clarify the question, I want the serial to have the same number of digits after the increment.
So, in most cases, this will mean reapplying the same number of leading zeros. BUT in some edge cases the number of leading zeros will be decremented. eg: 009 -> 010; 0099 -> 0100
Try str.zfill():
>>> s = "000123"
>>> i = int(s)
>>> i
123
>>> n = 6
>>> str(i).zfill(n)
'000123'
I develop my comment here, Obj1.serial being a string:
Obj1.serial = "000123"
('%0'+str(len(Obj1.serial))+'d') % (1+int(Obj1.serial))
It's like #owen-s answer '%06d' % n: print the number and pad with leading 0.
Regarding '%d' % n, it's just one way of printing. From PEP3101:
In Python 3.0, the % operator is supplemented by a more powerful
string formatting method, format(). Support for the str.format()
method has been backported to Python 2.6.
So you may want to use format instead… Anyway, you have an integer at the right of the % sign, and it will replace the %d inside the left string.
'%06d' means print a minimum of 6 (6) digits (d) long, fill with 0 (0) if necessary.
As Obj1.serial is a string, you have to convert it to an integer before the increment: 1+int(Obj1.serial). And because the right side takes an integer, we can leave it like that.
Now, for the left part, as we can't hard code 6, we have to take the length of Obj1.serial. But this is an integer, so we have to convert it back to a string, and concatenate to the rest of the expression %0 6 d : '%0'+str(len(Obj1.serial))+'d'. Thus
('%0'+str(len(Obj1.serial))+'d') % (1+int(Obj1.serial))
Now, with format (format-specification):
'{0:06}'.format(n)
is replaced in the same way by
('{0:0'+str(len(Obj1.serial))+'}').format(1+int(Obj1.serial))
You could check the length of the string ahead of time, then use rjust to pad to the same length afterwards:
>>> s = "000123"
>>> len_s = len(s)
>>> i = int(s)
>>> i
123
>>> str(i).rjust(len_s, "0")
'000123'
You can check a serial number for all digits using:
if serial.isdigit():
Along the lines of my previous question, How do I convert unicode characters to floats in Python? , I would like to find a more elegant solution to calculating the value of a string that contains unicode numeric values.
For example, take the strings "1⅕" and "1 ⅕". I would like these to resolve to 1.2
I know that I can iterate through the string by character, check for unicodedata.category(x) == "No" on each character, and convert the unicode characters by unicodedata.numeric(x). I would then have to split the string and sum the values. However, this seems rather hacky and unstable. Is there a more elegant solution for this in Python?
I think this is what you want...
import unicodedata
def eval_unicode(s):
#sum all the unicode fractions
u = sum(map(unicodedata.numeric, filter(lambda x: unicodedata.category(x)=="No",s)))
#eval the regular digits (with optional dot) as a float, or default to 0
n = float("".join(filter(lambda x:x.isdigit() or x==".", s)) or 0)
return n+u
or the "comprehensive" solution, for those who prefer that style:
import unicodedata
def eval_unicode(s):
#sum all the unicode fractions
u = sum(unicodedata.numeric(i) for i in s if unicodedata.category(i)=="No")
#eval the regular digits (with optional dot) as a float, or default to 0
n = float("".join(i for i in s if i.isdigit() or i==".") or 0)
return n+u
But beware, there are many unicode values that seem to not have a numeric value assigned in python (for example ⅜⅝ don't work... or maybe is just a matter with my keyboard xD).
Another note on the implementation: it's "too robust", it will work even will malformed numbers like "123½3 ½" and will eval it to 1234.0... but it won't work if there are more than one dots.
>>> import unicodedata
>>> b = '10 ⅕'
>>> int(b[:-1]) + unicodedata.numeric(b[-1])
10.2
define convert_dubious_strings(s):
try:
return int(s)
except UnicodeEncodeError:
return int(b[:-1]) + unicodedata.numeric(b[-1])
and if it might have no integer part than another try-except sub-block needs to be added.
This might be sufficient for you, depending on the strange edge cases you want to deal with:
val = 0
for c in my_unicode_string:
if unicodedata.category(unichr(c)) == 'No':
cval = unicodedata.numeric(c)
elif c.isdigit():
cval = int(c)
else:
continue
if cval == int(cval):
val *= 10
val += cval
print val
Whole digits are assumed to be another digit in the number, fractional characters are assumed to be fractions to add to the number. Doesn't do the right thing with spaces between digits, repeated fractions, etc.
I think you'll need a regular expression, explicitly listing the characters that you want to support. Not all numerical characters are suitable for the kind of composition that you envision - for example, what should be the numerical value of
u"4\N{CIRCLED NUMBER FORTY TWO}2\N{SUPERSCRIPT SIX}"
???
Do
for i in range(65536):
if unicodedata.category(unichr(i)) == 'No':
print hex(i), unicodedata.name(unichdr(i))
and go through the list defining which ones you really want to support.