loading complex numbers from yaml file into python - python

I have used the fix presented in YAML loads 5e-6 as string and not a number to correctly load into python numbers represented in scientific notation in a yaml file.
I also have complex numbers in my yaml file. These numbers may have varying amounts of whitespace (e.g., 2+2j, or 2 + 2j, etc). The complex numbers are currently being read in as strings (in the same way that the numbers in scientific notation were read in as strings prior to the fix referenced above). I would like to know how to modify the add_implicit_resolver argument in the fix to correctly read in complex numbers. Ideally, I'd like to continue to use pyyaml.
More specifically, if I have an entry in my yaml file such as:
offset: 2 + 1j
I would like this to be recognized as the complex number (of class 'complex'):
2+1j
Currently, the value in the python dictionary corresponding to the key 'offset' is a string:
'2 + 1j'
which I have to manually convert into a complex number via:
complex('2 + 1j'.replace(' ', ''))
I'm looking to automate this process by modifying the argument to the add_implicit_resolver using the same strategy as in the link above for dealing with numbers in scientific notation.
As for a spec, yes, in general, the real and imaginary parts of the complex number may be in scientific notation: (e.g., ' 2e-3 + 1.3e-4j'). I am fine with restricting the format to have a trailing j for the imaginary part. No tabs, no linefeeds, just individual spaces. Real or imaginary part can be missing and can be expressed as integers or floats.

Related

Removing scientific-notation from number in openpyxl

When a number is long in my excel document, Excel formats the cell value to scientific notation (ex 1.234567e+5) while the true number still exists in the formula bar at the top of the document (ex 123456789012).
I want to convert this number to a string for my own purposes, but when I do, the scientific notation is captured, rather than the true number. How can I assure that it's the true number that is being converted to a string?
Python will ignore the formatting that Excel uses for anything other than dates and times, so you should just be able to convert the number to a string. You will, however, be limited by Excel's precision. The OOXML file format is not suitable for some tasks notably those with historical dates or high precision times.

Context Free Grammar Differentiate Integer and Floating Point Constants

I am writing an LR(1) parser, and I've been basing my test grammar off of the C language. I've looked at the grammar for both C and Python:
https://www.lysator.liu.se/c/ANSI-C-grammar-y.html
https://docs.python.org/3/reference/grammar.html
C seems to use the symbol CONSTANT for integer and floating point constants, and Python uses NUMBER.
What I'm wondering is why are these not separated into individual symbols such as INT and FLOAT so that they can later be put into separate nodes in the Abstract Syntax Tree?
Since we already know what type of number it is after the lexer has parsed it, why merge them into a generic 'NUMBER' and later try to figure out which one it is again?
Being able to handle some special cases earlier does not simplify things, since you still need the same code in a different place later. For example, consider the code y + z. Python doesn't know what that is, other than at run time it will invoke y.__add__(z). The code to generate that isn't going away. That same code can take 3 + x and just as easily generate (3).__add__(z). So it doesn't really simplify anything to distinguish between y + z and 3 + z during parsing. (The same logic holds if y is a float literal instead of an identifier.)
Now consider something like 3.0 + 5. Separate code exists to replace this with 8.0 instead of (3.0).__add__(5) prior to byte-code compilation, because 1) it's simple to do and 2) it is demonstrably better than invoking a function at run time. However, this still isn't done by the parser. This is done by an optimizer that runs over the tree looking for things like NUMBER + NUMBER. Once that is found, the optimizer can determine if the NUMBERs are ints or floats, and produce the appropriate sum to include in the code. This is simpler than having to handle 4 different bits of parse tree INT + FLOAT, FLOAT + INT, FLOAT + FLOAT, and INT + INT.

Creating a binary fraction in python

I am trying to create a binary number that contains a fraction. Something like this:
0b110.101
However, this gives a syntax error.
0b110
works fine though. How do you create a binary number that is not an integer?
How do you create a binary number that is not an integer?
Binary is a text representation of a number. String are used to store text, so one would use something like the following:
"110.101"
But I think you misstated your question. You don't want the binary representation of a number, you want the number itself. 110.1012 represents the number six and five eight. There are infinite ways to create create that number, including the following:
6 + 5/8
6.625
That said, I suspect you'd prefer to see the binary representation of the number in the source. Unfortunately, Python does not have decimal binary number literals. You could use
0b110101 / (1>>3)
bin_to_num("110.101")
Writing bin_to_num is left as an exercise to the reader.
You use the Binary fractions package.
This let's you convert binary-fraction strings into numbers and vice-versa
Example:
>>> from binary_fractions import Binary
>>> str(Binary(6.625))
'0b110.101'
>>> float(Binary("0b110.101"))
6.625
It has many more helper functions to manipulate binary strings such as: shift, add, fill, to_exponential, invert...
PS: Shameless plug, I'm the author of this package.

Python 3: Alignment and commas

For my Concepts of Programming course in college, we need to use functions in Python 3 to create a data table that involves numbers in the thousands. We need to include commas in the numbers. I would like to use spacing in the format. Is there a way to keep spacing to keep the table aligned and still be able to use commas in the variable?
Like this?
n = 1303344095
"{:15,d}".format(n)
Yields:
' 1,303,344,095'
So you can provide a field width specification, and then the comma specification, then the type specification, within the overall format spec. It also works with floating point numbers, albeit with an uglier syntax:
f = 1299.21
"{:10,.2f}".format(f)
Yields:
' 1,299.21'

Python default behavior of str(x)

I am depending on some code that uses the Decimal class because it needs precision to a certain number of decimal places. Some of the functions allow inputs to be floats because of the way that it interfaces with other parts of the codebase. To convert them to decimal objects, it uses things like
mydec = decimal.Decimal(str(x))
where x is the float taken as input. My question is, does anyone know what the standard is for the 'str' method as applied to floats?
For example, take the number 2.1234512. It is stored internally as 2.12345119999999999 because of how floats are represented.
>>> x = 2.12345119999999999
>>> x
2.1234511999999999
>>> str(x)
'2.1234512'
Ok, str(x) in this case is doing something like '%.6f' % x. This is a problem with the way my code converts to decimals. Take the following:
>>> d = decimal.Decimal('2.12345119999999999')
>>> ds = decimal.Decimal(str(2.12345119999999999))
>>> d - ds
Decimal('-1E-17')
So if I have the float, 2.12345119999999999, and I want to pass it to Decimal, converting it to a string using str() gets me the wrong answer. I need to know what are the rules for str(x) that determine what the formatting will be, because I need to determine whether this code needs to be re-written to avoid this error (note that it might be OK, because, for example, the code might round to the 10th decimal place once we have a decimal object)
There must be some set of rules in python's docs that hopefully someone here can point me to. Thanks!
In the Python source, look in "Include/floatobject.h". The precision for the string conversion is set a few lines from the top after an comment with some explanation of the choice:
/* The str() precision PyFloat_STR_PRECISION is chosen so that in most cases,
the rounding noise created by various operations is suppressed, while
giving plenty of precision for practical use. */
#define PyFloat_STR_PRECISION 12
You have the option of rebuilding, if you need something different. Any changes will change formatting of floats and complex numbers. See ./Objects/complexobject.c and ./Objects/floatobject.c. Also, you can compare the difference between how repr and str convert doubles in these two files.
There's a couple of issues worth discussing here, but the summary is: you cannot extract information that is not stored on your system already.
If you've taken a decimal number and stored it as a floating point, you'll have lost information, since most decimal (base 10) numbers with a finite number of digits cannot be stored using a finite number of digits in base 2 (binary).
As was mentioned, str(a_float) will really call a_float.__str__(). As the documentation states, the purpose of that method is to
return a string containing a nicely printable representation of an object
There's no particular definition for the float case. My opinion is that, for your purposes, you should consider __str__'s behavior to be undefined, since there's no official documentation on it - the current implementation can change anytime.
If you don't have the original strings, there's no way to extract the missing digits of the decimal representation from the float objects. All you can do is round predictably, using string formatting (which you mention):
Decimal( "{0:.5f}".format(a_float) )
You can also remove 0s on the right with resulting_string.rstrip("0").
Again, this method does not recover the information that has been lost.

Categories

Resources