I have data structured as follows:
['1404407396000',
'484745869385011200',
'0',
'1922149633',
"The nurse from the university said I couldn't go if I don't get another measles immunization...",
'-117.14384195',
'32.8110777']
I want to write this data to a csv file, but when I do, python converts the numbers to scientific notation (e.g 1.404E12).
I am using the following function to convert the list of lists to a csv:
def list_to_csv(data,name_of_csv_string):
import csv
"""
This function takes the list of lists created from the twitter data and
writes it to a csv.
data - List of lists
name_of_csv_string - What do you think this could be?
"""
with open(name_of_csv_string + ".csv", "wb") as f:
writer=csv.writer(f)
writer.writerows(data)
How can I avoid this?
By using the format specification mini language described here:
https://docs.python.org/2/library/string.html
Search for: 7.1.3.1. Format Specification Mini-Language
Use string formatting.
writer.writerows("%f" % data)
There are various formatting options you can check out here.
In my case, it was the Microsoft Excel app which was converting the numbers to scientific notation (even in the formula bar the numbers were in scientific notation).
Try opening the csv file using Notepad or a standard text editor to make sure if the numbers are saved as integers. In my case, the Notepad showed normal integer numbers, while it was Excel which showed them in the Scientific Notation form.
Related
I have used the fix presented in YAML loads 5e-6 as string and not a number to correctly load into python numbers represented in scientific notation in a yaml file.
I also have complex numbers in my yaml file. These numbers may have varying amounts of whitespace (e.g., 2+2j, or 2 + 2j, etc). The complex numbers are currently being read in as strings (in the same way that the numbers in scientific notation were read in as strings prior to the fix referenced above). I would like to know how to modify the add_implicit_resolver argument in the fix to correctly read in complex numbers. Ideally, I'd like to continue to use pyyaml.
More specifically, if I have an entry in my yaml file such as:
offset: 2 + 1j
I would like this to be recognized as the complex number (of class 'complex'):
2+1j
Currently, the value in the python dictionary corresponding to the key 'offset' is a string:
'2 + 1j'
which I have to manually convert into a complex number via:
complex('2 + 1j'.replace(' ', ''))
I'm looking to automate this process by modifying the argument to the add_implicit_resolver using the same strategy as in the link above for dealing with numbers in scientific notation.
As for a spec, yes, in general, the real and imaginary parts of the complex number may be in scientific notation: (e.g., ' 2e-3 + 1.3e-4j'). I am fine with restricting the format to have a trailing j for the imaginary part. No tabs, no linefeeds, just individual spaces. Real or imaginary part can be missing and can be expressed as integers or floats.
When a number is long in my excel document, Excel formats the cell value to scientific notation (ex 1.234567e+5) while the true number still exists in the formula bar at the top of the document (ex 123456789012).
I want to convert this number to a string for my own purposes, but when I do, the scientific notation is captured, rather than the true number. How can I assure that it's the true number that is being converted to a string?
Python will ignore the formatting that Excel uses for anything other than dates and times, so you should just be able to convert the number to a string. You will, however, be limited by Excel's precision. The OOXML file format is not suitable for some tasks notably those with historical dates or high precision times.
This question already has answers here:
Suppress the u'prefix indicating unicode' in python strings
(11 answers)
Closed 8 years ago.
I want to go through data in my folder, identify them and rename them according to a list of rules I have in an excel spreadsheet
I load the needed libraries,
I make my directory the working directory;
I read in the xcel file (using xlrd)
and when I try to read the data by columns e.g. :
fname = metadata.col_values(0, start_rowx=1, end_rowx=None)
the list of values comes with a u in front of them - I guess unicode - such as:
fname = [u'file1', u'file2'] and so on
How can I convert fname to a list of ascii strings?
I'm not sure what the big issue behind having unicode filenames is, but assuming that all of your characters are ascii-valid characters the following should do it. This solution will just ignore anything that's non-ascii, but it's worth thinking about why you're doing this in the first place:
ascii_string = unicode_string.encode("ascii", "ignore")
Specifically, for converting a whole list I would use a list comprehension:
ascii_list = [old_string.encode("ascii", "ignore") for old_string in fname]
The u at the front is just a visual item to show you, when you print the string, what the underlying representation is. It's like the single-quotes around the strings when you print that list--they are there to show you something about the object being printed (specifically, that it's a string), but they aren't actually a part of the object.
In the case of the u, it's saying it's a unicode object. When you use the string internally, that u on the outside doesn't exist, just like the single-quotes. Try opening a file and writing the strings there, and you'll see that the u and the single-quotes don't show up, because they're not actually part of the underlying string objects.
with open(r'C:\test\foo.bar', 'w') as f:
for item in fname:
f.write(item)
f.write('\n')
If you really need to print strings without the u at the start, you can convert them to ASCII with u'unicode stuff'.encode('ascii'), but honestly I doubt this is something that actually matters for what you're doing.
You could also just use Python 3, where Unicode is the default and the u isn't normally printed.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Integers from excel files become floats?
I have an excel spreadsheet that contains 1984, which xlrd handles as a number type, and thus gives me the value back as the float 1984.0. I want to get the original value as it appears in the spreadsheet, as a string "1984". How do I get this?
So internally in Excel, that 1984 is stored as a decimal number, so 1984.0 is correct. You could have changed the number formatting to show it as 1984.00, or whatever.
So are you asking how to query the cell formatting to tell that the number format is no decimals? If so you might look into using the formatting_info=True parameter of open_workbook
sheet = open_workbook(
'types.xls',formatting_info=True
).sheet_by_index(0)
Have you come across the python-excel.pdf document from http://www.python-excel.org/ ?
It is pretty good tutorial for learning to use xlrd and xlwt. Unfortunately, they say:
We've already seen that open_workbook has a parameter to load formatting information from Excel files. When this is done, all the formatting information is available, but the details of how it is presented are beyond the scope of this tutorial.
if cell.ctype==xlrd.XL_CELL_NUMBER
then excel is storing 1984 as a float and you would need to convert to a string in python
In excel
="1984" would be a string
'1984 would be a string, note that ' does not display
1984 is a #
The only kind of number is a float. The formatting attached to the cell determines if it represents a date, a decimal, or an integer. Look up the format string, and hopefully it will let you discern how the number is to be displayed.
Use string formatting:
"%d" % mynumber
>>> "%d" % 1984.0
'1984'
i wrote a simple function to write into a text file. like this,
def write_func(var):
var = str(var)
myfile.write(var)
a= 5
b= 5
c= a + b
write_func(c)
this will write the output to a desired file.
now, i want the output in another format. say,
write_func("Output is :"+c)
so that the output will have a meaningful name in the file. how do i do it?
and why is that i cant write an integer to a file? i do, int = str(int) before writing to a file?
You can't add/concatenate a string and integer directly.
If you do anything more complicated than "string :"+str(number), I would strongly recommend using string formatting:
write_func('Output is: %i' % (c))
Python is a strongly typed language. This means, among other things, that you cannot concatenate a string and an integer. Therefore you'll have to convert the integer to string before concatenating. This can be done using a format string (as Nick T suggested) or passing the integer to the built in str function (as NullUserException suggested).
Simple, you do:
write_func('Output is' + str(c))
You have to convert c to a string before you can concatenate it with another string. Then you can also take off the:
var = str(var)
From your function.
why is that i cant write an integer to
a file? i do, int = str(int) before
writing to a file?
You can write binary data to a file, but byte representations of numbers aren't really human readable. -2 for example is 0xfffffffe in a 2's complement 32-bit integer. It's even worse when the number is a float: 2.1 is 0x40066666.
If you plan on having a human-readable file, you need to human-readable characters on them. In an ASCII file '0.5' isn't a number (at least not as a computer understands numbers), but instead the characters '0', '.' and '5'. And that's why you need convert your numbers to strings.
From http://docs.python.org/library/stdtypes.html#file.write
file.write(str)
Write a string to the file. There is no return value. Due to buffering,
the string may not actually show up in
the file until the flush() or close()
method is called.
Note how documentation specifies that write's argument must be a string.
So you should create a string yourself before passing it to file.write().