How do I convert my Csv file to right format? - python

When I read and print the CSV file I downloaded, then I got the following results
As you can see, the result is printed in a wired format.
If I want to print a specific column, here is the error message I got.
I believe the format of the winedata.csv file is wrong, because my code works for other csv file. How do I convert my csv file to right format?

Your winedata.csv file is separated by semi-colons, rather than commas. Therefore, you need to provide the sep option to pd.read_csv as follows:
wine_data = pd.read_csv("winedata.csv", sep=";")
You will then be able to access your pH column as:
wine_data["pH"]

Related

Trouble reading CSV file using pandas

I'm working on a data analysis project & I wanted to read data from CSV files using pandas. I read the first CSV file and It was fine but the second one gave me a UTF 8 encoding error. I exported the file to csv and encoded it to UTF-8 in the numbers spreadsheet app. However, the data frame is not in the expected format. Any idea why?
the original CSV file in numbers
it looks like your file is semicolon separated not comma separated.
To fix this you need to add the sep=';' parameter to pd.read_csv function.
pd.read_csv("mitb.csv", sep=';')
Try adding the correct delimiter, in this case ";", to read the csv.
mitb = pd.read_csv('mitb.csv', sep=";")
The file is semicolon-separated and also decimal is comma, not dot
df = pd.read_csv('mitb.csv', sep=';', decmal=',')
And Please do not upload images of code/data/errors.

Reading a .dat file in Python corrupts the file

I have a dat file of 3D dimensional coordinates that I am trying to read like this:
path = '/path/to/dat-file.dat'
data_content = [i.strip().split() for i in open(path, encoding = 'ISO-8859-1').readLines()]
print(data_content)
This is my output:
...\xad\x05jR|APcAoSNvA07\x9...
Its basically a long cryptic line with letters that have accents as well as letters from the cyrillic alphabet.
Is the way I'm opening this file corrupting it? Where am I going wrong?
.dat files contain binary data, not text data, but print() only works with text data. The file is not being corrupted, you are just printing the data in the wrong format. If you want to print the data inside the .dat file and get something meaningful, then you will need to use a library that understands the .dat file format.

When I save as CSV, the file contains two separators which are comma and pipe

I have two tables named result and result1 respectively. The only difference between those is with 'TITLE' columns, that one is str and another is str with the double quote.
table: result and result1
My computer excel default is sep by ','. And the python to_csv default is also ','. And now when I check my result and result1 in CSV file by using the filter.
I want my whole file looks like this:
result.csv sample
But: There are some line looks like this:
result.csv bad lines
Have a closer look, I found out some line is separated by pipe?????:
result.csv bad lines closer look
My questions become like:
1.why when I save as csv, the file is contain two separators which is comma and pipe.
2.How can I solve this problem? I check my data on python, it is all good.
Anybody can help?? I would be soooo appreciated!

strings converted to floats in a csv file

When I write a certain fraction in a csv file it gets automatically calculated whereas my requirement is to keep it as it is.
This is my try:
import csv
ft = "-1/-1.5" #or -1/-1.5 (removing the quotes)
print(ft)
with open("outputfile.csv","w",newline="") as infile:
writer = csv.writer(infile)
writer.writerow([ft])
Console prints it when in quotes:
-1/-1.5
However, when I write them same in a csv file it becomes like the following no matter when I try using quotes or without quotes.
0.666666667
How can I write the same in a csv file like -1/-1.5?
See the image below (this is what I'm getting right now):
If i try use a ' in a cell and then write the value, the output serves the purpose. Can I not do it programmatically?
As OP mentioned in comments use:
writer.writerow([f"'{ft}"])
will add formatting to the csv output file so M$ Excel will display the string values in string ft retaining their original string format.

Keeping format of text (.txt) files when reading and rewriting

I have a .txt file containing formatting elements as \n for line breaks which I want to read and then rewrite its data until a specific line back to a new .txt file. My code looks like this:
with open (filename) as f:
content=f.readlines()
with open("lf.txt", "w") as file1:
file1.write(str(content))
file1.close
The output file lf.txt is produced correctly but it throws away the formatting of the input file. Is there a way to keep the formatting of file 1 when rewriting it to a new file?
You converted content to a string, while it's really a list of strings (lines).
Use join to convert the lines back to a string:
file1.write(''.join(content))
join is a string method, and it is activated in the example from an empty string object. The string calling this method is used as a separator for the strings joining process. In this situation we don't need any separator, just joining the strings.

Categories

Resources