unexpected line breaks in python after writing to csv

unexpected line breaks in python after writing to csv - python

i have a code that updates CSVs from a server. it gets data using:
a = urllib.urlopen(url)
data = a.read().strip()
then i append the data to the csv by
f = open(filename+".csv", "ab")
f.write(ndata)
f.close()
the problem is that randomly, a line in the csv gets written like this (or gets a line break somewhere along the csv):
2,,,,,
015-04-21 13:00:00,18,998,50,31,2293
instead of its usual form:
2015-04-21 13:00:00,6,1007,29,25,2394
2015-04-21 13:00:00,7,1004,47,26,2522
i tried printing my data in shell after the program ran, and it would show that the broken csv entry actually appears to be normal.
hope you guys can help me out. thanks.
running python 2.7.9 on win8.1

What actions are performed on your "ndata" variable ?
You should use the csv module to manage CSV files : https://docs.python.org/2/library/csv.html
Edit after comment :
If you do not want to use the "csv" module I linked to you, instead of
a = urllib.urlopen(url)
data = a.read().strip()
ndata = data.split('\n')
f.write('\n'.join(ndata[1:]))
you should do this :
a = urllib.urlopen(url)
f.writelines(a.readlines()[1:])
I don't see any reason explaining your randomly unwanted "\n" if you are sure that your incoming data is correct. Do you manage very long lines ?
I recommand you to use the csv module to read your input : you'll be sure to have a valid CSV content if your input is correct.

Related

Edit Minecraft .dat File in Python

I'm looking to edit a Minecraft Windows 10 level.dat file in python. I've tried using the package nbt and pyanvil but get the error OSError: Not a gzipped file. If I print open("level.dat", "rb").read() I get a lot of nonsensical data. It seems like it needs to be decoded somehow, but I don't know what decoding it needs. How can I open (and ideally edit) one of these files?

To read data just do :
from nbt import nbt
nbtfile = nbt.NBTFile("level.dat", 'rb')
print(nbtfile) # Here you should get a TAG_Compound('Data')
print(nbtfile["Data"].tag_info()) # Data came from the line above
for tag in nbtfile["Data"].tags: # This loop will show us each entry
print(tag.tag_info())
As for editing :
# Writing data (changing the difficulty value
nbtfile["Data"]["Difficulty"].value = 2
print(nbtfile["Data"]["Difficulty"].tag_info())
nbtfile.write_file("level.dat")
EDIT:
It looks like Mojang doesn't use the same formatting for Java and bedrock, as bedrock's level.dat file is stored in little endian format and uses non-compressed UTF-8.
As an alternative, Amulet-Nbt is supposed to be a Python library written in Cython for reading and editing NBT files (supposedly works with Bedrock too).
Nbtlib also seems to work, as long as you set byteorder="little when loading the file.
Let me know if u need more help...

You'll have to give the path either relative to the current working directory
path/to/file.dat
Or you can use the absolute path to the file
C:user/dir/path/to/file.dat
Read the data,replace the values and then write it
# Read in the file
with open('file.dat', 'r') as file :
filedata = file.read()
# Replace the target string
filedata = filedata.replace('yuor replacement or edit')
# Write the file out again
with open('file.dat', 'w') as file:
file.write(filedata)

python: from arduino serial port to csv file, why the data format is strange?

/I apologize for my limited level of English which may cause the question not that clear./
I'm now using python to write the data which was sent from the arduino into a csv file, I want around 200 data in a group, one group for one row, every data separately in different colums. The data from my aruino is in the format: number+, (for example: 123,144,135,....) but in csv file, the number had been separated into different colums (1,2,3 in different colums instead of 123 in one colum), and when open the file by writing note, the data looks like "1,2,3","1,4,4",.....
I tried different delimeters like \t, space... \t looks fine when I viewed the file by excel but still didn't work in writing pad (a tab between every two numbers).
I also tried to delete the "," in arduino code but it doesn't help as well.
In the writerows() function, I tried data, str(data) and str(data)+"," ,not much difference.
I even changed the delimeter setting of my laptop from "," to "\t" but dosen't help.
The arduino part:
Serial.print(value);
Serial.print(",");
The python part:
while True:
try:
ser_bytes = ser.readline()
decoded_bytes = ser_bytes.decode('utf-8')
print(decoded_bytes)
#decoded_bytes = decoded_bytes.strip('|')
with open("test_data.csv","a",newline="") as f:
writer = csv.writer(f,delimiter=",")
writer.writerows([str(decoded_bytes),])
I searched a lot about the csv format but I still can't get the point why the code doesn't work.
Thank you for the help.

You're right, I think I didn't totally get what your question is, but here are some ideas. To get correct csv output, you have to change your code to something like this:
while True:
try:
ser_bytes = ser.readline()
// the following line splits the line you got from your arduino into an array
decoded_bytes = ser_bytes.split(",");
print(decoded_bytes)
with open("test_data.csv","a") as f:
writer = csv.writer(f,delimiter=",")
writer.writerow(decoded_bytes)
Like this you should get correct csv output and every line you get from the arduino is written to a line in the file.
Some additional thoughts: Since you're a getting a csv style line from your arduino already you could write that to a file directly, without splitting and using the csv writer. That's actually a little overkill, but it probably doesn't matter that much ;)

Trying to download data from URL with CSV File

I'm slightly new to Python and have a question as to why the following code doesn't produce any output in the csv file. The code is as follows:
import csv
import urllib2
url = 'http://www.rba.gov.au/statistics/tables/csv/f17-yields.csv'
response = urllib2.urlopen(url)
cr = csv.reader(response)
for row in cr:
with open("AusCentralbank.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(row)
Cheers.
Edit:
Brien and Albert solved the initial issue I had. However, I now have one further question. When I download the CSV File which I have listed above which is in "http://www.rba.gov.au/statistics/tables/#interest-rates" under Zero-coupon "Interest Rates - Analytical Series - 2009 to Current - F17" and is the F-17 Yields CSV I see that it has 5 workbooks and I actually just want to gather the data in the 5th Workbook. Is there a way I could do this? Cheers.

I could only test my code using Python 3. However, the only diffence should be urllib2, hence I am using urllib.respose for opening the desired url.
The variable html is type bytes and can generally be written to a file in binary mode. Additionally, your source is a csv-file already, so there should be no need to convert it somehow:
#!/usr/bin/env python3
# coding: utf-8
import urllib
url = 'http://www.rba.gov.au/statistics/tables/csv/f17-yields.csv'
response = urllib.request.urlopen(url)
html = response.read()
with open('output.csv', 'wb') as f:
f.write(html)

It is probably because of your opening mode.
According to documentation:
'w' for only writing (an existing file with the same name will be
erased)
You should use append(a) mode to append it to the end of the file.
'a' opens the file for appending; any data written to the file is
automatically added to the end.
Also, since the file you are trying to download is csv file, you don't need to convert it.

#albert had a great answer. I've gone ahead and converted it to the equivalent Python 2.x code. You were doing a bit too much work in your original program; since the file was already a csv you didn't need to do any special work to turn it into a csv.
import urllib2
url = 'http://www.rba.gov.au/statistics/tables/csv/f17-yields.csv'
response = urllib2.urlopen(url)
html = response.read()
with open('AusCentralbank.csv', 'wb') as f:
f.write(html)

Cpickle invalid load key error with a weird key at the end

I just tried to update a program i wrote and i needed to add another pickle file. So i created the blank .pkl and then use this command to open it(just as i did with all my others):
with open('tryagain.pkl', 'r') as input:
self.open_multi_clock = pickle.load(input)
only this time around i keep getting this really weird error for no obvious reason,
cPickle.UnpicklingError: invalid load key, 'Γ'.
The pickle file does contain the necessary information to be loaded, it is an exact match to other blank .pkl's that i have and they load fine. I don't know what that last key is in the error but i suspect that could give me some incite if i know what it means.

So have have figured out the solution to this problem, and i thought I'd take the time to list some examples of what to do and what not to do when using pickle files. Firstly, the solution to this was to simply just make a plain old .txt file and dump the pickle data to it.
If you are under the impression that you have to actually make a new file and save it with a .pkl ending you would be wrong. I was creating my .pkl's with notepad++ and saving them as .pkl's. Now from my experience this does work sometimes and sometimes it doesn't, if your semi-new to programming this may cause a fair amount of confusion as it did for me. All that being said, i recommend just using plain old .txt files. It's the information stored inside the file not necessarily the extension that is important here.
#Notice file hasn't been pickled.
#What not to do. No need to name the file .pkl yourself.
with open('tryagain.pkl', 'r') as input:
self.open_multi_clock = pickle.load(input)
The proper way:
#Pickle your new file
with open(filename, 'wb') as output:
pickle.dump(obj, output, -1)
#Now open with the original .txt ext. DONT RENAME.
with open('tryagain.txt', 'r') as input:
self.open_multi_clock = pickle.load(input)

Gonna guess the pickled data is throwing off portability by the outputted characters. I'd suggest base64 encoding the pickled data before writing it to file. What what I ran:
import base64
import pickle
value_p = pickle.dumps("abdfg")
value_p_b64 = base64.b64encode(value_p)
f = file("output.pkl", "w+")
f.write(value_p_b64)
f.close()
for line in open("output.pkl", 'r'):
readable += pickle.loads(base64.b64decode(line))
>>> readable
'abdfg'

How to write lines in a txt file, with data from a csv file

How can I tell Python to open a CSV file, and merge all columns per line, into new lines in a new TXT file?
To explain:
I'm trying to download a bunch of member profiles from a website, for a research project. To do this, I want to write a list of all the URLs in a TXT file.
The URLs are akin to this: website.com-name-country-title-id.html
I have written a script that takes all these bits of information for each member and saves them in columns (name/country/title/id), in a CSV file, like this:
mark japan rookie married
john sweden expert single
suzy germany rookie married
etc...
Now I want to open this CSV and write a TXT file with lines like these:
www.website.com/mark-japan-rookie-married.html
www.website.com/john-sweden-expert-single.html
www.website.com/suzy-germany-rookie-married.html
etc...
Here's the code I have so far. As you can probably tell I barely know what I'm doing so help will be greatly appreciated!!!
import csv
x = "http://website.com/"
y = ".html"
csvFile=csv.DictReader(open("NameCountryTitleId.csv")) #This file is stored on my computer
file = open("urls.txt", "wb")
for row in csvFile:
strArgument=str(row['name'])+"-"+str(row['country'])+"-"+str(row['title'])+"-"+str(row['id'])
try:
file.write(x + strArgument + y)
except:
print(strArgument)
file.close()
I don't get any error messages after running this, but the TXT file is completely empty.

Rather than using a DictReader, use a regular reader to make it easier to join the row:
import csv
url_format = "http://website.com/{}.html"
csv_file = 'NameCountryTitleId.csv'
urls_file = 'urls.txt'
with open(csv_file, 'rb') as infh, open(urls_file, 'w') as outfh:
reader = csv.reader(infh)
for row in reader:
url = url_format.format('-'.join(row))
outfh.write(url + '\n')
The with statement ensures the files are closed properly again when the code completes.
Further changes I made:
In Python 2, open a CSV files in binary mode, the csv module handles line endings itself, because correctly quoted column data can have embedded newlines in them.
Regular text files should be opened in text mode still though.
When writing lines to a file, do remember to add a newline character to delineate lines.
Using a string format (str.format()) is far more flexible than using string concatenations.
str.join() lets you join a sequence of strings together with a separator.

its actually quite simple, you are working with strings yet the file you are opening to write to is being opened in bytes mode, so every single time the write fails and it prints to the screen instead. try changing this line:
file = open("urls.txt", "wb")
to this:
file = open("urls.txt", "w")
EDIT:
i stand corrected, however i would like to point out that with an absence of newlines or some other form of separator, how do you intend to use the URLs later on? if you put newlines between each URL they would be easy to recover

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

unexpected line breaks in python after writing to csv - python

Related

Edit Minecraft .dat File in Python

python: from arduino serial port to csv file, why the data format is strange?

Trying to download data from URL with CSV File

Cpickle invalid load key error with a weird key at the end

How to write lines in a txt file, with data from a csv file

Categories

Resources