I have a following code:
matrix_file = open("abc.txt", "rU")
matrix = matrix_file.readlines()
keys = matrix[0]
vals = [line[1:] for line in matrix[1:]]
ea=open("abc_format.txt",'w')
ea.seek(0)
ea.write(vals)
ea.close()
However I am getting the following error:
TypeError: expected a character buffer object
How do I buffer the output and what data type is the variable vals?
vals is a list. If you want to write a list of strings to a file, as opposed to an individual string, use writelines:
ea=open("abc_format.txt",'w')
ea.seek(0)
ea.writelines(vals)
ea.close()
Note that this will not insert newlines for you (although in your specific case your strings already end in newlines, as pointed out in the comments). If you need to add newlines you could do the following as an example:
ea=open("abc_format.txt",'w')
ea.seek(0)
ea.writelines([line+'\n' for line in vals])
ea.close()
The write function will only handle characters or bytes. To write arbitrary objects, use python's pickle library. Write with pickle.dump(), read them back with pickle.load().
But if what you're really after is writing something in the same format as your input, you'll have to write out the matrix values and newlines yourself.
for line in vals:
ea.write(line)
ea.close()
You've now written a file that looks like abc.txt, except that the first row and first character from each line has been removed. (You dropped those when constructing vals.)
Somehow I doubt this is what you intended, since you chose to name it abc_format.txt, but anyway this is how you write out a list of lines of text.
You cannot "write" objects to files. Rather, use the pickle module:
matrix_file = open("abc.txt", "rU")
matrix = matrix_file.readlines()
keys = matrix[0]
vals = [line[1:] for line in matrix[1:]]
#pickling begins!
import pickle
f = open("abc_format.txt")
pickle.dump(vals, f) #call with (object, file)
f.close()
Then read it like this:
import pickle
f = open("abc_format.txt")
vals = pickle.load(f) #exactly the same list
f.close()
You can do this with any kind of object, your own or built-in. You can only write strings and bytes to files, python's open() function just opens it like opening notepad would.
To answer your first question, vals is a list, because anything in [operation(i) for i in iterated_over] is a list comprehension, and list comprehensions make lists. To see what the type of any object is, just use the type() function; e.g. type([1,4,3])
Examples: https://repl.it/qKI/3
Documentation here:
https://docs.python.org/2/library/pickle.html and https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions
First of all instead of opening and closing the file separately you can use with statement that does the job automatically.and about the Error,as it says the write method only accepts character buffer object so you need to convert your list to a string.
For example you can use join function that join the items within an iterable object with a specific delimiter and return a concatenated string.
with open("abc.txt", "rU") as f,open("abc_format.txt",'w') as out:
matrix = f.readlines()
keys = matrix[0]
vals = [line[1:] for line in matrix[1:]]
out.write('\n'.join(vals))
Also as a more pythonic way as the file objects are iterators you can do it in following code and get the first line with calling its next method and pass the rest to join function :
with open("abc.txt", "rU") as f,open("abc_format.txt",'w') as out:
matrix = next(f)
out.write('\n'.join(f))
Related
Recently I was dealing with CSV files and tried to replace the NULL bytes in the CSV file to empty strings to make the CSV reader work. I referred to this answer but decided to do it in a different way.
The original solution is like this:
with open(file) as f:
reader = csv.reader(x.replace('\0','') for x in f)
print([x for x in reader])
But I decide to do it like this:
with open(file) as f:
for line in f:
line.replace('\0','')
f.seek(0)
reader = csv.reader(f)
print([x for x in reader])
And my approach seemed not to work as the original one, I wonder why is that?
Thank you for your time!
Take a look at the official doc of the replace function in python:
str.replace(old, new[, count])
Return a copy of the string with all
occurrences of substring old replaced by new. If the optional argument
count is given, only the first count occurrences are replaced.
In your implementation, you are calling replace but not capturing the returned replaced line anywhere.
You could instead, replace the whole file and store it in another variable or, if it is large, perform your operation inside the for loop itself.
However, the reference implementation you showed before looks better: It uses a generator that will yield replaced lines as you need them, you should stick with that.
I have a csv which contains text like
AAABBBBCCCDDDDDDD
EEEFFFRRRTTTHHHYY
when I run the code like below:
rows = csv.reader(csvfile)
for row in rows:
print(" ".join('%s' %row for row in rows))
it will project as follow:
['AAABBBBCCCDDDDDDD']
['EEEFFFRRRTTTHHHYY']
But I want to display as a series of words like below:
AAABBBBCCCDDDDDDDEEEFFFRRRTTTHHHYY
Is there anything wrong in the code?
Your example looks like you simply need
with open(csvfile) as inputfile: # misnomer; not really proper CSV
for row in inputfile:
print(row.rstrip('\n'), end='')
The example you provided doesn't look like a csv file. It looks like a simple text file. The you could have something as simple as :
Input.txt
AAABBBBCCCDDDDDDD
EEEFFFRRRTTTHHHYY
Solution.py
input_filename = "Input.txt"
with open(input_filename) as input_file:
print("".join(x.rstrip('\n') for x in input_file))
This is taking advantage of:
A file object can be iterated on. This will give you a new line from each iteration
Every line received from the file will have newline character at its end. Since you seem to not want it we use the method .rstrip() to remove it
The .join() method can accept any iterable even a...
Generator expression which will help us create an iterable that will accepted by .join() using .rstrip() to format every line coming from the input file.
EDIT: OK let's decompose further my answer:
When you open a file you can iterate over it. In the most simple way to explain it, let's say that it means that you do a loop over it (for line in input_file: ...).
But not only that, but with an iterator you can create another iterator by transforming each element. This is what a list comprehension or, in the case I have chosen, a generator expression does. So the expression (x.rstrip() for x in input_file) will be a iterator that takes every element of input_file and applies to it .rstrip()
The string method .join() will glue together the elements provided by an iterator using that string as a separator. Since I use here an empty string there won't be a seperator. I have used the iterator defined before for this.
I then print() the string provided by the .join() operation explained before.
I did a minor correction on my answer because there is the edge case that if there are space or tab characters at the end of a line in the input file they would have been removed if I use x.rstrip() instead of x.rstrip('\n')
You could start with an empty string, and for every row read from the csv file, remove the newline at the end and add the contents to the empty string.
joined = ""
with open(csvfile) as f:
for row in f:
joined = joined + row.replace("\n","")
print(joined)
Output:
>> AAABBBBCCCDDDDDDDEEEFFFRRRTTTHHHYY
Here is the code I modified from previous code.
But, I got this error:
TypeError: must be str not list in f1.write(head)
This is the part of code that is producing this error:
from itertools import islice
with open("input.txt") as myfile:
head = list(islice(myfile, 3))
f1.write(head)
f1.close()
Well, you have it right, using islice(filename, n) will get you the first n lines of file filename. The problem here is when you try and write these lines to another file.
The error is pretty intuitive (I've added the full error one receives in this case):
TypeError: write() argument must be str, not list
This is because f.write() accepts strings as parameters, not list types.
So, instead of dumping the list as is, write the contents of it in your other file using a for loop:
with open("input.txt", "r") as myfile:
head = list(islice(myfile, 3))
# always remember, use files in a with statement
with open("output.txt", "w") as f2:
for item in head:
f2.write(item)
Granted that the contents of the list are all of type str this works like a charm; if not, you just need to wrap each item in the for loop in an str() call to make sure it is converted to a string.
If you want an approach that doesn't require a loop, you could always consider using f.writelines() instead of f.write() (and, take a look at Jon's comment for another tip with writelines).
I am downloading Json files from an API, I use the following code to write the JSON. Each item the loop gives me a JSON file. I need to save it and extract entities from the appended JSON file using a loop.
for item in style_ls:
dat = get_json(api, item)
specs_dict[item] = dat
with open("specs_append.txt", "a") as myfile:
json.dump(dat, myfile)
myfile.close()
print item
with open ("specs_data.txt", "w") as my file:
json.dump(spec_dict, myfile)
myfile.close()
I know that I cannot get a valid JSON format from the specs_append.txt, but I can get one from the specs_data.txt. I am doing the first one just because my program needs atleast 3-4 days to complete and there are high chances that my system may shutdown. So is there anyway I can do this efficiently ?
If not is there anyway I can extract it from specs_append.txt <{JSON}{JSON}> format (which is not a valid JSON format)?
If not should I write specs_dict to a txt file every time in the loop, so that even if program gets terminated i can start if from that point in loop and still get a valid json format?
I suggest several possible solutions.
One solution is to write custom code to slurp in the input file. I would suggest putting a special line before each JSON object in the file, such as: ###
Then you could write code like this:
import json
def json_get_objects(f):
temp = ''
line = next(f) # pull first line
assert line == SPECIAL_LINE
for line in f:
if line != SPECIAL_LINE:
temp += line
else:
# found special marker, temp now contains a complete JSON object
j = json.loads(temp)
yield j
temp = ''
# after loop done, yield up last JSON object
if temp:
j = json.loads(temp)
yield j
with open("specs_data.txt", "r") as f:
for j in json_get_objects(f):
pass # do something with JSON object j
Two notes on this. First, I am simply appending to a string over and over; this used to be a very slow way to do this in Python, so if you are using a very old version of Python, don't do it this way unless your JSON objects are very small. Second, I wrote code to split the input and yield up JSON objects one at a time, but you could also use a guaranteed-unique string, slurp in all the data with a single call to f.read() and then split on your guaranteed-unique string using the str.split() method function.
Another solution would be to write the whole file as a valid JSON list of valid JSON objects. Write the file like this:
{"mylist":[
# first JSON object, followed by a comma
# second JSON object, followed by a comma
# third JSON object
]}
This would require your file appending code to open the file with writing permission, and seek to the last ] in the file before writing a comma plus newline, then the new JSON object on the end, and then finally writing ]} to close out the file. If you do it this way, you can use json.loads() to slurp the whole thing in and have a list of JSON objects.
Finally, I suggest that maybe you should just use a database. Use SQLite or something and just throw the JSON strings in to a table. If you choose this, I suggest using an ORM to make your life simple, rather than writing SQL commands by hand.
Personally, I favor the first suggestion: write in a special line like ###, then have custom code to split the input on those marks and then get the JSON objects.
EDIT: Okay, the first suggestion was sort of assuming that the JSON was formatted for human readability, with a bunch of short lines:
{
"foo": 0,
"bar": 1,
"baz": 2
}
But it's all run together as one big long line:
{"foo":0,"bar":1,"baz":2}
Here are three ways to fix this.
0) write a newline before the ### and after it, like so:
###
{"foo":0,"bar":1,"baz":2}
###
{"foo":0,"bar":1,"baz":2}
Then each input line will alternately be ### or a complete JSON object.
1) As long as SPECIAL_LINE is completely unique (never appears inside a string in the JSON) you can do this:
with open("specs_data.txt", "r") as f:
temp = f.read() # read entire file contents
lst = temp.split(SPECIAL_LINE)
json_objects = [json.loads(x) for x in lst]
for j in json_objects:
pass # do something with JSON object j
The .split() method function can split up the temp string into JSON objects for you.
2) If you are certain that each JSON object will never have a newline character inside it, you could simply write JSON objects to the file, one after another, putting a newline after each; then assume that each line is a JSON object:
import json
def json_get_objects(f):
for line in f:
if line.strip():
yield json.loads(line)
with open("specs_data.txt", "r") as f:
for j in json_get_objects(f):
pass # do something with JSON object j
I like the simplicity of option (2), but I like the reliability of option (0). If a newline ever got written in as part of a JSON object, option (0) would still work, but option (2) would error.
Again, you can also simply use an actual database (SQLite) with an ORM and let the database worry about the details.
Good luck.
Append json data to a dict on every loop.
In the end dump this dict as a json and write it to a file.
For getting you an idea for appending data to dict:
>>> d1 = {'suku':12}
>>> t1 = {'suku1':212}
>>> d1.update(t1)
>>> d1
{'suku1': 212, 'suku': 12}
I am very new at programming. I have the following problem.
I want to take some floats from a .txt file, and add them to a Python list as strings, with a comma between them, like this:
.TXT:
194220.00 38.4397984 S 061.1720742 W 0.035
194315.00 38.4398243 S 061.1721378 W 0.036
Python:
myList = ('38.4397984,061.1720742','38.4398243,061.1721378')
Does anybody know how to do this? Thank you!
There are three key pieces you'll need to do this. You'll need to know how to open files, you'll need to know how to iterate through the lines with the file open, and you'll need to know how to split the list.
Once you know all these things, it's as simple as concatenating the pieces you want and adding them to your list.
my_list = []
with open('path/to/my/file.txt') as f:
for line in f:
words = line.split()
my_list.append(words[1] + words[3])
print mylist
Python has a method open(fileName, mode) that returns a file object.
fileName is a string with the name of the file.
mode is another a string that states how will the file used. Ex 'r' for reading and 'w' for writing.
f = open(file.txt, 'r')
This will create file object in the variable f. f has now different methods you can use to read the data in the file. The most common is f.read(size) where size is optional
text = f.read()
Will save the data in the variable text.
Now you want to split the string. String is an object and has a method called split() that creates a list of words from a string separated by white space.
myList = text.split()
In your code you gave us a tuple, which from the variable name i am not sure it was what you were looking for. Make sure to read the difference between a tuple and a list. The procedure to find a tuple is a bit different.