How to correctly append binary files in python - python

I am trying to create a binary file (called textsnew) and then append two (previously created) binary files to it. When I print the resulting (textsnew), it only shows the first file appended to it, not the second one. I do however see that the size of the new file (textsnew) is the sum of the two appended files. Maybe Im opening it incorrectly? This is my code
with open("/path/textsnew", "ab") as myfile, open("/path/names", "rb") as file2:
myfile.write(file2.read())
with open("/path/textsnew", "ab") as myfile, open("/path/namesthree", "rb") as file2:
myfile.write(file2.read())
this code is for reading the file:
import pickle
infile1 = open('/path/textsnew','rb')
names1 = pickle.load(infile1)
print (names1)

Open the new file, write its data.
Then, while the new file is still open (in append mode), open the second file, read its data and immediately write that data to the first file.
Then repeat the procedure for the third file.
Everything in binary, of course, although it will work just as well with text files. Linux/Macos/*nix don't even really care.
This also assume that the built-in I/O buffer size will read the full file contents in one go, as in your question. Otherwise, you would need to create a loop around the read/write parts.
with open('/path/textsnew', 'ab') as fpout:
fpout.write(data)
with open('/path/names', 'rb') as fpin:
fpout.write(fpin.read())
with open('/path/namesthree', 'rb') as fpin:
fpout.write(fpin.read())

Related

Python script which opens the exe file and copy the data from exe file based on offset condition and writing the extracted data to other file

with open('C:\\users\desktop\Jhansi\parsing\out.exe', 'rb') as input_file:
with open('output.bin','wb') as output_file
for line in input_file:
output_file. Write(line)
In the above script i have to put the condition i.e from the offset value 00000200 to 00000400 i have to take the data between offset values i.e 00000200 to 00000400 and store this extracted data to separate file. I am attaching the image of the input file.
I need one pointer for opening the input file
I need second pointer for writing the extracted data to a separate file.
input file
First of all, you are opening the files in binary mode, so reading and writing lines does not make (much) sense. You want read and write bytes.
This does what you want, I think:
>>> with open('some_pathname', 'rb') as input_file:
... input_file.seek(offset_start)
... num_bytes = offset_end - offset_start
... bytes_read = input_file.read(num_bytes)
>>> with open('another_pathname', 'wb') as output_file:
... output_file.write(bytes_read)
The interactive interpreter is a great way to explore the Python language. You can use the help() function to find out what you can and can't do with any functions and objects you pass to it. Have a look at help(open), help(input_file) and help(bytes_read) to better understand the above code snippet.

replacing characters in existing large file

i have a large file in my local disk which contains some fixed length string in first line. I need to programmatically replace that fixed length string using python without reading whole file in memory .
i have tried opening the file in append mode and seeking to 0 position. And then replace the string which is of 9 bytes. The code is also added here , what i tried .
with open ("largefile.txt", 'a') as f:
f.seek(0,0)
f.write("123456789")
I think you just want to open the file for writing without truncating it, which would be r+. to make this reproducible, we first create a file that matches this format:
with open('many_lines.txt', 'w') as fd:
print('abcdefghi', file=fd)
for i in range(10000):
print(f'line {i:09}', file=fd)
then we basically do what you were doing, but with the correct mode:
with open('many_lines.txt', 'r+') as fd:
print('123456789', file=fd)
or you can use write directly, with:
with open('many_lines.txt', 'r+') as fd:
fd.write('123456789')
Note: I'm opening in r+ so that you'll get an FileNotFoundError if it doesn't exist (or the filename is misspelled) rather than just blindly creating a tiny file
The open modes are directly copied from the C/POSIX API for the fopen so your use of a will trigger behaviour that says:
Subsequent writes to the file will always end up at the then current end of file, irrespective of any intervening fseek(3) or similar

Selectively replacing csv header names

I have been searching for a solution for this and haven't been able to find one. I have a directory of folders which contain multiple, very-large csv files. I'm looping through each csv in each folder in the directory to replace values of certain headers. I need the headers to be consistent (from file to file) in order to run a different script to process all the data properly.
I found this solution that I though would work: change first line of a file in python.
However this is not working as expected. My code:
from_file = open(filepath)
# for line in f:
# if
data = from_file.readline()
# print(data)
# with open(filepath, "w") as f:
print 'DBG: replacing in file', filepath
# s = s.replace(search_pattern, replacement)
for i in range(len(search_pattern)):
data = re.sub(search_pattern[i], replacement[i], data)
# data = re.sub(search_pattern, replacement, data)
to_file = open(filepath, mode="w")
to_file.write(data)
shutil.copyfileobj(from_file, to_file)
I want to replace the header values in search_pattern with values in replacement without saving or writing to a different file - I want to modify the file. I have also tried
shutil.copyfileobj(from_file, to_file, -1)
As I understand it that should copy the whole file rather than breaking it up in chunks, but it doesn't seem to have an effect on my output. Is it possible that the csv is just too big?
I haven't been able to determine a different way to do this or make this way work. Any help would be greatly appreciated!
this answer from change first line of a file in python you copied from doesn't work in windows
On Linux, you can open a file for reading & writing at the same time. The system ensures that there's no conflict, but behind the scenes, 2 different file objects are being handled. And this method is very unsafe: if the program crashes while reading/writing (power off, disk full)... the file has a great chance to be truncated/corrupt.
Anyway, in Windows, you cannot open a file for reading and writing at the same time using 2 handles. It just destroys the contents of the file.
So there are 2 options, which are portable and safe:
create a file in the same directory, once copied, delete first file, and rename the new one
Like this:
import os
import shutil
filepath = "test.txt"
with open(filepath) as from_file, open(filepath+".new","w") as to_file:
data = from_file.readline()
to_file.write("something else\n")
shutil.copyfileobj(from_file, to_file)
os.remove(filepath)
os.rename(filepath+".new",filepath)
This doesn't take much longer, because the rename operation is instantaneous. Besides, if the program/computer crashes at any point, one of the files (old or new) is valid, so it's safe.
if patterns have the same length, use read/write mode
like this:
filepath = "test.txt"
with open(filepath,"r+") as rw_file:
data = rw_file.readline()
data = "h"*(len(data)-1) + "\n"
rw_file.seek(0)
rw_file.write(data)
Here we, read the line, replace the first line by the same amount of h characters, rewind the file and write the first line back, overwriting previous contents, keeping the rest of the lines. This is also safe, and even if the file is huge, it's very fast. The only constraint is that the pattern must be of the exact same size (else you would have remainders of the previous data, or you would overwrite the next line(s) since no data is shifted)

Delete the contents of a file before writing to it (in Python)?

I'm trying my hand at this rosalind problem and am running into an issue. I believe everything in my code is correct but it obviously isn't as it's not running as intended. i want to delete the contents of the file and then write some text to that file. The program writes the text that I want it to, but it doesn't first delete the initial contents.
def ini5(file):
raw = open(file, "r+")
raw2 = (raw.read()).split("\n")
clean = raw2[1::2]
raw.truncate()
for line in clean:
raw.write(line)
print(line)
I've seen:
How to delete the contents of a file before writing into it in a python script?
But my problem still persists. What am I doing wrong?
truncate() truncates at the current position. Per its documentation, emphasis added:
Resize the stream to the given size in bytes (or the current position if size is not specified).
After a read(), the current position is the end of the file. If you want to truncate and rewrite with that same file handle, you need to perform a seek(0) to move back to the beginning.
Thus:
raw = open(file, "r+")
contents = raw.read().split("\n")
raw.seek(0) # <- This is the missing piece
raw.truncate()
raw.write('New contents\n')
(You could also have passed raw.truncate(0), but this would have left the pointer -- and thus the location for future writes -- at a position other than the start of the file, making your file sparse when you started writing to it at that position).
If you want to completley overwrite the old data in the file, you should use another mode to open the file.
It should be:
raw = open(file, "w") # or "wb"
To resolve your problem, First read the file's contents:
with open(file, "r") as f: # or "rb"
file_data = f.read()
# And then:
raw = open(file, "w")
And then open it using the write mode.This way, you will not append your text to the file, you'll just write only your data to it.
Read about mode files here.

python clear csv file

how can I clear a complete csv file with python. Most forum entries that cover the issue of deleting row/columns basically say, write the stuff you want to keep into a new file. I need to completely clear a file - how can I do that?
Basically you want to truncate the file, this can be any file. In this case it's a csv file so:
filename = "filewithcontents.csv"
# opening the file with w+ mode truncates the file
f = open(filename, "w+")
f.close()
Your question is rather strange, but I'll interpret it literally. Clearing a file is not the same as deleting it.
You want to open a file object to the CSV file, and then truncate the file, bringing it to zero length.
f = open("filename.csv", "w")
f.truncate()
f.close()
If you want to delete it instead, that's just a os filesystem call:
import os
os.remove("filename.csv")
The Python csv module is only for reading and writing whole CSV files but not for manipulating them. If you need to filter data from file then you have to read it, create a new csv file and write the filtered rows back to new file.

Categories

Resources