Missing contents in .txt file after shutil.make_archive

Missing contents in .txt file after shutil.make_archive - python

Sysinfo = open('SystemInformation.txt', 'w')
Sysinfo.write("something useful",)
Sysinfo.close
#a handful more processes occur here
os.chdir(dstFolder)
shutil.make_archive('filename', 'zip', srcFolder)
I have the above code and everything zips up just fine except for the SystemInformation.txt file I created. When I open it up after extracting the .zip file it is completely blank. The odd part to me is that the same file in the source folder before it gets zipped is completely fine.

Make sure you call functions properly. You are missing the following:
Sysinfo.close()

Related

Python: Iterate through directory to find specific text file, open it, write some of its contents to a new file, close it, and move on to the next dir

I have a script that takes an input text file then finds data in it, puts that data as a variable, then later I call that variable to write to a new file. This snippet of code is just for reading the txt file and storing the data from it as variables.
searchfile = open('C://Users//Me//DynamicFolder//report//summary.txt','r', encoding='utf-8')
slab_count=0
slab_number=[]
slab_total=0
for line in searchfile:
if "Slab" in line:
slab_num = ([float(s) for s in re.findall(r'[-+]?(?:\d*\.\d+|\d+)', line)])
slab_percent = slab_num[-1]
slab_number.append(slab_percent)
slab_count=slab_count+1
slab_total=0
for slab_percent in slab_number:
slab_total+=slab_percent
searchfile.close()
I am using xlsxwriter to write the variables to an excel doc.
My question is, how do I iterate this to search through a given directories sub-directories for summary.txt when there is a dynamic folder.
So C://Users//Me//DynamicFolder//report//summary.txt is a path to one of the files. There are several folders I named DynamicFolder that are there because another process puts them there, they change their names all the time. I need have this script go into each of those dynamic folders to a subdir called report, this is a static name and is always the same. So each of those dynamicfolders has another subdir called report, and in the report folder is a file called summary.txt. I am trying to go through each of those dynamicfolders into the subdir report > summary.txt and then opening and writing data from those txt files.
How do I iterate or loop this? Right now I have 18 folders with those DynamicFolder names that will change when they are over written. How can I put this snip of code to iterate through?
for path in Path('C://Users//Me//DynamicFolder//report//summary.txt').rglob('summary.txt'):
report folder is not the only folder with a summary.txt file, but its the only folder with the file I want. So this code above pulls ALL summary.txt files from all subdir's under the DynamicFolder (not just report folder). I am wondering if I can make this JUST do the 'report' subdir folders under DynamicFolders, and somehow use this to iterate the rest of my code?

A file when you run creates a replica of itself and deletes the original file. (Python)

I tried creating a code where the file, when you run creates a replica of itself and deletes the original file.
Here is my code:
import shutil
import os
loc=os.getcwd()
shutil.move("./aa/test.py", loc, copy_function=shutil.copy2)
But the issue with this is that:
this code is only 1 time usable and to use it again, I need to change the name of the file or delete the newly created file and then run it again.
Also, If I run it inside a folder, It will always create the new file outside the folder (in a dir up from the exceuting program).
How Do I fix this?
Some Notes:
The copy should be made at the exact place where the original file was.
The folder was empty, just having this file. The file doesn't needs to be in a folder but I just used it as a test instance.
Yes, I understand that if I delete the original file it should stop working. I actually have a figure in my mind of how It should work:
First, a new file with the exact same content in it will be made > in the same path as the original file (with a different name probably).
Then, the original file will be deleted and the 2nd file (which is > the copy of the original file) will be renamed as the exact name and > extension as of the original file which got deleted.
This thing above should repeat every time I run the .py file (containing this code) thus making this code portable and suitable for multiple uses.
Maybe the code to be executed after the file deletion can be stored in memory cache (I guess?).

Easiest way (in pseudo code):
Get name of current script.
Read all contents in memory.
Delete current script.
Write memory contents into new file with the same name.

this code is only 1 time usable and to use it again, I need to change the name of the file or delete the newly created file and then run it again.
That is of course because the file is called differently. You could approach this by having no other files in that folder, or always prefixing the filename in the same way, so that you can find the file although it always is called differently.
Also, If I run it inside a folder, It will always create the new file outside the folder (in a dir up from the exceuting program).
That is because you move it from ./aa to ./. You can take the path of the file and reuse it, apart for the filename, and then it would be in the same folder.

Hey TheKaushikGoswami,
I believe your code does exactly what you told him to and as everybody before me stated, surely only works once. :)
I would like to throw in another idea:
First off I'd personally believe that shutil.move is more of a method for actually moving a file into another directory, as you did by accident.
https://docs.python.org/3/library/shutil.html#shutil.move
So why not firstly parameterize your folder (makes it easier for gui/cmd access) and then just copy to a temporary file and then copying from that temporary file. That way you wont get caught in errors raised if you try to create files already existing and have an easy-to-understand code!
Like so:
import shutil
import os
try:
os.mkdir('./aa/')
except:
print('Folder already exists!')
dest= './aa/'
file= 'test.py'
copypath = dest + 'tmp' + file
srcpath = dest + file
shutil.copy2(srcpath, copypath, follow_symlinks=True)
os.remove(srcpath)
shutil.copy2(copypath, srcpath, follow_symlinks=True)
os.remove(copypath)
But may I ask what your use-case is for that since it really doesn't change anything for me other than creating an exact same file?

os.listdir() adds characters to the beginning of file name?

I had a quick google of this but couldn't find anything. I'm using os to get a list of all the file names in the current working directory using the following code:
path = os.getcwd()
files = os.listdir(path)
The list of files returns fine, but the last element has an extra '~$' that isn't in the actual file name. For example:
files
['File1.xlsx', 'File2.xlsx', '~$File3.xlsx']
This is then causing an issue when I iterate through these files to try and import them, as I get the error of:
[Errno 2] No such file or directory: 'C:\\Users\\$File3.xlsx'
If anyone knows why this happens and how I can fix/prevent it, that would be great!

Just thought I'd answer in case anyone else has this issue.
It's nothing to do with os. It happened because I had File3 open in Excel while pulling the list of file names. I've found out that opening a microsoft document creates a temporary 'lock' file, which are denoted by '~$' (this is how it can re-open unsaved data if it crashes etc).
I found the below from here:
The files you are describing are so-called owner files (sometimes
referred to as "lock" files). An owner file is created when you work
with a document ... and it should be deleted when you save your
document and exit.
There's also a SO question about this within Microsoft files, which can be found here

Why Does a Strange File Shows Up in Directory When Using os.walk()?

The project is written in Pycharm on Windows 10.
I wrote a program that grabs .docx files from a directory and searches for information. At the end of the list of file names I get this file: "~$640188.docx"
I get this error when it hits this file:
raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file
This error happens when I try to put file '~$640188.docx' into the docx2text method process
text = docx2txt.process(r'C:\path\to\folder\~$640188.docx')
From what I can see, this file does not exist in the directory I'm searching nor anywhere on my computer. The other strange part is that yesterday I wasn't getting this error.
I know there are sometimes "hidden" files in directories and I ran into those before on my mac (specifically '.DS_Store') but this is a .docx file.
I currently have an ugly solution, which says "don't run the code if you run into '~$640188.docx'". My concern is that this will become more of a problem when I dump 11000 files into the directory.
Where does this file come from?
Below is the code for reference
import docx2txt
import os
check_files = []
for dir, subdir, files in os.walk(r'C:\path\to\folder'):
for file in files:
check_files.append(file)
for file in check_files:
print "file: {0}".format(file)
text = docx2txt.process(r'C:\path\to\folder\{0}'.format(file))

Hidden .docx files starting with ~$ are simply temporary files created by Word while a file is actively open and being edited – the first two characters of the respective parent file's name are replaced with the ~$. They are usually deleted once you save and close a document, but sometimes they manage to stick around after you quit anyway. Since they are designed to be temporary compliments to a proper .docx file, they do not necessary have the correct zip package structure at all times.
You will do well to skip those. Checking if the file name starts with '~' should be good enough. Just add the following filtering:
check_files2 = [fl for fl in check_files if fl[0] != '~']
for file in check_files2:

python zipfile.ZipFile() method generates 20G zip file from 6M original

I am running a python program (v2.7) which zips output so that it can be emailed.
Usually this works as expected, but occasionally the zipped file is so huge that the machine runs out of disk space. Yet when I zip the file manually using the finder, it works fine.
In this case, the 6MB file gets zipped down to a 1.6MB file using the finder, but the python zip method generated a 20GB file. Here is the code where the zipping is happening:
zip = zipfile.ZipFile(zipfilename,"w",zipfile.ZIP_DEFLATED)
for f in os.listdir("."):
if fnmatch.fnmatch(f,"*final*"):
zip.write(f)
zip.close()
Is there a way to fix this or at least avoid generating a gigantic file?

Do you maybe create that zip file in the same directory and the program is then trying to add the zipfile itself to the zip file?

Its Linux?, i think you are including hidden files and folders?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Missing contents in .txt file after shutil.make_archive - python

Make sure you call functions properly. You are missing the following: Sysinfo.close()

Related

Python: Iterate through directory to find specific text file, open it, write some of its contents to a new file, close it, and move on to the next dir

A file when you run creates a replica of itself and deletes the original file. (Python)

os.listdir() adds characters to the beginning of file name?

Why Does a Strange File Shows Up in Directory When Using os.walk()?

python zipfile.ZipFile() method generates 20G zip file from 6M original

Categories

Resources