So I'm trying to write a script that allows me import the most recently modified file from a directory. I've looked at the glob and os.listdir commands but they don't seem to do it (I get errors). Any thoughts?
import os
import glob
newest = max(glob.iglob('Directory Name'), key=os.path.getctime)
print newest
f = open(newest,'r')
I get an error:
max() arg is an empty sequence
Would something like os.stat work better?
Try:
newest = max(glob.iglob('Directory Name/*'), key=os.path.getctime)
Related
I am trying to grab all of the mp3 files in my Downloads directory (after procedurally downloading them) and move them to a new file. However, anytime I try to use glob to grab a list of the available .mp3 files, I have to glob twice for it to work properly (the first time it is running it returns an empty list). Does anyone know what I am doing wrong here?
import glob
import os
import shutil
newpath = r'localpath/MP3s'
if not os.path.exists(newpath):
os.makedirs(newpath)
list_of_files = glob.glob('localpath/Downloads/*.mp3')
for i in list_of_files:
shutil.move(i, newpath)
This turned out to be a timing issue. The files I was trying to access were still in the process of downloading, with is why the glob was returning empty. I inserted a time.sleep(5) before the glob, and it is now running smoothly.
May I suggest an alternate approach
from pathlib import Path
from shutil import move
music = Path("./soundtrack")
# you can include absolute paths too...
newMusic = Path("./newsoundtrack")
# makes new folder specified in newMusic
newMusic.mkdir(exist_ok=True)
# will select all
list_of_files = music.glob("*")
# u want to select only mp3's do:
# list_of_files = music.glob("*.mp3")
for file in list_of_files:
move(str(file), str(newMusic))
I am trying to get the file name of the latest file on a directory which has couple hundred files on a network drive.
Basically the idea is to snip the file name (its the date/time the file was downloaded, eg xyz201912191455.csv) and paste it on a config file every time the script is run.
Now the list_of_files usually run in about a second but latest_file takes about 100 seconds which is extremely slow.
Is there a faster way to extract the information about the latest file?
The code sample as below:
import os
import glob
import time
from configparser import ConfigParser
import configparser
list_of_files = glob.glob('filepath\*', recursive=True)
latest_file = max(list_of_files, key=os.path.getctime)
list_of_files2 = glob.glob('filepath\*', recursive=True)
latest_file2 = max(list_of_files2, key=os.path.getctime)
If the filenames already include the datetime, why bother getting their stat information? And if the names are like xyz201912191455.csv, one could use [-16:-4] to extract 201912191455 and as these are zero padded they will sort lexicographically in numerical order. Also recursive=True is not needed here as the pattern does not have a ** in it.
list_of_files = glob.glob('filepath\*')
latest_file = max(list_of_files, key=lambda n: n[-16:-4])
I have tried this solution:
How to get the latest file in a folder using python
The code I tried is:
import glob
import os
list_of_files = glob.glob('/path/to/folder/**/*.csv')
latest_file = max(list_of_files, key=os.path.getctime)
print (latest_file)
I received the output with respect to the Windows log of timestamp for the files.
But I have maintained a log separate for writing files in the respective sub-folder.
When I opened the log I see that the last updated file was not what the Python code has specified.
I was shocked as my complete process was depending upon the last file written.
Kindly, let me know what I can do to get the last updated file through Python
I want to read the file which is updated last, but as windows is not prioritizing the updation of the file Last modified, I am not seeing any other way out.
Does anyone has any other way to look out for it?
In linux, os.path.getctime() returns the last modified time, but on windows it returns the creation time. You need to use os.path.getmtime to get the modified time on windows.
import glob
import os
list_of_files = glob.glob('/path/to/folder/**/*.csv')
latest_file = max(list_of_files, key=os.path.getmtime)
print (latest_file)
This code should work for you.
os.path.getctime is the creation time of the file - it seems you want os.path.getmtime which is the modification time of the file, so, try:
latest_file = max(list_of_files, key=os.path.getmtime)
and see if that does what you want.
I have some code that will find the newest file in a directory and append a time stamp to the file name. It works great as long as there is a file in the directory to rename. If there isn't I am getting:
"ValueError: max() arg is an empty sequence"
Here's my code:
import os
import glob
import datetime
now = datetime.datetime.now()
append = now.strftime("%H%M%S")
newest = max(glob.iglob('1234_fileName*.LOG'), key=os.path.getmtime)
newfile = (append+"_"+newest)
os.rename(newest, newfile)
Any suggestions for simplifying the code would be appreciated as well as explaining how to only run if a "1234_fileName*.LOG" (note the wildcard) file is detected.
What I need this program to do is run periodically (I can use task scheduler for that) and check for a new file. If there is a new file append the hours, minutes and seconds to it's name.
Thanks!
You could use glob.glob() that returns a list instead of glob.iglob() that returns an iterator:
files = glob.glob('1234_fileName*.LOG')
if files:
newest = max(files, key=os.path.getmtime)
newfile = append + "_" + newest
os.rename(newest, newfile)
Both glob() and iglob() use os.listdir() under the hood so there is no difference for a single directory.
max() is complaining that you're asking for the largest of 0 items, and throwing a ValueError. You'll have to catch it. This will continue to throw any IOErrors that might occur:
import os, glob, datetime
try:
app = datetime.datetime.now().strftime("%H%M%S")
newest = max(glob.iglob("1234_filename*.LOG"), key=os.path.getmtime)
newfile = (app + "_" + newest)
os.rename(newest, newfile)
except ValueError:
pass
os.access allows you to check access rights before operations. An example is right under the link.
Also, it's fine to just do things inside a try .. except IOError.
Within in my script it's very rare that I run into this problem where I'm trying to move a file to this new folder that already happens to have a file with the same name, but it just happened. So my current code uses the shutil.move method but it errors out with the duplicate file names. I was hoping I could use a simple if statement of checking if source is already in destination and change the name slightly but can't get to that work either. I also read another post on here that used the distutils module for this issue but that one gives me an attribute error. Any other ideas people may have for this?
I added some sample code below. There is already a file called 'file.txt' in the 'C:\data\new' directory. The error given is Destination path already exist.
import shutil
myfile = r"C:\data\file.txt"
newpath = r"C:\data\new"
shutil.move(myfile, newpath)
You can just check that the file exists with os.path.exists and then remove it if it does.
import os
import shutil
myfile = r"C:\data\file.txt"
newpath = r"C:\data\new"
# if check existence of the new possible new path name.
check_existence = os.path.join(newpath, os.path.basename(myfile))
if os.path.exists(check_existence):
os.remove(check_existence)
shutil.move(myfile, newpath)
In Python 3.4 you can try the pathlib module. This is just an example so you can rewrite this to be more efficient/use variables:
import pathlib
import shutil
myfile = r"C:\data\file.txt"
newpath = r"C:\data\new"
p = pathlib.Path("C:\data\new")
if not p.exists():
shutil.move(myfile, newpath)
#Use an else: here to handle your edge case.