Programmaticallly moving files in python - python

I'm trying to simply move files from folder path1 to folder path.
import os
import shutil
path1 = '/home/user/Downloads'
file_dir = os.listdir(path1)
fpath = '/home/user/music'
for file in file_dir:
if file.endswith('.mp3'):
shutil.move(os.path.join(file_dir,file), os.path.join(fpath, file))
... but I get this error
TypeError: expected str, bytes or os.PathLike object, not list

First of all, you shouldn't use file as a variable name, it's a builtin in python, consider using f instead.
Also notice that in the shutil.move line, I've changed your (os.path.join(file_dir,f) to (os.path.join(path1,f). file_dir is a list, not the name of the directory that you're looking for, that value is stored in your path1 variable.
Altogether, it looks like this:
import os
import shutil
path1 = '/home/user/Downloads'
file_dir = os.listdir(path1)
fpath = '/home/user/music'
for f in file_dir:
if f.endswith('.mp3'):
shutil.move(os.path.join(path1,f), os.path.join(fpath, f))

You have confused your variable purposes from one line to the next. You've also over-built your file path construction.
You set up file_dir as a list of all the files in path1. That works fine through your for command, where you iterate through that list. The move method requires two file names, simple strings. Look at how you construct your file name:
os.path.join(file_dir,file)
Remember, file_dir is a list of files in path1. file is one of the files in that list. What are you trying to do here? Do you perhaps mean to concatenate path1 with file?
NOTE: Using pre-defined names as variables is really bad practice. file is a pre-defined type. Instead, use f or local_file, perhaps.

Read carefully the error message. file_dir is list. You can not join it with os.path.join. You probably want to write:
shutil.move(os.path.join(path1, f), os.path.join(fpath, f))
I suggest to name variables with meaningful names like:
file_list = os.listdir(path1)
This way you will not join a file list with a path :)

Related

Rename the filename using python

I have folder where I have multiple files. Out of this files I want to rename some of them. For example: PB report December21 North.xlsb, PB report November21 North.xslb and so on. They all have a same start - PB report. I would like to change their name and leave only PB report and Month. For example PB report December.
I have tried this code:
import os
path = r'C://Users//greencolor//Desktop//Autoreport//Load_attachments//'
for filename in os.listdir(path):
if filename.startswith("PB report"):
os.rename(filename, filename[:-8])
-8 indicates that I want to split the name from the end on the 8th character
I get this error:
FileNotFoundError: [WinError 2] The system cannot find the file specified
Any suggestion?
You need the path when renaming file with os.rename:
Replace:
os.rename(filename, filename[:-8])
With:
filename_part, extension = os.path.splitext(filename)
os.rename(path+filename, path+filename_part[:-8]+extension)
The problem is likely that it cannot find the file because the directory is not specified. You need to add the path to the file name:
import os
path = r'C://Users//greencolor//Desktop//Autoreport//Load_attachments//'
for filename in os.listdir(path):
if filename.startswith("PB report"):
os.rename(os.path.join(path, filename), os.path.join(path, filename[:-8]))
This is a classic example of how working with os/os.path to manipulate paths is just not convenient. This is why pathlib exists. By treating paths as objects, rather than strings everything becomes more sensible. By using a combination of path.iterdir() and path.rename() you can achieve what you want like:
from pathlib import Path
path = Path(r'your path')
for file in path.iterdir():
if file.name.startswith("PB report"):
file.rename(file.with_stem(file.stem[:-8]))
Note that stem means the filename without the extension and that with_stem was added in Python 3.9. So for older versions you can still use with_name:
file.rename(file.with_name(file.stem[:-8] + file.suffix))
Where suffix is the extension of the file.

Find efficiently a file with unknown extension

I have a problem that feels easy, but I cannot come up with a satisfying solution.
I have a file structure with a directory containing a very large number of files. The file names are just their index with an unknown extension. For example, the 10th file is "10.pdf" and the 42th file is "42.png". There can be many different extensions.
I need to access the i-th file from python, given index i but not knowing the extension. This will happen a lot, so I should be able to do it efficiently.
Here are the partial solutions I could think about:
I can glob the pattern f"{i}.*"
However, I think glob will check every file in the directory? This will be very slow for a large number of files.
I can save and preload the full name in a dict, in a JSON file like {..., 10: "10.pdf", ...}
This works, but I have to load and keep track of another heavy object. This feels wrong somehow...
If I have a list of all allowed extensions, I can just test all possibilities. This feels weird and unnecessary, but that's my best guess for now.
What do you think ? Is one of those proposal the correct way to do it ?
As I think, you only need the file name instead full filename+ext. So, one way is to remove the extension from the file, for example:
import os
path = r"Enter your folder's path here"
file_dict = {}
for file in os.listdir(path):
if os.path.isfile(file): # because os.listdir return both files and folders
file_name, ext = os.path.splitext(file)
print(file_name, ext)
For example, if your file is '10.pdf' then file_name='10' and ext='.pdf'. Then you can add it to a dictionary for the future:
file_dict[file_name] = os.path.join(path, file)
Another way is using regular expressions or "re"! if you have a patter(even complex pattern) 're' is awesome! You need to type your desired pattern, for example:
import os
import re
path = r"Enter your folder's path here"
file_dict = {}
for file in os.listdir(path):
if os.path.isfile(file):
mo = re.search(r'(.*\)(..*)', file)
file_name, ext = mo.groups()
print(file_name, ext)

Recursively find and copy files from many folders

I have some files in an array that I want to recursively search from many folders
An example of the filename array is ['A_010720_X.txt','B_120720_Y.txt']
Example of folder structure is as below which I can also provide as an array e.g ['A','B'] and ['2020-07-01','2020-07-12']. The "DL" remains the same for all.
C:\A\2020-07-01\DL
C:\B\2020-07-12\DL
etc
I have tried to use shutil but it doesn't seem to work effectively for my requirement as I can only pass in a full file name and not a wildcard. The code I have used with shutil which works but without wildcards and with absolute full file name and path e.g the code below will only give me A_010720_X.txt
I believe the way to go would be using glob or pathlib which i have not used before or cannot find some good examples similar to my use case
import shutil
filenames_i_want = ['A_010720_X.txt','B_120720_Y.txt']
RootDir1 = r'C:\A\2020-07-01\DL'
TargetFolder = r'C:\ELK\LOGS\ATH\DEST'
for root, dirs, files in os.walk((os.path.normpath(RootDir1)), topdown=False):
for name in files:
if name in filenames_i_want:
print ("Found")
SourceFolder = os.path.join(root,name)
shutil.copy2(SourceFolder, TargetFolder)
I think this should do what you need assuming they are all .txt files.
import glob
import shutil
filenames_i_want = ['A_010720_X.txt','B_120720_Y.txt']
TargetFolder = r'C:\ELK\LOGS\ATH\DEST'
all_files = []
for directory in ['A', 'B']:
files = glob.glob('C:\{}\*\DL\*.txt'.format(directory))
all_files.append(files)
for file in all_files:
if file in filenames_i_want:
shutil.copy2(file, TargetFolder)

Removing a period from a several file names using Python

I've found several related posts to this but when I try to use the code suggested I keep getting "The system cannot find the file specified". I imagine it's some kind of path problem. There are several folders within the "Cust" folder and each of those folders have several files and some have "." in the file name I need to remove. Any idea what I have wrong here?
customer_folders_path = r"C:\Users\All\Documents\Cust"
for directname, directnames, files in os.walk(customer_folders_path):
for file in files:
filename_split = os.path.splitext(file)
filename_zero = filename_split[0]
if "." in filename_zero:
os.rename(filename_zero, filename_zero.replace(".", ""))
When you use os.walk and then iterate through the files, remember that you are only iterating through file names - not the full path (which is what is needed by os.rename in order to function properly). You can adjust by adding the full path to the file itself, which in your case would be represented by joining directname and filename_zero together using os.path.join:
os.rename(os.path.join(directname, filename_zero),
os.path.join(directname, filename_zero.replace(".", "")))
Also, not sure if you use it elsewhere, but you could remove your filename_split variable and define filename_zero as filename_zero = os.path.splitext(file)[0], which will do the same thing. You may also want to change customer_folders_path = r"C:\Users\All\Documents\Cust" to customer_folders_path = "C:/Users/All/Documents/Cust", as the directory will be properly interpreted by Python.
EDIT: As intelligently pointed out by #bozdoz, when you split off the suffix, you lose the 'original' file and therefore it can't be found. Here is an example that should work in your situation:
import os
customer_folders_path = "C:/Users/All/Documents/Cust"
for directname, directnames, files in os.walk(customer_folders_path):
for f in files:
# Split the file into the filename and the extension, saving
# as separate variables
filename, ext = os.path.splitext(f)
if "." in filename:
# If a '.' is in the name, rename, appending the suffix
# to the new file
new_name = filename.replace(".", "")
os.rename(
os.path.join(directname, f),
os.path.join(directname, new_name + ext))
You need to use the original filename as the first parameter to os.rename and handle the case where the filename didn't have a period in the first place. How about:
customer_folders_path = r"C:\Users\All\Documents\Cust"
for directname, directnames, files in os.walk(customer_folders_path):
for fn in files:
if '.' in fn:
fullname = os.path.join(directname, fn)
os.rename(fullname, os.path.splitext(fullname)[0])

Fast way to read filename from directory?

Given a local directory structure of /foo/bar, and assuming that a given path contains exactly one file (filename and content does not matter), what is a reasonably fast way to get the filename of that single file (NOT the file content)?
1st element of os.listdir()
import os
os.listdir('/foo/bar')[0]
Well I know this code works...
for file in os.listdir('.'):
#do something
you can also use glob
import glob
print glob.glob("/path/*")[0]
os.path.basename will return the file name for you
so you can use it for the exact one file by adding your file path :
os.path.basename("/foo/bar/file.file")
or you can run through the files in the folder and read all names
file_src = "/foo/bar/"
for x in os.listdir(file_src):
print(os.path.basename(x))

Categories

Resources