Move Files that ends with .pdf to selected folder (Python)

Move Files that ends with .pdf to selected folder (Python) - python

My script I run will be on my mac.
My root is '/Users/johnle/Desktop/'
The purpose of the code is to move a tons of files.
On my desktop will be tons of .pdf files. I want to move the pdf files to '/Users/johnle/Desktop/PDF'
So : '/Users/johnle/Desktop/file.pdf' - > '/Users/johnle/Desktop/PDF/'
This is my code in python :
def moveFile(root,number_of_files, to):
list_of_file = os.listdir(root)
list_of_file.sort()
for file in list_of_file:
name = root + str(file)
dest = to + str(file)
shutil.move( name, dest )

You can use glob and shutil modules. For example:
import glob
import shutil
for f in glob.glob('/Users/johnle/Desktop/*.pdf'):
shutil.copy(f, '/Users/johnle/Desktop/PDF')
(this code hasn't been tested).
Note: my code copies files. If you want to move them, then replace shutil.copy with shutil.move.

In case you have .pdf files with inconsistent casing on their extensions (e.g. .PDF, .pdf, .PdF, ...), you can use something like this:
import os
import shutil
SOURCE_DIR = '/Users/johnle/Desktop/'
DEST_DIR = '/Users/johnle/Desktop/PDF/'
for fname in os.listdir(SOURCE_DIR):
if fname.lower().endswith('.pdf'):
shutil.move(os.path.join(SOURCE_DIR, fname), DEST_DIR)

The os module has lots of fun toys like this for manipulating files and other OS related operations.
You can use the rename function within the os module, to move the file to a new location.
import os
os.mkdir(<path>) #creates a new folder at the specified path
os.rename(<original/current path>, <new path>)

Related

How to modify this script so that all of my files are not deleted when trying to delete files that do not have XML files with them?

I am trying to delete all .JPG files that do not have .xml files with the same name attached to them. However, when I run this script, all of my files are deleted in my directory and not just the desired images. How can I change this script so that I can just delete the images without corresponding .xml files?
Note: The only files I have in the directory are .JPG and .XML
import os
from tqdm import tqdm
path = 'C:\\users\\my_username\\path_to_directory_with_xml_and_jpg_images'
files = os.listdir(path)
for file in tqdm(files):
filename, filetype = file.split('.')
if filetype == 'xml':
continue
imgfile = os.path.join(path, file)
xmlfile = os.path.join(path, filename + '.xml')
if not os.path.exists(xmlfile):
print('{} deleted.'.format(imgfile))
os.remove(imgfile)

It's hard to tell why your code doesn't work as we don't know the exact contents of the directory. But a simpler way to do what you want could be to use the amazing pathlib library (Python >= 3.4). The method Path.with_suffix() will make the task quite easy, together with Path.glob():
from pathlib import Path
path = Path('C:\\users\\my_username\\path_to_directory_with_xml_and_jpg_images')
for imgfile in path.glob("*.jpg"):
xmlfile = imgfile.with_suffix(".xml")
if not xmlfile.exists():
imgfile.unlink()
print(imgfile, 'deleted.')

Copy pdfs in a directory to folders with the same name as the PDFs

I want to put the pdfs I have in a directory in to the folders with the same name. Those folders with the same name have already been created and are in the same directory as the pdf files I want to move in to them.
I am relatively new at python and have not gotten very far on the code. Currently when I run the below it only prints the .pdf files but does not print the subfolders within the directory (that is besides the point but I am not sure why I cant see the sub folders in the directory in the below code.)
import os
from shutil import copyfile
path_to_files = "C:\\tmp\\all_files_converted\\"
def copy_documents(file_path):
for f in os.listdir(file_path):
print(f)
copy_documents(path_to_files)
folders in directory C://tmp//all_files_converted//
pdf files with the same name as the folders in the same directory C://tmp//all_files_converted//

You can use pathlib and shutil to perform this:
from pathlib import Path
from shutil import move
path_to_files = Path(r"C:\tmp\all_files_converted")
for pdf_path in path_to_files.glob("*.pdf"):
dir_path = path_to_files / pdf_path.stem
dir_path.mkdir()
move(pdf_path, dir_path / pdf_path.name)

You can use shutil.move(src, dst)
import shutil
shutil.move(src, dst)

import os
from shutil import copyfile
from glob import glob
path_to_files = "pasta"
def copy_documents(path_to_files):
# os.path is a module to work with file paths
# Using the module glob to list all pdf files of a folder
for file_path in glob(os.path.join(path_to_files, "*.pdf")):
# basename will return the filename without the rest of the path ie: "something.pdf"
pdf_file_name = os.path.basename(file_path)
dest_folder = os.path.join(path_to_files, pdf_file_name[:-4])
print(f"Copy {file_path} to {dest_folder}")
copyfile(file_path, os.path.join(dest_folder, pdf_file_name))
copy_documents(path_to_files)
I recommend reading https://docs.python.org/3/library/os.path.html and https://docs.python.org/3/library/glob.html
for more info.

Get files from specific folders in python

I have the following directory structure with the following files:
Folder_One
├─file1.txt
├─file1.doc
└─file2.txt
Folder_Two
├─file2.txt
├─file2.doc
└─file3.txt
I would like to get only the .txt files from each folder listed. Example:
Folder_One-> file1.txt and file2.txt
Folder_Two-> file2.txt and file3.txt
Note: This entire directory is inside a folder called dataset. My code looks like this, but I believe something is missing. Can someone help me.
path_dataset = "./dataset/"
filedataset = os.listdir(path_dataset)
for i in filedataset:
pasta = ''
pasta = pasta.join(i)
for file in glob.glob(path_dataset+"*.txt"):
print(file)

from pathlib import Path
for path in Path('dataset').rglob('*.txt'):
print(path.name)
Using glob
import glob
for x in glob.glob('dataset/**/*.txt', recursive=True):
print(x)

You can use re module to check that filename ends with .txt.
import re
import os
path_dataset = "./dataset/"
l = os.listdir(path_dataset)
for e in l:
if os.path.isdir("./dataset/" + e):
ll = os.listdir(path_dataset + e)
for file in ll:
if re.match(r".*\.txt$", file):
print(e + '->' + file)

One may use an additional option to check and find all files by using the os module (this is of advantage if you already use this module):
import os
#get current directory, you may also provide an absolute path
path=os.getcwd()
#walk recursivly through all folders and gather information
for root, dirs, files in os.walk(path):
#check if file is of correct type
check=[f for f in files if f.find(".txt")!=-1]
if check!=[]:print(root,check)

Keeping renamed text files in original folder

This is my current (from a Jupyter notebook) code for renaming some text files.
The issue is when I run the code, the renamed files are placed in my current working Jupyter folder. I would like the files to stay in the original folder
import glob
import os
path = 'C:\data_research\text_test\*.txt'
files = glob.glob(r'C:\data_research\text_test\*.txt')
for file in files:
os.rename(file, file[-27:])

You should only change the name and keep the path the same. Your filename will not always be longer than 27 so putting this into you code is not ideal. What you want is something that just separates the name from the path, no matter the name, no matter the path. Something like:
import os
import glob
path = 'C:\data_research\text_test\*.txt'
files = glob.glob(r'C:\data_research\text_test\*.txt')
for file in files:
old_name = os.path.basename(file) # now this is just the name of your file
# now you can do something with the name... here i'll just add new_ to it.
new_name = 'new_' + old_name # or do something else with it
new_file = os.path.join(os.path.dirname(file), new_name) # now we put the path and the name together again
os.rename(file, new_file) # and now we rename.
If you are using windows you might want to use the ntpath package instead.

file[-27:] takes the last 27 characters of the filename so unless all of your filenames are 27 characters long, it will fail. If it does succeed, you've stripped off the target directory name so the file is moved to your current directory. os.path has utilities to manage file names and you should use them:
import glob
import os
path = 'C:\data_research\text_test*.txt'
files = glob.glob(r'C:\data_research\text_test*.txt')
for file in files:
dirname, basename = os.path.split(file)
# I don't know how you want to rename so I made something up
newname = basename + '.bak'
os.rename(file, os.path.join(dirname, newname))

Move child folder contents to parent folder in python

I have a specific problem in python. Below is my folder structure.
dstfolder/slave1/slave
I want the contents of 'slave' folder to be moved to 'slave1' (parent folder). Once moved,
'slave' folder should be deleted. shutil.move seems to be not helping.
Please let me know how to do it ?

Example using the os and shutil modules:
from os.path import join
from os import listdir, rmdir
from shutil import move
root = 'dstfolder/slave1'
for filename in listdir(join(root, 'slave')):
move(join(root, 'slave', filename), join(root, filename))
rmdir(join(root, 'slave'))

I needed something a little more generic, i.e. move all the files from all the [sub]+folders into the root folder.
For example start with:
root_folder
|----test1.txt
|----1
|----test2.txt
|----2
|----test3.txt
And end up with:
root_folder
|----test1.txt
|----test2.txt
|----test3.txt
A quick recursive function does the trick:
import os, shutil, sys
def move_to_root_folder(root_path, cur_path):
for filename in os.listdir(cur_path):
if os.path.isfile(os.path.join(cur_path, filename)):
shutil.move(os.path.join(cur_path, filename), os.path.join(root_path, filename))
elif os.path.isdir(os.path.join(cur_path, filename)):
move_to_root_folder(root_path, os.path.join(cur_path, filename))
else:
sys.exit("Should never reach here.")
# remove empty folders
if cur_path != root_path:
os.rmdir(cur_path)
You will usually call it with the same argument for root_path and cur_path, e.g. move_to_root_folder(os.getcwd(),os.getcwd()) if you want to try it in the python environment.

The problem might be with the path you specified in the shutil.move function
Try this code
import os
import shutil
for r,d,f in os.walk("slave1"):
for files in f:
filepath = os.path.join(os.getcwd(),"slave1","slave", files)
destpath = os.path.join(os.getcwd(),"slave1")
shutil.copy(filepath,destpath)
shutil.rmtree(os.path.join(os.getcwd(),"slave1","slave"))
Paste it into a .py file in the dstfolder. I.e. slave1 and this file should remain side by side. and then run it. worked for me

Use this if the files have same names, new file names will have folder names joined by '_'
import shutil
import os
source = 'path to folder'
def recursive_copy(path):
for f in sorted(os.listdir(os.path.join(os.getcwd(), path))):
file = os.path.join(path, f)
if os.path.isfile(file):
temp = os.path.split(path)
f_name = '_'.join(temp)
file_name = f_name + '_' + f
shutil.move(file, file_name)
else:
recursive_copy(file)
recursive_copy(source)

Maybe you could get into the dictionary slave, and then
exec system('mv .........')
It will work won't it?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Move Files that ends with .pdf to selected folder (Python) - python

You can use glob and shutil modules. For example: import glob import shutil for f in glob.glob('/Users/johnle/Desktop/*.pdf'): shutil.copy(f, '/Users/johnle/Desktop/PDF') (this code hasn't been tested). Note: my code copies files. If you want to move them, then replace shutil.copy with shutil.move.

Related

How to modify this script so that all of my files are not deleted when trying to delete files that do not have XML files with them?

Copy pdfs in a directory to folders with the same name as the PDFs

Get files from specific folders in python

Keeping renamed text files in original folder

Move child folder contents to parent folder in python

Categories

Resources