Related
Ok so i'm trying to create a script that does the following: Searches a directory for known hashes. Here is my first script:
Hash.py
import hashlib
from functools import partial
#call another python script
execfile("knownHashes.py")
def md5sum(filename):
with open(filename, mode='rb') as f:
d = hashlib.md5()
for buf in iter(partial(f.read, 128), b''):
d.update(buf)
return d.hexdigest()
print "Hash of is: "
print(md5sum('photo.jpg'))
if md5List == md5sum:
print "Match"
knownHashes.py
print ("Call worked\n")
md5List = "01071709f67193b295beb7eab6e66646" + "5d41402abc4b2a76b9719d911017c592"
The problem at the moment is that I manually have to type in the file I want to find out the hash of where it says photo.jpg. Also, The I haven't got the md5List to work yet.
I want the script to eventually work like this:
python hash.py <directory>
1 match
cookies.jpg matches hash
So how can I get the script to search a directory rather than manually type in what file to hash? Also, how can I fix the md5List because that is wrong?
You can get a list of files in the current working directory using the following. This is the directory that you run the script from.
import os
#Get list of files in working directory
files_list = os.listdir(os.getcwd())
You can iterate through the list using a for loop:
for file in files_list:
#do something
As equinoxel also mentioned below, you can use os.walk() as well.
Simple little gist should solve most of your problems. Understandable if you don't like using OOP for this problem, but I believe all of the important conceptual pieces are here in a pretty clean, concise representation. Let me know if you have any questions.
class PyGrep:
def __init__(self, directory):
self.directory = directory
def grab_all_files_with_ending(self, file_ending):
"""Will return absolute paths to all files with given file ending in self.directory"""
walk_results = os.walk(self.directory)
file_check = lambda walk: len(walk[2]) > 0
ending_prelim = lambda walk: file_ending in " ".join(walk[2])
relevant_results = (entry for entry in walk_results if file_check(entry) and ending_prelim(entry))
return (self.grab_files_from_os_walk(result, file_ending) for result in relevant_results)
def grab_files_from_os_walk(self, os_walk_tuple, file_ending):
format_check = lambda file_name: file_ending in file_name
directory, subfolders, file_paths = os_walk_tuple
return [os.path.join(directory, file_path) for file_path in file_paths if format_check(file_path)]
I need to pass a file path name to a module. How do I build the file path from a directory name, base filename, and a file format string?
The directory may or may not exist at the time of call.
For example:
dir_name='/home/me/dev/my_reports'
base_filename='daily_report'
format = 'pdf'
I need to create a string '/home/me/dev/my_reports/daily_report.pdf'
Concatenating the pieces manually doesn't seem to be a good way. I tried os.path.join:
join(dir_name,base_filename,format)
but it gives
/home/me/dev/my_reports/daily_report/pdf
This works fine:
os.path.join(dir_name, base_filename + '.' + filename_suffix)
Keep in mind that os.path.join() exists only because different operating systems use different path separator characters. It smooths over that difference so cross-platform code doesn't have to be cluttered with special cases for each OS. There is no need to do this for file name "extensions" (see footnote) because they are always preceded by a dot character, on every OS.
If using a function anyway makes you feel better (and you like needlessly complicating your code), you can do this:
os.path.join(dir_name, '.'.join((base_filename, filename_suffix)))
If you prefer to keep your code clean, simply include the dot in the suffix:
suffix = '.pdf'
os.path.join(dir_name, base_filename + suffix)
That approach also happens to be compatible with the suffix conventions in pathlib, which was introduced in python 3.4 a few years after this question was asked. New code that doesn't require backward compatibility can do this:
suffix = '.pdf'
pathlib.PurePath(dir_name, base_filename + suffix)
You might be tempted to use the shorter Path() instead of PurePath() if you're only handling paths for the local OS. I would question that choice, given the cross-platform issues behind the original question.
Warning: Do not use pathlib's with_suffix() for this purpose. That method will corrupt base_filename if it ever contains a dot.
Footnote: Outside of Microsoft operating systems, there is no such thing as a file name "extension". Its presence on Windows comes from MS-DOS and FAT, which borrowed it from CP/M, which has been dead for decades. That dot-plus-three-letters that many of us are accustomed to seeing is just part of the file name on every other modern OS, where it has no built-in meaning.
If you are fortunate enough to be running Python 3.4+, you can use pathlib:
>>> from pathlib import Path
>>> dirname = '/home/reports'
>>> filename = 'daily'
>>> suffix = '.pdf'
>>> Path(dirname, filename).with_suffix(suffix)
PosixPath('/home/reports/daily.pdf')
Um, why not just:
>>> import os
>>> os.path.join(dir_name, base_filename + "." + format)
'/home/me/dev/my_reports/daily_report.pdf'
Is not it better to add the format in the base filename?
dir_name='/home/me/dev/my_reports/'
base_filename='daily_report.pdf'
os.path.join(dir_name, base_filename)
Just use os.path.join to join your path with the filename and extension. Use sys.argv to access arguments passed to the script when executing it:
#!/usr/bin/env python3
# coding: utf-8
# import netCDF4 as nc
import numpy as np
import numpy.ma as ma
import csv as csv
import os.path
import sys
basedir = '/data/reu_data/soil_moisture/'
suffix = 'nc'
def read_fid(filename):
fid = nc.MFDataset(filename,'r')
fid.close()
return fid
def read_var(file, varname):
fid = nc.Dataset(file, 'r')
out = fid.variables[varname][:]
fid.close()
return out
if __name__ == '__main__':
if len(sys.argv) < 2:
print('Please specify a year')
else:
filename = os.path.join(basedir, '.'.join((sys.argv[1], suffix)))
time = read_var(ncf, 'time')
lat = read_var(ncf, 'lat')
lon = read_var(ncf, 'lon')
soil = read_var(ncf, 'soilw')
Simply run the script like:
# on windows-based systems
python script.py year
# on unix-based systems
./script.py year
from pathlib import Path
# Build paths inside the project like this: BASE_DIR / 'subdir'.
BASE_DIR = Path(__file__).resolve().parent.parent
TEMPLATE_PATH = Path.joinpath(BASE_DIR,"templates")
print(TEMPLATE_PATH)
Adding code below for better understanding:
import os
def createfile(name, location, extension):
print(name, extension, location)
#starting creating a file with some dummy contents
path = os.path.join(location, name + '.' + extension)
f = open(path, "a")
f.write("Your contents!! or whatever you want to put inside this file.")
f.close()
print("File creation is successful!!")
def readfile(name, location, extension):
#open and read the file after the appending:
path = os.path.join(location, name + '.' + extension)
f = open(path, "r")
print(f.read())
#pass the parameters here
createfile('test','./','txt')
readfile('test','./','txt')
Let's say I have two paths: the first one (may be file or folder path): file_path, and the second one (may only be a folder path): folder_path. And I want to determine whether an object collocated with file_path is inside of the object collocated with folder_path.
I have an idea of doing this:
import os
...
def is_inside(file_path, folder_path):
full_file_path = os.path.realpath(file_path)
full_folder_path = os.path.realpath(folder_path)
return full_folder_path.startswith(full_file_path)
but I'm afraid there are some pitfalls in this approach. Also I think there must be a prettier way to do this.
The solution must work on Linux but it would be great if you propose me some cross-platform trick.
Use os.path.commonprefix. Here's an example based on your idea.
import os.path as _osp
def is_inside(file_path, folder_path):
full_file_path = _osp.realpath(file_path)
full_folder_path = _osp.realpath(folder_path)
return _osp.commonprefix([full_file_path, full_folder_path]) == \
full_folder_path
Parse the file name from the file path and do
os.path.exists(full_folder_path + '/' + file_name)
I want to change a.txt to b.kml.
Use os.rename:
import os
os.rename('a.txt', 'b.kml')
Usage:
os.rename('from.extension.whatever','to.another.extension')
File may be inside a directory, in that case specify the path:
import os
old_file = os.path.join("directory", "a.txt")
new_file = os.path.join("directory", "b.kml")
os.rename(old_file, new_file)
As of Python 3.4 one can use the pathlib module to solve this.
If you happen to be on an older version, you can use the backported version found here
Let's assume you are not in the root path (just to add a bit of difficulty to it) you want to rename, and have to provide a full path, we can look at this:
some_path = 'a/b/c/the_file.extension'
So, you can take your path and create a Path object out of it:
from pathlib import Path
p = Path(some_path)
Just to provide some information around this object we have now, we can extract things out of it. For example, if for whatever reason we want to rename the file by modifying the filename from the_file to the_file_1, then we can get the filename part:
name_without_extension = p.stem
And still hold the extension in hand as well:
ext = p.suffix
We can perform our modification with a simple string manipulation:
Python 3.6 and greater make use of f-strings!
new_file_name = f"{name_without_extension}_1"
Otherwise:
new_file_name = "{}_{}".format(name_without_extension, 1)
And now we can perform our rename by calling the rename method on the path object we created and appending the ext to complete the proper rename structure we want:
p.rename(Path(p.parent, new_file_name + ext))
More shortly to showcase its simplicity:
Python 3.6+:
from pathlib import Path
p = Path(some_path)
p.rename(Path(p.parent, f"{p.stem}_1_{p.suffix}"))
Versions less than Python 3.6 use the string format method instead:
from pathlib import Path
p = Path(some_path)
p.rename(Path(p.parent, "{}_{}_{}".format(p.stem, 1, p.suffix))
import shutil
shutil.move('a.txt', 'b.kml')
This will work to rename or move a file.
os.rename(old, new)
This is found in the Python docs: http://docs.python.org/library/os.html
As of Python version 3.3 and later, it is generally preferred to use os.replace instead of os.rename so FileExistsError is not raised if the destination file already exists.
assert os.path.isfile('old.txt')
assert os.path.isfile('new.txt')
os.rename('old.txt', 'new.txt')
# Raises FileExistsError
os.replace('old.txt', 'new.txt')
# Does not raise exception
assert not os.path.isfile('old.txt')
assert os.path.isfile('new.txt')
See the documentation.
Use os.rename. But you have to pass full path of both files to the function. If I have a file a.txt on my desktop so I will do and also I have to give full of renamed file too.
os.rename('C:\\Users\\Desktop\\a.txt', 'C:\\Users\\Desktop\\b.kml')
One important point to note here, we should check if any files exists with the new filename.
suppose if b.kml file exists then renaming other file with the same filename leads to deletion of existing b.kml.
import os
if not os.path.exists('b.kml'):
os.rename('a.txt','b.kml')
import os
# Set the path
path = 'a\\b\\c'
# save current working directory
saved_cwd = os.getcwd()
# change your cwd to the directory which contains files
os.chdir(path)
os.rename('a.txt', 'b.klm')
# moving back to the directory you were in
os.chdir(saved_cwd)
Using the Pathlib library's Path.rename instead of os.rename:
import pathlib
original_path = pathlib.Path('a.txt')
new_path = original_path.rename('b.kml')
Here is an example using pathlib only without touching os which changes the names of all files in a directory, based on a string replace operation without using also string concatenation:
from pathlib import Path
path = Path('/talend/studio/plugins/org.talend.designer.components.bigdata_7.3.1.20200214_1052\components/tMongoDB44Connection')
for p in path.glob("tMongoDBConnection*"):
new_name = p.name.replace("tMongoDBConnection", "tMongoDB44Connection")
new_name = p.parent/new_name
p.rename(new_name)
import shutil
import os
files = os.listdir("./pics/")
for key in range(0, len(files)):
print files[key]
shutil.move("./pics/" + files[key],"./pics/img" + str(key) + ".jpeg")
This should do it. python 3+
How to change the first letter of filename in a directory:
import os
path = "/"
for file in os.listdir(path):
os.rename(path + file, path + file.lower().capitalize())
then = os.listdir(path)
print(then)
If you are Using Windows and you want to rename your 1000s of files in a folder then:
You can use the below code. (Python3)
import os
path = os.chdir(input("Enter the path of the Your Image Folder : ")) #Here put the path of your folder where your images are stored
image_name = input("Enter your Image name : ") #Here, enter the name you want your images to have
i = 0
for file in os.listdir(path):
new_file_name = image_name+"_" + str(i) + ".jpg" #here you can change the extention of your renmamed file.
os.rename(file,new_file_name)
i = i + 1
input("Renamed all Images!!")
os.chdir(r"D:\Folder1\Folder2")
os.rename(src,dst)
#src and dst should be inside Folder2
import os
import re
from pathlib import Path
for f in os.listdir(training_data_dir2):
for file in os.listdir( training_data_dir2 + '/' + f):
oldfile= Path(training_data_dir2 + '/' + f + '/' + file)
newfile = Path(training_data_dir2 + '/' + f + '/' + file[49:])
p=oldfile
p.rename(newfile)
You can use os.system to invoke terminal to accomplish the task:
os.system('mv oldfile newfile')
In Python, is there a portable and simple way to test if an executable program exists?
By simple I mean something like the which command which would be just perfect. I don't want to search PATH manually or something involving trying to execute it with Popen & al and see if it fails (that's what I'm doing now, but imagine it's launchmissiles)
I know this is an ancient question, but you can use distutils.spawn.find_executable. This has been documented since python 2.4 and has existed since python 1.6.
import distutils.spawn
distutils.spawn.find_executable("notepad.exe")
Also, Python 3.3 now offers shutil.which().
Easiest way I can think of:
def which(program):
import os
def is_exe(fpath):
return os.path.isfile(fpath) and os.access(fpath, os.X_OK)
fpath, fname = os.path.split(program)
if fpath:
if is_exe(program):
return program
else:
for path in os.environ["PATH"].split(os.pathsep):
exe_file = os.path.join(path, program)
if is_exe(exe_file):
return exe_file
return None
Edit: Updated code sample to include logic for handling case where provided argument is already a full path to the executable, i.e. "which /bin/ls". This mimics the behavior of the UNIX 'which' command.
Edit: Updated to use os.path.isfile() instead of os.path.exists() per comments.
Edit: path.strip('"') seems like the wrong thing to do here. Neither Windows nor POSIX appear to encourage quoted PATH items.
Use shutil.which() from Python's wonderful standard library.
Batteries included!
For python 3.3 and later:
import shutil
command = 'ls'
shutil.which(command) is not None
As a one-liner of Jan-Philip Gehrcke Answer:
cmd_exists = lambda x: shutil.which(x) is not None
As a def:
def cmd_exists(cmd):
return shutil.which(cmd) is not None
For python 3.2 and earlier:
my_command = 'ls'
any(
(
os.access(os.path.join(path, my_command), os.X_OK)
and os.path.isfile(os.path.join(path, my_command)
)
for path in os.environ["PATH"].split(os.pathsep)
)
This is a one-liner of Jay's Answer, Also here as a lambda func:
cmd_exists = lambda x: any((os.access(os.path.join(path, x), os.X_OK) and os.path.isfile(os.path.join(path, x))) for path in os.environ["PATH"].split(os.pathsep))
cmd_exists('ls')
Or lastly, indented as a function:
def cmd_exists(cmd, path=None):
""" test if path contains an executable file with name
"""
if path is None:
path = os.environ["PATH"].split(os.pathsep)
for prefix in path:
filename = os.path.join(prefix, cmd)
executable = os.access(filename, os.X_OK)
is_not_directory = os.path.isfile(filename)
if executable and is_not_directory:
return True
return False
Just remember to specify the file extension on windows. Otherwise, you have to write a much complicated is_exe for windows using PATHEXT environment variable. You may just want to use FindPath.
OTOH, why are you even bothering to search for the executable? The operating system will do it for you as part of popen call & will raise an exception if the executable is not found. All you need to do is catch the correct exception for given OS. Note that on Windows, subprocess.Popen(exe, shell=True) will fail silently if exe is not found.
Incorporating PATHEXT into the above implementation of which (in Jay's answer):
def which(program):
def is_exe(fpath):
return os.path.exists(fpath) and os.access(fpath, os.X_OK) and os.path.isfile(fpath)
def ext_candidates(fpath):
yield fpath
for ext in os.environ.get("PATHEXT", "").split(os.pathsep):
yield fpath + ext
fpath, fname = os.path.split(program)
if fpath:
if is_exe(program):
return program
else:
for path in os.environ["PATH"].split(os.pathsep):
exe_file = os.path.join(path, program)
for candidate in ext_candidates(exe_file):
if is_exe(candidate):
return candidate
return None
For *nix platforms (Linux and OS X)
This seems to be working for me:
Edited to work on Linux, thanks to Mestreion
def cmd_exists(cmd):
return subprocess.call("type " + cmd, shell=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE) == 0
What we're doing here is using the builtin command type and checking the exit code. If there's no such command, type will exit with 1 (or a non-zero status code anyway).
The bit about stdout and stderr is just to silence the output of the type command, since we're only interested in the exit status code.
Example usage:
>>> cmd_exists("jsmin")
True
>>> cmd_exists("cssmin")
False
>>> cmd_exists("ls")
True
>>> cmd_exists("dir")
False
>>> cmd_exists("node")
True
>>> cmd_exists("steam")
False
On the basis that it is easier to ask forgiveness than permission (and, importantly, that the command is safe to run) I would just try to use it and catch the error (OSError in this case - I checked for file does not exist and file is not executable and they both give OSError).
It helps if the executable has something like a --version or --help flag that is a quick no-op.
import subprocess
myexec = "python2.8"
try:
subprocess.call([myexec, '--version']
except OSError:
print "%s not found on path" % myexec
This is not a general solution, but will be the easiest way for a lot of use cases - those where the code needs to look for a single well known executable which is safe to run, or at least safe to run with a given flag.
See os.path module for some useful functions on pathnames. To check if an existing file is executable, use os.access(path, mode), with the os.X_OK mode.
os.X_OK
Value to include in the mode parameter of access() to determine if path can be executed.
EDIT: The suggested which() implementations are missing one clue - using os.path.join() to build full file names.
I know that I'm being a bit of a necromancer here, but I stumbled across this question and the accepted solution didn't work for me for all cases Thought it might be useful to submit anyway. In particular, the "executable" mode detection, and the requirement of supplying the file extension. Furthermore, both python3.3's shutil.which (uses PATHEXT) and python2.4+'s distutils.spawn.find_executable (just tries adding '.exe') only work in a subset of cases.
So I wrote a "super" version (based on the accepted answer, and the PATHEXT suggestion from Suraj). This version of which does the task a bit more thoroughly, and tries a series of "broadphase" breadth-first techniques first, and eventually tries more fine-grained searches over the PATH space:
import os
import sys
import stat
import tempfile
def is_case_sensitive_filesystem():
tmphandle, tmppath = tempfile.mkstemp()
is_insensitive = os.path.exists(tmppath.upper())
os.close(tmphandle)
os.remove(tmppath)
return not is_insensitive
_IS_CASE_SENSITIVE_FILESYSTEM = is_case_sensitive_filesystem()
def which(program, case_sensitive=_IS_CASE_SENSITIVE_FILESYSTEM):
""" Simulates unix `which` command. Returns absolute path if program found """
def is_exe(fpath):
""" Return true if fpath is a file we have access to that is executable """
accessmode = os.F_OK | os.X_OK
if os.path.exists(fpath) and os.access(fpath, accessmode) and not os.path.isdir(fpath):
filemode = os.stat(fpath).st_mode
ret = bool(filemode & stat.S_IXUSR or filemode & stat.S_IXGRP or filemode & stat.S_IXOTH)
return ret
def list_file_exts(directory, search_filename=None, ignore_case=True):
""" Return list of (filename, extension) tuples which match the search_filename"""
if ignore_case:
search_filename = search_filename.lower()
for root, dirs, files in os.walk(path):
for f in files:
filename, extension = os.path.splitext(f)
if ignore_case:
filename = filename.lower()
if not search_filename or filename == search_filename:
yield (filename, extension)
break
fpath, fname = os.path.split(program)
# is a path: try direct program path
if fpath:
if is_exe(program):
return program
elif "win" in sys.platform:
# isnt a path: try fname in current directory on windows
if is_exe(fname):
return program
paths = [path.strip('"') for path in os.environ.get("PATH", "").split(os.pathsep)]
exe_exts = [ext for ext in os.environ.get("PATHEXT", "").split(os.pathsep)]
if not case_sensitive:
exe_exts = map(str.lower, exe_exts)
# try append program path per directory
for path in paths:
exe_file = os.path.join(path, program)
if is_exe(exe_file):
return exe_file
# try with known executable extensions per program path per directory
for path in paths:
filepath = os.path.join(path, program)
for extension in exe_exts:
exe_file = filepath+extension
if is_exe(exe_file):
return exe_file
# try search program name with "soft" extension search
if len(os.path.splitext(fname)[1]) == 0:
for path in paths:
file_exts = list_file_exts(path, fname, not case_sensitive)
for file_ext in file_exts:
filename = "".join(file_ext)
exe_file = os.path.join(path, filename)
if is_exe(exe_file):
return exe_file
return None
Usage looks like this:
>>> which.which("meld")
'C:\\Program Files (x86)\\Meld\\meld\\meld.exe'
The accepted solution did not work for me in this case, since there were files like meld.1, meld.ico, meld.doap, etc also in the directory, one of which were returned instead (presumably since lexicographically first) because the executable test in the accepted answer was incomplete and giving false positives.
This seems simple enough and works both in python 2 and 3
try: subprocess.check_output('which executable',shell=True)
except: sys.exit('ERROR: executable not found')
Added windows support
def which(program):
path_ext = [""];
ext_list = None
if sys.platform == "win32":
ext_list = [ext.lower() for ext in os.environ["PATHEXT"].split(";")]
def is_exe(fpath):
exe = os.path.isfile(fpath) and os.access(fpath, os.X_OK)
# search for executable under windows
if not exe:
if ext_list:
for ext in ext_list:
exe_path = "%s%s" % (fpath,ext)
if os.path.isfile(exe_path) and os.access(exe_path, os.X_OK):
path_ext[0] = ext
return True
return False
return exe
fpath, fname = os.path.split(program)
if fpath:
if is_exe(program):
return "%s%s" % (program, path_ext[0])
else:
for path in os.environ["PATH"].split(os.pathsep):
path = path.strip('"')
exe_file = os.path.join(path, program)
if is_exe(exe_file):
return "%s%s" % (exe_file, path_ext[0])
return None
An important question is "Why do you need to test if executable exist?" Maybe you don't? ;-)
Recently I needed this functionality to launch viewer for PNG file. I wanted to iterate over some predefined viewers and run the first that exists. Fortunately, I came across os.startfile. It's much better! Simple, portable and uses the default viewer on the system:
>>> os.startfile('yourfile.png')
Update: I was wrong about os.startfile being portable... It's Windows only. On Mac you have to run open command. And xdg_open on Unix. There's a Python issue on adding Mac and Unix support for os.startfile.
You can try the external lib called "sh" (http://amoffat.github.io/sh/).
import sh
print sh.which('ls') # prints '/bin/ls' depending on your setup
print sh.which('xxx') # prints None
So basically you want to find a file in mounted filesystem (not necessarily in PATH directories only) and check if it is executable. This translates to following plan:
enumerate all files in locally mounted filesystems
match results with name pattern
for each file found check if it is executable
I'd say, doing this in a portable way will require lots of computing power and time. Is it really what you need?
There is a which.py script in a standard Python distribution (e.g. on Windows '\PythonXX\Tools\Scripts\which.py').
EDIT: which.py depends on ls therefore it is not cross-platform.