I made a script in the past to mass rename any file greater than x characters in a directory. When I made that script I had a source directory which you would need to input manually. Any file that was over x characters in that directory would be stripped of it's extension, renamed, then the extension would be re added and it would use os.path.join to join the source and the newly created filename+ext. I'm now making another script and used os.path.join("Folder in the current dir", "file in that dir"). Because this worked I'm guessing that when os.path.join is called with just a foldername and no full path in it's first parameter it starts it's search from the directory that the script it was run in? Just wondering if this is correct.
os.path.join has nothing to do with any actual filesystem, and does not "start" anywhere. It simply joins two arbitrary paths, whether they exist or not.
What os.path.join does is to just join path elements the system-compatible way, taking into effect the particular directory separator character, etc., into account. It's a simple string manipulation tool.
So the returned result simply starts from whatever you give to it as the first argument.
Related
Im splitting a CSV file based on column "ColumnName". How can I make all the CSV files created save into a specified path?
data = pd.read_csv(r'C:\Users\...\Output.csv')
for (ColumnName), group in data.groupby(['ColumnName']):
group.to_csv('{ColumnName}.csv', index=False)
Thanks
pandas.DataFrame.to_csv() takes a string path as input to write to said path.
With your current code group.to_csv('{ColumnName}.csv', index=False), {ColumnName} is being interpreted as a normal string. If you wanted variable substition in this case Python has many methods, two would be:
f-strings - Introduced in Python 3.6
group.to_csv('{ColumnName}.csv', index=False)
str.format
group.to_csv('{}.csv'.format(ColumnName), index=False)
Specifying path
Following this. If you're looking to specify more than just the file name, you are able to specify the full file path or the file path relative to the current directory.
Providing full file path
Full file paths require describing the path from the root context. In windows this would be providing a path such as f'C:\Users\mycsvfolder\{ColumnName}.csv'. Providing the full path to to_csv() will have the file written there.
Note In linux, root context starts at /. So for example /Users/myuser/mycsvfolder/file.csv would be the full file path.
Providing a relative file path
Relative file paths take into account the current folder. For example, to instead write to a folder within the current folder you are able to specify f'mycsvfolder/{ColumnName}.csv' and the file will be written to the specified folder in the current directory. It's with this method that writing f'{ColumnName}.csv' will write a file, but to the current directory as work is relative to the current directory unless otherwise specified.
Note when writing to folder
You will need to create folders before writing to them in most cases. Some write functions do provide folder creation functionality however.
Additional material regarding paths, specifically in Python.
I'm new to Python and I'm trying to access a file with a full path represented by the following:
'X:/01 File Folder/MorePath/Data/Test/myfile.txt'
Every time I try to build the full string using os.path.join, it ends up slicing out everything between the drive letter and the second path string, like so:
import os
basePath = 'X:/01 File Folder/MorePath'
restofPath = '/Data/Test/myfile.txt'
fullPath = os.path.join(basePath,restofPath)
gives me:
'X:/Data/Test/myfile.txt'
as the fullPath name.
Can anyone tell me what I'm doing wrong? Does it have something to do with the digits near the beginning of the base path name?
The / at the beginning of your restofPath means "start at the root directory." So os.path.join() helpfully does that for you.
If you don't want it to do that, write your restofPath as a relative directory, i.e., Data/Test/myfile.txt, rather than an absolute one.
If you are getting your restofPath from somewhere outside your program (user input, config file, etc.), and you always want to treat it as relative even if the user is so gauche as to start the path with a slash, you can use restofPath.lstrip(r"\/").
I am using os.walk to run through a tree of directories check for some input files and then run a program if the proper inputs are there. I notice I am having a problem because of the away that os.walk is evaluating the root variable in the loop:
for root, dirs, files in os.walk('.'):# I use '.' because I want the walk to
# start where I run the script. And it
# will/can change
if "input.file" in files:
infile = os.path.join(root,"input.file")
subprocess.check_output("myprog input.file", Shell=True)
# if an input file is found, store the path to it
# and run the program
This is giving me an issue because the infile string looks like this
./path/to/input.file
When it needs to look like this for the program to be able to find it
/home/start/of/walk/path/to/input.file
I want to know if there is a better method/ a different way to use os.walk such that I can leave the starting directory arbitrary, but still be able to use the full path to any files that it finds for me. Thanks
The program I am using is written by me in c++ and I suppose I could modify it as well. But I am not asking about how to do that in this question just to clarify this question is about python's os.walk and related topics that is why there is no examples of my c++ code here.
Instead of using ., convert it to the absolute path by using os.path.abspath("."). That will convert your current path to an absolute path before you begin.
I have a script that will pull files from two directories back, so the script resides at:
/folder2/folder1/folder0/script.py
and the files that will be processed will be in folder2.
I can get back one level with "..//" (I'm making a Windows executable with cx_free) but I'm thinking this isn't the best way to do this.
I am setting an input directory and an output directory. I want to keep the paths relative to the location of the script so that "folder2" can be moved without screwing up the functionality of the script or force rewriting of it.
thanks
First you get the directory where your script is located, like so:
current_dir = os.path.dirname(os.path.realpath(__file__))
And then, if you know you will always use the directory two levels above, just use:
target_dir = os.path.join(current_dir, '..', '..')
From there you can manipulate files from the target_dir as you please.
Edit
From adsmith, instead of joining two ".." paths together, you can instead define target_dir as:
target_dir = os.path.sep.join(current_dir.split(os.path.sep)[:-2])
Which will simply cut off the last two directories in your path, instead of them ending in a few uglier ".."s. So, the first method would look something like:
/path/to/folder2/folder1/directory/../..
Whereas the second implementation would be:
/path/to/folder2/
I would use your suggested method of os.chdir(r'..\..') to make sure your current working directory is in folder2. I'm not really sure what you're asking though, so maybe clarify why you think this ISN'T the right solution?
So, I want to create a simple script to create directories based upon the file names contained within a certain folder.
My method looks like this:
def make_new_folders(filenames, destination):
"""
Take a list of presets and create new directories using mkdir
"""
for filename in filenames:
path = '"%s/%s/"' % (destination, filename)
subprocess.call(["mkdir", path])
For some reason I can't get the command to work.
If I pass in a file named "Test Folder", i get an error such as:
mkdir: "/Users/soundteam/Desktop/PlayGround/Test Folder: No such file or directory
Printing the 'path' variable results in:
"/Users/soundteam/Desktop/PlayGround/Test Folder/"
Can anyone point me in the right direction?
First of all, you should use os.path.join() to glue your path parts together because it works cross-platform.
Furthermore, there are built-in commands like os.mkdir or os.makedirs (which is really cool because it's recursive) to create folders. Creating a subprocess is expensive and, in this case, not a good idea.
In your example you're passing double-quotes ("destination/filename") to subprocess, which you don't have to do. Terminals need double-quotes if you use whitespaces in file or folder names, subprocess takes care of that for you.
You don't need the double quotes. subprocess passes the parameters directly to the process, so you don't need to prepare them for parsing by a shell. You also don't need the trailing slash, and should use os.path.join to combine path components:
path = os.path.join(destination, filename)
EDIT: You should accept #Fabian's answer, which explains that you don't need subprocess at all (I knew that).