how to easily overwrite paths in python - python

I have two paths to a file like below;
old_path_1 = 'old_path_1/12374994/12324515/000000.dcm'
old_path_2 = 'old_path_2/07-20-2016-DDSM-74994/1_full_24515/000000.dcm'
I want to have a new path like below in order to create .csv file, that contains the correct paths to all images.
new_path = 'old_path_1/07-20-2016-DDSM-74994/1_full_24515/000000.dcm'
** only 12374994/12324515 must be replaced by 07-20-2016-DDSM-74994/1_full_24515.
I have to do this as there are some inconsistent in the path of the original file. Could anyone show me how can we do this in python in simpler way?
this is what I did;
old_path_1.split('/')[0]+ '/' + old_path_2.split('/')[1]+'/' +old_path_2.split('/')[2]+'/' +old_path_1.split('/')[3]
is there any better way?

I think your question needs some more explanation about the general case you're dealing with.
However, if this is the only case you're dealing then you only need to replace the '2' in old_path_2 in a '1' so:
new_path = old_path_2
new_path[9] = '1'
Or, if you're looking for a one liner:
new_path = old_path_1[:10] + old_path_2[10:]

Related

How to make subfolders inside a parameterized folder?

I made a folder and inside there are 100 subfolders which are made by parameters. Now I want to create one subfolder inside each of this 100 subfolders. But whatever I am doing it is not working.
I added a simple example.
number=[1,2,3]
for i in range (len(number)):
Name = 'GD_%d'%(number[i])
os.mkdir('C:/Temp/t2_t1_18/'+Name) #till this works fine
subfolder_name='S1_%d'%(number[i])
#This does not work and idea somehow not correct
os.mkdir(os.path.join('C:/Temp/t2_t1_18/Name'+subfolder_name))
Some Notes
It is better not to use string concatenation when concatenating paths.
Since you just need the numbers it is better to iterate over them, instead of using range
You can take a look at python's new way of formatting https://realpython.com/python-f-strings/
Assuming I got your question right and you want to create a subdirectory in the newly created directory, I would do something like that
import os
numbers = [1,2,3]
main_dir = os.path.normpath('C:/Temp/t2_t1_18/')
for number in numbers:
dir_name = f'GD_{number}'
# dir_name = 'GD_{}'.format(number) # python < 3.6
dir_path = os.path.join(main_dir, dir_name)
os.mkdir(dir_path)
subdir_name = f'S1_{number}'
subdir_path = os.path.join(dir_path, subdir_name)
os.mkdir(subdir_path)
There is a better answer to your question already.
In your example this should be an easy solution (if your Python version is sufficient):
from pathlib import Path
numbers = (1, 2, 3, 4)
for n in numbers:
Path(f"C:/Temp/t2_t1_18/GD_{n}/S1_{n}").mkdir(parents=True, exist_ok=True)
I'm not certain I understand what you're trying to do, but here is a version of your code that is cleaned up a bit. It assumes the C:\Temp directory exists, and will create 3 folders in C:\Temp, and 1 subfolder in each of those 3 folders.
import os
numbers = [1,2,3]
base_path = os.path.join('C:/', 'Temp')
for number in numbers:
# create the directory C:\Temp\{name}
os.mkdir(os.path.join(base_path, f'GD_{number}'))
# create the directory C:\Temp\{name}\{subfolder_name}
os.mkdir(os.path.join(base_path, f'GD_{number}', f'S1_{number}'))
Some Notes and Tips:
Indentation is part of the syntax in python, so make sure you indent every line that is in a code block (such as your for loop)
There are many ways to format strings, I like f-strings (a.k.a. string interpolation) which were introduced in python 3.6. If you're using an earlier version of python, either update, or use a different string formatting method. Whatever you choose, be consistent.
It is a good idea to use os.path.join() when working with paths, as you were trying to do. I expanded the use of this method in the code above.
As another answer pointed out, you can simply iterate over your numbers collection instead of using range() and indexing.

Add character in a link

I want to add an character to a link.
The link is C:\Users\user\Documents\test.csv I want to add C:\Users\user\Documents\test_new.csv.
So you can see I added the _new to the filename.
Should I extract the name with Path(path).name) and then with Regex? What is the best option for do that?
As you said you want to "add" _new and not rename here is your solution and it is tiny just 2 lines of code apart from the varaible and the result, this is solution might be complex because i have compressed the code to take less memory and do the work fast, you could also change the keyword and the extension from the OUTPUT FUNCTION arguments
PATH = "C:\\User\\Folder\\file.csv"
def new_name(path, ext="csv", keyword="_new"):
print('\\'.join(path.split("\\")[:-1])+"\\"+path.split("\\")[-1].split(".")[0] + keyword + "." + ext)
new_name(PATH)
Here's a solution using the os module:
path = r"C:\User\Folder\file.csv"
root, ext = os.path.splitext(path)
new_path = f'{root}_new{ext}'
And here's one using pathlib:
path = pathlib.Path(r"C:\User\Folder\file.csv")
new_path = str(path.with_stem(path.stem + '_new'))

access a file with not fully known filename in python

I have a huge database of files whose names are like:
XYZ-ABC-K09235D1-20151220-5H1E2H4A.txt
XYZ-ABC-W8D2S5G5-20151225-HG2EK4GE.txt
XYZ-ABC-ME2C5K32-20160206-DD8BA4R6.txt
etc...
Names have all the same structure:
'XYZ-ABC-' + 8 random char + '%y%m%d' + 8 random char + '.txt'
Now, I need to open a file, given the date. The point is that, I don't know the exact name of the file, as there are some random chars within. For instance, for datetime 12/05/2014 I know the filename will be something like
XYZ-ABC-????????-20140512-????????.txt
but I don't know the exact name when using f.open command. What could be the best way to do this? (I thought about first creating a list with all filenames, but I don't know whether it's a good technique or if it's better to use something like glob...). Thank you in advance.
You can use following code
import os
fileName = [filename for filename in os.listdir('.') if filename.startswith("prefix") and 'otherstring' in filename]
Hope this helps !

os.path.join producing an extra forward slash

I am trying to join an absolute path and variable folder path depending on the variable run. However when I use the following code it inserts a forward slash after a string, which I don't require. How can I remove the slash after Folder_?
import os
currentwd = os.getcwd()
folder = '001'
run_folder = os.path.join(currentwd, 'Folder_', folder)
print run_folder
The output I get using this code is:
/home/xkr/Workspace/Folder_/001
You are asking os.path.join() to take multiple path elements and join them. It is doing its job.
Don't use os.path.join() to produce filenames; just use concatenation:
run_folder = os.path.join(currentwd, 'Folder_' + folder)
or use string formatting; the latter can give you such nice features such as automatic padding of integers:
folder = 1
run_folder = os.path.join(currentwd, 'Folder_{:03d}'.format(folder))
That way you can increment folder past 10 or 100 and still have the correct number of leading zeros.
Note that you don't have to use os.getcwd(); you could also use os.path.abspath(), it'll make relative paths absolute based on the current working directory:
run_folder = os.path.abspath('Folder_' + folder)

python - check if list entry is contained in string

I have a script where I'm walking through a list of passed (via a csv file) paths. I'm wondering how I can determine if the path I'm currently working with has been previously managed (as a subdirectory of the parent).
I'm keeping a list of managed paths like this:
pathsManaged = ['/a/b', '/c/d', '/e']
So when/if the next path is '/e/a', I want to check in the list if a parent of this path is present in the pathsManaged list.
My attempt so far:
if any(currPath in x for x in pathsManaged):
print 'subdir of already managed path'
This doesn't seem to be working though. Am I expecting too much from the any command. Are there any other shortcuts that I could use for this type of look-up?
Thanks
Perhaps:
from os.path import dirname
def parents(p):
while len(p) > 1:
p = dirname(p)
yield p
pathsManaged = ['/a/b', '/c/d', '/e']
currPath = '/e/a'
if any(p in pathsManaged for p in parents(currPath)):
print 'subdir of already managed path'
prints:
subdir of already managed path
Assuming that pathsManaged contains absolute paths (otherwise I think all bets are off), then you could make currPath an absolute path and see if it starts with any of the paths in pathsManaged. Or in Python:
def is_path_already_managed(currPath):
return any(
os.path.abspath(currPath).startswith(managed_path)
for managed_path in pathsManaged)
Also conceptually I feel pathsManaged should be a set, not a list.
If I understand you correctly, you want to check if any of pathsManaged is part of currPath, but you are doing this other way around.
Depending on what you want, one of this should work for you:
any(x in currPath for x in pathsManaged)
any(currPath.startswith(x) for x in pathsManaged)
os.path.dirname(currPath) in pathsManaged

Categories

Resources