Issue running python script in separate folder - python

I Have a directory structured as follows:
application
├── app
│ └── folder
│ └── file_1.py
│ └── Model_data
│ └──data.csv
└── app2
└── some_folder
└── file_2.py
I want to import a function from file_1 inside of file_2. I use:
from application.app.folder.file_1 import load_data
t = load_data()
the problem is that this returns an error. Within the function load_data I call pandas and import csv data from a sub-folder.
df = pd.read_csv('Model_data/data.csv')
this returns a "file doesn't exist error".
how do I resolve this?
file_1 runs fine from within the directory.

You can try changing 'Model_data/data.csv' to its absolute path. example C:/application/app/folder/Model_data/data.csv

You can use a relative path from file_1.py:
from pathlib import Path
def load_data():
file_1_path = Path(__file__)
filename = file_1_path.parent / "Model_data" / "data.csv"
df = pd.read_csv(filename)

Related

Using glob recursion to get sub directories and files containing CSVs

I am trying to concat multiple CSVs that live in subfolders of my parent directory.
/ParentDirectory
│
│
├───SubFolder 1
│ test1.csv
│
├───SubFolder 2
│ test2.csv
│
├───SubFolder 3
│ test3.csv
│ test4.csv
│
├───SubFolder 4
│ test5.csv
When I do
import pandas as pd
import glob
files = glob.glob('/ParentDirectory/*.csv', recursive=True)
df = pd.concat([pd.read_csv(fp) for fp in files], ignore_index=True)
I get ValueError: No objects to concatenate.
But if I select a specific sub folder, it works:
files = glob.glob('/ParentDirectory/SubFolder 3/*.csv', recursive=True)
How come glob isn't able to go down a directory and get the CSVs within each folder of the parent directory?
Try:
files = glob.glob('/ParentDirectory/**/*.csv', recursive=True)
files = glob.glob('/ParentDirectory/*/*.csv')
It doesn't need to be recursive for that pattern, but does need a wildcard for the subdirectory.

How do I match the file name from different directories and replace the partial filename with the actual filename?

So I have a slightly complicated issue that I need some help with :(
In Directory 1, I have the filenames as follows:
00HFP.mp4
0AMBV.mp4
2D5GN.mp4
3HVKR.mp4
3IJGQ.mp4
In Directory 2, I did some processing to the mp4s and got some output files:
_0HFP.usd
_AMBV.usd
_D5GN.usd
_HVKR.usd
_IJGQ.usd
For some reason, the programme I'm using replaces the first number/character with an underscore for some files. Other files are generally left alone. But I need the filenames to match :( How do I do a mass renaming (over 500 files) based on this partial naming using python script? So like for example: _0HFP.usd should become 00HFP.usd since there's a 00HFP.mp4 file in Directory 1.
Please help :( Thank you!
Trying this (as suggested by Corralien): but still doesn't work for me :(
dir1 = pathlib.Path('./mnt/d/Downloads/Charades_v1_480/charades_18Jan/done/')
dir2 = pathlib.Path('./mnt/d/Downloads/Charades_v1_480/charades_18Jan_anim/pt-charades-output/')
print('i am here')
for f1 in dir1.glob('*.mp4'):
print(f1)
f2 = dir2 / f'_{f1.stem[1:]}.usd'
if f2.exists():
f2.rename(dir2 / f'{f1.stem}.usd')
Suppose the following directories:
Dir1
├── 00HFP.mp4
├── 0AMBV.mp4
├── 2D5GN.mp4
├── 3HVKR.mp4
└── 3IJGQ.mp4
Dir2
├── _0HFP.usd
├── _AMBV.usd
├── _D5GN.usd
├── _HVKR.usd
└── _IJGQ.usd
Try:
import pathlib
dir1 = pathlib.Path('./Dir1')
dir2 = pathlib.Path('./Dir2')
for f1 in dir1.glob('*.mp4'):
f2 = dir2 / f'_{f1.stem[1:]}.usd'
if f2.exists():
f2.rename(dir2 / f'{f1.stem}.usd')
After processing:
Dir1
├── 00HFP.mp4
├── 0AMBV.mp4
├── 2D5GN.mp4
├── 3HVKR.mp4
└── 3IJGQ.mp4
Dir2
├── 00HFP.usd
├── 0AMBV.usd
├── 2D5GN.usd
├── 3HVKR.usd
└── 3IJGQ.usd

Finding the difference between two paths

I'm writing a script that will allow me to move a folder and fix an XML based project file.
I'm getting from the user the source and destination paths and saving them in a pathlib.Path object.
My question is, how can I use the 2 paths given by the user to find a relative path to the XML project file in order to replace all appearances of this path?
I have tried to use the relative_to function, but because the project file is not a parent directory, I get an error
Traceback (most recent call last):
File "KeilMoveFile.py", line 50, in <module>
fix_keil_project(keilPrjFile, objToCopy)
File "KeilMoveFile.py", line 29, in fix_keil_project
print(line.replace(str(SrcDstPath.src.relative_to(prjFilePath)),
File "C:\Program Files\Python38\lib\pathlib.py", line 884, in relative_to
raise ValueError("{!r} does not start with {!r}"
ValueError: 'SI\\SI_Boot\\SiBoot' does not start with 'SI\\SI_Boot\\MDK-ARM\\SI_Boot.uvprojx'
My Current Project Layout
.
├── _libs
│ ├── src
│ └── inc
└── MDK_arm
└── projectFile
The desiered Project Layout
.
├── _libs
| └── Application
│ ├── src
│ └── inc
├── MDK_arm
└── projectFile
The code I'm currently running to fix the project file
def fix_keil_project(prjFilePath, SrcDstPath):
with fileinput.FileInput(prjFilePath, inplace=True, backup='.bak') as file:
dstStrToReplace = str(SrcDstPath.dst.relative_to(prjFilePath))
srcStrToReplace = str(SrcDstPath.src.relative_to(prjFilePath))
for line in file:
print(line.replace(srcStrToReplace, dstStrToReplace), end='')
Problem is that relative_to() search only subfolders but not folder which would need ...
You will have to use os.path.relpath() instead of module pathlib.
Error shows two paths 'SI\\SI_Boot\\SiBoot', 'SI\\SI_Boot\\MDK-ARM\\SI_Boot.uvprojx' so I use them in examples. Because I use Linux so I tested with / instead of \\.
pathlib gives error
import pathlib
path1 = 'SI/SI_Boot/SiBoot'
path2 = 'SI/SI_Boot/MDK-ARM/SI_Boot.uvprojx'
src = pathlib.Path(path1)
dst = pathlib.Path(path2)
print(src.relative_to(dst))
#print(dst.relative_to(src))
Result: (error like in your question)
ValueError: 'SI/SI_Boot/SiBoot' does not start with 'SI/SI_Boot/MDK-ARM/SI_Boot.uvprojx'
But os.path.relpath gives expected result
import os
path1 = 'SI/SI_Boot/SiBoot'
path2 = 'SI/SI_Boot/MDK-ARM/SI_Boot.uvprojx'
print(os.path.relpath(path1, path2))
print(os.path.relpath(path2, path1))
Result:
../../SiBoot
../MDK-ARM/SI_Boot.uvprojx

Rename part of filenames in sub-folders python

I have to rename images in main directory with contains sub-folders, script with I using right now do some work but not exactly what I need: I can't find a way to do it properly, now i have it:
maindir #my example origin
├── Sub1
│ ├── example01.jpg
│ ├── example02.jpg
│ └── example03.jpg
└── Sub2
├── example01.jpg
├── example02.jpg
└── example03.jpg
My script do that:
maindir
├── Sub1
│ ├── Sub1_example01.jpg
│ ├── Sub1_example02.jpg
│ └── Sub1_example03.jpg
└── Sub2
├── Sub2_example01.jpg
├── Sub2_example02.jpg
└── Sub2_example03.jpg
And I would like to get it :replace a letters in my filenames by my sub-folder name and keep the origin numbers of my jpg:
maindir
├── Sub1
│ ├── Sub1_01.jpg
│ ├── Sub1_02.jpg
│ └── Sub1_03.jpg
└── Sub2
├── Sub2_01.jpg
├── Sub2_02.jpg
└── Sub2_03.jpg
there is my code 4 witch I using:
from os import walk, path, rename
parent = ("F:\\PS\\maindir")
for dirpath, _, files in walk(parent):
for f in files:
rename(path.join(dirpath, f), path.join(dirpath, path.split(dirpath)[-1] + '_' + f))
what I have to change overhere to get my result???
instead of that line:
rename(path.join(dirpath, f), path.join(dirpath, path.split(dirpath)[-1] + '_' + f))
generate a new name using str.replace:
newf = f.replace("example",os.path.basename(dirpath)+"_")
then
rename(path.join(dirpath, f), path.join(dirpath,newf))
of course if you don't know the extension or the "prefix" of the input file, and only want to keep the number & extension, there's a way:
import re
number = (re.findall("\d+",f) or ['000'])[0]
this extracts the number from the name, and if not found, issues 000.
Then rebuild newf with the folder name, the extracted number & the original extension:
newf = "{}_{}.{}".format(os.path.basename(dirpath),number,os.path.splitext(f)[1])

How to read folder structure and assign it to datastructure?

I'm only starting with python, and I'm trying to accomplish following:
I have a folder structure (simplified):
.
├── folder1
│ ├── file1
│ └── file2
├── folder2
│ └── file3
└── folder3
├── file4
├── file5
└── file6
I'd like to read filenames into some kind of a datastructure, that is able to distinguish which files are from the same folder. I've used glob in a one folder case, but is it possible to get for example following datastructure using glob?
files = [{file1, folder1}, {file2, folder1}, {file3, folder2}...]
I assume you'd rather get this kind of structure:
files = {folder1: [file1, file2], folder2: [file3], ...}
The following code will do the trick:
import os
rootDir = '.'
files = {}
for dirName, subdirList, fileList in os.walk(rootDir):
files[dirName] = fileList

Categories

Resources