I am beginner of Python user and select Visual Studio Code as editor. Recently I write down one Python file to identify all the files/directory name at the same level with and then output txt files to list down all the files/directory name that match my rule.
I remember in last month, when I run this Python file with Visual Studio Code, the output files will be seen at the parent folder(upper/previous level). But today, there is no output files after running this Python file with Visual Studio Code. Due to this reason, I double click the Python file directly to run it without Visual Studio Code and see the output files at the same level along with my Python file.
So my problems are:
How to ensure we can get the output files by running Python file with Visual Studio Code?
How to generate the output files at the same level along with Python file that would be run?
Code:
import os
CurrentScriptDir = os.path.dirname(os.path.realpath(__file__))
All_DirName = []
for root, dirs, files in os.walk(CurrentScriptDir):
for each_dir in dirs:
All_DirName.append(each_dir)
for Each_DirName in All_DirName:
Each_DirName_Split = Each_DirName.split('_')
if Each_DirName_Split[3] == 'twc':
unitname = "_".join(Each_DirName_Split[0:-1])
with open(unitname + ".txt", "a") as file:
file.write(Each_DirName + "_K3" + "\n")
file.close()
else:
next
tl;dr;
Below is my how I would write your program while adhering to the original code's code flow. Explanation follows this, and I can update this answer if you provide more details.
To avoid confusion with paths, I would suggest simply requiring the user to provide it when running the script. The path provided by the user is the path that gets scanned, and is also the location of all text files the script creates; cwd and location of script file is then irrelevant.
import os
import sys
# Usage:
# python Program.py <path>
def find_twc_folders(path):
for root, dirs, files in os.walk(path):
for dir in dirs:
parts = dir.split('_')
if len(parts) == 4 and parts[3] == 'twc': # 'a_twc', 'a_b_c_twc_d', etc are skipped
with open(os.path.join(path, dir[:-4] + '.txt'), 'a') as file: # substring with '_twc' removed
file.write(dir + '_K3\n')
if __name__ == '__main__':
if len(sys.argv) > 1:
find_twc_folders(sys.argv[1])
else:
find_twc_folders(os.path.dirname(os.path.realpath(__file__)))
(EDIT: Changed to use the script's directory if program is called with no args).
Folder setup:
Given the following directory setup, with your current working directory (cwd) in the VSCode terminal being one level above root:
PS C:\Users\there\source\repos\SO\75241788> tree /f
C:.
├───.vscode
└───root
│ Program.py
│
├───0_duplicate_path_twc
├───1a_one_two_three
│ ├───0_duplicate_path_twc
│ ├───2a_one_two_three
│ │ ├───0_duplicate_path_twc
│ │ ├───3a_one_two_three
│ │ └───3b_one_two_twc
│ └───2b_one_two_twc
├───1b_one_two_twc
│ ├───2a_one_two_three
│ ├───2b_one_two_three
│ ├───2c_one_two_twc
│ └───2d_one_two_twc
└───1c_one_two_twc
A dry run gives us the following, after replacing the actual file operations with print():
PS C:\Users\there\source\repos\SO\75241788> python root/Program.py
CurrentScriptDir: C:\Users\there\source\repos\SO\75241788\root
in "0_duplicate_path_twc" # <- in top level directory
in "1a_one_two_three"
in "1b_one_two_twc"
open 1b_one_two.txt
print: 1b_one_two_twc_K3\n
in "1c_one_two_twc"
open 1c_one_two.txt
print: 1c_one_two_twc_K3\n
in "0_duplicate_path_twc" # <- in sub level directory
in "2a_one_two_three"
# ...
In the current implementation, you are only pushing the directory name into your array, not the full path. A relative path that is unqualified will be considered rooted under the cwd by the OS, so your script will create all files at the location you see in your terminal to the left of the >.
Operating on folder names alone in this manner also means identical-named folders at different levels will result in multiple (duplicate) entries being added to the same file.
Code fixes
The final else in your program is unnecessary, as your for loop does that anyways. As mentioned by #rioV8, next is being used incorrectly here also. Also pointed out by him, there is no need to close the file in this case, since with does that for you.
As it stands, removing the unneeded All_DirName array, removing the last 3 lines previously mentioned, moving your join operation inline, and prepending your filepaths with CurrentScriptDir, result in:
import os
CurrentScriptDir = os.path.dirname(os.path.realpath(__file__))
for root, dirs, files in os.walk(CurrentScriptDir):
for each_dir in dirs:
Each_DirName_Split = each_dir.split('_')
# todo: check length > 3 first (or) compare last index instead
if Each_DirName_Split[3] == 'twc':
unitname = "_".join(Each_DirName_Split[0:-1])
with open(os.path.join(CurrentScriptDir, unitname) + '.txt'), 'a') as file:
file.write(each_dir + '_K3\n')
...And running it in the before-mentioned setup will walk all folders found in the folder the script is located in, saving all files to that same folder also.
EDIT: Added os.path.join(CurrentScriptDir, ...) in the previous code example to ensure the files are written next to the source program, regardless of current working directory.
I have create a python project that update the data excel for my work, the problem is i need to create a config file so that others people can use my python project without changing the code.
My code:
import pandas as pd
import datetime
timestr = datetime.date.today().strftime('%d%m%Y')
Buyerpath = 'https://asd.cvs'
Sellerpath = 'https://dsa.csv'
Onlinepath = 'https://sda.csv'
Totalpath = pd.DataFrame({'BuyValue': Buyerpath.Buytotal,
'SellerValue': Sellerpath.Sellertotal,
'OnlineValue':Onlinepath.onlinetotal})
Totalpath.to_excel(index=False, excel_writer=r'C:\Users\Mike\Desktop\Result\ResultTotal'+timestr+'.xlsx')
I need the config file to allow others people use my python code and save the excel update in the output folder.
I think what you're asking is: how do I hide strings/variables that are specific to my local machine? correct?
The easiest way to do this is with another python file
my-project
├── __init__.py
├── main.py
└── secrets.py
put your the strings into you're secrets file, then import the values into your main.py file
from .secrets import secret_val
I'm trying to load CSV data into a tensorflow Dataset object, but don't know how to associate the label with the CSV files given my directory structure.
I've got a directory structure like:
gesture_data/
├── train/
│ └── gesture{0001..9999}/ <- each directory name is the label
│ └── {timestamp}.txt <- each file is an observation associated with that label
├── test/
└── valid/
Despite having a .txt extension, all the files
gesture_data/{test,train,valid}/gesture{0001..9999}/*.txt are CSV files, with a format like:
│ File: train/gesture0002/2022-05-24T01:59:08.244689+02:00.txt
───────┼─────────────────────────────────────────────────────────────
1 │ 0,391,478,528,374,495,471,405,471,438,396,510,473,401,475,192,383,516,501,412,496,453,395,496,445,376,479,470,402,488,445
2 │ 19,402,488,514,371,494,471,407,472,441,390,514,475,406,488,185,395,499,496,399,488,451,409,490,463,382,490,467,403,487,467
3 │ 40,404,490,526,372,484,487,408,472,441,395,506,477,406,474,193,398,496,504,414,493,459,405,476,446,393,495,467,399,473,447
4 │ 56,400,491,525,370,479,486,386,457,439,383,511,466,406,473,192,398,505,503,411,476,450,412,494,461,389,491,467,397,483,392
5 │ 82,391,478,524,371,483,486,408,473,437,394,513,456,410,483,186,397,500,494,398,491,442,402,490,468,386,495,452,386,491,409
... about 200 more lines after this
Where the first value on a line is milliseconds since the start of recording, and after that are 30 sensor readings taken at that millisecond offset.
Each file is one observation, and the directory the file is in is the label of that observation. So all the files under gesture0001 should have the label gesture0001,all the files under gesture0002 should have the label gesture0002, and so on.
I can't see how to do that easily without making my own custom mapping, but this seems like a common data format and directory structure so I'd imagine there'd be an easier way to do it?
Currently I read in the files like:
gesture_ds = tf.data.experimental.make_csv_dataset(
file_pattern = "../gesture_data/train/*/*.txt",
header=False,
column_names=['millis'] + fingers, # `fingers` is an array labeling each of the sensor measurements
batch_size=10,
num_epochs=1,
num_parallel_reads=20,
shuffle_buffer_size=10000
)
But I don't know how to label the data from here. I found the label_name parameter to make_csv_dataset but that requires the label name to be one of the columns of the CSV file.
I can restructure the CSV file to include the label name as a column, but I'm expecting a lot of data and don't want to bloat the files if I can possibly help it.
Thanks!
My goal is to append 9 excel files together that exist in different directories. I have a directory tree with the following structure:
Big Folder
|
├── folder_1/
| ├── file1.xls
| ├── file2.xls
| └── file3.xls
|
├── folder_2/
| ├── file4.xls
| ├── file5.xls
| └── file6.xls
|
├── folder_3/
| ├── file7.xls
| ├── file8.xls
| └── file9.xls
I successfully wrote a loop that appends file1, file2, and file3 together within folder_1. My idea is to nest this loop into another loop that flows through each folder as a list. I'm currently tring to us os.walk to accomplish this but am running into the following error in folder_1
[Errno 2 No such file or directory]
Do community members have recommendations on how to extend this loop to execute in each directory? Thanks!
It is hard for me to know how you have implemented the program without given some sort of code to work with, however I believe you have misused the os.walk() method, please read about it here.
I would use the os.walk() method the following way for getting the path to various files in a current directory and subdirectories.
import os
all_files = [(path, files) for path, dirs, files in os.walk(".")]
and then get all the files which ends with "*.xls" like so
all_xls_files = [
os.path.join(path, xls_file)
for (path, xls_files_list) in all_files
for xls_file in xls_files_list
if xls_file.endswith(".xls")
]
this is equivalent to
all_xls_files = []
for (path, xls_files_list) in all_files:
for xls_file in xls_files_list:
if xls_file.endswith(".xls"):
files.append(os.path.join(path, xls_file))
Once you obtain a list of excel files with their path
you can open them by
with open("my_output_file", "w") as output_file:
for file in all_xls_files:
with open(file) as f:
# Do your append here