Im splitting a CSV file based on column "ColumnName". How can I make all the CSV files created save into a specified path?
data = pd.read_csv(r'C:\Users\...\Output.csv')
for (ColumnName), group in data.groupby(['ColumnName']):
group.to_csv('{ColumnName}.csv', index=False)
Thanks
pandas.DataFrame.to_csv() takes a string path as input to write to said path.
With your current code group.to_csv('{ColumnName}.csv', index=False), {ColumnName} is being interpreted as a normal string. If you wanted variable substition in this case Python has many methods, two would be:
f-strings - Introduced in Python 3.6
group.to_csv('{ColumnName}.csv', index=False)
str.format
group.to_csv('{}.csv'.format(ColumnName), index=False)
Specifying path
Following this. If you're looking to specify more than just the file name, you are able to specify the full file path or the file path relative to the current directory.
Providing full file path
Full file paths require describing the path from the root context. In windows this would be providing a path such as f'C:\Users\mycsvfolder\{ColumnName}.csv'. Providing the full path to to_csv() will have the file written there.
Note In linux, root context starts at /. So for example /Users/myuser/mycsvfolder/file.csv would be the full file path.
Providing a relative file path
Relative file paths take into account the current folder. For example, to instead write to a folder within the current folder you are able to specify f'mycsvfolder/{ColumnName}.csv' and the file will be written to the specified folder in the current directory. It's with this method that writing f'{ColumnName}.csv' will write a file, but to the current directory as work is relative to the current directory unless otherwise specified.
Note when writing to folder
You will need to create folders before writing to them in most cases. Some write functions do provide folder creation functionality however.
Additional material regarding paths, specifically in Python.
Related
I have a script in python, I want to import a csv from another folder. how can I do this? (for example, my .py is in a folder and I want to reach the data from the desktop)
First of all, you need to understand how relative and absolute paths work.
I write an example using relative paths. I have two folders in desktop called scripts which includes python files and csvs which includes csv files. So, the code would be:
df = pd.read_csv('../csvs/file.csv)
The path means:
.. (previous folder, in this case, desktop folder).
/csvs (csvs folder).
/file.csv (the csv file).
If you are on Windows:
Right-click on the file on your desktop, and go to its properties.
You should see a Location: tag that has a structure similar to this: C:\Users\<user_name>\Desktop
Then you can define the file path as a variable in Python as:
file_path = r'C:\Users\<your_user_name>\Desktop\<your_file_name>.csv'
To read it:
df = pd.read_csv(file_path)
Obviously, always try to use relative paths instead of absolute paths like this in your code. Investing some time into learning the Pathlib module would greatly help you.
My work needs me to collect some file names and their generating time.
I am using the fileName = tkinter.filedialog.askopenfilenames() to realize the function, that the program pops up a window to ask for selecting files, then I can get the files' pathes and then use fileGeneratedTime = datetime.datetime.fromtimestamp(os.path.getmtime(fileName)) to get the files' generating time.
But now here comes the problem. When I want to get the path of a .lnk file, however, it returns the path of the file which the .lnk file is pointing to. It is OK to run the program on the origin computer that has the .lnk files, but when I copy the .lnk files to other computers, the program says FileNotFoundError.
So, is there any parameters that can make the fileName = tkinter.filedialog.askopenfilenames() returns the .lnk file itself's path (not the path of the file which the .lnk file points to)? Or is there any other ways to realize the same function?
Thanks for your answering!
I have 2 directories. I had a python program located in dir_1 writing to a .txt file also in dir_1. I meant to create them in dir_2, but when I move them both to dir_2, the python program, instead of writing to the existing .txt file that is now with it in dir_2, creates a new .txt file in dir_1 and writes to it. How do I fix this? I'm very new to programming and python and googling didn't help me out, probably because I didn't know what exactly to search.
with open('guest_book.txt', 'w') as file:
while True:
name = input('Please enter your name: ')
if name == 'q':
break
else:
print(f"Hello, {name.title()}!\nYou have been added to the guest"
f"book")
file.write(f"{name.title()}\n")
Python writes to the file location you supply it with. If this file location is a relative path, then it will create files relative to the directory of the script. I.e. when you move the script then the .txt file will be created relative to the new directiory.
On the other hand, if you provide an absolute path, then it does not matter where the script is located / where you execute it from. Instead, it will create the file at that location always.
From the sounds of it, you are using an absolute path when you want a relative path.
So change from something like /home/bob/file.txt (Linux) or C:\\Users\Bob\file.txt (Win) to simply file.txt or even ./file.txt.
Update: Since you were using a relative location all along, the problem will lie with the context that you are executing the script from. Your code is not the issue here, it is how you are executing it.
As vlovero suggests, maybe your IDE is not executing the new file in its new location?
One way you can test this robustly is to navigate to dir_2 in a terminal and run
python your_program_name.py
This will execute the script in the dir_2 location.
Since you have not specified an absolute path, your program is then specifying a directory relative to the current working directory (if instead, for example, you had specified a path such as '../guest_book.txt', you would have been specifying a directory one level above the current working directory). So let's imagine your OS is Linux and the Python program resides in /my_home/programs:
cd /my_home/data # this is the current working directory
python ../programs/your_program.py
The current working directory when the program is executed is /home/my_home/data even though the program being executed resides in /my_home/programs, and thus the output file will be created in the /my_home/data directory. os.getcwd() can be called to tell you what the current working directory is.
I'm doing a Python course on Udacity. And there is a class called Rename Troubles, which presents the following incomplete code:
import os
def rename_files():
file_list = os.listdir("C:\Users\Nick\Desktop\temp")
print (file_list)
for file_name in file_list:
os.rename(file_name, file_name.translate(None, "0123456789"))
rename_files()
As the instructor explains it, this will return an error because Python is not attempting to rename files in the right folder. He then proceeds to check the "current working directory", and then goes on to specify to Python which directory to rename files in.
This makes no sense to me. We are using the for loop to specifically tell Python that we want to rename the contents of file_list, which we have just pointed to the directory we need, in rename_files(). So why does it not attempt to rename in that folder? Why do we still need to figure out cwd and then change it?? The code looks entirely logical without any of that.
Look closely at what os.listdir() gives you. It returns only a list of names, not full paths.
You'll then proceed to os.rename one of those names, which will be interpreted as a relative path, relative to whatever your current working directory is.
Instead of messing with the current working directory, you can os.path.join() the path that you're searching to the front of both arguments to os.rename().
I think your code needs some formatting help.
The basic issue is that os.listdir() returns names relative to the directory specified (in this case an absolute path). But your script can be running from any directory. Manipulate the file names passed to os.rename() to account for this.
Look into relative and absolute paths, listdir returns names relative to the path (in this case absolute path) provided to listdir. os.rename is then given this relative name and unless the app's current working directory (usually the directory you launched the app from) is the same as provided to listdir this will fail.
There are a couple of alternative ways of handling this, changing the current working directory:
os.chdir("C:\Users\Nick\Desktop\temp")
for file_name in os.listdir(os.getcwd()):
os.rename(file_name, file_name.translate(None, "0123456789"))
Or use absolute paths:
directory = "C:\Users\Nick\Desktop\temp"
for file_name in os.listdir(directory):
old_file_path = os.path.join(directory, file_name)
new_file_path = os.path.join(directory, file_name.translate(None, "0123456789"))
os.rename(old_file_path, new_file_path)
You can get a file list from ANY existing directory - i.e.
os.listdir("C:\Users\Nick\Desktop\temp")
or
os.listdir("C:\Users\Nick\Desktop")
or
os.listdir("C:\Users\Nick")
etc.
The instance of the Python interpreter that you're using to run your code is being executed in a directory that is independent of any directory for which you're trying to get information. So, in order to rename the correct file, you need to specify the full path to that file (or the relative path from wherever you're running your Python interpreter).
I made a script in the past to mass rename any file greater than x characters in a directory. When I made that script I had a source directory which you would need to input manually. Any file that was over x characters in that directory would be stripped of it's extension, renamed, then the extension would be re added and it would use os.path.join to join the source and the newly created filename+ext. I'm now making another script and used os.path.join("Folder in the current dir", "file in that dir"). Because this worked I'm guessing that when os.path.join is called with just a foldername and no full path in it's first parameter it starts it's search from the directory that the script it was run in? Just wondering if this is correct.
os.path.join has nothing to do with any actual filesystem, and does not "start" anywhere. It simply joins two arbitrary paths, whether they exist or not.
What os.path.join does is to just join path elements the system-compatible way, taking into effect the particular directory separator character, etc., into account. It's a simple string manipulation tool.
So the returned result simply starts from whatever you give to it as the first argument.