Python Setup Disabling Path Length Limit Pros and Cons? - python

I recently installed Python 3.7 and at the end of the setup, there is the option to "Disable path length limit". I don't know whether or not I should do this.
What are the pros and cons of doing this? Just from the sound of it you should always disable it.

I recommend selecting that option and thereby removing the path length limit. It will potentially save you time in future on debugging an avoidable issue.
Here is an anecdote of how I came to know about it:
During the compilation of my program (C# code on a Windows machine), I started getting the following error:
error MSB3541: Files has invalid value "long\path\filename". The specified path,
file name, or both are too long. The fully qualified file name must be less than
260 characters, and the directory name must be less than 248 characters.
This error was not allowing me to build my project and the only apparent solution to this issue was to shorten my path/file names. Turns out that this bug is a built-in limitation in NTFS (Window's File System): Why does the 260 character path length limit exist in Windows?
After a couple of decades with the limitation built into the NTFS file system, it has finally been fixed (Unix based system did not have it) in Windows 10 (https://learn.microsoft.com/en-us/windows/desktop/FileIO/naming-a-file#maximum-path-length-limitation), but it is not enabled automatically, and needs registry (or group policy) settings to do this. The Python option allows you to disable it for Python libraries, saving you a lot of headache.
Do note that enabling this option will,
a) break compatibility of your programs on systems using older versions of Windows 10 and lower, when using long file/directory names and paths.
b) break programs on Windows 10 machines not having this option enabled, when using long file/directory names and paths.

to answer both of your questions:
Should you disable this?
The quick answer is that it doesn't matter that much, since this only matters when working with paths longer than 260 characters, not something most people do.
What are the pros and cons of disabling the path length limit?
Pros
you won't get an error when working with filepaths longer than 260 characters, so there's less worry about the path length
it can make debugging easier
Cons
disabling it has no negative technical side effects
if you work in a team, it might introduce bugs where code works on your machine, but not on their machine. because you have the path-limit disabled, but they don't.
disabling it can have negative human behaviour side effects.
Enabling long paths could promote bad naming-behaviour in your team
regarding pathnames and folderstructure. A limit forces people to
shorten their paths.
E.g. I've worked in teams with paths like this, and allowing them longer names would have resulted in less readable filepaths:
c:/project_name/unity/files/assets/UI/UI_2.0/levelname/season2_levelname/release_season2_levelname_ui_2/PROJECT_S2_MENU_UI/PROJECT_S2_hover_button_shadow_ui/PROJECT_S2_hover_button_shadow_ui_blue/PROJECT_S2_hover_button_shadow_ui_blue.asset
Explanation
To understand the pros and cons, it helps to understand what the path length limit is.
windows path length
You probably already know that a Windows path is a string, that represents where to find a file or folder.
e.g. C:\Program Files\7-Zip
longer folder or file names result in a longer string.
e.g. C:\Program Files\Microsoft Update Health Tools
more folders inside other folders also result in a longer string
e.g. C:\Program Files\Microsoft Update Health Tools\Logs
file path length errors
If you have a lot of folders inside each other, with long names, you might run into an error when trying to use this path in your code.
This is because Windows has a path length limit. An update in windows 10 allows you to disable this limitation. but it doesn't do so by default.
Disabling this limitation allows your computer to use longer paths without errors.
Why does this happen?
The old windows API promised that if you wrote your application correctly, it'd continue to work in the future.
If Windows were to allow filenames longer than 260 characters then your existing application (which used the windows API correctly) would fail.
Microsoft did create a way to use the full 32,768 path names; but they had to create a new API contract to do it. This is the update on windows 10.
read more on why

I am keeping this simple and straight forward
The "Disable path length limit" option refers to the maximum length of the file paths that Windows can handle. Disabling this limit can allow for longer file paths, which can be useful if you are working with files that have very long names or are stored in deeply nested directories. However, it can also cause compatibility issues with some programs, particularly older ones that may not designed to support long file paths.
In general, it's usually not necessary to disable the path length limit unless you have a specific need for it. If you're not sure whether you need it or not, it's probably best to leave it enabled.

Generally, it's not a good idea to disable it, especially if you have programs that could potentially break upon disabling it.
I have a lot of older programs, and potentially forgetting that I disabled it, and the fact that re-enabling it (being that finding out how to) and the fact that doing that could potentially break any program that uses long file paths in its scripts, makes having it off unhelpful, and moreover possibly a waste of time and debugging.
But to defend its existence, in certain environments it can be helpful, especially in environments where making subfolders upon subfolders is key. Particularly, this is helpful when making a game with a lot of assets. But again, there are many ways to shorten subfolders (and files), and doing that makes it generally easier to type out the path if you aren't copy-and-pasting everywhere. (For example, C:\my_game\assts\01\plyr\walk_01.png is easier to type than C:\my_epic_game_featuring_my_awesome_character\assets\…)
If you have a virtual machine or just another OS to try this on where you do not have to worry about specific programs breaking upon disabling the path limit, it'd probably be useful to have this off, but for everything else, just be wary of it's probability to make more bugs than to fix.

Related

In my "small" python exe GUI program the tcl folder has 820 files (mostly tzdata). Any chance of reducing this number?

As stated in the title:; I have a "small" python exe GUI program generated by pyinstaller which creates a tcl folder that has 820 files (mostly tzdata). Any chance of reducing this number?
It takes a long time to copy the program because of all the tiny files.
I've used the datetime library. I just need the date and time to pop up on a pdf that I'm printing, so doesn't need to be that fancy. I just need the time on the computer :)
I can use "--onefile" to just get the .exe, but that takes too long to open.
Program is only for Windows atm.
You can almost certainly delete the http1.0 and opt0.4 directories outright. They're obsolete packages included for backward compatibility only.
The *.tcl and tclIndex files should be left (except for parray.tcl, which you likely don't need).
Of the encoding, msgs and tzdata directories, if you're deploying in a restricted set of locations, you can delete a lot of that; you only need the encodings, message catalogs and timezone definitions that you actually use when running. Thus, if you're only supporting English speakers in the USA, you can delete a very large fraction of the files. (If you're not using Tcl to format or parse dates at all, you don't need any timezone definitions.) The main encoding that you must retain is the one that the scripts are written in! (NB: support for the UTF-8 and ISO8859-1 encodings, and the UTF-16-derived ones used for talking to the Windows API, are all built in directly to Tcl; you can't remove support for them.)
Which things you can remove depend on your application and where you deploy it. That's why we can't tell you outright which files to delete.
Generally the 'blunt' approach is to attack the problem by deleting the files(or some files) and see if your programm works as intended without any bugs.That can be some times rather complicating and time consuming,and some times not even possible.
Libraries like pyinstaller and cx_freeze tends to be super inclusive of files that you don't even need so the programm is guaranteed to work.
Generally i advise you to create an installer for your programm(like Inno Setup) that will look really more professional and will diminish your current problem.
Also python supports ziped libraries that can drastically discreaze the size of the app on some libraries.Look one of my own question on the topic Python3 compiled App Decrease Size with zip?.
Have fun!

Interruption of a os.rename in python

I made a script in python that renames all files and folders(does not recurse) in "." directory: the directory in which file is kept. It happened that I ran the script in a directory which contained no files and only one directory let's say imp with path .\imp. While program was renaming it, the electricity went off and the job was interrupted (sorry did't had UPS).
Now as the name suggests, assume imp contains important data. The renaming process also took quite good time ( compared to others ) before electricity went off even when all it was renaming was one folder. After this endeavour is some data corrupted, lost or anything?
Just make this more useful: what happens os.rename is forced to stop when it is doing its job? How is the effect different for files and folders?
Details
Python Version - 2.7.10
Operating System - Windows 10 Pro
You are using Windows, which means you are (probably) on NTFS. NTFS is a modern, journaling file system. It should not corrupt or lose any data, though it's possible that only some of the changes that constitute a rename have been applied (for instance, the filename might change without updating the modification time, or vice-versa). It is also possible that none of those changes have been applied.
Note the word "should" is not the same as "will." NTFS should not lose data in this fashion, and if it does, it's a bug. But because all software has bugs, it is important to keep backups of files you care about.

Should I delete temporary files created by my script?

It's a common question not specifically about some language or platform. Who is responsible for a file created in systems $TEMP folder?
If it's my duty, why should I care where to put this file? I can place it anywhere with same result.
If it's OS responsibility, can I forgot about this file right after use?
Thanks and sorry for my basic English.
As a general rule, you should remove the temporary files that you create.
Recall that the $TEMP directory is a shared resource that other programs can use. Failure to remove the temporary files will have an impact on the other programs that use $TEMP.
What kind of impacts? That will depend upon the other programs. If those other programs create a lot of temporary files, then their execution will be slower as it will take longer to create a new temporary file as the directory will have to be scanned on each temporary file creation to ensure that the file name is unique.
Consider the following (based on real events) ...
In years past, my group at work had to use the Intel C Compiler. We found that over time, it appeared to be slowing down. That is, the time it took to run our sanity tests using it took longer and longer. This also applied to building/compiling a single C file. We tracked the problem down.
ICC was opening, stat'ing and reading every file under $TEMP. For what purpose, I know not. Although the argument can be made that the problem lay with the ICC, the existence of the files under $TEMP was slowing it and our development team down. Deleting those temporary files resulted in the sanity checks running in less than a half hour instead of over two--a significant time saver.
Hope this helps.
There is no standard and no common rules. In most OSs, the files in the temporary folder will pile up. Some systems try to prevent this by deleting files in there automatically after some time but that sometimes causes grief, for example with long running processes or crash backups.
The reason for $TEMP to exist is that many programs (especially in early times when RAM was scarce) needed a place to store temporary data since "super computers" in the 1970s had only a few KB of RAM (yes, N*1024 bytes where N is << 100 - you couldn't even fit the image of your mouse cursor into that). Around 1980, 64KB was a lot.
The solution was a folder where anyone could write. Security wasn't an issue at the time, memory was.
Over time, OSs started to get better systems to create temporary files and to clean them up but backwards compatibility prevented a clean, "work for all" solution.
So even though you know where the data ends up, you are responsible to clean up the files after yourself. To make error analysis easier, I tend to write my code in such a way that files are only deleted when everything is fine - that way, I can look at intermediate results to figure out what is wrong. But logging is often a better and safer solution.
Related: Memory prices 1957-2014 12KB of Ram did cost US $4'680,- in 1973.

Where should I write a user specific log file to (and be XDG base directory compatible)

By default, pip logs errors into "~/.pip/pip.log". Pip has an option to change the log path, and I'd like to put the log file somewhere besides ~/.pip so as not to clutter up my home directory. Where should I put it and be XDG base dir compatible?
Right now I'm considering one of these:
$XDG_DATA_HOME (typically $HOME/.local/share)
$XDG_CACHE_HOME (typically $HOME/.cache)
This is, for the moment, unclear.
Different software seem to handle this in different ways (imsettings puts it in $XDG_CACHE_HOME,
profanity in $XDG_DATA_HOME).
Debian, however, has a proposal which I can get behind (emphasis mine):
This is a recurring request/complaint (see this or this) on the xdg-freedesktop mailing list to introduce another directory for state information that does not belong in any of the existing categories (see also home-dir.proposal. Examples for this information are:
history files of shells, repls, anything that uses libreadline
logfiles
state of application windows on exit
recently opened files
last time application was run
emacs: bookmarks, ido last directories, backups, auto-save files, auto-save-list
The above example information is not essential data. However it should still persist on reboots of the system unlike cache data that a user might consider putting in a TMPFS. On the other hand the data is rather volatile and does not make sense to be checked into a VCS. The files are also not the data files that an application works on.
A default folder for a future STATE category might be: $HOME/.local/state
This would effectively introduce another environment variable since $XDG_DATA_HOME usually points to $HOME/.local/share and this hypothetical environment variable ($XDG_STATE_HOME?) would point to $HOME/.local/state
If you really want to adhere to the current standard I would place my log files in $XDG_CACHE_HOME since log files aren't required to run the program.

File changed event network share

What I want to do:
I got two directories. Each one contains about 90.000 xml and bak files.
I need the xml files to sync at both folders when a file changes (of course the newer one should be copied).
The problem is:
Because of the huge amount of files and the fact that one of the directories is a network share I can't just loop though the directory and compare os.path.getmtime(file) values.
Even watchdog and PyQt don't work (tried the solutions from here and here).
The question:
Is there any other way to get a file changed event (on windows systems) which works for those configuration without looping though all those files?
So I finally found the solution:
I changed some of my network share settings and used the FileSystemWatcher
To prevent files getting synced on syncing i use a md5 filehash.
The code i use can be found at pastebin (It's a quick and dirty code and just the parts of it mentioned in the question here).
The code in...
https://stackoverflow.com/a/12345282/976427
Seems to work for me when passed a network share.
I'm risking giving an answer which is way off here (you didn't specify requirement regarding speed, etc) but... Dropbox would do exactly that for you for free, and would required writing no code at all.
Of course it might not suit your needs if you required real-time syncing, or if you want to avoid "sharing" your files with a third party (although you can encrypt them first).
Can you use the second option on this page?
http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
From mentioning watchdog, I assume you are running under Linux. For the local machine inotify can help, but for the network share you are out of luck.
Mercurial's inotify extension http://hgbook.red-bean.com/read/adding-functionality-with-extensions.html has the same limitation.
In a similar situation (10K+ files) I have used a cloned mercurial repository with inotify on both the server and the local machine. They automatically commit and notified each other of changes. It had a slight delay (no problem in my case) but as a benifit had a full history
of changes and easily resynced after one of the systems had been down.

Categories

Resources