In python how can i know a given file is being used - python

I have a list of file and there is only one in used in a time. So i wanna know which file is being used by specific program. Since i can use 'unlocker' to find out a file that are in used like this question have mentioned. But i want a programming way so that my program can help me find out. Is there any way?
Specially, the simple 'open' function in whatever r/w mode CAN access the using file and python won't throw any exception. I can tell which file being used only by 'unlocker'.
I have find out that the python 'open' function in w mode have took the access permission from that specific program, and the program then don't work so well. In this moment i open the unlocker and i can see two process accessing the file. Is there any 'weak' method that can only detect whether the file is being used?

I'm not sure which of these two you want to find:
Are there are any existing HANDLEs for a given file, like the handle and Process Explorer tools shows?
Are there any existing locked HANDLEs for a given file, like the Unlocker tool shows.
But either way, the answer is similar.
Obviously it's doable, or those tools couldn't do it, right? Unfortunately, there is nothing in the Python stdlib that can help. So, how do you do it?
You will need to access Windows APIs functions—through pywin32, ctypes, or otherwise.
There are two ways to go about it. The first, mucking about with the NT kernel object APIs, is much harder, and only really needed if you need to work with very old versions of Windows. The second, NtQuerySystemInformation, is the one you probably want.
The details are pretty complicated (and not well documented), but they're explained in a CodeProject sample program and on the Sysinternals forums.
If you don't understand the code on those pages, or how to call it from Python, you really shouldn't be doing this. If you get the gist, but have questions, or get stuck, of course you can ask for help here (or at the sysinternals forums or elsewhere).
However, even if you have used ctypes for Windows before, there are a few caveats you may not know about:
Many of these functions either aren't documented, or the documentation doesn't tell you which DLL to find them in. They will all be in either ntdll or kernel32; in some cases, however, the documented NtFooBar name is just an alias for the real ZwFooBar function, so if you don't find NtFooBar in either DLL, look for ZwFooBar.
At least on older versions of Windows, ZwQuerySystemInformation does not act as you'd expect: You cannot call it with a 0 SystemInformationLength, check the ReturnLength, allocate a buffer of that size, and try again. The only thing you can do is start with a buffer with enough room for, say, 8 handles, try that, see if you get an error STATUS_INFO_LENGTH_MISMATCH, increase that number 8, and try again until it succeeds (or fails with a different error). The code looks something like this (assuming you've defined the SYSTEM_HANDLE structure):
STATUS_INFO_LENGTH_MISMATCH = 0xc0000004
i = 8
while True:
class SYSTEM_HANDLE_INFORMATION(Structure):
_fields_ = [('HandleCount', c_ulong),
('Handles', SYSTEM_HANDLE * i)]
buf = SYSTEM_HANDLE_INFORMATION()
return_length = sizeof(buf)
rc = ntdll.ZwQuerySystemInformation(SystemHandleInformation,
buf, sizeof(buf),
byref(return_length))
if rc == STATUS_INFO_LENGTH_MISMATCH:
i += 8
continue
elif rc == 0:
return buf.Handles[:buf.HandleCount]
else:
raise SomeKindOfError(rc)
Finally, the documentation doesn't really explain this anywhere, but the way to get from a HANDLE that you know is a file to a pathname is a bit convoluted. Just using NtQueryObject(ObjectNameInformation) returns you a kernel object space pathname, which you then have to map to either a DOS pathname, a possibly-UNC normal NT pathname, or a \?\ pathname. Of course the first doesn't work files on network drives without a mapped drive letter; neither of the first two work for files with very long pathnames.
Of course there's a simpler alternative: Just drive handle, Unlocker, or some other command-line tool via subprocess.
Or, somewhere in between, build the CodeProject project linked above and just open its DLL via ctypes and call its GetOpenedFiles method, and it will do the hard work for you.
Since the project builds a normal WinDLL-style DLL, you can call it in the normal ctypes way. If you've never used ctypes, the examples in the docs show you almost everything you need to know, but I'll give some pseudocode to get you started.
First, we need to create an OF_CALLBACK type for the callback function you're going to write, as described in Callback functions. Since the prototype for OF_CALLBACK is defined in a .h file that I can't get to here, I'm just guessing at it; you'll have to look at the real version and translate it yourself. But your code is going to look something like this:
from ctypes import windll, c_int, WINFUNCTYPE
from ctypes.wintypes import LPCWSTR, UINT_PTR, HANDLE
# assuming int (* OF_CALLBACK)(int, HANDLE, int, LPCWSTR, UINT_PTR)
OF_CALLBACK = WINFUNCTYPE(c_int, HANDLE, c_int, LPWCSTR, UINT_PTR)
def my_callback(handle, namelen, name, context):
# do whatever you want with each handle, and whatever other args you get
my_of_callback = OF_CALLBACK(my_callback)
OpenFileFinder = windll.OpenFileFinder
# Might be GetOpenedFilesW, as with most Microsoft DLLs, as explained in docs
OpenFileFinder.GetOpenedFiles.argtypes = (LPCWSTR, c_int, OF_CALLBACK, UINT_PTR)
OpenFileFinder.GetOpenedFiles.restype = None
OpenFileFinder.GetOpenedFiles(ru'C:\Path\To\File\To\Check.txt', 0, my_of_callback, None)
It's quite possible that what you get in the callback is actually a pointer to some kind of structure or list of structures you're going to have to define—likely the same SYSTEM_HANDLE you'd need for calling the Nt/Zw functions directly, so let me show that as an example—if you get back a SYSTEM_HANDLE *, not a HANDLE, in the callback, it's as simple as this:
class SYSTEM_HANDLE(Structure):
_fields_ = [('dwProcessId', DWORD),
('bObjectType', BYTE),
('bFlags', BYTE),
('wValue', WORD),
('pAddress', PVOID),
('GrantedAccess', DWORD)]
# assuming int (* OF_CALLBACK)(int, SYSTEM_HANDLE *, int, LPCWSTR, UINT_PTR)
OF_CALLBACK = WINFUNCTYPE(c_int, POINTER(SYSTEM_HANDLE), c_int, LPWCSTR, UINT_PTR)

You could use a try/except block.
check if a file is open in Python
try:
f = open("file.txt", "r")
except IOError:
print "This file is already in use"

Related

unix tools to parse file on the command line

I have a python script that looks like the following that I want to transform:
import sys
# more imports
''' some comments '''
class Foo:
def _helper1():
etc.
def _helper2():
etc.
def foo1():
d = { a:3, b:2, c:4 }
etc.
def foo2():
d = { a:2, b:2, c:7 }
etc.
def foo3():
d = { a:3, b:2, c:7 }
etc.
etc.
if __name__ == "__main__":
etc.
I'd like to be able to parse JUST the foo*() functions and keep just the ones that have certain attributes, like d={a:3, b:2}. Obviously keep everything else that is non foo*() so the transformation will still run. The foo*() will be well defined though d may have different key, values.
Is there some set of unix tools I can use to do this through chaining? I can use grep to identify foo but how would I scan the next couple of lines to apply the keep or reject portion of my logic?
edit: note, i'm trying to see if it's reasonable to do this with command line tools before writing a custom parser. i know how to write the parser.
You haven't specified your problem with enough detail to recommend a particular solution, but there are many tools and techniques that will handle this type of problem.
As I understand this, you want to
Identify the boundaries of your class
Identify the methods within the class
Remove the methods lacking certain textual features
My general approach to this would be a script with logic based on "open old and new files; write everything you read from the old file, unless ."
You can blithely write things until you get into the class (one flag) and start finding methods (another flag). The one slightly tricky part here is the buffering: you need to keep the text of each method until you know whether it contains the target text. You can either read in the entire method (minor parsing task) and search that for the target, or simply hold lines of text until you find the target (then return to your write-it-all mode) or run off the end (empty the buffer without writing).
This is simply enough that you could cobble a script in any handy language to handle the problem. UNIX provides a variety of tools; in that paradigm I'd use awk. However, I recommend a read-friendly tool, such as Python or Perl. If you want to move formally into the world of parsing, I suggest a trivial Lex-YACC couplet: you can have very simple tokens (perhaps even complete lines, depending on your coding style) and actions (write line, hold line, set status flag, flush buffer, etc.).
Is that enough to get you moving?

Deciphering large program flow in Python

I'm in the process of learning how a large (356-file), convoluted Python program is set up. Besides manually reading through and parsing the code, are there any good methods for following program flow?
There are two methods which I think would be useful:
Something similar to Bash's "set -x"
Something that displays which file outputs each line of output
Are there any methods to do the above, or any other ways that you have found useful?
I don't know if this is actually a good idea, but since I actually wrote a hook to display the file and line before each line of output to stdout, I might as well give it to you…
import inspect, sys
class WrapStdout(object):
_stdout = sys.stdout
def write(self, buf):
frame = sys._getframe(1)
try:
f = inspect.getsourcefile(frame)
except TypeError:
f = 'unknown'
l = frame.f_lineno
self._stdout.write('{}:{}:{}'.format(f, l, buf))
def flush(self):
self._stdout.flush()
sys.stdout = WrapStdout()
Just save that as a module, and after you import it, every chunk of stdout will be prefixed with file and line number.
Of course this will get pretty ugly if:
Anyone tries to print partial lines (using stdout.write directly, or print magic comma in 2.x, or end='' in 3.x).
You mix Unicode and non-Unicode in 2.x.
Any of the source files have long pathnames.
etc.
But all the tricky deep-Python-magic bits are there; you can build on top of it pretty easily.
Could be very tedious, but using a debugger to trace the flow of execution, instruction by instruction could probably help you to some extent.
import pdb
pdb.set_trace()
You could look for a cross reference program. There is an old program called pyxr that does this. The aim of cross reference is to let you know how classes refer to each other. Some of the IDE's also do this sort of thing.
I'd recommend running the program inside an IDE like pydev or pycharm. Being able to stop the program and inspect its state can be very helpful.

entering console inputs from within python file

In my python file, I have made a GUI widget that takes some inputs from user. I have imported a python module in my python file that takes some input using raw_input(). I have to use this module as it is, I have no right to change it. When I run my python file, it ask me for the inputs (due to raw_input() of imported module). I want to use GUI widget inputs in that place.
How can I pass the user input (that we take from widget) as raw_input() of imported module?
First, if importing it directly into your script isn't actually a requirement (and it's hard to imagine why it would be), you can just run the module (or a simple script wrapped around it) as a separate process, using subprocess or pexpect.
Let's make this concrete. Say you want to use this silly module foo.py:
def bar():
x = raw_input("Gimme a string")
y = raw_input("Gimme another")
return 'Got two strings: {}, {}'.format(x, y)
First write a trivial foo.wrapper.py:
import foo
print(foo.bar())
Now, instead of calling foo.do_thing() directly in your real script, run foo_wrapper as a child process.
I'm going to assume that you already have the input you want to send it in a string, because that makes the irrelevant parts of the answer simpler (in fact, it makes them possible—if you wanted to use some GUI code for that, there's really no way I could show you how unless you first tell us which GUI library you're using).
So:
foo_input = 'String 1\nString 2\n'
with subprocess.Popen([sys.executable, 'foo_wrapper.py'],
stdin=subprocess.PIPE, stdout=subprocess.PIPE) as p:
foo_output, _ = p.communicate(foo_input)
Of course in real life you'll want to use an appropriate path for foo_wrapper.py instead of assuming that it's in the current working directory, but this should be enough to illustrate the idea.
Meanwhile, if "I have no right to change it" just means "I don't (and shouldn't) have checkin rights to the foo project's github site or the relevant subtree on our company's P4 server" or whatever, there's a really easy answer: Fork it, and change the fork.
Even if it's got a weak copyleft license like LGPL: fork it, change the fork, publish your fork under the same license as the original, then use your fork.
If you're depending on the foo package being installed on every target system, and can't depend on your replacement foo being installed instead, that's a bit more of a problem. But if the function or method that actually calls raw_input is just a small fraction of the actual code in foo, you can fix that by monkeypatching foo at runtime.
And that leads to the last-ditch possibility: You can always monkeypatch raw_input itself.
Again, I'm going to assume that you already have the input you need to give it to make things simpler.
So, first you write a replacement function:
foo_input = ['String 1\n', 'String 2\n']
def fake_raw_input(prompt):
global foo_input
return foo_input.pop()
Now, there are two ways you can patch this in. Usually, you want to do this:
import foo
foo.raw_input = fake_raw_input
This means any code in foo that calls raw_input will see the function you crammed into its module globals instead of the normal builtin. Unless it does something really funky (like looking up the builtin directly and copying it to a local variable or something), this is the answer.
If you need to handle one of those really funky edge cases, and you don't mind doing something questionable, you can do this:
import __builtin__
__builtin__.raw_input = fake_raw_input
You must do this before the first import foo anywhere in your problem. Also, it's not clear whether this is intentionally guaranteed to work, accidentally guaranteed to work (and should be fixed in the future), or not guaranteed to work. But it does work (at least for CPython 2.5-2.7, which is what you're probably using).

Python: Windows System File

In python, how can I identify a file that is a "window system file". From the command line I can do this with the following command:
ATTRIB "c:\file_path_name.txt"
If the return has the "S" character, then it's a windows system file. I cannot figure out the equivilant in python. A few example of similar queries look like this:
Is a file writeable?
import os
filePath = r'c:\testfile.txt'
if os.access(filePath, os.W_OK):
print 'writable'
else:
print 'not writable'
another way...
import os
import stat
filePath = r'c:\testfile.txt'
attr = os.stat(filePath)[0]
if not attr & stat.S_IWRITE:
print 'not writable'
else:
print 'writable'
But I can't find a function or enum to identify a windows system file. Hopefully there's a built in way to do this. I'd prefer not to have to use win32com or another external module.
The reason I want to do this is because I am using os.walk to copy files from one drive to another. If there was a way to walk the directory tree while ignoring system files that may work too.
Thanks for reading.
Here's the solutions I came up with based on the answer:
Using win32api:
import win32api
import win32con
filePath = r'c:\test_file_path.txt'
if not win32api.GetFileAttributes(filePath) & win32con.FILE_ATTRIBUTE_SYSTEM:
print filePath, 'is not a windows system file'
else:
print filePath, 'is a windows system file'
and using ctypes:
import ctypes
import ctypes.wintypes as types
# From pywin32
FILE_ATTRIBUTE_SYSTEM = 0x4
kernel32dll = ctypes.windll.kernel32
class WIN32_FILE_ATTRIBUTE_DATA(ctypes.Structure):
_fields_ = [("dwFileAttributes", types.DWORD),
("ftCreationTime", types.FILETIME),
("ftLastAccessTime", types.FILETIME),
("ftLastWriteTime", types.FILETIME),
("nFileSizeHigh", types.DWORD),
("nFileSizeLow", types.DWORD)]
def isWindowsSystemFile(pFilepath):
GetFileExInfoStandard = 0
GetFileAttributesEx = kernel32dll.GetFileAttributesExA
GetFileAttributesEx.restype = ctypes.c_int
# I can't figure out the correct args here
#GetFileAttributesEx.argtypes = [ctypes.c_char, ctypes.c_int, WIN32_FILE_ATTRIBUTE_DATA]
wfad = WIN32_FILE_ATTRIBUTE_DATA()
GetFileAttributesEx(pFilepath, GetFileExInfoStandard, ctypes.byref(wfad))
return wfad.dwFileAttributes & FILE_ATTRIBUTE_SYSTEM
filePath = r'c:\test_file_path.txt'
if not isWindowsSystemFile(filePath):
print filePath, 'is not a windows system file'
else:
print filePath, 'is a windows system file'
I wonder if pasting the constant "FILE_ATTRIBUTE_SYSTEM" in my code is legit, or can I get its value using ctypes as well?
But I can't find a function or enum to identify a windows system file. Hopefully there's a built in way to do this.
There is no such thing. Python's file abstraction doesn't have any notion of "system file", so it doesn't give you any way to get it. Also, Python's stat is a very thin wrapper around the stat or _stat functions in Microsoft's C runtime library, which doesn't have any notion of "system file". The reason for this is that both Python files and Microsoft's C library are both designed to be "pretty much like POSIX".
Of course Windows also has a completely different abstraction for files. But this one isn't exposed by the open, stat, etc. functions; rather, there's a completely parallel set of functions like CreateFile, GetFileAttributes, etc. And you have to call those if you want that information.
I'd prefer not to have to use win32com or another external module.
Well, you don't need win32com, because this is just Windows API, not COM.
But win32api is the easiest way to do it. It provides a nice wrapper around GetFileAttributesEx, which is the function you want to call.
If you don't want to use an external module, you can always call Windows API functions via ctypes instead. Or use subprocess to run command-line tools (like ATTRIB—or, if you prefer, like DIR /S /A-S to let Windows do the recursive-walk-skipping-system-files bit for you…).
The ctypes docs show how to call Windows API functions, but it's a little tricky the first time.
First you need to go to the MSDN page to find out what DLL you need to load (kernel32), and whether your function has separate A and W variants (it does), and what values to pass for any constants (you have to follow a link to another page, and know how C enums works, to find out that GetFileExInfoStandard is 0), and then you need to figure out how to define any structs necessary. In this case, something like this:
from ctypes import *
kernel = windll.kernel32
GetFileExInfoStandard = 0
GetFileAttributesEx = kernel.GetFileAttributesEx
GetFileAttributesEx.restype = c_int
GetFileAttributesEx.argypes = # ...
If you really want to avoid using win32api, you can do the work to finish the ctypes wrapper yourself. Personally, I'd use win32api.
Meanwhile:
The reason I want to do this is because I am using os.walk to copy files from one drive to another. If there was a way to walk the directory tree while ignoring system files that may work too.
For that case, especially given your complaint that checking each file was too slow, you probably don't want to use os.walk either. Instead, use FindFirstFileEx, and do the recursion manually. You can distinguish files and directories without having to stat (or GetFileAttributesEx) each file (which os.walk does under the covers), you can filter out system files directly inside the find function instead of having to stat each file, etc.
Again, the options are the same: use win32api if you want it to be easy, use ctypes otherwise.
But in this case, I'd take a look at Ben Hoyt's betterwalk, because he's already done 99% of the ctypes-wrapping, and 95% of the rest of the code, that you want.

How to read LV2 ttl file in Python?

I have an LV2 plugin and I want to use Python to extract its metadata - plugin name, description, list of control and audio ports and specification of each port.
With LADSPA the instructions were pretty clear, although a bit difficult to implement in Python: I just needed to call ladspa_descriptor() function. Now with LV2 there's a .ttl file, simples to access but more complicated to parse.
Is there any python library that will make this job simple?
The LV2 documentation generation tools use RDFLib. It is probably the most popular RDF interface for Python, though does much more than just parse Turtle. It is a good choice if performance is not an issue, but is unfortunately really slow.
If you need to actually instantiate and use plugins, you probably want to use an existing LV2 implementation. As Steve mentioned, Lilv is for this. It is not limited to any static default location, but will look in all the locations in LV2_PATH. You can set this environment variable to whatever you want before calling Lilv and it will only look in those locations. Alternatively, if you want to specifically load just one bundle at a time, there is a function for that: lilv_world_load_bundle().
There are SWIG-based Python bindings included with Lilv, but they stop short of actually allowing you to process data. However there is a project to wrap Lilv that allows processing of audio using scipy arrays: http://pyslv2.sourceforge.net/ (despite the name they are indeed Lilv bindings and not bindings for its predecessor SLV2)
That said, if you only need to get static information from the Turtle files, involving C libraries is probably more trouble than it is worth. One of the big advantages of using standard data files is ease of use with existing tools. To get the number of ports on a plugin, you simply need to count the number of triples that match the pattern (plugin, lv2:port, *). Here is an example Python script that prints the number of ports of a plugin, given the file to read and the plugin URI as command line arguments:
#!/usr/bin/env python
import rdflib
import sys
lv2 = rdflib.Namespace('http://lv2plug.in/ns/lv2core#')
path = sys.argv[1]
plugin = rdflib.URIRef(sys.argv[2])
model = rdflib.ConjunctiveGraph()
model.parse(path, format='n3')
num_ports = 0
for i in model.triples(plugin, lv2.port, None]):
num_ports += 1
print('%s has %u ports' % (plugin, num_ports))
This is how to get the number of ports each plugin supports:
w = lilv.World()
w.load_all()
for p in w.get_all_plugins():
print p.get_name().as_string(), p.get_num_ports()
At least this is all i got while trying to figure this out.

Categories

Resources