On windows, I want to copy a bunch of files over a network with Python. Sometimes, the network is not responding, and the copy is stalled. I want to check, if that happens, and skip the file in question, when that happens. By asking this related question here, I found out about the CopyFileEx function, that allows the use of a callback function, that can abort the file copy.
The implementation in Python looks like that:
import win32file
def Win32_CopyFileEx( ExistingFileName, NewFileName, Canc = False):
win32file.CopyFileEx(
ExistingFileName, # PyUNICODE | File to be copied
NewFileName, # PyUNICODE | Place to which it will be copied
Win32_CopyFileEx_ProgressRoutine, # CopyProgressRoutine | A python function that receives progress updates, can be None
Data = None, # object | An arbitrary object to be passed to the callback function
Cancel = Canc, # boolean | Pass True to cancel a restartable copy that was previously interrupted
CopyFlags = win32file.COPY_FILE_RESTARTABLE, # int | Combination of COPY_FILE_* flags
Transaction = None # PyHANDLE | Handle to a transaction as returned by win32transaction::CreateTransaction
)
From the documentation of the CopyFileEx function, I can see two possibilities of cancelation of a running copy.
pbCancel [in, optional] If this flag is set to TRUE during the copy operation, the operation is canceled. Otherwise, the copy
operation will continue to completion.
I could not figure out a way how to do that. I tried calling the same function with the same file handles again but with the cancel flag set to TRUE, but that leads in an error, because of the file in question being in use by another process.
Another possibility seems to be the callback function:
lpProgressRoutine [in, optional] The address of a callback function of
type LPPROGRESS_ROUTINE that is called each time another portion of
the file has been copied. This parameter can be NULL. For more
information on the progress callback function, see the
CopyProgressRoutine function.
The documentation of this ProgressRoutine states, that this callback is either called when the copy is started or when a junk of the file is finished copying. The callback function can cancel the copy process if it returns 1 or 2 ( cancel, stop). However, this callback function seems to not being called, when the copy of a junk is stalled.
So my question is: How I can cancel this copy on a per-file-basis when it is stalled?
win32file.CopyFileEx doesn't allow passing Cancel as anything but a boolean or integer value. In the API it's an LPBOOL pointer, which allows the caller to set its value concurrently in another thread. You'll have to use ctypes, Cython, or a C extension to get this level of control. Below I've written an example using ctypes.
If canceling the copy doesn't work because the thread is blocked on synchronous I/O, you can try calling CancelIoEx on the file handles that you're passed in the progress routine, or CancelSynchronousIo to cancel all synchronous I/O for the thread. These I/O cancel functions were added in Windows Vista. They're not available in Windows XP, in case you're still supporting it.
import ctypes
from ctypes import wintypes
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
COPY_FILE_FAIL_IF_EXISTS = 0x0001
COPY_FILE_RESTARTABLE = 0x0002
COPY_FILE_OPEN_SOURCE_FOR_WRITE = 0x0004
COPY_FILE_ALLOW_DECRYPTED_DESTINATION = 0x0008
COPY_FILE_COPY_SYMLINK = 0x0800
COPY_FILE_NO_BUFFERING = 0x1000
CALLBACK_CHUNK_FINISHED = 0
CALLBACK_STREAM_SWITCH = 1
PROGRESS_CONTINUE = 0
PROGRESS_CANCEL = 1
PROGRESS_STOP = 2
PROGRESS_QUIET = 3
ERROR_REQUEST_ABORTED = 0x04D3
if not hasattr(wintypes, 'LPBOOL'):
wintypes.LPBOOL = ctypes.POINTER(wintypes.BOOL)
def _check_bool(result, func, args):
if not result:
raise ctypes.WinError(ctypes.get_last_error())
return args
LPPROGRESS_ROUTINE = ctypes.WINFUNCTYPE(
wintypes.DWORD, # _Retval_
wintypes.LARGE_INTEGER, # _In_ TotalFileSize
wintypes.LARGE_INTEGER, # _In_ TotalBytesTransferred
wintypes.LARGE_INTEGER, # _In_ StreamSize
wintypes.LARGE_INTEGER, # _In_ StreamBytesTransferred
wintypes.DWORD, # _In_ dwStreamNumber
wintypes.DWORD, # _In_ dwCallbackReason
wintypes.HANDLE, # _In_ hSourceFile
wintypes.HANDLE, # _In_ hDestinationFile
wintypes.LPVOID) # _In_opt_ lpData
kernel32.CopyFileExW.errcheck = _check_bool
kernel32.CopyFileExW.argtypes = (
wintypes.LPCWSTR, # _In_ lpExistingFileName
wintypes.LPCWSTR, # _In_ lpNewFileName
LPPROGRESS_ROUTINE, # _In_opt_ lpProgressRoutine
wintypes.LPVOID, # _In_opt_ lpData
wintypes.LPBOOL, # _In_opt_ pbCancel
wintypes.DWORD) # _In_ dwCopyFlags
#LPPROGRESS_ROUTINE
def debug_progress(tsize, ttrnsfr, stsize, sttrnsfr, stnum, reason,
hsrc, hdst, data):
print('ttrnsfr: %d, stnum: %d, stsize: %d, sttrnsfr: %d, reason: %d' %
(ttrnsfr, stnum, stsize, sttrnsfr, reason))
return PROGRESS_CONTINUE
def copy_file(src, dst, cancel=None, flags=0,
cbprogress=None, data=None):
if isinstance(cancel, int):
cancel = ctypes.byref(wintypes.BOOL(cancel))
elif cancel is not None:
cancel = ctypes.byref(cancel)
if cbprogress is None:
cbprogress = LPPROGRESS_ROUTINE()
kernel32.CopyFileExW(src, dst, cbprogress, data, cancel, flags)
Example
if __name__ == '__main__':
import os
import tempfile
import threading
src_fd, src = tempfile.mkstemp()
os.write(src_fd, os.urandom(16 * 2 ** 20))
os.close(src_fd)
dst = tempfile.mktemp()
cancel = wintypes.BOOL(False)
t = threading.Timer(0.001, type(cancel).value.__set__, (cancel, True))
t.start()
try:
copy_file(src, dst, cancel, cbprogress=debug_progress)
except OSError as e:
print(e)
assert e.winerror == ERROR_REQUEST_ABORTED
finally:
if os.path.exists(src):
os.remove(src)
if os.path.exists(dst):
os.remove(dst)
Related
I was trying to craft a response to a question about streaming audio from a HTTP server, then play it with PyGame. I had the code mostly complete, but hit an error where the PyGame music functions tried to seek() on the urllib.HTTPResponse object.
According to the urlib docs, the urllib.HTTPResponse object (since v3.5) is an io.BufferedIOBase. I expected this would make the stream seek()able, however it does not.
Is there a way to wrap the io.BufferedIOBase such that it is smart enough to buffer enough data to handle the seek operation?
import pygame
import urllib.request
import io
# Window size
WINDOW_WIDTH = 400
WINDOW_HEIGHT = 400
# background colour
SKY_BLUE = (161, 255, 254)
### Begin the streaming of a file
### Return the urlib.HTTPResponse, a file-like-object
def openURL( url ):
result = None
try:
http_response = urllib.request.urlopen( url )
print( "streamHTTP() - Fetching URL [%s]" % ( http_response.geturl() ) )
print( "streamHTTP() - Response Status [%d] / [%s]" % ( http_response.status, http_response.reason ) )
result = http_response
except:
print( "streamHTTP() - Error Fetching URL [%s]" % ( url ) )
return result
### MAIN
pygame.init()
window = pygame.display.set_mode( ( WINDOW_WIDTH, WINDOW_HEIGHT ) )
pygame.display.set_caption("Music Streamer")
clock = pygame.time.Clock()
done = False
while not done:
# Handle user-input
for event in pygame.event.get():
if ( event.type == pygame.QUIT ):
done = True
# Keys
keys = pygame.key.get_pressed()
if ( keys[pygame.K_UP] ):
if ( pygame.mixer.music.get_busy() ):
print("busy")
else:
print("play")
remote_music = openURL( 'http://127.0.0.1/example.wav' )
if ( remote_music != None and remote_music.status == 200 ):
pygame.mixer.music.load( io.BufferedReader( remote_music ) )
pygame.mixer.music.play()
# Re-draw the screen
window.fill( SKY_BLUE )
# Update the window, but not more than 60fps
pygame.display.flip()
clock.tick_busy_loop( 60 )
pygame.quit()
When this code runs, and Up is pushed, it fails with the error:
streamHTTP() - Fetching URL [http://127.0.0.1/example.wav]
streamHTTP() - Response Status [200] / [OK]
io.UnsupportedOperation: seek
io.UnsupportedOperation: File or stream is not seekable.
io.UnsupportedOperation: seek
io.UnsupportedOperation: File or stream is not seekable.
Traceback (most recent call last):
File "./sound_stream.py", line 57, in <module>
pygame.mixer.music.load( io.BufferedReader( remote_music ) )
pygame.error: Unknown WAVE format
I also tried re-opening the the io stream, and various other re-implementations of the same sort of thing.
Seeking seeking
According to the urlib docs, the urllib.HTTPResponse object (since v3.5) is an io.BufferedIOBase. I expected this would make the stream seek()able, however it does not.
That's correct. The io.BufferedIOBase interface doesn't guarantee the I/O object is seekable. For HTTPResponse objects, IOBase.seekable() returns False:
>>> import urllib.request
>>> response = urllib.request.urlopen("http://httpbin.org/get")
>>> response
<http.client.HTTPResponse object at 0x110870ca0>
>>> response.seekable()
False
That's because the BufferedIOBase implementation offered by HTTPResponse is wrapping a socket object, and sockets are not seekable either.
You can't wrap an BufferedIOBase object in a BufferedReader object and add seeking support. The Buffered* wrapper objects can only wrap RawIOBase types, and they rely on the wrapped object to provide seeking support. You would have to emulate seeking at raw I/O level, see below.
You can still provide the same functionality at a higher level, but take into account that seeking on remote data is a lot more involved; this isn't a simple change a simple OS variable that represents a file position on disk operation. For larger remote file data, seeking without backing the whole file on disk locally could be as sophisticated as using HTTP range requests and local (in memory or on-disk) buffers to balance sound play-back performance and minimising local data storage. Doing this correctly for a wide range of use-cases can be a lot of effort, so is certainly not part of the Python standard library.
If your sound files are small
If your HTTP-sourced sound files are small enough (a few MB at most) then just read the whole response into an in-memory io.BytesIO() file object. I really do not think it is worth making this more complicated than that, because the moment you have enough data to make that worth pursuing your files are large enough to take up too much memory!
So this would be more than enough if your sound files are smaller (no more than a few MB):
from io import BytesIO
import urllib.error
import urllib.request
def open_url(url):
try:
http_response = urllib.request.urlopen(url)
print(f"streamHTTP() - Fetching URL [{http_response.geturl()}]")
print(f"streamHTTP() - Response Status [{http_response.status}] / [{http_response.reason}]")
except urllib.error.URLError:
print("streamHTTP() - Error Fetching URL [{url}]")
return
if http_response.status != 200:
print("streamHTTP() - Error Fetching URL [{url}]")
return
return BytesIO(http_response.read())
This doesn't require writing a wrapper object, and because BytesIO is a native implementation, once the data is fully copied over, access to the data is faster than any Python-code wrapper could ever give you.
Note that this returns a BytesIO file object, so you no longer need to test for the response status:
remote_music = open_url('http://127.0.0.1/example.wav')
if remote_music is not None:
pygame.mixer.music.load(remote_music)
pygame.mixer.music.play()
If they are more than a few MB
Once you go beyond a few megabytes, you could try pre-loading the data into a local file object. You can make this more sophisticated by using a thread to have shutil.copyfileobj() copy most of the data into that file in the background and give the file to PyGame after loading just an initial amount of data.
By using an actual file object, you can actually help performance here, as PyGame will try to minimize interjecting itself between the SDL mixer and the file data. If there is an actual file on disk with a file number (the OS-level identifier for a stream, something that the SDL mixer library can make use of), then PyGame will operate directly on that and so minimize blocking the GIL (which in turn will help the Python portions of your game perform better!). And if you pass in a filename (just a string), then PyGame gets out of the way entirely and leaves all file operations over to the SDL library.
Here's such an implementation; this should, on normal Python interpreter exit, clean up the downloaded files automatically. It returns a filename for PyGame to work on, and finalizing downloading the data is done in a thread after the initial few KB has been buffered. It will avoid loading the same URL more than once, and I've made it thread-safe:
import shutil
import urllib.error
import urllib.request
from tempfile import NamedTemporaryFile
from threading import Lock, Thread
INITIAL_BUFFER = 1024 * 8 # 8kb initial file read to start URL-backed files
_url_files_lock = Lock()
# stores open NamedTemporaryFile objects, keeping them 'alive'
# removing entries from here causes the file data to be deleted.
_url_files = {}
def open_url(url):
with _url_files_lock:
if url in _url_files:
return _url_files[url].name
try:
http_response = urllib.request.urlopen(url)
print(f"streamHTTP() - Fetching URL [{http_response.geturl()}]")
print(f"streamHTTP() - Response Status [{http_response.status}] / [{http_response.reason}]")
except urllib.error.URLError:
print("streamHTTP() - Error Fetching URL [{url}]")
return
if http_response.status != 200:
print("streamHTTP() - Error Fetching URL [{url}]")
return
fileobj = NamedTemporaryFile()
content_length = http_response.getheader("Content-Length")
if content_length is not None:
try:
content_length = int(content_length)
except ValueError:
content_length = None
if content_length:
# create sparse file of full length
fileobj.seek(content_length - 1)
fileobj.write(b"\0")
fileobj.seek(0)
fileobj.write(http_response.read(INITIAL_BUFFER))
with _url_files_lock:
if url in _url_files:
# another thread raced us to this point, we lost, return their
# result after cleaning up here
fileobj.close()
http_response.close()
return _url_files[url].name
# store the file object for this URL; this keeps the file
# open and so readable if you have the filename.
_url_files[url] = fileobj
def copy_response_remainder():
# copies file data from response to disk, for all data past INITIAL_BUFFER
with http_response:
shutil.copyfileobj(http_response, fileobj)
t = Thread(daemon=True, target=copy_response_remainder)
t.start()
return fileobj.name
Like the BytesIO() solution, the above returns either None or a value ready for passing to pass to pygame.mixer.music.load().
The above will probably not work if you try to immediately set an advanced playing position in your sound files, as later data may not yet have been copied into the file. It's a trade-off.
Seeking and finding third party libraries
If you need to have full seeking support on remote URLs and don't want to use on-disk space for them and don't want to have to worry about their size, you don't need to re-invent the HTTP-as-seekable-file wheel here. You could use an existing project that offers the same functionality. I found two that offer io.BufferedIOBase-based implementations:
smart_open
httpio
Both use HTTP Range requests to implement seeking support. Just use httpio.open(URL) or smart_open.open(URL) and pass that directly to pygame.mixer.music.load(); if the URL can't be opened, you can catch that by handling the IOError exception:
from smart_open import open as url_open # or from httpio import open
try:
remote_music = url_open('http://127.0.0.1/example.wav')
except IOError:
pass
else:
pygame.mixer.music.load(remote_music)
pygame.mixer.music.play()
smart_open uses an in-memory buffer to satisfy reads of a fixed size, but creates a new HTTP Range request for every call to seek that changes the current file position, so performance may vary. Since the SDL mixer executes a few seeks on audio files to determine their type, I expect this to be a little slower.
httpio can buffer blocks of data and so might handle seeks better, but from a brief glance at the source code, when actually setting a buffer size the cached blocks are never evicted from memory again so you'd end up with the whole file in memory, eventually.
Implementing seeking ourselves, via io.RawIOBase
And finally, because I'm not able to find efficient HTTP-Range-backed I/O implementations, I wrote my own. The following implements the io.RawIOBase interface, specifically so you can then wrap the object in a io.BufferedIOReader() and so delegate caching to a caching buffer that will be managed correctly when seeking:
import io
from copy import deepcopy
from functools import wraps
from typing import cast, overload, Callable, Optional, Tuple, TypeVar, Union
from urllib.request import urlopen, Request
T = TypeVar("T")
#overload
def _check_closed(_f: T) -> T: ...
#overload
def _check_closed(*, connect: bool, default: Union[bytes, int]) -> Callable[[T], T]: ...
def _check_closed(
_f: Optional[T] = None,
*,
connect: bool = False,
default: Optional[Union[bytes, int]] = None,
) -> Union[T, Callable[[T], T]]:
def decorator(f: T) -> T:
#wraps(cast(Callable, f))
def wrapper(self, *args, **kwargs):
if self.closed:
raise ValueError("I/O operation on closed file.")
if connect and self._fp is None or self._fp.closed:
self._connect()
if self._fp is None:
# outside the seekable range, exit early
return default
try:
return f(self, *args, **kwargs)
except Exception:
self.close()
raise
finally:
if self._range_end and self._pos >= self._range_end:
self._fp.close()
del self._fp
return cast(T, wrapper)
if _f is not None:
return decorator(_f)
return decorator
def _parse_content_range(
content_range: str
) -> Tuple[Optional[int], Optional[int], Optional[int]]:
"""Parse a Content-Range header into a (start, end, length) tuple"""
units, *range_spec = content_range.split(None, 1)
if units != "bytes" or not range_spec:
return (None, None, None)
start_end, _, size = range_spec[0].partition("/")
try:
length: Optional[int] = int(size)
except ValueError:
length = None
start_val, has_start_end, end_val = start_end.partition("-")
start = end = None
if has_start_end:
try:
start, end = int(start_val), int(end_val)
except ValueError:
pass
return (start, end, length)
class HTTPRawIO(io.RawIOBase):
"""Wrap a HTTP socket to handle seeking via HTTP Range"""
url: str
closed: bool = False
_pos: int = 0
_size: Optional[int] = None
_range_end: Optional[int] = None
_fp: Optional[io.RawIOBase] = None
def __init__(self, url_or_request: Union[Request, str]) -> None:
if isinstance(url_or_request, str):
self._request = Request(url_or_request)
else:
# copy request objects to avoid sharing state
self._request = deepcopy(url_or_request)
self.url = self._request.full_url
self._connect(initial=True)
def readable(self) -> bool:
return True
def seekable(self) -> bool:
return True
def close(self) -> None:
if self.closed:
return
if self._fp:
self._fp.close()
del self._fp
self.closed = True
#_check_closed
def tell(self) -> int:
return self._pos
def _connect(self, initial: bool = False) -> None:
if self._fp is not None:
self._fp.close()
if self._size is not None and self._pos >= self._size:
# can't read past the end
return
request = self._request
request.add_unredirected_header("Range", f"bytes={self._pos}-")
response = urlopen(request)
self.url = response.geturl() # could have been redirected
if response.status not in (200, 206):
raise OSError(
f"Failed to open {self.url}: "
f"{response.status} ({response.reason})"
)
if initial:
# verify that the server supports range requests. Capture the
# content length if available
if response.getheader("Accept-Ranges") != "bytes":
raise OSError(
f"Resource doesn't support range requests: {self.url}"
)
try:
length = int(response.getheader("Content-Length", ""))
if length >= 0:
self._size = length
except ValueError:
pass
# validate the range we are being served
start, end, length = _parse_content_range(
response.getheader("Content-Range", "")
)
if self._size is None:
self._size = length
if (start is not None and start != self._pos) or (
length is not None and length != self._size
):
# non-sensical range response
raise OSError(
f"Resource at {self.url} served invalid range: pos is "
f"{self._pos}, range {start}-{end}/{length}"
)
if self._size and end is not None and end + 1 < self._size:
# incomplete range, not reaching all the way to the end
self._range_end = end
else:
self._range_end = None
fp = cast(io.BufferedIOBase, response.fp) # typeshed doesn't name fp
self._fp = fp.detach() # assume responsibility for the raw socket IO
#_check_closed
def seek(self, offset: int, whence: int = io.SEEK_SET) -> int:
relative_to = {
io.SEEK_SET: 0,
io.SEEK_CUR: self._pos,
io.SEEK_END: self._size,
}.get(whence)
if relative_to is None:
if whence == io.SEEK_END:
raise IOError(
f"Can't seek from end on unsized resource {self.url}"
)
raise ValueError(f"whence value {whence} unsupported")
if -offset > relative_to: # can't seek to a point before the start
raise OSError(22, "Invalid argument")
self._pos = relative_to + offset
# there is no point in optimising an existing connection
# by reading from it if seeking forward below some threshold.
# Use a BufferedIOReader to avoid seeking by small amounts or by 0
if self._fp:
self._fp.close()
del self._fp
return self._pos
# all read* methods delegate to the SocketIO object (itself a RawIO
# implementation).
#_check_closed(connect=True, default=b"")
def read(self, size: int = -1) -> Optional[bytes]:
assert self._fp is not None # show type checkers we already checked
res = self._fp.read(size)
if res is not None:
self._pos += len(res)
return res
#_check_closed(connect=True, default=b"")
def readall(self) -> bytes:
assert self._fp is not None # show type checkers we already checked
res = self._fp.readall()
self._pos += len(res)
return res
#_check_closed(connect=True, default=0)
def readinto(self, buffer: bytearray) -> Optional[int]:
assert self._fp is not None # show type checkers we already checked
n = self._fp.readinto(buffer)
self._pos += n or 0
return n
Remember that this is a RawIOBase object, which you really want to wrap in a BufferReader(). Doing so in open_url() looks like this:
def open_url(url, *args, **kwargs):
return io.BufferedReader(HTTPRawIO(url), *args, **kwargs)
This gives you fully buffered I/O, with full support seeking, over a remote URL, and the BufferedReader implementation will minimise resetting the HTTP connection when seeking. I've found that using this with the PyGame mixer, only single HTTP connection is made, as all the test seeks are within the default 8KB buffer.
If your fine with using the requests module (which supports streaming) instead of urllib, you could use a wrapper like this:
class ResponseStream(object):
def __init__(self, request_iterator):
self._bytes = BytesIO()
self._iterator = request_iterator
def _load_all(self):
self._bytes.seek(0, SEEK_END)
for chunk in self._iterator:
self._bytes.write(chunk)
def _load_until(self, goal_position):
current_position = self._bytes.seek(0, SEEK_END)
while current_position < goal_position:
try:
current_position = self._bytes.write(next(self._iterator))
except StopIteration:
break
def tell(self):
return self._bytes.tell()
def read(self, size=None):
left_off_at = self._bytes.tell()
if size is None:
self._load_all()
else:
goal_position = left_off_at + size
self._load_until(goal_position)
self._bytes.seek(left_off_at)
return self._bytes.read(size)
def seek(self, position, whence=SEEK_SET):
if whence == SEEK_END:
self._load_all()
else:
self._bytes.seek(position, whence)
Then I guess you can do something like this:
WINDOW_WIDTH = 400
WINDOW_HEIGHT = 400
SKY_BLUE = (161, 255, 254)
URL = 'http://localhost:8000/example.wav'
pygame.init()
window = pygame.display.set_mode( ( WINDOW_WIDTH, WINDOW_HEIGHT ) )
pygame.display.set_caption("Music Streamer")
clock = pygame.time.Clock()
done = False
font = pygame.font.SysFont(None, 32)
state = 0
def play_music():
response = requests.get(URL, stream=True)
if (response.status_code == 200):
stream = ResponseStream(response.iter_content(64))
pygame.mixer.music.load(stream)
pygame.mixer.music.play()
else:
state = 0
while not done:
for event in pygame.event.get():
if ( event.type == pygame.QUIT ):
done = True
if event.type == pygame.KEYDOWN and state == 0:
Thread(target=play_music).start()
state = 1
window.fill( SKY_BLUE )
window.blit(font.render(str(pygame.time.get_ticks()), True, (0,0,0)), (32, 32))
pygame.display.flip()
clock.tick_busy_loop( 60 )
pygame.quit()
using a Thread to start streaming.
I'm not sure this works 100%, but give it a try.
I'm trying to call RegisterWaitForSingleObject on a file handle in Python to check if there's data asynchronously. My understanding is that I can create a callback function in Python, pass it to RegisterWaiForSingleObject, and as soon as there's data to read, the callback function will be called.
Here's my current implementation:
def _decl(name, ret=None, args=(), module=kernel32):
fn = getattr(module, name)
fn.restype = ret
fn.argtypes = args
return fn
WAITORTIMERCALLBACK = ctypes.WINFUNCTYPE(
None, # return value: VOID
ctypes.wintypes.LPVOID, # PVOID lpParameter
ctypes.wintypes.BOOL # BOOLEAN TimerOrWaitFired
)
new_wait_object = HANDLE()
RegisterWaitForSingleObject = _decl(
"RegisterWaitForSingleObject",
BOOL,
(ctypes.POINTER(HANDLE), HANDLE, WAITORTIMERCALLBACK, LPVOID, DWORD, DWORD)
)
def waitortimercallback(lp_parameter, timer_fired):
print('Data available')
c_waitortimercallback = WAITORTIMERCALLBACK(waitortimercallback)
result = RegisterWaitForSingleObject(
ctypes.byref(new_wait_object), # phNewWaitObject
overlapped.hEvent, # hObject
c_waitortimercallback, # Callback
None, # Context
timeout_mil, # dwMilliseconds
0, # dwFlags
)
I'm not sure I've done everything (or anything) correctly. But I've had success previously with WaitForSingleObject. Unfortunately there are no examples on the internet for RegisterWaitForSingleObject and Python. I probably defined some types incorrectly.
When I run my code, RegisterWaitForSingleObject returns 1 (success), but as soon as data is written, I receive "Segmentation fault" error.
Would appreciate any input.
I'm having trouble with one of my scripts, where it erratically seems to have trouble writing to its own log, throwing the error "This file is being used by another process."
I know there are ways to handle this with try excepts, but I'd like to find out why this is happening rather than just papering over it. Nothing else should be accessing that file at all. So in order to confirm the source of the bug, I'd like to find out what service is using that file.
Is there a way in Python on Windows to check what process is using a given file?
You can use Microsoft's handle.exe command-line utility. For example:
import re
import subprocess
_handle_pat = re.compile(r'(.*?)\s+pid:\s+(\d+).*[0-9a-fA-F]+:\s+(.*)')
def open_files(name):
"""return a list of (process_name, pid, filename) tuples for
open files matching the given name."""
lines = subprocess.check_output('handle.exe "%s"' % name).splitlines()
results = (_handle_pat.match(line.decode('mbcs')) for line in lines)
return [m.groups() for m in results if m]
Note that this has limitations regarding Unicode filenames. In Python 2 subprocess passes name as an ANSI string because it calls CreateProcessA instead of CreateProcessW. In Python 3 the name gets passed as Unicode. In either case, handle.exe writes its output using a lossy ANSI encoding, so the matched filename in the result tuple may contain best-fit characters and "?" replacements.
Please don't delete this answer in case I did anything wrong but give me a chance to correct it by leaving a comment. Thanks!
There is a better way than iterating through all PIDs (as was suggested in the comments) that involves performing a Windows API call to determine all handles on a given file. Please find below a code example which I have already posted for another question (however, cannot flag as duplicate since it does not have any accepted answers). Note that this only works for Windows.
import ctypes
from ctypes import wintypes
path = r"C:\temp\stackoverflow39570207.txt"
# -----------------------------------------------------------------------------
# generic strings and constants
# -----------------------------------------------------------------------------
ntdll = ctypes.WinDLL('ntdll')
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
NTSTATUS = wintypes.LONG
INVALID_HANDLE_VALUE = wintypes.HANDLE(-1).value
FILE_READ_ATTRIBUTES = 0x80
FILE_SHARE_READ = 1
OPEN_EXISTING = 3
FILE_FLAG_BACKUP_SEMANTICS = 0x02000000
FILE_INFORMATION_CLASS = wintypes.ULONG
FileProcessIdsUsingFileInformation = 47
LPSECURITY_ATTRIBUTES = wintypes.LPVOID
ULONG_PTR = wintypes.WPARAM
# -----------------------------------------------------------------------------
# create handle on concerned file with dwDesiredAccess == FILE_READ_ATTRIBUTES
# -----------------------------------------------------------------------------
kernel32.CreateFileW.restype = wintypes.HANDLE
kernel32.CreateFileW.argtypes = (
wintypes.LPCWSTR, # In lpFileName
wintypes.DWORD, # In dwDesiredAccess
wintypes.DWORD, # In dwShareMode
LPSECURITY_ATTRIBUTES, # In_opt lpSecurityAttributes
wintypes.DWORD, # In dwCreationDisposition
wintypes.DWORD, # In dwFlagsAndAttributes
wintypes.HANDLE) # In_opt hTemplateFile
hFile = kernel32.CreateFileW(
path, FILE_READ_ATTRIBUTES, FILE_SHARE_READ, None, OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS, None)
if hFile == INVALID_HANDLE_VALUE:
raise ctypes.WinError(ctypes.get_last_error())
# -----------------------------------------------------------------------------
# prepare data types for system call
# -----------------------------------------------------------------------------
class IO_STATUS_BLOCK(ctypes.Structure):
class _STATUS(ctypes.Union):
_fields_ = (('Status', NTSTATUS),
('Pointer', wintypes.LPVOID))
_anonymous_ = '_Status',
_fields_ = (('_Status', _STATUS),
('Information', ULONG_PTR))
iosb = IO_STATUS_BLOCK()
class FILE_PROCESS_IDS_USING_FILE_INFORMATION(ctypes.Structure):
_fields_ = (('NumberOfProcessIdsInList', wintypes.LARGE_INTEGER),
('ProcessIdList', wintypes.LARGE_INTEGER * 64))
info = FILE_PROCESS_IDS_USING_FILE_INFORMATION()
PIO_STATUS_BLOCK = ctypes.POINTER(IO_STATUS_BLOCK)
ntdll.NtQueryInformationFile.restype = NTSTATUS
ntdll.NtQueryInformationFile.argtypes = (
wintypes.HANDLE, # In FileHandle
PIO_STATUS_BLOCK, # Out IoStatusBlock
wintypes.LPVOID, # Out FileInformation
wintypes.ULONG, # In Length
FILE_INFORMATION_CLASS) # In FileInformationClass
# -----------------------------------------------------------------------------
# system call to retrieve list of PIDs currently using the file
# -----------------------------------------------------------------------------
status = ntdll.NtQueryInformationFile(hFile, ctypes.byref(iosb),
ctypes.byref(info),
ctypes.sizeof(info),
FileProcessIdsUsingFileInformation)
pidList = info.ProcessIdList[0:info.NumberOfProcessIdsInList]
print(pidList)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I need to know how to monitor a memory address and its values in Python.
For example: I have a game that is written in C. I want to write a Python script that read the memory address of my current HitPoints and take actions based on its values.
I already can get the memory addresses with CheatEngine, but I don't know how use this in Python.
Here's a read_process function. The result is either bytes (2.x str), or an array of ctypes structures. The default is to read 1 byte from the process. The optional dtype parameter must be a ctypes type, such as ctypes.c_cint or a ctypes.Structure subclass. It reads an array of the given type and length.
Be careful to avoid dereferencing pointer values. For example, if you pass dtype=c_char_p, then simply indexing the result array will try to dereference a remote pointer in the current process, which will likely crash Python. In a previous answer I wrote a read-only RemotePointer class if you need to handle that case.
ctypes definitions
import ctypes
from ctypes import wintypes
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
PROCESS_VM_READ = 0x0010
SIZE_T = ctypes.c_size_t
PSIZE_T = ctypes.POINTER(SIZE_T)
def _check_bool(result, func, args):
if not result:
raise ctypes.WinError(error)
return args
kernel32.OpenProcess.errcheck = _check_bool
kernel32.OpenProcess.restype = wintypes.HANDLE
kernel32.OpenProcess.argtypes = (
wintypes.DWORD, # _In_ dwDesiredAccess
wintypes.BOOL, # _In_ bInheritHandle
wintypes.DWORD) # _In_ dwProcessId
kernel32.CloseHandle.errcheck = _check_bool
kernel32.CloseHandle.argtypes = (
wintypes.HANDLE,)
kernel32.ReadProcessMemory.errcheck = _check_bool
kernel32.ReadProcessMemory.argtypes = (
wintypes.HANDLE, # _In_ hProcess
wintypes.LPCVOID, # _In_ lpBaseAddress
wintypes.LPVOID, # _Out_ lpBuffer
SIZE_T, # _In_ nSize
PSIZE_T) # _Out_ lpNumberOfBytesRead
read_process definition
def read_process(pid, address, length=1, dtype=ctypes.c_char):
result = (dtype * length)()
nread = SIZE_T()
hProcess = kernel32.OpenProcess(PROCESS_VM_READ, False, pid)
try:
kernel32.ReadProcessMemory(hProcess, address, result,
ctypes.sizeof(result),
ctypes.byref(nread))
finally:
kernel32.CloseHandle(hProcess)
if issubclass(dtype, ctypes.c_char):
return result.raw
return result
example
if __name__ == '__main__':
import os
class DType(ctypes.Structure):
_fields_ = (('x', ctypes.c_int),
('y', ctypes.c_double))
source = (DType * 2)(*[(42, 3.14),
(84, 2.72)])
pid = os.getpid()
address = ctypes.addressof(source)
sink = read_process(pid, address, 2, DType)
for din, dout in zip(source, sink):
assert din.x == dout.x
assert din.y == dout.y
size = ctypes.sizeof(source)
buf_source = ctypes.string_at(source, size)
buf_sink = read_process(pid, address, size)
assert buf_source == buf_sink
I'm trying to work with ctypes, and I can't get the call to FormatMessage() to work properly.
Here's the code I have so far; I think the only issue is passing in a mutable buffer; I'm getting an ArgumentError from ctypes about lpBuffer
import ctypes
from ctypes.wintypes import DWORD
def main():
fm = ctypes.windll.kernel32.FormatMessageA
fm.argtypes = [DWORD,DWORD,DWORD,DWORD,ctypes.wintypes.LPWSTR(),DWORD]
dwFlags = DWORD(0x1000) # FORMAT_MESSAGE_ALLOCATE_BUFFER |FORMAT_MESSAGE_FROM_SYSTEM
lpSource = DWORD(0)
dwMessageId = DWORD(0x05)
dwLanguageId = DWORD(0)
#buf = ctypes.wintypes.LPWSTR()
#lpBuffer = ctypes.byref(buf)
lpBuffer = ctypes.create_string_buffer(512)
nSize = DWORD(512)
res = fm(dwFlags,lpSource,dwMessageId,dwLanguageId,lpBuffer,nSize)
print res
I'm getting an error on the lpBuffer argument saying it's a wrong type, but I've tried as many variations of passing in the buffer as I could think of. I've tried doing it similar to here: https://gist.github.com/CBWhiz/6135237 and setting FORMAT_MESSAGE_ALLOCATE_BUFFER then passing in a LPWSTR() byref, I've also tried changing the argtype, pointer and casting to a variety of LPWSTR(), c_char_p, etc, but no matter what I do it keeps complaining.
What's the proper syntax to get the function to execute properly? I know ctypes can be finnicky but I haven't found anything in the documentation to resolve the issue (I know the documentation uses prototype() but I'd like to do it this way for now)
Thanks
Here's the argtypes definition for FormatMessageW (note "W" for Unicode):
import ctypes
from ctypes import wintypes
fm = ctypes.windll.kernel32.FormatMessageW
fm.argtypes = [
wintypes.DWORD, # dwFlags
wintypes.LPCVOID, # lpSource
wintypes.DWORD, # dwMessageId
wintypes.DWORD, # dwLanguageId
wintypes.LPWSTR, # lpBuffer
wintypes.DWORD, # nSize
wintypes.LPVOID, # Arguments (va_list *)
]
FORMAT_MESSAGE_ALLOCATE_BUFFER = 0x100
FORMAT_MESSAGE_FROM_SYSTEM = 0x1000
If FormatMessage allocates the buffer, you have to instead pass a reference to lpBuffer. Just cast the reference to get around the TypeError. Also, remember to call kernel32.LocalFree to free the buffer:
def main():
dwFlags = FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER
lpSource = None
dwMessageId = 5
dwLanguageId = 0
lpBuffer = wintypes.LPWSTR()
nSize = 0 # minimum size
Arguments = None
if not fm(dwFlags, lpSource, dwMessageId, dwLanguageId,
ctypes.cast(ctypes.byref(lpBuffer), wintypes.LPWSTR),
nSize, Arguments):
raise ctypes.WinError()
msg = lpBuffer.value.rstrip()
ctypes.windll.kernel32.LocalFree(lpBuffer)
return msg