Python3.6 download 1.3G Big Video File generate MemoryError - python

self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
def download(self):
print("start thread:%s at %s" % (self.getName(), time.time()))
headers = {"Range": "bytes=%s-%s" % (self.startpos, self.endpos)}
res = requests.get(self.url, headers=headers, stream=True)
# res.text 是将get获取的byte类型数据自动编码,是str类型, res.content是原始的byte类型数据
# 所以下面是直接write(res.content)
with open(self.filename, "wb") as fp:
print("stop thread:%s at %s" % (self.getName(), time.time()))
# f.close()
def run(self):

I had the same problem on Windows 10. If your platform is NOT Windows NT or newer, this answer will not help.
The problem is: on Windows the socket input is always buffered. This can NOT be avoided. No way. And this causes MemoryError because of Windows (not by Python code) and by Python C internals - socket input is buffered to a C variable, these have limited size so we have to "help" that buffer - the solution is: creating a bytearray object, calling its join method (second parameter - sockobj.recv(1024)) in a while loop. To accomplish this, navigate to <Python Installdir>\Lib\ (can also be Lib\site-packages\requests\ - notice TWO underscores!) and fix all socket usage with the fix specified above. Also perform such fix for all files in the requests package and also urllib3 package.
Hope this helps.
No. Python 64bit will NOT fix your problem: Windows Socket Buffer will remain THE SAME! Only looped recv with explicit Python bytearray buffer will help!

You're using stream=True, which explicitly has the use of buffering the data so you can asynchronously empty the buffer.
You're not doing that. So simply don't use stream=True but directly save to a file.


How to check output of a sub process but also hide it? [duplicate]

NB. I have seen Log output of multiprocessing.Process - unfortunately, it doesn't answer this question.
I am creating a child process (on windows) via multiprocessing. I want all of the child process's stdout and stderr output to be redirected to a log file, rather than appearing at the console. The only suggestion I have seen is for the child process to set sys.stdout to a file. However, this does not effectively redirect all stdout output, due to the behaviour of stdout redirection on Windows.
To illustrate the problem, build a Windows DLL with the following code
#include <iostream>
extern "C"
__declspec(dllexport) void writeToStdOut()
std::cout << "Writing to STDOUT from test DLL" << std::endl;
Then create and run a python script like the following, which imports this DLL and calls the function:
from ctypes import *
import sys
print "Writing to STDOUT from python, before redirect"
sys.stdout = open("stdout_redirect_log.txt", "w")
print "Writing to STDOUT from python, after redirect"
testdll = CDLL("Release/stdout_test.dll")
In order to see the same behaviour as me, it is probably necessary for the DLL to be built against a different C runtime than than the one Python uses. In my case, python is built with Visual Studio 2010, but my DLL is built with VS 2005.
The behaviour I see is that the console shows:
Writing to STDOUT from python, before redirect
Writing to STDOUT from test DLL
While the file stdout_redirect_log.txt ends up containing:
Writing to STDOUT from python, after redirect
In other words, setting sys.stdout failed to redirect the stdout output generated by the DLL. This is unsurprising given the nature of the underlying APIs for stdout redirection in Windows. I have encountered this problem at the native/C++ level before and never found a way to reliably redirect stdout from within a process. It has to be done externally.
This is actually the very reason I am launching a child process - it's so that I can connect externally to its pipes and thus guarantee that I am intercepting all of its output. I can definitely do this by launching the process manually with pywin32, but I would very much like to be able to use the facilities of multiprocessing, in particular the ability to communicate with the child process via a multiprocessing Pipe object, in order to get progress updates. The question is whether there is any way to both use multiprocessing for its IPC facilities and to reliably redirect all of the child's stdout and stderr output to a file.
UPDATE: Looking at the source code for multiprocessing.Processs, it has a static member, _Popen, which looks like it can be used to override the class used to create the process. If it's set to None (default), it uses a multiprocessing.forking._Popen, but it looks like by saying
multiprocessing.Process._Popen = MyPopenClass
I could override the process creation. However, although I could derive this from multiprocessing.forking._Popen, it looks like I would have to copy a bunch of internal stuff into my implementation, which sounds flaky and not very future-proof. If that's the only choice I think I'd probably plump for doing the whole thing manually with pywin32 instead.
The solution you suggest is a good one: create your processes manually such that you have explicit access to their stdout/stderr file handles. You can then create a socket to communicate with the sub-process and use multiprocessing.connection over that socket (multiprocessing.Pipe creates the same type of connection object, so this should give you all the same IPC functionality).
Here's a two-file example.
import multiprocessing.connection
import subprocess
import socket
import sys, os
## Listen for connection from remote process (and find free port number)
port = 10000
while True:
l = multiprocessing.connection.Listener(('localhost', int(port)), authkey="secret")
except socket.error as ex:
if ex.errno != 98:
port += 1 ## if errno==98, then port is not available.
proc = subprocess.Popen((sys.executable, "", str(port)), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## open connection for remote process
conn = l.accept()
conn.send([1, "asd", None])
import multiprocessing.connection
import subprocess
import sys, os, time
port = int(sys.argv[1])
conn = multiprocessing.connection.Client(('localhost', port), authkey="secret")
while True:
obj = conn.recv()
print("received: %s\n" % str(obj))
except EOFError: ## connection closed
You may also want to see the first answer to this question to get non-blocking reads from the subprocess.
I don't think you have a better option than redirecting a subprocess to a file as you mentioned in your comment.
The way consoles stdin/out/err work in windows is each process when it's born has its std handles defined. You can change them with SetStdHandle. When you modify python's sys.stdout you only modify where python prints out stuff, not where other DLL's are printing stuff. Part of the CRT in your DLL is using GetStdHandle to find out where to print out to. If you want, you can do whatever piping you want in windows API in your DLL or in your python script with pywin32. Though I do think it'll be simpler with subprocess.
Alternatively - and I know this might be slightly off-topic, but helped in my case for the same problem - , this can be resolved with screen on Linux:
screen -L -Logfile './logfile_%Y-%m-%d.log' python
this way no need to implement all the master-child communication
I assume I'm off base and missing something, but for what it's worth here is what came to mind when I read your question.
If you can intercept all of the stdout and stderr (I got that impression from your question), then why not add or wrap that capture functionality around each of your processes? Then send what is captured through a queue to a consumer that can do whatever you want with all of the outputs?
In my situation I changed sys.stdout.write to write to a PySide QTextEdit. I couldn't read from sys.stdout and I didn't know how to change sys.stdout to be readable. I created two Pipes. One for stdout and the other for stderr. In the separate process I redirect sys.stdout and sys.stderr to the child connection of the multiprocessing pipe. On the main process I created two threads to read the stdout and stderr parent pipe and redirect the pipe data to sys.stdout and sys.stderr.
import sys
import contextlib
import threading
import multiprocessing as mp
import multiprocessing.queues
from queue import Empty
import time
class PipeProcess(mp.Process):
"""Process to pipe the output of the sub process and redirect it to this sys.stdout and sys.stderr.
The use_queue = True argument will pass data between processes using Queues instead of Pipes. Queues will
give you the full output and read all of the data from the Queue. A pipe is more efficient, but may not
redirect all of the output back to the main process.
def __init__(self, group=None, target=None, name=None, args=tuple(), kwargs={}, *_, daemon=None,
use_pipe=None, use_queue=None):
self.read_out_th = None
self.read_err_th = None
self.pipe_target = target
self.pipe_alive = mp.Event()
if use_pipe or (use_pipe is None and not use_queue): # Default
self.parent_stdout, self.child_stdout = mp.Pipe(False)
self.parent_stderr, self.child_stderr = mp.Pipe(False)
self.parent_stdout = self.child_stdout = mp.Queue()
self.parent_stderr = self.child_stderr = mp.Queue()
args = (self.child_stdout, self.child_stderr, target) + tuple(args)
target = self.run_pipe_out_target
super(PipeProcess, self).__init__(group=group, target=target, name=name, args=args, kwargs=kwargs,
def start(self):
"""Start the multiprocess and reading thread."""
super(PipeProcess, self).start()
self.read_out_th = threading.Thread(target=self.read_pipe_out,
args=(self.pipe_alive, self.parent_stdout, sys.stdout))
self.read_err_th = threading.Thread(target=self.read_pipe_out,
args=(self.pipe_alive, self.parent_stderr, sys.stderr))
self.read_out_th.daemon = True
self.read_err_th.daemon = True
def run_pipe_out_target(cls, pipe_stdout, pipe_stderr, pipe_target, *args, **kwargs):
"""The real multiprocessing target to redirect stdout and stderr to a pipe or queue."""
sys.stdout.write = cls.redirect_write(pipe_stdout) # , sys.__stdout__) # Is redirected in main process
sys.stderr.write = cls.redirect_write(pipe_stderr) # , sys.__stderr__) # Is redirected in main process
pipe_target(*args, **kwargs)
def redirect_write(child, out=None):
"""Create a function to write out a pipe and write out an additional out."""
if isinstance(child, mp.queues.Queue):
send = child.put
send = child.send_bytes # No need to pickle with child_conn.send(data)
def write(data, *args):
if isinstance(data, str):
data = data.encode('utf-8')
if out is not None:
return write
def read_pipe_out(cls, pipe_alive, pipe_out, out):
if isinstance(pipe_out, mp.queues.Queue):
# Queue has better functionality to get all of the data
def recv():
return pipe_out.get(timeout=0.5)
def is_alive():
return pipe_alive.is_set() or pipe_out.qsize() > 0
# Pipe is more efficient
recv = pipe_out.recv_bytes # No need to unpickle with data = pipe_out.recv()
is_alive = pipe_alive.is_set
# Loop through reading and redirecting data
while is_alive():
data = recv()
if isinstance(data, bytes):
data = data.decode('utf-8')
except EOFError:
except Empty:
def join(self, *args):
# Wait for process to finish (unless a timeout was given)
super(PipeProcess, self).join(*args)
# Trigger to stop the threads
# Pipe must close to prevent blocking and waiting on recv forever
if not isinstance(self.parent_stdout, mp.queues.Queue):
with contextlib.suppress():
with contextlib.suppress():
# Close the pipes and threads
with contextlib.suppress():
with contextlib.suppress():
def run_long_print():
for i in range(1000):
print(i, file=sys.stderr)
if __name__ == '__main__':
# Example test write (My case was a QTextEdit)
out = open('stdout.log', 'w')
err = open('stderr.log', 'w')
# Overwrite the write function and not the actual stdout object to prove this works
sys.stdout.write = out.write
sys.stderr.write = err.write
# Create a process that uses pipes to read multiprocess output back into sys.stdout.write
proc = PipeProcess(target=run_long_print, use_queue=True) # If use_pipe=True Pipe may not write out all values
# proc.daemon = True # If daemon and use_queue Not all output may be redirected to stdout
# time.sleep(5) # Not needed unless use_pipe or daemon and all of stdout/stderr is desired
# Close the process
proc.join() # For some odd reason this blocks forever when use_queue=False
# Close the output files for this test
Here is the simple and straightforward way for capturing stdout for multiprocessing.Process:
import app
import io
import sys
from multiprocessing import Process
def run_app(some_param):
sys.stdout = io.TextIOWrapper(open(sys.stdout.fileno(), 'wb', 0), write_through=True)
app_process = Process(target=run_app, args=('some_param',))
# Use app_process.termninate() for python <= 3.7.

How to exit function with signal on Windows?

I have the following code written in Python 2.7 on Windows. I want to check for updates for the current python script and update it, if there is an update, with a new version through ftp server preserving the filename and then executing the new python script after terminating the current through the os.kill with SIGNTERM.
I went with the exit function approach but I read that in Windows this only works with the atexit library and default python exit methods. So I used a combination of the atexit.register() and the signal handler.
***necessary libraries***
filematch = ''
version = '0.0'
checkdir = os.path.abspath(".")
dircontent = os.listdir(checkdir)
r = StringIO()
def exithandler():
if filematch in dircontent:
os.remove(checkdir + '\\' + filematch)
except Exception as e:
print e
ftp = FTP(ip address)
ftp.login(username, password)
for filename in ftp.nlst(filematch):
fhandle = open(filename, 'wb')
ftp.retrbinary('RETR ' + filename, fhandle.write)
subprocess.Popen([sys.executable, ""])
print 'Test file successfully updated.'
except Exception as e:
print e
ftp = FTP(ip address)
ftp.login(username, password)
ftp.retrbinary('RETR version.txt', r.write)
if(r.getvalue() != version):
somepid = os.getpid()
signal.signal(SIGTERM, lambda signum, stack_frame: exit(1))
os.kill(somepid, signal.SIGTERM)
print 'Successfully replaced and started the file'
Using the:
signal.signal(SIGTERM, lambda signum, stack_frame: exit(1))
I get:
Traceback (most recent call last):
File "C:\Users\STiX\Desktop\Python Keylogger\", line 50, in <module>
signal.signal(SIGTERM, lambda signum, stack_frame: exit(1))
NameError: name 'SIGTERM' is not defined
But I get the job done without a problem except if I use the current code in a more complex script where the script give me the same error but terminates right away for some reason.
On the other hand though, if I use it the correct way, signal.SIGTERM, the process goes straight to termination and the exit function never executed. Why is that?
How can I make this work on Windows and get the outcome that I described above successfully?
What you are trying to do seems a bit complicated (and dangerous from an infosec-perspective ;-). I would suggest to handle the reload-file-when-updated part of the functionality be adding a controller class that imports the python script you have now as a module and, starts it and the reloads it when it is updated (based on a function return or other technique) - look this way for inspiration -
Edit - what about exe?
Another hacky technique for manipulating the file of the currently running program would be the shell ping trick. It can be used from all programming languages. The trick is to send a shell command that is not executed before after the calling process has terminated. Use ping to cause the delay and chain the other commands with &. For your use case it could be something like this:
import subprocess
subprocess.Popen("ping -n 2 -w 2000 > Nul & del & rename & ", shell=True)
Edit 2 - Alternative solution to original question
Since python does not block write access to the currently running script an alternative concept to solve the original question would be:
import subprocess
print "hello"
a = open(__file__,"r")
running_script_as_string =
b = open(__file__,"w")
b.write("\nprint 'updated version of hack'")

zlib.error: Error -5 while decompressing data: incomplete or truncated stream in Python

I have been pulling my hair out trying to get a proxy working. I need to decrypt the packets from a server and client ((this may be out of order..)), then decompress everything but the packet header.
The first 2 packets ((10101 and 20104)) are not compressed, and decrypt, destruct, and decompile properly.
Alas, but to no avail; FAIL!; zlib.error: Error -5 while decompressing data: incomplete or truncated stream
Same error while I am attempting to decompress the encrypted version of the packet.
When I include the packet header, I get a randomly chosen -3 error.
I have also tried changing -zlib.MAX_WBITS to zlib.MAX_WBITS, as well as a few others, but still get the same error.
Here's the code;
import socket, sys, os, struct, zlib
from Crypto.Cipher import ARC4 as rc4
cwd = os.getcwd()
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ss = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client, addr = s.accept()
key = "fhsd6f86f67rt8fw78fw789we78r9789wer6renonce"
cts =
stc =
skip = 'a'*len(key)
def io():
while True:
pack = client.recv(65536)
decpack = cts.decrypt(pack[7:])
msgid, paylen = dechead(pack)
if msgid != 10101:
decopack = zlib.decompress(decpack, -zlib.MAX_WBITS)
print "ID:",msgid
print "Payload Length",paylen
print "Payload:\n",decpack
dump(msgid, decpack)
except socket.timeout:
pack = ss.recv(65536)
msgid, paylen = dechead(pack)
decpack = stc.decrypt(pack[7:])
if msgid != 20104:
decopack = zlib.decompress(decpack, -zlib.MAX_WBITS)
print "ID:",msgid
print "Payload Length",paylen
print "Payload:\n",decpack
dump(msgid, decpack)
except socket.timeout:
def dump(msgid, decpack):
global cwd
pdf = open(cwd+"/"+str(msgid)+".bin",'wb')
def dechead(pack):
msgid = struct.unpack('>H', pack[0:2])[0]
print int(struct.unpack('>H', pack[5:7])[0])
payload_bytes = struct.unpack('BBB', pack[2:5])
payload_len = ((payload_bytes[0] & 255) << 16) | ((payload_bytes[1] & 255) << 8) | (payload_bytes[2] & 255)
return msgid, payload_len
I realize it's messy, disorganized and very bad, but it all works as intended minus the decompression.
Yes, I am sure the packets are zlib compressed.
What is going wrong here and why?
Full Traceback:
Traceback (most recent call last):
File "", line 68, in <module>
File "", line 33, in io
decopack = zlib.decompress(decpack, zlib.MAX_WBITS)
zlib.error: Error -5 while decompressing data: incomplete or truncated stream
I ran into the same problem while trying to decompress a file using zlib with Python 2.7. The issue had to do with the size of the stream (or file input) exceeding the size that could be stored in memory. (My PC has 16 GB of memory, so it was not exceeding the physical memory size, but the buffer default size is 16384.)
The easiest fix was to change the code from:
import zlib
f_in = open('my_data.zz', 'rb')
comp_data =
data = zlib.decompress(comp_data)
import zlib
f_in = open('my_data.zz', 'rb')
comp_data =
zobj = zlib.decompressobj() # obj for decompressing data streams that won’t fit into memory at once.
data = zobj.decompress(comp_data)
It handles the stream by buffering it and feeding in into the decompressor in manageable chunks.
I hope this helps to save you time trying to figure out the problem. I had help from my friend Jordan! I was trying all kinds of different window sizes (wbits).
Edit: Even with the below working on partial gz files for some files when I decompressed I got empty byte array and everything I tried would always return empty though the function was successful. Eventually I resorted to running gunzip process which always works:
def gunzip_string(the_string):
proc = subprocess.Popen('gunzip',stdout=subprocess.PIPE,
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL)
body =
return body
Note that the above can return a non-zero error code indicating that the input string is incomplete but it still performs the decompression and hence the stderr being swallowed. You may wish to check errors to allow for this case.
I think the zlib decompression library is throwing an exception because you are not passing in a complete file just a 65536 chunk ss.recv(65536). If you change from this:
decopack = zlib.decompress(decpack, -zlib.MAX_WBITS)
decompressor = zlib.decompressobj(-zlib.MAX_WBITS)
decopack = decompressor(decpack)
it should work as that way can handle streaming.
A the docs say
zlib.decompressobj - Returns a decompression object, to be used for decompressing data streams that won’t fit into memory at once.
or even if it does fit into memory you might just want to do the beginning of the file
Try this:
decopack = zlib.decompressobj().decompress(decpack, zlib.MAX_WBITS)

Unblock a file in windows from a python script

Could I unblock a file in windows(7), which is automatically blocked by windows (downloaded from Internet) from a python script? A WindowsError is raised when such a file is encountered. I thought of catching this exception, and running a powershell script that goes something like:
Parameter Set: ByPath
Unblock-File [-Path] <String[]> [-Confirm] [-WhatIf] [ <CommonParameters>]
Parameter Set: ByLiteralPath
Unblock-File -LiteralPath <String[]> [-Confirm] [-WhatIf] [ <CommonParameters>]
I don't know powershell scripting. But if I had one I could call it from python. Could you folks help?
Yes, all you have to do is call the following command line from Python:
powershell.exe -Command Unblock-File -Path "c:\path\to\blocked file.ps1"
From this page about the Unblock-File command:
Internally, the Unblock-File cmdlet removes the Zone.Identifier alternate data stream, which has a value of 3 to indicate that it was downloaded from the internet.
To remove an alternate data stream ads_name from a file path\to\file.ext, simply delete path\to\file.ext:ads_name:
os.remove(your_file_path + ':Zone.Identifier')
except FileNotFoundError:
# The ADS did not exist, it was already unblocked or
# was never blocked in the first place
# No need to open up a PowerShell subprocess!
(And similarly, to check if a file is blocked you can use os.path.isfile(your_file_path + ':Zone.Identifier'))
In a PowerShell script, you can use Unblock-File for this, or simply Remove-Item -Path $your_file_path':Zone.Identifier'.
Remove-Item also has a specific flag for alternate data streams: Remove-Item -Stream Zone.Identifier (which you can pipe in multiple files to, or a single -Path)
Late to the party . . . .
I have found that the Block status is simply an extra 'file' (stream) attached in NTFS and it can actually be accessed and somewhat manipulated by ordinary means. These are called Alternative Data Streams.
The ADS for file blocking (internet zone designation) is called ':Zone.Identifier' and contains, I think, some useful information:
All the other info I have found says to just delete this extra stream.... But, personally, I want to keep this info.... So I tried changing the ZoneId to 0, but it still shows as Blocked in Windows File Properties.
I settled on moving it to another stream name so I can still find it later.
The below script originated from a more generic script called pyADS. I only care about deleting / changing the Zone.Identifier attached stream -- which can all be done with simple Python commands. So this is a stripped-down version. It has several really nice background references listed. I am currently running the latest Windows 10 and Python 3.8+; I make no guarantees this works on older versions.
import os
Accessing alternative data-streams of files on an NTFS volume
Original ADS class (pyADS)
SysInternal streams applet
Windows: killing the Zone.Identifier NTFS alternate data stream
About URL Security Zones
GREAT info: How Windows Determines That the File....
Dixin's Blog: Understanding File Blocking and Unblocking
class ADS2():
def __init__(self, filename):
self.filename = filename
def full_filename(self, stream):
return "%s:%s" % (self.filename, stream)
def add_stream_from_file(self, filename):
if os.path.exists(filename):
with open(filename, "rb") as f: content =
return self.add_stream_from_string(filename, content)
print("Could not find file: {0}".format(filename))
return False
def add_stream_from_string(self, stream_name, bytes):
fullname = self.full_filename(os.path.basename(stream_name))
if os.path.exists(fullname):
print("Stream name already exists")
return False
fd = open(fullname, "wb")
return True
def delete_stream(self, stream):
return True
return False
def get_stream_content(self, stream):
fd = open(self.full_filename(stream), "rb")
content =
return content
def UnBlockFile(file, retainInfo=True):
ads = ADS2(file)
if zi := ads.get_stream_content("Zone.Identifier"):
if retainInfo: ads.add_stream_from_string("Download.Info", zi)
### Usage:
from unblock_files import UnBlockFile
D:\downloads>dir /r
Volume in drive D is foo
Directory of D:\downloads
11/09/2021 10:05 AM 8 some-pic.jpg
126 some-pic.jpg:Zone.Identifier:$DATA
1 File(s) 8 bytes
D:\downloads>more <some-pic.jpg:Zone.Identifier:$DATA
D:\downloads>dir /r
Volume in drive D is foo
Directory of D:\downloads
11/09/2021 10:08 AM 8 some-pic.jpg
126 some-pic.jpg:Download.Info:$DATA
1 File(s) 8 bytes

python - print output

I have created this below script and it works fine. But the output is not friendly (see below). I want the first line to display only the hostname and IP and remove (,'[], please suggest
('testhostname', [], [''])
cannot resolve hostname:
import socket
pfile = open ('C:\\Python27\\scripts\\test.txt')
while True:
IP = pfile.readline()
if not IP:
host = socket.gethostbyaddr(IP.rstrip())
print host
except socket.herror, err:
print "cannot resolve hostname: ", IP
Rather than printing all of the host tuple that is returned by gethostbyaddr, I suggest unpacking into separate variables that you can then print as you see fit:
hostname, alias_list, ip_addr_list = gethostbyaddr(IP.rstrip())
print hostname, ip_addr_list # or ip_addr_list[0] if you only want the one IP
If you want more control over the formatting, I suggest using the str.format method:
print "hostname: {}, IP(s): {}".format(hostname, ", ".join(ip_addr_list))
Also, a few other code suggestions (not directly related to your main question):
Use a with statement rather than manually opening and closing your file.
Iterate on the file object directly (with for IP in pfile:), rather than using while True: and calling pfile.readline() each time through.
Use the syntax except socek.herror as err rather than the older form with commas (which is deprecated in Python 2 and no longer exists in Python 3).

