Embed pickle (or arbitrary) data in python script

Embed pickle (or arbitrary) data in python script - python

In Perl, the interpreter kind of stops when it encounters a line with
__END__
in it. This is often used to embed arbitrary data at the end of a perl script. In this way the perl script can fetch and store data that it stores 'in itself', which allows for quite nice opportunities.
In my case I have a pickled object that I want to store somewhere. While I can use a file.pickle file just fine, I was looking for a more compact approach (to distribute the script more easily).
Is there a mechanism that allows for embedding arbitrary data inside a python script somehow?

With pickle you can also work directly on strings.
s = pickle.dumps(obj)
pickle.loads(s)
If you combine that with """ (triple-quoted strings) you can easily store any pickled data in your file.

If the data is not particularly large (many K) I would just .encode('base64') it and include that in a triple-quoted string, with .decode('base64') to get back the binary data, and a pickle.loads() call around it.

In Python, you can use """ (triple-quoted strings) to embed long runs of text data in your program.
In your case, however, don't waste time on this.
If you have an object you've pickled, you'd be much, much happier dumping that object as Python source and simply including the source.
The repr function, applied to most objects, will emit a Python source-code version of the object. If you implement __repr__ for all of your custom classes, you can trivially dump your structure as Python source.
If, on the other hand, your pickled structure started out as Python code, just leave it as Python code.

I made this code. You run something like python comp.py foofile.tar.gz, and it creates decomp.py, with foofile.tar.gz's contents embedded in it. I don't think this is really portable with windows because of the Popen though.
import base64
import sys
import subprocess
inf = open(sys.argv[1],"r+b").read()
outs = base64.b64encode(inf)
decomppy = '''#!/usr/bin/python
import base64
def decomp(data):
fname = "%s"
outf = open(fname,"w+b")
outf.write(base64.b64decode(data))
outf.close()
# You can put the rest of your code here.
#Like this, to unzip an archive
#import subprocess
#subprocess.Popen("tar xzf " + fname, shell=True)
#subprocess.Popen("rm " + fname, shell=True)
''' %(sys.argv[1])
taildata = '''uudata = """%s"""
decomp(uudata)
''' %(outs)
outpy = open("decomp.py","w+b")
outpy.write(decomppy)
outpy.write(taildata)
outpy.close()
subprocess.Popen("chmod +x decomp.py",shell=True)

Related

A function that can be stored on memory

suppose we have two python programs. calculate.py and show_results.py.
When calculate.py program runs on terminal, it returns a variable (let's say a list called result) to the computer memory. And when we run show_results.py on terminal, it prints the result from the programs before.
Suppose the result of the calculate.py is a list A = [83, 22]. So it will be like below on terminal:
$:~ python3 calculate.py
-------Calculation Done--------
$:~ python3 show_results.py
83, 22
Any suggestions ?
Any response will be appreciated.

As #blue_note suggests, it will not possible from kernal level.
1) You can store first script result into filesystem/database to retrieve later.
2) You can write a program which will handle all these functionality in one python script if you run it together.

I think you can store that data in a .json file. For that, you can use the json library:
import json
with open('data.json', 'w') as outfile:
json.dump(data, outfile)
And, read it from show_results.py file:
import json
with open('data.json') as f:
data = json.load(f)
Here you have the documentation about Json Python library:
https://docs.python.org/3/library/json.html

You can't do that. And there's no way around it, it's not in python's hands, the operating system decides. When you run python my_script.py, you create a process. The process has its own memory space, as long as it runs. When the program terminates, this memory is cleared. When you run the second script, the execution of the first script has never happened, as far as the OS is concerned.
You could get around it with keeping the first process running and using some interprocess communication method. But it's difficult and has no real benefits. Just create as script that gets the results from the first script and passes them to the second script, if you just want one-off results. Or, store to a file or database, if you care about the result long-term.

Writing PDFs to STDOUT with Python

I want to merge two PDF documents with Python (prepend a pre-made cover sheet to an existing document) and present the result to a browser. I'm currently using the PyPDF2 library which can perform the merge easily enough, but the PdfFileWriter class write() method only seems to support writing to a file object (must support write() and tell() methods). In this case, there is no reason to touch the filesystem; the merged PDF is already in memory and I just want to send a Content-type header and then the document to STDOUT (the browser via CGI). Is there a Python library better suited to writing a document to STDOUT than PyPDF2? Alternately, is there a way to pass STDIO as an argument to PdfFileWriter's write() method in such a way that it appears to write() as though it were a file handle?
Letting write() write the document to the filesystem and then opening the resulting file and sending it to the browser works, but is not an option in this case (aside from being terribly inelegant).
solution
Using mgilson's advice, this is how I got it to work in Python 2.7:
#!/usr/bin/python
import cStringIO
import sys
from PyPDF2 import PdfFileMerger
merger = PdfFileMerger()
###
# Actual PDF open/merge code goes here
###
output = cStringIO.StringIO()
merger.write(output)
print("Content-type: application/pdf\n")
sys.stdout.write(output.getvalue())
output.close()

Python supports an "in-memory" filetype via cStringIO.StringIO (or io.BytesIO, ... depending on python version). In your case, you could create an instance of one of those classes, pass that to the method which expects a file and then you can use the .getvalue() method to return the contents as a string (or bytes depending on python version). Once you have the contents as a string, you can simply print them or use sys.stdout.write to write the string to standard output.

Output python into python-readable format

We're using a python based application which reads a configuration file containing a couple of arrays:
Example layout of config file:
array1 = [
'bob',
'sue',
'jayne'
]
Currently changes to the configuration are done by hand, but I've written a little interface to streamline the process (mainly to avoid errors).
It currently reads in the existing configuration, using a simple "import". However what I'm not sure how to do, is get my script to write it's output in valid python, so that the main application can read it again.
How can I can dump the array back into the file, but in valid python?
Cheers!

I'd suggest JSON or YAML (Less verbose than JSON) for configuration files. That way, the configuration file becomes more readable for the less pythonate ;) It's also easier to throw adequate errors, e.g. if the configuration is incomplete.
To save python objects you can always use pickle.

Generally using repr() will create a string that can be re-avaluated. But pprint does a little nicer output.
from pprint import pprint
outf.write("array1 = "); pprint(array1, outf)

repr(array1) (and write that into the file) would be a very simple solution, but it should work here.

create a tar file in a string using python

I need to generate a tar file but as a string in memory rather than as an actual file. What I have as input is a single filename and a string containing the assosiated contents. I'm looking for a python lib I can use and avoid having to role my own.
A little more work found these functions but using a memory steam object seems a little... inelegant. And making it accept input from strings looks like even more... inelegant. OTOH it works. I assume, as most of it is new to me. Anyone see any bugs in it?

Use tarfile in conjunction with cStringIO:
c = cStringIO.StringIO()
t = tarfile.open(mode='w', fileobj=c)
# here: do your work on t, then...:
s = c.getvalue() # extract the bytestring you need

Python Input/Output, files

I need to write some methods for loading/saving some classes to and from a binary file. However I also want to be able to accept the binary data from other places, such as a binary string.
In c++ I could do this by simply making my class methods use std::istream and std::ostream which could be a file, a stringstream, the console, whatever.
Does python have a similar input/output class which can be made to represent almost any form of i/o, or at least files and memory?

The Python way to do this is to accept an object that implements read() or write(). If you have a string, you can make this happen with StringIO:
from cStringIO import StringIO
s = "My very long string I want to read like a file"
file_like_string = StringIO(s)
data = file_like_string.read(10)
Remember that Python uses duck-typing: you don't have to involve a common base class. So long as your object implements read(), it can be read like a file.

The Pickle and cPickle modules may also be helpful to you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Embed pickle (or arbitrary) data in python script - python

With pickle you can also work directly on strings. s = pickle.dumps(obj) pickle.loads(s) If you combine that with """ (triple-quoted strings) you can easily store any pickled data in your file.

If the data is not particularly large (many K) I would just .encode('base64') it and include that in a triple-quoted string, with .decode('base64') to get back the binary data, and a pickle.loads() call around it.

Related

A function that can be stored on memory

Writing PDFs to STDOUT with Python

Output python into python-readable format

create a tar file in a string using python

Python Input/Output, files

Categories

Resources