How do I hide a file inside an image with Python?

How do I hide a file inside an image with Python? - python

I know it's possible in Batch using the 'copy' command with the '/B' switch, i.e.:
copy /B imagefile+hiddenfile newfile
My question is this; Is it possible to do this in Python, and if so, how?
This question is very similar, and it's answer is acceptable, but I am still curious;
Is there a way to do this without the stepic module?

You don't need stepic for that.
>>> out = file("out.jpg", "wb")
>>> out.write(file("someimage.jpg", "rb").read())
>>> out.write(file("somehiddenfile.pdf", "rb").read())
>>> out.close()
stepic is something completely different it is for putting "really" hidden data into an image, whereas your copy approach (and also my snippet above) just appends the file after the image's data. It is quite easy to extract the "somehiddenfile.pdf" from the generated file, whereas extracting steganographic information out of a real image is a lot more difficult.

stepic is a python package written to perform this operation - why not simply look at the source?

Related

tostring vs. tofile for creating raw binary files

I need to save a numpy array to a raw binary file, and based on advice from colleagues, I understand that tostring and tofile should be doing roughly the same thing. However, when I run
x=np.load('foo.npy')
(open('foo_1.dat', 'w')).write(x.T.tostring())
x.T.tofile('foo_2.dat')
np.all(np.fromfile('foo_1.dat') == np.fromfile('foo_2.dat'))
False is returned. Can anyone explain to me why this is the case, or if I am doing something wrong, where I can correct the code to make my end products equivalent?
EDIT:
Using this method, foo_1.dat and foo_2.dat have the same data type (float64), HOWEVER, the shape is different (tostring = 38497230L, tofile =38407680L).
I need to transpose the files for a program-specific application, and am not using np.save because I specifically need .dat files.

I don't know if this is the issue, but the file should be opened in binary >>mode: open('foo_1.dat', 'wb')
That was it! Thank you!

Saving and loading simple data in Python convenient way

I'm currently working on a simple Python 3.4.3 and Tkinter game.
I struggle with saving/reading data now, because I'm a beginner at coding.
What I do now is use .txt files to store my data, but I find this extremely counter-intuitive, as saving/reading more than one line of data requires of me to have additional code to catch any newlines.
Skipping a line would be terrible too.
I've googled it, but I either find .txt save/file options or way too complex ones for saving large-scale data.
I only need to save some strings right now and be able to access them (if possible) by key like in a dictionary key:value .
Do you know of any file format/method to help me accomplish that?
Also: If possible, should work on Win/iOS/Linux.

It sounds like using json would be best for this, which comes as part of the Python Standard library in Python-2.6+
import json
data = {'username':'John', 'health':98, 'weapon':'warhammer'}
# serialize the data to user-data.txt
with open('user-data.txt', 'w') as fobj:
json.dump(data, fobj)
# read the data back in
with open('user-data.txt', 'r') as fobj:
data = json.load(fobj)
print(data)
# outputs:
# {u'username': u'John', u'weapon': u'warhammer', u'health': 98}
A popular alternative is yaml, which is actually a superset of json and produces slightly more human readable results.

You might want to try Redis.
http://redis.io/
I'm not totally sure it'll meet all your needs, but it would probably be better than a flat file.

How to read file to import matrix?

I need to use some matrices in Python programs, like
Q = np.matrix([[1,0,1,1,0],
[0,2,0,1,1],
[1,0,2,0,1],
[1,1,0,1,0],
[0,1,1,0,1]])
and I want to import the matrix (use numpy) from a file, so what should I do to realize it? what code should I write and what file should I use (.txt?). I am quite new to python, anyone can help me? Thank you in advance.

I'm assuming that you're not only importing the matrices, but also exporting them to files in the first place.
If that's true, there are multiple easy options, with different tradeoffs.
np.save saves the array in a binary format that's only usable by NumPy. But it's very fast, and generates reasonably small files.
np.save('matrix.npy', Q)
Q = np.load('matrix.npy')
np.savetxt saves the array in a text file, using a dialect of CSV (with whitespace separators, by default). It's slower, and generates bigger files, but if you want to be able to read or edit the files (or send them through an ASCII-only channel, like email without attachments), it's the best option.
np.savetxt('matrix.txt', Q)
Q = np.loadtxt('matrix.txt')
np.savetxt can also save the array in a compressed text file. This gives you small files, but they're slower to save and load. They're not directly human-readable, but it's very easy to un-gzip a file, and then you've got a text file you can read and edit. So, sometimes this is worth doing.
np.savetxt('matrix.txt.gz', Q)
Q = np.loadtxt('matrix.txt.gz')
Finally, you can just use standard Python saving and loading mechanisms, like pickle:
with open('matrix.pickle', 'wb') as f:
pickle.dump(Q, f)
with open('matrix.pickle', 'rb') as f:
Q = pickle.load(f)
This is really only useful if you need to store NumPy arrays together with non-NumPy objects.
If you have to save multiple matrices, instead of saving one per file, you might want to look at savez and savez_compressed. Or, if you need multiple objects, only some of which are NumPy, pickle may be the best option.

python: edit ISO file directly

Is it possible to take an ISO file and edit a file in it directly, i.e. not by unpacking it, changing the file, and repacking it?
It is possible to do 1. from Python? How would I do it?

You can use for listing and extracting, I tested the first.
https://github.com/barneygale/iso9660/blob/master/iso9660.py
import iso9660
cd = iso9660.ISO9660("/Users/murat/Downloads/VisualStudio6Enterprise.ISO")
for path in cd.tree():
print path
https://github.com/barneygale/isoparser
import isoparser
iso = isoparser.parse("http://www.microsoft.com/linux.iso")
print iso.record("boot", "grub").children
print iso.record("boot", "grub", "grub.cfg").content

Have you seen Hachoir, a Python library to "view and edit a binary stream field by field"? I haven't had a need to try it myself, but ISO 9660 is listed as a supported parser format.

PyCdlib can read and edit ISO-9660 files, as well as Joliet, Rock Ridge, UDF, and El Torito extensions. It has a bunch of detailed examples in its documentation, including one showing how to edit a file in-place. At the time of writing, it cannot add or remove files, or edit directories. However, it is still actively maintained, in contrast to the libraries linked in older answers.

Of course, as with any file.
It can be done with open/read/write/seek/tell/close operations on a file. Pack/unpack the data with struct/ctypes. It would require serious knowledge of the contents of ISO, but I presume you already know what to do. If you're lucky you can try using mmap - the interface to file contents string-like.

abstracting the conversion between id3 tags, m4a tags, flac tags

I'm looking for a resource in python or bash that will make it easy to take, for example, mp3 file X and m4a file Y and say "copy X's tags to Y".
Python's "mutagen" module is great for manupulating tags in general, but there's no abstract concept of "artist field" that spans different types of tag; I want a library that handles all the fiddly bits and knows fieldname equivalences. For things not all tag systems can express, I'm okay with information being lost or best-guessed.
(Use case: I encode lossless files to mp3, then go use the mp3s for listening. Every month or so, I want to be able to update the 'master' lossless files with whatever tag changes I've made to the mp3s. I'm tired of stubbing my toes on implementation differences among formats.)

I needed this exact thing, and I, too, realized quickly that mutagen is not a distant enough abstraction to do this kind of thing. Fortunately, the authors of mutagen needed it for their media player QuodLibet.
I had to dig through the QuodLibet source to find out how to use it, but once I understood it, I wrote a utility called sequitur which is intended to be a command line equivalent to ExFalso (QuodLibet's tagging component). It uses this abstraction mechanism and provides some added abstraction and functionality.
If you want to check out the source, here's a link to the latest tarball. The package is actually a set of three command line scripts and a module for interfacing with QL. If you want to install the whole thing, you can use:
easy_install QLCLI
One thing to keep in mind about exfalso/quodlibet (and consequently sequitur) is that they actually implement audio metadata properly, which means that all tags support multiple values (unless the file type prohibits it, which there aren't many that do). So, doing something like:
print qllib.AudioFile('foo.mp3')['artist']
Will not output a single string, but will output a list of strings like:
[u'The First Artist', u'The Second Artist']
The way you might use it to copy tags would be something like:
import os.path
import qllib # this is the module that comes with QLCLI
def update_tags(mp3_fn, flac_fn):
mp3 = qllib.AudioFile(mp3_fn)
flac = qllib.AudioFile(flac_fn)
# you can iterate over the tag names
# they will be the same for all file types
for tag_name in mp3:
flac[tag_name] = mp3[tag_name]
flac.write()
mp3_filenames = ['foo.mp3', 'bar.mp3', 'baz.mp3']
for mp3_fn in mp3_filenames:
flac_fn = os.path.splitext(mp3_fn)[0] + '.flac'
if os.path.getmtime(mp3_fn) != os.path.getmtime(flac_fn):
update_tags(mp3_fn, flac_fn)

I have a bash script that does exactly that, atwat-tagger. It supports flac, mp3, ogg and mp4 files.
usage: `atwat-tagger.sh inputfile.mp3 outputfile.ogg`
I know your project is already finished, but somebody who finds this page through a search engine might find it useful.

Here's some example code, a script that I wrote to copy tags between
files using Quod Libet's music format classes (not mutagen's!). To run
it, just do copytags.py src1 dest1 src2 dest2 src3 dest3, and it
will copy the tags in sec1 to dest1 (after deleting any existing tags
on dest1!), and so on. Note the blacklist, which you should tweak to
your own preference. The blacklist will not only prevent certain tags
from being copied, it will also prevent them from being clobbered in
the destination file.
To be clear, Quod Libet's format-agnostic tagging is not a feature of mutagen; it is implemented on top of mutagen. So if you want format-agnostic tagging, you need to use quodlibet.formats.MusicFile to open your files instead of mutagen.File.
Code can now be found here: https://github.com/DarwinAwardWinner/copytags
If you also want to do transcoding at the same time, use this: https://github.com/DarwinAwardWinner/transfercoder
One critical detail for me was that Quod Libet's music format classes
expect QL's configuration to be loaded, hence the config.init line in my
script. Without that, I get all sorts of errors when loading or saving
files.
I have tested this script for copying between flac, ogg, and mp3, with "standard" tags, as well as arbitrary tags. It has worked perfectly so far.
As for the reason that I didn't use QLLib, it didn't work for me. I suspect it was getting the same config-related errors as I was, but was silently ignoring them and simply failing to write tags.

You can just write a simple app with a mapping of each tag name in each format to an "abstract tag" type, and then its easy to convert from one to the other. You don't even have to know all available types - just those that you are interested in.
Seems to me like a weekend-project type of time investment, possibly less. Have fun, and I won't mind taking a peek at your implementation and even using it - if you won't mind releasing it of course :-) .

There's also tagpy, which seems to work well.

Since the other solutions have mostly fallen off the net, here is what I came up, based on the python mediafile library (python3-mediafile in Debian GNU/Linux).
#!/usr/bin/python3
import sys
from mediafile import MediaFile
src = MediaFile (sys.argv [1])
dst = MediaFile (sys.argv [2])
for field in src.fields ():
try:
setattr (dst, field, getattr (src, field))
except:
pass
dst.save ()
Usage: mediafile-mergetags srcfile dstfile
It copies (merges) all tags from srcfile into dstfile, and seems to work properly with flac, opus, mp3 and so on, including copying album art.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.