I am working on a project where I am interfacing(?) with another program. This other program has no way for me to interface with it, so, I need to pull values out of memory. I have already found the addresses where these values are stored relative to the MZ Start address listed in the programs PE header. I simply need to look at that address and get its value. Is there a way to do this in Python, or will I need a different language?
I used Cheat Engine to find the memory address relative to the MZ Start address listed in the programs PE header. However, I have no way to interface with Cheat Engine in order to do something with the value it is looking at. Then I had the idea to manually look at that address while the program is running with a python script. However, I am unsure of where to begin.
Here's what I know:
First line of memory starts at address: 0x00CC0000
It always starts here.
Hexadecimal Address: 00CC0000(StartOfMem)+841984(Offset) = 0x01501984
This is where the pointer is stored in memory. I have verified that it is always in this location.
This pointer points to the memory address of a UI class object in the program I am trying to interface with, this object contains data I want to read.
If I dereference the pointer it will give me another memory address. Let's call this value AddressAtPointer.
I know the two things I am looking have an offset of 43C and 434 from AddressAtPointer and are 4 byte integers.
Is there a way for me to read the data at these specific memory addresses?
Yes, this is possible. But, I will warn you that reading and writing to specific memory addresses is the wrong tool to solve this problem. The right tool is probably ctypes or SWIG. In particular, that would save you from needing to figure out what the right offsets are.
I figure you're going to ignore that advice, so here's how to write arbitrary memory addresses.
import ctypes
foo = ctypes.c_char.from_address(0x00000000)
foo.value = 1
This will write a byte of 0x01 to the address zero. You can change the address by changing 0x00000000. You can change the value written by changing the 1. You can change the size of the write by changing c_char to something else.
Reading a memory address is the same, except instead of foo.value = 1, you have variable = foo.value.
All of the above assumes you're in the same address space as your target.
No -- Not through python directly. Python is a memory-safe language and therefore doesn't allow for interaction directly with memory.
Your best bet might be using CPython to call a C function which does the memory-trickery that you want.
This is also an extremely fragile way of getting data: Memory addresses may not be the same between different machines, different operating systems, or even different executions of the same program (ASLR is a feature that randomizes memory addresses every time a program starts up, and this may be enabled)
Related
I have a set of files (compiled software) that I want to give an unique fingerprint before distribution. The idea is to write a script that:
Randomly generates a character sequence
Appends the character sequence to a file in the project
Stores the fingerprint in a database with the addressee
Distributes the software to the addressee
The requirements for the fingerprint process is that:
The fingerprint is difficult to detect (i.e. not stored in the file metadata or easily accessible areas)
The fingerprint does not corrupt the data of the file the sequence is added to
The fingerprint can be added to an executable or dll file
It's easy to read the fingerprint if you know where to look
Are there any open source solutions that is built for the purpose of fingerprinting files?
Storing information in the file without corrupting it and in a way that is not easily detectable is an exercise in steganography, and quite a hard one. This theoretical tool needs to be able to parse executable structure, and properly modify it, edit offsets if needed, or detect padding arias, or basically do some of the work that the compiler is doing. I doubt that it exists or is reliable.
However, there are quite a few steganography tools that can store information in pictures by subtly changing the colors of the pixels, perhaps you can store your information in the icon of the exe file or any included asset.
Another way is to hide the data at compilation time, in optimization level of the performance-uncritical parts of the executable, so that compiler generates slightly different code, but the behavior is guaranteed to stay consistent. You can now use file hashes as your fingerprint.
Yet another way is to just create unused string inside some random function, mark it as volatile or analog in your language of choice to prevent the compiler from optimizing it out of your program and put something noticeable in it, like REPLACE_ME. Now you can open this file, search for this string and replace it with the identifier that you have generated. If identifier and the string were the same length - you can’t damage your software.
Another, more subtle way is to create multiple different rephrasings of the same messages in your app and swap them in and out as a way to differentiate versions. If your programming language stores null-terminated strings then this is very easy, just make your strings in the code as long as the longest rephrasing. If your language stores length of the string then you have to dynamically recalculate it too.
Alternatively, if you are working with the Unicode strings in your code, then you can use similar-looking glyphs in some strings as a less effort version of previous idea. Basically you are performing a homograph attack on your strings. Alternatively you can use unicode control chars (ZWJ, ZWNJ, etc.) that do not affect most languages and are invisible.
All schemes is easily discovered by diffing two different distributions of the software, the one with the different optimization levels could be plausibly written off as just different builds of the software, but the persistent attacker still could figure it out.
Since you are talking about compiled software, maybe an alternative solution could be to use an execbinary encrypting tool. When you execute the file it will ask for a password, if it's correct then it will use the password to generate a key. Then it uses that key to decrypt the program directly in memory. That way they won't be able to analyze the binary and even with the key it would be a lot more difficult to do so, much less modify it. You can put as many fingerprints as you like, regular text strings, into the code and they will most likely stay there.
say I store
a=9
in a program. the value of a is stored in the computer memory.
By making no changes in the memory if I want to re-access the memory location and print its content.
How do I do that in python.
I want to take the memory address from first program and in the new one use that memory address and print its value.
I tried using ctype to access the memory locations but i end up getting segmentation fault.
If you want to share the value between two Python processes have a look at the multiprocessing package and its tools for sharing values.
firstly, I would like to thanks to whomever would help me.
- Environment
I am using Python v2.7 in Windows 8 OS.
I am using COM4 to talk to robot by sending some commands in Python code.
I send a command getversion to robot and suppose to get a bunch of data which is in the following format (I omit some, it is too long):
Component,Major,Minor,Build,Aux
APPassword,956FC721
BaseID,1.2,1.0,18000,2000,
BatteryType,4,LIION_4CELL_SMART,
Beehive URL, beehive.cloud.com
BlowerType,1,BLOWER_ORIG,
Bootloader Version,27828,,
BrushMotorType,1,BRUSH_MOTOR_ORIG,
BrushSpeed,1400,,
BrushSpeedEco,800,,
ChassisRev,1,,
Cloud Selector, 2
DropSensorType,1,DROP_SENSOR_ORIG,
LCD Panel,137,240,124,
LDS CPU,F2802x/c001,,
LDS Serial,KSH13315AA-0000153,,
My Question Is:
Based on this format, how to get the size (in byte) of the above result?
The reason for this question is, I can get the full/complete result as long as I know how large it is.
To be specific, my code is:
ser.write('getver \n') # send 'getversion' cmd to robot
ser.read(1305)
The response size of getver is 1305 byte, yes, I count it manually, that is why I would like to ask Python to tell me how large it is automatically.
In order to be able to communicate with a device, you have to know what the protocol is for that communication. Whoever designed the protocol had to define a way for you to know how many bytes to read. If you have a specification, it probably covers that question.
So, there is either a way to determine number of bytes beforehand or to detect the end of transmission, e.g. by the existance of a special end character.
Without some sort of specification, we can only guess what the protocol is.
The response message size is apparently not fixed. Maybe there is a way to ask the device "what would be the length of the answer to getversion"?
Some protocols would prefix each message with the length information. Here there is none. Perhaps you can put the device in a different mode where it deos something like that by sending it some special command?
Your message does not look like it has as the end marked, but perhaps it is just not visible, e.g. might there be a null character ('\0') at the end? If there is one, you could read character-by-character until it appears.
Failing to find any other solution, you can try setting a reasonable timeout on reads (ser = serial.Serial(..., timeout=2, ...)). Then try to read everything. When there is nothing more to read, the read function will freeze indefinitely, unless there is a timeout. If you set a reasonably long timeout and no date is received in that time, you can assume the transmission is over.
Let's imagine a situation: I have two Python programs. The first one will write some data (str) to computer memory, and then exit. I will then start the second program which will read the in-memory data saved by the first program.
Is this possible?
Sort of.
python p1.py | python p2.py
If p1 writes to stdout, the data goes to memory. If p2 reads from stdin, it reads from memory.
The issue is that there's no "I will then start the second program". You must start both programs so that they share the appropriate memory (in this case, the buffer between stdout and stdin.)
What are all these nonsense answers? Of course you can share memory the way you asked, there's no technical reason you shouldn't be able to persist memory other than lack of usermode API.
In Linux you can use shared memory segments which persist even after the program that made them is gone. You can view/edit them with ipcs(1). To create them, see shmget(2) and the related syscalls.
Alternatively you can use POSIX shared memory, which is probably more portable. See shm_overview(7)
I suppose you can do it on Windows like this.
Store you data into "memory" using things like databases, eg dbm, sqlite, shelve, pickle, etc where your 2nd program can pick up later.
No.
Once the first program exits, its memory is completely gone.
You need to write to disk.
The first one will write some data
(str) to computer memory, and then
exit.
The OS will then ensure all that memory is zeroed before any other program can see it. (This is an important security measure, as the first program may have been processing your bank statement or may have had your password).
You need to write to persistent storage - probably disk. (Or you could use a ramdisk, but that's unlikely to make any difference to real-world performance).
Alternatively, why do you have 2 programs? Why not one program that does both tasks?
Yes.
Define a RAM file-system.
http://www.vanemery.com/Linux/Ramdisk/ramdisk.html
http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/
You can also set up persistent shared memory area and have one program write to it and the other read it. However, setting up such things is somewhat dependent on the underlying O/S.
Maybe the poster is talking about something like shared memory? Have a look at this: http://poshmodule.sourceforge.net/
I currently have a Python app I am developing which will data carve a block device for jpeg files. Let's just say that it sometimes works and sometimes doesn't. I have created it so that I read the block device till I find a ffd8, then I keep the stream open and search via looping for the ffd9 closure. Though I always need to take into account all ffd9 closures even after the first. So it tends to be a really intensive operation. Given a device with let's say 25 jpegs as well as lots of other data, the looping is pretty dramatic and it runs though a lot.
The program is not the slowest thing in the world, but I think it could be much faster and much more efficient. I am looking for a better way to search the block device and extract the data in a more efficient manner. I also don't want to kill the HDD or the drive holding the image of the block device.
So does anybody knew of a better way to systematically handle the searching and extraction of the data?
The trouble with reading the block device directly is that there is no guarantee that the blocks of any given file are contiguous. That means that even if you find your magic marker bytes 0xFFD8 in block 13, say, there is no guarantee that block 14 belongs to the same file, whether or not it contains the 0xFFD9 end marker or not. (Most files will start on a block boundary; the end of the file may be anywhere, possibly even across block boundaries.)
What's the better way to deal with it? Well, it depends what you're after - but if you're looking only at currently allocated blocks, then scan the file system using the Python analog of the POSIX C function ftw (nftw), and read each file in turn. This won't find evidence of deleted JPEG files in the free list - if that's what you are after, then you'll need to do as you are doing, more or less, but correlate that information with what you find in the file system proper. Mapping those blocks will (at best) be hard.