Emacs collaborative buffers open in the wrong mode - python

I am using Emacs and Rudel to collaborate with a remote programmer. Rudel has a concept of published buffers. When my partner publishes a buffer, I can subscribe to it and the we can both edit it simultaneously.
My problem is that when he publishes a Python file with a *.py extension and I subscribe to it, my buffer is not set to python-mode automatically (it is in fundamental mode). How can I get it so that the buffer opens with the correct language mode?

I don't know Rudel well enough to give a 100% solution, but what you want to do is something like this:
(add-hook 'rudel-document-attach-hook 'my-rudel-set-mode-appropriately)
(defun my-rudel-set-mode-appropriately (document buffer)
"try to set the mode appropriately"
(set-buffer buffer)
(let ((buffer-file-name ...get-name-from-document...))
(set-auto-mode)))
Only, you need to replace the ...get-name-from-document... portion of the code with something that evaluates to the file name that you want, for example, if the buffer is named myfile.py, then you can change that to (buffer-name). But, if the buffers get odd names, perhaps you need to extract the name from the document object (Rudel internally uses a document object to represent the thing you are sharing). So, if (buffer-name) doesn't work, you can try (rudel-suggested-buffer-name document).
i.e. try the above code but using one of these lines:
(let ((buffer-file-name (buffer-name)))
and
(let ((buffer-file-name (rudel-suggested-buffer-name document)))
The set-auto-mode will use value of buffer-file-name to determine the major mode using the general Emacs mechanisms.

I know absolutely nothing about how rudel works. However, have you tried explicitly setting the mode in the text file? Try adding something like this to the first line of the file:
# -*- mode: python; fill-column: 75; comment-column: 50; -*-
Putting a line like this first in the file will cause emacs to ignore the file's extension and open in the given mode.

Related

How do you use Python Ghostscript's high-level interface to convert a .pdf file into multiple .png files?

I am trying to convert a .pdf file into several .png files using Ghostscript in Python. The other answers on here were pretty old hence this new thread.
The following code was given as an example on pypi.org of the 'high level' interface, and I am trying to model my code after the example code below.
import sys
import locale
import ghostscript
args = [
"ps2pdf", # actual value doesn't matter
"-dNOPAUSE", "-dBATCH", "-dSAFER",
"-sDEVICE=pdfwrite",
"-sOutputFile=" + sys.argv[1],
"-c", ".setpdfwrite",
"-f", sys.argv[2]
]
# arguments have to be bytes, encode them
encoding = locale.getpreferredencoding()
args = [a.encode(encoding) for a in args]
ghostscript.Ghostscript(*args)
Can someone explain what this code is doing? And can it be used somehow to convert a .pdf into .png files?
I am new to this and am truly confused. Thanks so much!
That's calling Ghostscript, obviously. From the arguments it's not spawning a process, it's linked (either dynamically or statically) to the Ghostscript library.
The args are Ghostscript arguments. These are documented in the Ghostscript documentation, you can find it online here. Because it mimics the command line interface, where the first argument is the calling program, the first argument here is meaningless and can be anything you want (as the comment says).
The next three arguments turn on SAFER (which prevents some potentially dangerous operations and is, now, the default anyway), sets NOPAUSE so the entire input is processed without pausing between pages, and BATCH so that on completion Ghostscript exits instead of returning to the interactive prompt.
Then it selects a device. In Ghostscript (due to the PostScript language) devices are what actually output stuff. In this case the device selected is the pdfwrite device, which outputs PDF.
Then there's the OutputFile, you can probably guess that this is the name (and path) of the file where the output is to be written.
The next 3 arguments; -c .setpdfwrite -f are, frankly archaic and pointless. They were once recommended when using the pdfwrite device (and only the pdfwrite device) but they have no useful effect these days.
The very last argument is, of course, the input file.
Certainly you can use Ghostscript to render PDF files to PNG. You want to use one of the PNG devices, there are several depending on what colour depth you want to support. Unless you have some stranger requirement, just use png16m. If your input file contains more than one page you'll want to set the OutputFile to use %d so that it writes one file per page.
More details on all of this can, of course, be found in the documentation.

Why is Django FieldFIle readline() returning the hex version of a text file?

Having an odd problem.
I have a Django app that opens a file (represented as a Django FieldFile) and reads each row using readline() as below:
with file.open(mode='r') as f:
row = f.readline()
# do something with row...
The file is text, utf-8 encoded and lines are terminated with \r\n.
The problem is each row is being read as the hex representation of the string, so instead of "Hello" I get "48656c6c6f".
A few stranger things:
It previously worked properly, but at some point an update has broken it (I've tried rolling back to previous commits and it is still wonky, so possibly a dependency has updated and not something from my requirements.txt). Missed it in my testing because it is in a very rarely used part of the app.
If I read the same file using readlines() instead of readline() I see the correct string representation of the file wrapped in [b'...']
The file reads normally if I do it using straight Python open() and readline() from an interpreter
Forcing text mode with mode='rt' doesn't change the behaviour, neither does mode='rb'
The file is stored in a Minio bucket, so the defaut storage is storages.backends.s3boto3.S3Boto3Storage from django-storages and not the default Django storage class. This means that boto3, botocore and s3fs are also in the mix, making it more confusing for me to debug.
Scratching my head at why this worked before and what I'm doing wrong.
Environment is Python 3.8, Django 2.2.8 and 3.0 (same result) running in Docker containers.
EDIT
Let me point out that the fix for this is simply using
row = f.readline().decode()
but I would still like to figure out what's happening.
EDIT 2
Further to this, FieldFile.open() is reading the file as a binary file, whereas plain Python open() is reading the file as a text file.
This seems very weird.
I think you will see the solution immediately after trying following (I will then update my answer or delete it if it really doesn't help to find it, but I'm quite confident)
A assume, that there is some code, that is monkeypatching file.open or the django view function.
What I suggest is:
Start your code with manage.py runserver
Ad following code to manage.py (as the very first lines)
import file
print("ID of file.open at manage startup is", id(file.open)
Then add code to your view directly one line above the file.open
print("ID of file.open before opening is", id(file.open)
If both ids are different, then something monkeypatched your open function.
If both are the same, then the problem must be somewhere else.
If you don not see the output of these two prints, something might have monkeypatched your view.
If this doesn't work, then try to use
open() instead of file.open()
Is there any particular reason you use file.open()
Addendum 1:
So what you sai is, that file is an object instance of a class is it a FileField?
In any case can you obtain the name of the file and open it with a normal open() to see whether it is only file.open() that does funny things or whether it is also open() reading it this stange way.
Did you just open the file from command line with cat filename (or if under windows with type filename?
If that doesn't work we could add traces to follow each line of the source code that is being executed.
Addendum 2:
Well if you can't try this in a manage.py runserver, what happens if you try to read the file with a manage.py shell?
Just open the shell and type something like:
from <your_application>.models import <YourModel>
entry = <YourModel>.objects.get(id=<idofentry>)
line1 = entry.<filefieldname>.open("r").read().split("\n")[0]
print("line1 = %r" % line1)
If this is still not conclusive, (but only if you can reproduce the issue with the management shell, then create a small file containing the lines.
from <your_application>.models import <YourModel>
entry = <YourModel>.objects.get(id=<idofentry>)
import pdb; pdb.set_trace()
line1 = entry.<filefieldname>.open("r").read().split("\n")[0]
print("line1 = %r" % line1)
And import it from the management shell.
The code should enter the debugger and now you can single step through the open function and see whether you end up on sime weird function in some monkeypatch.

How do I edit an executable with python by address/offset/bytes, like in a hex editor?

I've used Hex-rays IDA to find the bytes of code I need changed in a windows executable. I would like to write a python script that will programmatically edit those bytes.
I know the address (as given in hex-rays IDA) and I know the hexadecimal I wish to overwrite it with. How do I do this in python? I'm sure there is a simple answer, but I can't find it.
(For example: address = 0x00436411, and new hexadecimal = 0xFA)
You just need to open the executable as a file, for writing, in binary mode; then seek to the position you want to write; then write. So:
with open(path, 'r+b') as f:
f.seek(position)
f.write(new_bytes)
If you're going to be changing a lot of bytes, you may find it simpler to use mmap, which lets you treat the file as a giant list:
with open(path, 'r+b') as f:
with contextlib.closing(mmap.mmap(f.fileno(), access=mmap.ACCESS_WRITE)) as m:
m[first_position] = first_new_byte
m[other_position] = other_new_byte
# ...
If you're trying to write multi-byte values (e.g., a 32-bit int), you probably want to use the struct module.
If what you know is an address in memory at runtime, rather than a file position, you have to be able to map that to the right place in the executable file. That may not even be possible (e.g., a memory-mapped region). But if it is, you should be able to find out from the debugger where it's mapped. From inside a debugger, this is easy; from outside, you need to parse the PE header structures and do a lot of complicated logic, and there is no reason to do that.
I believe when using hex-ray IDA as a static disassembler, with all the default settings, the addresses it gives you are the addresses where the code and data segments will be mapped into memory if they aren't forced to relocate. Those are, obviously, not offsets into the file.

is there a built-in python analog to unix 'wc' for sniffing a file?

Everyone's done this--from the shell, you need some details about a text file (more than just ls -l gives you), in particular, that file's line count, so:
# > wc -l iris.txt
149 iris.txt
i know that i can access shell utilities from python, but i am looking for a python built-in, if there is one.
The crux of my question is getting this information without opening the file (hence my reference to the unix utility *wc -*l)
(is 'sniffing' the correct term for this--i.e., peeking at a file w/o opening it?')
You can always scan through it quickly, right?
lc = sum(1 for l in open('iris.txt'))
No, I would not call this "sniffing". Sniffing typically refers to looking at data as it passes through, like Ethernet packet capture.
You cannot get the number of lines from a file without opening it. This is because the number of lines in the file is actually the number of newline characters ("\n" on linux) in the file, which you have to read after open()ing it.

How to append EOF to file using Perl or Python?

I’m trying to bulk insert data to SQL server express database. When doing bcp from Windows XP command prompt, I get the following error:
C:\temp>bcp in -T -f -S
Starting copy...
SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Native Client]Unexpected EOF encountered in BCP data-file
0 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 4391
So, there is a problem with EOF. How to append a correct EOF character to this file using Perl or Python?
EOF is End Of File. What probably occurred is that the file is not complete; the software expects data, but there is none to be had anymore.
These kinds of things happen when:
the export is interrupted (quit dump software while dumping)
while copying the dumpfile aborting the copy
disk full during dump
these kinds of things.
By the way, though EOF is usually just an end of file, there does exist an EOF character. This is used because terminal (command line) input doesn't really end like a file does, but it sometimes is necessary to pass an EOF to such a utility. I don't think it's used in real files, at least not to indicate an end of file. The file system knows perfectly well when the file has ended, it doesn't need an indicator to find that out.
EDIT shamelessly copied from a comment provided by John Machin
It can happen (uninentionally) in real files. All it needs is (1) a data-entry user to type Ctrl-Z by mistake, see nothing on the screen, type the intended Shift-Z, and keep going and (2) validation software (written by e.g. the company president's nephew) which happily accepts Ctrl-anykey in text fields and your database has a little bomb in it, just waiting for someone to produce a query to a flat file.
Unexpected EOF means that the bcp reader found an EOF when it was expecting more data. This EOF can be:
(1) the actual physical end-of-file (no more bytes to be read). This means that you have mis-formatted data. Check near the end of your file for an incomplete record.
OR
(2) on Windows, where you are, programs reading a file in text mode honour the ancient convention inherited via MS-DOS from CP/M of regarding Ctrl-Z (aka ^Z aka \'x1A' aka SUB aka SUBSTITUTE) as an end-of-file marker when reading from ANY file, not just a terminal. This includes Python -- the behaviour is determined by the C stdlib. Check for '\x1A' in your data.
Update responding to comments in a legible fashion:
In Notepad++, you can make it display unusual characters by doing View / Show Symbol / Show All Characters. You can search by doing Ctrl-F, typing \x1a in the Find What box, and selecting the Extended radio button in the Search panel.
Or you can with a little bit of Python get the line number of the first Ctrl-Z:
bytes = open('bcp.dat', 'rb').read()
zpos = bytes.find('\x1a')
# if zpos is -1, no Ctrl-Z in file
print 1 + bytes[:zpos].count('\r\n')
Where your .dat was created doesn't matter. An unintentional Ctrl-Z can happen anywhere in a file created on any operating system. It is where it is being read as a text file that matters -- Windows? Bang!
This is not a problem with missing EOF, but with EOF that is there and is not expected by bcp.
I am not a bcp tool expert, but it looks like there is some problem with format of your data files.

Categories

Resources