Map object has no len() in Python 3 - python

I have this Python tool written by someone else to flash a certain microcontroller, but he has written this tool for Python 2.6 and I am using Python 3.3.
So, most of it I got ported, but this line is making problems:
data = map(lambda c: ord(c), file(args[0], 'rb').read())
The file function does not exist in Python 3 and has to be replaced with open. But then, a function which gets data as an argument causes an exception:
TypeError: object of type 'map' has no len()
But what I see so far in the documentation is, that map has to join iterable types to one big iterable, am I missing something?
What do I have to do to port this to Python 3?

In Python 3, map returns an iterator. If your function expects a list, the iterator has to be explicitly converted, like this:
data = list(map(...))
And we can do it simply, like this
with open(args[0], "rb") as input_file:
data = list(input_file.read())
rb refers to read in binary mode. So, it actually returns the bytes. So, we just have to convert them to a list.
Quoting from the open's docs,
Python distinguishes between binary and text I/O. Files opened in
binary mode (including 'b' in the mode argument) return contents as
bytes objects without any decoding.

Related

Is there documentation for file object?

This is probably really dumb question, but I honestly can't find documentation for file object's API in Python 3.
Python docs for things using or returning file objects like open or sys.stdin have links to glossary with high-level introduction. It doesn't list functions exposed by such objects and I don't know, what can I do with them. I've tried googling for file object docs, but search engines don't seem to understand, what am I looking for.
I'm new to Python, but not to programming in general. Until now my scheme of using objects was to find complete API reference, see what it can do and then pick methods to use in my code. Is this wrong mindset in Python world? What are the alternatives?
open returns a file object that differs depending on the mode. From the open docs:
The type of file object returned by the open() function depends on the mode. When open() is used to open a file in a text mode ('w', 'r', 'wt', 'rt', etc.), it returns a subclass of io.TextIOBase (specifically io.TextIOWrapper). When used to open a file in a binary mode with buffering, the returned class is a subclass of io.BufferedIOBase. The exact class varies: in read binary mode, it returns an io.BufferedReader; in write binary and append binary modes, it returns an io.BufferedWriter, and in read/write mode, it returns an io.BufferedRandom. When buffering is disabled, the raw stream, a subclass of io.RawIOBase, io.FileIO, is returned.
Since it varies, open a file object with the mode you want help for and ask it for help:
>>> f = open('xx','w')
>>> help(f)
Help on TextIOWrapper object:
class TextIOWrapper(_TextIOBase)
| Character and line based layer over a BufferedIOBase object, buffer.
|
: etc...

python 2 to 3 migration error "readinto" method

I converted a huge file which I wrote it at python 2.7.3 and then now I wanted to upgrade to python 3+ (i have 3.5).
what I have done so far:
installed the python interpreter 3.5+
updated the environment path to read from python3+ folder
upgraded the numpy, pandas,
I used >python 2to3.py -w viterbi.py to convert to version 3+
the section that I have error
import sys
import numpy as np
import pandas as pd
# Counting number of lines in the text file
lines = 0
buffer = bytearray(2048)
with open(inputFilePatheName) as f:
while f.readinto(buffer) > 0:
lines += buffer.count('\n')
My error is:
AttributeError: '_io.TextIOWrapper' object has no attribute 'readinto'
This is the first error and I cannot proceed to see if there is any other error. I dont know what is the equivalent command for readinto
In 3.x, the readinto method is only available on binary I/O streams. Thus: with open(inputFilePatheName, 'rb') as f:.
Separately, buffer.count('\n') will not work any more, because Python 3.x handles text properly, as something distinct from a raw sequence of bytes. buffer, being a bytearray, stores bytes; it still has a .count method, but it has to be given either an integer (representing the numeric value of a byte to look for) or a "bytes-like object" (representing a subsequence of bytes to look for). So we also have to update that, as buffer.count(b'\n') (using a bytes literal).
Finally, we need to be aware that processing the file this way means we don't get universal newline translation by default any more.
Open the file as binary.
As long as you can guarantee it's utf-8 or CP encoded, all \ns will necessarily be newlines:
with open(inputFilePatheName, "rb") as f:
while f.readinto(buffer) > 0:
lines += buffer.count(b'\n')
That way you also save the time of decoding the file, and use your buffer in the most efficient way possible.
A better approach to what you're trying to achieve is using memory mapped files.
In case of Windows:
file_handle = os.open(r"yourpath", os.O_RDONLY|os.O_BINARY|os.O_SEQUENTIAL)
try:
with mmap.mmap(file_handle, 0, access=mmap.ACCESS_READ) as f:
pos = -1
total = 0
while (pos := f.find(b"\n", pos+1)) != -1:
total +=1
finally:
os.close(file_handle)
Again, make sure you are not encoding the text as UTF-16 which is the default for Windows.

Convert and save string to binary file in Python

I'm using PyOBEX to exchange binary files (e.g. images etc.) between my computer (Windows 7) and my phone (Android). However, when I use get() to get a file from my phone, it arrives on my computer as a str. I tried using the chardet module to find out what encoding to use to decode it and eventually turn it into a binary file, but it returned None. type() says that it's a str.
The code is the following:
import bluetooth
import BTDeviceFinder
import PyOBEX.client
name = "myDevice"
address = BTDeviceFinder.find_by_name(name)
port = BTDeviceFinder.find_port(address)
client = PyOBEX.client.BrowserClient(address, port)
client.connect()
a, b = client.get("pic.jpg")
where a is the header (that comes with a file sent via OBEX) and b is the actual file object. b looks something like this: https://drive.google.com/file/d/0By0ywTLTjb3LaFJaM2hWVEdBakE/view?usp=sharing
The PyOBEX documentation or Python forums say nothing about what encoding is used with get().
Do you know how to turn this string into binary data that can be used with write() and then saved in the original file format (i.e. .jpg)?
In python 2.7 strings represent raw bytes (this changes in python 3)
You simply need to save the data to a binary type file:
with open('file.jpg', 'wb') as handle:
handle.write(data_string)
Here is a link to the python doc on open:
https://docs.python.org/2/library/functions.html#open
Note that the "b" represents binary.
Again, this is assuming Python 2.7

Converting the output of a module from print to write mode

I have been trying to use a twobitreader package (http://pythonhosted.org//twobitreader/) to extract DNA sequence information, however I have ran into a problem. Whenever I use twobitreader.twobit_reader() module I am only able to obtain a printed output. What I would like to do is to write the output into a new file.
This is the information on this module from http://pythonhosted.org//twobitreader/:
twobit_reader takes a twobit_file (of class TwoBitFile) and an “input_stream” which can be any iterable (incl. file-like objects) writes output (FASTA format) using write (print if write=None) logs errors/warning to stderr
Likely, my limited knowledge with python programming is impeding me from accomplishing this task.
For example, here is some code that I wrote:
def get_a(n):
"""get sequences from genome"""
genome = twobitreader.TwoBitFile('hg19.2bit')
bedfile = open(n+'.bed', 'r')
o_f = open(n+'_FASTA.txt', 'w')
twobitreader.twobit_reader(genome, bedfile)
bedfile.close()
o_f.close()
This ends up printing my sequences.
If I try to alter the twobitreader line to: twobitreader.twobit_reader(genome, bedfile, o_f) in the attempt to write the data to the file o_f, I get the error 'file' object is not callable.
OP confirmed this worked:
twobitreader.twobit_reader(genome, bedfile, o_f.write)

Access contents of PyBuffer from C

I have created a buffer object in python like so:
f = io.open('some_file', 'rb')
byte_stream = buffer(f.read(4096))
I'm now passing byte_stream as a parameter to a C function, through SWIG. I have a typemap for converting the data which looks like this:
%typemap(in) unsigned char * byte_stream {
PyObject *buf = $input;
//some code to read the contents of buf
}
I have tried a few different things bug can't get to the actual content/value of my byte_stream. How do I convert or access the content of my byte_stream using the C API? There are many different methods for converting a C data to a buffer but none that I can find for going the other way around.
I have tried looking at this object in gcb but neither it, or the values it points to contain my data.
(I'm using buffers because I want to avoid the overhead of converting the data to a string when reading it from the file)
I'm using python 2.6 on Linux.
--
Thanks Pavel
I'm using buffers because I want to
avoid the overhead of converting the
data to a string when reading it from
the file
You are not avoiding anything. The string is already built by the read() method. Calling buffer() just builds an additional buffer object pointing to that string.
As for getting at the memory pointed to by the buffer object, try PyObject_AsReadBuffer(). See also http://docs.python.org/c-api/objbuffer.html.
As soon as you use the read method on your file object, the data will be converted to a str object; calling the buffer method does not convert it into a stream of any kind. If you want to avoid the overhead of creating the string object, you could simply pass the file object to your C code and then use it via its C API.

Categories

Resources