Python regexp \n issue

Python regexp \n issue - python

This searched ok:
>>> re.search(r'(.*?)\r\n(.+?)\r\n', 'aaa\r\r\nbbb\r\n').groups()
('aaa\r', 'bbb')
But when I replace one of three b to \n it not searched:
>>> re.search(r'(.*?)\r\n(.+?)\r\n', 'aaa\r\r\nb\nc\r\n').groups()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'groups'
But I want to parse in second case:
('aaa\r', 'b\nc')

You need the DOTALL flag:
import re
re.search(r'(.*?)\r\n(.+?)\r\n', 'aaa\r\r\nb\nc\r\n', flags=re.DOTALL).groups()
result:
('aaa\r', 'b\nc')

Related

Tuple index out of range error with .format(list)

I have a strange problem I don't get. I have a format string with a lot of fields. I want to supply the content for the fields using a list. The following simple demo below shows the issue:
>>> formatstr = "Hello {}, you are my {} fried since {}"
>>> list = ["John", "best", 2020]
>>> print formatstr.format(list)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
>>>
The format string has 3 fields and the list has also 3 elements.
So I don't understand the error message.
Even when I try to address the indexes within the format string:
>>>
>>> formatstr = "Hello {0:}, you are my {1:} fried since {2:}"
>>>
>>> print formatstr.format(list)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
>>>
Can you please help me? I think I blocked somewhere in my thinking.
Thanks.

string manipulation col_schema="col1 string,col2 int". I need to retrieve => "col1,col2"

I have a string col_schema="col1 string,col2 int"
Now I have to retrieve column names alone . something like this => output="col1,col2"
Tried doing below,
name, value = col_list.split(' ')
Traceback (most recent call last):
File "", line 1, in
ValueError: too many values to unpack
col_list_split = col_list.split(',')
>>> print col_list_split
['col1 string', 'col2 string']
>>> name, value = col_list_split.split(' ')
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'list' object has no attribute 'split'

You can split each element on a space after splitting on a comma
col_list_split = (x.split() for x in col_schema.split(','))
Then you have a list of lists, where the first element is the column name, which you can join on a comma
result = ','.join(x[0] for x in col_list_split)

Type error in Python: need a single Unicode character as parameter

When I try to convert a unicode variable to float using unicodedata.numeric(variable_name), I get this error "need a single Unicode character as parameter". Does anyone know how to resolve this?
Thanks!
Here is the code snippet I'm using :
f = urllib.urlopen("http://compling.org/cgi-bin/DAL_sentence_xml.cgi?sentence=good")
s = f.read()
f.close()
doc = libxml2dom.parseString(s)
measure = doc.getElementsByTagName("measure")
valence = unicodedata.numeric(measure[0].getAttribute("valence"))
activation = unicodedata.numeric(measure[0].getAttribute("activation"))
This is the error I'm getting when I run the code above
Traceback (most recent call last):
File "sentiment.py", line 61, in <module>
valence = unicodedata.numeric(measure[0].getAttribute("valence"))
TypeError: need a single Unicode character as parameter

Summary: Use float() instead.
The numeric function takes a single character. It does not do general conversions:
>>> import unicodedata
>>> unicodedata.numeric('½')
0.5
>>> unicodedata.numeric('12')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: need a single Unicode character as parameter
If you want to convert a number to a float, use the float() function.
>>> float('12')
12.0
It won't do that Unicode magic, however:
>>> float('½')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: '½'

Python: Problems with a list comprehension using module laspy

recently i understand the great advantage to use the list comprehension. I am working with several milion of points (x,y,z) stored in a special format *.las file. In python there are two way to work with this format:
Liblas module [http://www.liblas.org/tutorial/python.html][1] (in a C++/Python)
laspy module [http://laspy.readthedocs.org/en/latest/tut_part_1.html][2] (pure Python)
I had several problem with liblas and i wish to test laspy.
in liblas i can use list comprehension as:
from liblas import file as lasfile
f = lasfile.File(inFile,None,'r') # open LAS
points = [(p.x,p.y) for p in f] # read in list comprehension
in laspy i cannot figurate how do the same:
from laspy.file import File
f = file.File(inFile, mode='r')
f
<laspy.file.File object at 0x0000000013939080>
(f[0].X,f[0].Y)
(30839973, 696447860)
i tryed several combination as:
points = [(p.X,p.Y) for p in f]
but i get this message
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: Point instance has no attribute 'x'
I tryed in uppercase and NOT-uppercase because Python is case sensitive:
>>> [(p.x,p.y) for p in f]
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: Point instance has no attribute 'x'
>>> [(p.X,p.Y) for p in f]
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: Point instance has no attribute 'X'
this is in interactive prompt:
C:\Python27>python.exe
Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from laspy.file import File
>>> inFile="C:\\04-las_clip_inside_area\\Ku_018_class.las"
>>> f = File(inFile, None, 'r')
>>> f
<laspy.file.File object at 0x00000000024D5E10>
>>> points = [(p.X,p.Y) for p in f]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: Point instance has no attribute 'X'
>>>
the print p after the list is:
print dir(p)
['__doc__', '__init__', '__module__', 'make_nice', 'pack', 'packer', 'reader', 'unpacked']
in a loop format i have always the same error
>>> for p in f:
... print dir(p)
... print p.X,p.Y
...
['__doc__', '__init__', '__module__', 'make_nice', 'pack', 'packer', 'reader', 'unpacked']
Traceback (most recent call last):
File "<interactive input>", line 3, in <module>
AttributeError: Point instance has no attribute 'X'
using this code suggested by nneonneo
import numpy as np
for p in f:
... points = np.array([f.X, f.Y]).T
i can store in an array
points
array([[ 30839973, 696447860],
[ 30839937, 696447890],
[ 30839842, 696447832],
...,
[ 30943795, 695999984],
[ 30943695, 695999922],
[ 30943960, 695999995]])
but miss the way to create a list comprehension
points = [np.array(p.X,p.Y).T for p in f]
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: Point instance has no attribute 'X'
thanks in advance for help.
Gianni

Python is case-sensitive. Too me it looks like you ask for attribute x, but it should be an uppercase X.

Try
import numpy as np
...
points = np.array([f.X, f.Y]).T

It looks like Point has a make_nice() method that makes more attributes show up.
for p in f: p.make_nice()
Now your list comp should work (with uppercase X and Y--see comments below).
[(p.X,p.Y) for p in f]
note: This answer is not tested. It is based on reading the source of laspy.util.Point.
Relevant source:
def make_nice(self):
'''Turn a point instance with the bare essentials (an unpacked list of data)
into a fully populated point. Add all the named attributes it possesses,
including binary fields.
'''
i = 0
for dim in self.reader.point_format.specs:
self.__dict__[dim.name] = self.unpacked[i]
i += 1
# rest of method snipped

The right and elegant way to split a join a string in Python

I have the following list:
>>> poly
'C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa.shp'
>>> record
1373155
and I wish to create:
'C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa_1373155.txt'
I wish to split in order to get the part "C:\04-las_clip_inside_area\16x16grids_1pp_fsa16x16grids_1pp_fsa".
I have tried this two-code-lines solution:
mylist = [poly.split(".")[0], "_", record, ".txt"]
>>> mylist
['C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa', '_', 1373155, '.txt']
from here, reading the example in Python join, why is it string.join(list) instead of list.join(string)?.
I find this solution to joint, but I get this error message:
>>> mylist.join("")
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'list' object has no attribute 'join'
Also if I use:
>>> "".join(mylist)
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
TypeError: sequence item 2: expected string, int found

Python join: why is it string.join(list) instead of list.join(string)?
So there is
"".join(mylist)
instead of
mylist.join("")
There's your error.
To solve your int/string problem, convert the int to string:
mylist= [poly.split(".")[0],"_",str(record),".txt"]
or write directly:
"{}_{}.txt".format(poly.split(".")[0], record)

>>> from os import path
>>>
>>> path.splitext(poly)
('C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa', '.shp')
>>>
>>> filename, ext = path.splitext(poly)
>>> "{0}_{1}.txt".format(filename, record)
'C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa_1373155.txt'

>>> poly = 'C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa.shp'
>>> record = 1373155
>>> "{}_{}.txt".format(poly.rpartition('.')[0], record)
'C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa_1373155.txt'
or if you insist on using join()
>>> "".join([poly.rpartition('.')[0], "_", str(record), ".txt"])
'C:\\04-las_clip_inside_area\\16x16grids_1pp_fsa_1373155.txt'
It's important to use rpartition() (or rsplit()) as otherwise it won't work properly if the path has any other '.''s in it

You need to convert record into a string.
mylist= [poly.split(".")[0],"_",str(record),".txt"]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python regexp \n issue - python

You need the DOTALL flag: import re re.search(r'(.*?)\r\n(.+?)\r\n', 'aaa\r\r\nb\nc\r\n', flags=re.DOTALL).groups() result: ('aaa\r', 'b\nc')

Related

Tuple index out of range error with .format(list)

string manipulation col_schema="col1 string,col2 int". I need to retrieve => "col1,col2"

Type error in Python: need a single Unicode character as parameter

Python: Problems with a list comprehension using module laspy

The right and elegant way to split a join a string in Python

Categories

Resources