In ArcGIS I have intersected a large number of zonal polygons with another set and recorded the original zone IDs and the data they are connected with. However the strings that are created are one long list of numbers ranging from 11 to 77 (each ID is 11 characters long). I am looking to add a "," between each one making, it easier to read and export later as a .csv file. To do this I wrote this code:
def StringSplit(StrO,X):
StrN = StrO #Recording original string
StrLen = len(StrN)
BStr = StrLen/X #How many segments are inside of one string
StrC = BStr - 1 #How many times it should loop
if StrC > 0:
while StrC > 1:
StrN = StrN[ :((X * StrC) + 1)] + "," + StrN[(X * StrC): ]
StrC = StrC - 1
while StrC == 1:
StrN = StrN[:X+1] + "," + StrN[(X*StrC):]
StrC = 0
while StrC == 0:
return StrN
else:
return StrN
The main issue is how it has to step through multiple rows (76) with various lengths (11 -> 77). I got the last parts to work, just not the internal loop as it returns an error or incorrect outputs for strings longer than 22 characters.
Thus right now:
1. 01234567890 returns 01234567890
2. 0123456789001234567890 returns 01234567890,01234567890
3. 012345678900123456789001234567890 returns either: Error or ,, or even ,,01234567890
I know it is probably something pretty simple I am missing, but I can't seem remember what it is...
It can be easily done by regex.
those ........... are 11 dots for give split for every 11th char.
you can use pandas to create csv from the array output
Code:
import re
x = re.findall('...........', '01234567890012345678900123456789001234567890')
print(x)
myString = ",".join(x)
print(myString)
output:
['01234567890', '01234567890', '01234567890', '01234567890']
01234567890,01234567890,01234567890,01234567890
for the sake of simplicity you can do this
code:
x = ",".join(re.findall('...........', '01234567890012345678900123456789001234567890'))
print(x)
Don't make the loops by yourself, use python libraries or builtins, it will be easier. For example :
def StringSplit(StrO,X):
substring_starts = range(0, len(StrO), X)
substrings = (StrO[start:start + X] for start in substring_starts)
return ','.join(substrings)
string = '1234567890ABCDE'
print(StringSplit(string, 5))
# '12345,67890,ABCDE'
Related
I am trying to convert a string to a list of complex numbers. (If you were to read it without quotes, it would be a list of complex numbers.) I've written a function to do this, but I'm getting this error:
Traceback (most recent call last):
File "complex.py", line 26, in <module>
print(listCmplx('[1.111 + 2.222j, 3.333 + 4.444j]'))
File "complex.py", line 10, in listCmplx
while (not isDigit(listIn[count])) and (listIn[count] != '.'):
IndexError: string index out of range
What am I doing wrong here?
def isDigit(char):
return char in '0123456789'
def listCmplx(listIn):
listOut = []
count = 0
real = '0'
imag = '0'
while count < len(listIn):
while (not isDigit(listIn[count])) and (listIn[count] != '.'):
count += 1
start = count
while (isDigit(listIn[count])) or (listIn[count] == '.'):
count += 1
end = count
if listIn[count] == 'j':
imag = listIn[start:end]
else:
real = listIn[start:end]
if listIn[count] == ',':
listOut += [float(real) + float(imag) * 1j]
real = '0'
imag = '0'
return listOut
print(listCmplx('[1.111 + 2.222j, 3.333 + 4.444j]'))
Thank you in advance.
Amazingly, this is something Python can do without needing any functions written, with its inbuilt complex number class.
listIn = '1.111 + 2.222j, 3.333 + 4.444j'
listOut = eval(listIn)
print(listOut[0])
print(listOut[0].imag,listOut[0].real)
Your original parsing problem is a good example because it highlights the importance, whenever possible, of using the simplest, highest-level parsing tools available. Simple, high-level tools include basic things like splitting, stripping, and string indexing. Regex might be considered a mid-level tool, and it's certainly a more complex one. The lowest-level tool -- and the one you chose -- was character by character analysis. Never do that unless you are absolutely forced to by the problem at hand.
Here's one way to parse your example input with simple tools:
# Helper function to take a string a return a complex number.
def s2complex(s):
r, _, i = s.split()
return complex(float(r), float(i[:-1]))
# Parse the input.
raw = '[1.111 + 2.222j, 3.333 + 4.444j]'
xs = raw[1:-1].split(', ')
nums = [s2complex(x) for x in xs]
# Check.
for n in nums:
print(n)
I'm trying to format any number by inserting ',' every 3 numbers from the end by not using format()
123456789 becomes 123,456,789
1000000 becomes 1,000,000
What I have so far only seems to go from the start, I've tried different ideas to get it to reverse but they seem to not work as I hoped.
def format_number(number):
s = [x for x in str(number)]
for a in s[::3]:
if s.index(a) is not 0:
s.insert(s.index(a), ',')
return ''.join(s)
print(format_number(1123456789))
>> 112,345,678,9
But obviously what I want is 1,123,456,789
I tried reversing the range [:-1:3] but I get 112,345,6789
Clarification: I don't want to use format to structure the number, I'd prefer to understand how to do it myself just for self-study's sake.
Here is a solution for you, without using built-in functions:
def format_number(number):
s = list(str(number))[::-1]
o = ''
for a in range(len(s)):
if a and a % 3 == 0:
o += ','
o += s[a]
return o[::-1]
print(format_number(1123456789))
And here is the same solution using built-in functions:
def format_number(number):
return '{:,}'.format(number)
print(format_number(1123456789))
I hope this helps. :D
One way to do it without built-in functions at all...
def format_number(number):
i = 0
r = ""
while True:
r = "0123456789"[number % 10] + r
number //= 10
if number == 0:
return r
i += 1
if i % 3 == 0:
r = "," + r
Here's a version that's almost free of built-in functions or methods (it does still have to use str)
def format_number(number):
i = 0
r = ""
for character in str(number)[::-1]:
if i > 0 and i % 3 == 0:
r = "," + r
r = character + r
i += 1
return r
Another way to do it without format but with other built-ins is to reverse the number, split it into chunks of 3, join them with a comma, and reverse it again.
def format_number(number):
backward = str(number)[::-1]
r = ",".join(backward[i:i+3] for i in range(0, len(backward), 3))
return r[::-1]
Your current approach has following drawbacks
checking for equality/inequality in most cases (especially for int) should be made using ==/!= operators, not is/is not ones,
using list.index returns first occurence from the left end (so s.index('1') will be always 0 in your example), we can iterate over range if indices instead (using range built-in).
we can have something like
def format_number(number):
s = [x for x in str(number)]
for index in range(len(s) - 3, 0, -3):
s.insert(index, ',')
return ''.join(s)
Test
>>> format_number(1123456789)
'1,123,456,789'
>>> format_number(6789)
'6,789'
>>> format_number(135)
'135'
If range, list.insert and str.join are not allowed
We can replace
range with while loop,
list.insert using slicing and concatenation,
str.join with concatenation,
like
def format_number(number):
s = [x for x in str(number)]
index = len(s) - 3
while index > 0:
s = s[:index] + [','] + s[index:]
index -= 3
result = ''
for character in s:
result += character
return result
Using str.format
Finally, following docs
The ',' option signals the use of a comma for a thousands separator. For a locale aware separator, use the 'n' integer presentation type instead.
your function can be simplified to
def format_number(number):
return '{:,}'.format(number)
and it will even work for floats.
I have three lists in text files and I am trying to generate a four-word message randomly, choosing from the prefix and suprafix lists for the first three words and the `suffix' file for the fourth word.
However, I want to prevent it from picking a word that was already chosen by the random.choice function.
import random
a= random.random
prefix = open('prefix.txt','r').readlines()
suprafix = open('suprafix.txt','r').readlines()
suffix = open('suffix.txt','r').readlines()
print (random.choice(prefix + suprafix), random.choice(prefix + suprafix), random.choice(prefix + suprafix), random.choice(suffix))
As you can see it chooses randomly from those two lists for three words.
random.sample(pop, k) selected k items from pop without replacement. Hence:
prefix1, prefix2, prefix3 = random.sample(prefix, 3)
suprafix1, suprafix2, suprafix3 = random.sample(suprafix, 3)
suffix = random.choice(suffix)
print (prefix1 + suprafix1, prefix2 + suprafix2, prefix3 + suprafix3, suffix))
Thankyou xnx that helped me sort out the problem by using the random.sample first then printing either of them afterwards, i might have done it the long way round but this is how i did it >
import random
a= random.random
prefix = open('prefix.txt','r').readlines()
suprafix = open('suprafix.txt','r').readlines()
suffix = open('suffix.txt','r').readlines()
prefix1, prefix2, prefix3 = random.sample(prefix, 3)
suprafix1, suprafix2, suprafix3 = random.sample(suprafix, 3)
suffix = random.choice(suffix)
one = prefix1, suprafix1
two = prefix2, suprafix2
three = prefix3, suprafix3
print (random.choice(one), random.choice(two), random.choice(three), suffix)
I would like to create a binary puzzle with python.
At the moment I already made a 6x6, 8x8 and 10x10 layout which is shown based on the difficulty that the players wishes to play. The purpose of the puzzle can be compared with a game of sudoku, you want to input either 0 or 1 on a given location by the player. Below you will find what I currently have for the layout.
if graad == 1:
easy = [['A', 'B', 'C', 'D', 'E'],
['_','_','_','_','_','_','_'],
[0,1,0,1,0,1,' |1'],
[1,0,1,0,1,0,' |2'],
[0,1,0,1,0,1,' |3'],
[1,0,1,0,1,0,' |4'],
[0,1,0,1,0,1,' |5'],
[1,0,1,0,1,0,' |6']]
i = 0
while i < len(easy):
j = 0
s = ""
while j < len(easy[i]):
s = s + str(easy[i][j]) + " "
j = j + 1
print (s)
i = i + 1
Now the problem that I am facing is, how can I let python know that when a player fills in position 3 on column C and row 5 with a 0 for example?
I was thinking of an IF statement that checks the input on either a A, B, C D, E... Row 1,2,3,4,5.. but that is going to be a lot of if statements.
Edit1: Ok so to clarify.I wanted to post a picture but need more posts.
For example, I have a game board of 6x6 cells. Some of them are filled with a 1 and some of them are filled with 0 and most of them are empty because the goal is to have it look in the end like my layout in the python code.(That's the solution). So you want the user to fill in those empty cells.
Now, let's say that the player wants to fill in A-1 with a 1, how will python know that input A-1 is linked to index [0][0] in the list?
A simple way to convert your letter indices to numbers is to use the ord() function, which returns the numerical code of a single character. Since you are using upper-case letters, with 'A' being the label for the column with index 0, you can do
column = ord(letter) - ord('A')
That will convert 'A' to 0, 'B' to 1, etc.
Here's a short example program vaguely based on the code on your question.
It accepts moves in the form A10 to set location A1 to '1', 'B30' to set location B3 to '0'. It accepts lower case letters, too, so 'd11' is the same as 'D11'. Hit Ctrl-C to exit.
Tested on Python 2.6.6, but it should work correctly on Python 3. (To run it on Python 2, change input() to raw_input()).
#! /usr/bin/env python
def print_grid(g):
gsize = len(g)
base = ord('A')
print(' '.join([chr(base + i) for i in range(gsize)]))
print((gsize * 2) * '-')
for i, row in enumerate(g, 1):
print(' '.join(row) + ' | ' + str(i))
print('\n')
def main():
gsize = 9
rowstr = gsize * '_'
grid = [list(rowstr) for i in range(gsize)]
print_grid(grid)
while True:
move = input('Enter move: ')
letter, number, bit = move.strip()
col = ord(letter.upper()) - ord('A')
row = int(number) - 1
grid[row][col] = bit
print_grid(grid)
if __name__ == "__main__":
main()
If you work with a pandas DataFrame to hold your correct answer of the game you can easily check things. The pandas package has a good documentation (and a lot of Q&A here on stackoverflow).
The setup of your correct answer:
import pandas as pd
data = [[0,1,0,1,0,1],
[1,0,1,0,1,0],
[0,1,0,1,0,1],
[1,0,1,0,1,0],
[0,1,0,1,0,1],
[1,0,1,0,1,0]]
easy = pd.DataFrame(data)
easy.columns = ['A','B','C','D','E','F']
print easy
The item at position 'A',0 (python starts to number from 0) is given by easy['A'][0]. For more information about indexing a pandas DataFrame object visit the documentation.
Another usefull thing, a DataFrame object is printable, making it unnecessary to write a print command yourself.
If using DataFrames is overkill for you, another option is to work with a 'translation' dictionary. This dictionary will use the letters for keys and the corresponding column number as a value.
>>> column = {'A':0, 'B':1, 'C':2, 'D':3, 'E':4, 'F':5}
>>> print column['A']
0
I have a parser that reads in a long octet string, and I want it to print out smaller strings based on the parsing details. It reads in a hexstring which is as follows
The string will be in a format like so:
01046574683001000004677265300000000266010000
The format of the interface contained in the hex is like so:
version:length_of_name:name:op_status:priority:reserved_byte
==
01:04:65746830:01:00:00
== (when converted from hex)
01:04:eth0:01:00:00
^ this is 1 segment of the string , represents eth0 (I inserted the : to make it easier to read). At the minute, however, my code returns a blank list, and I don't know why. Can somebody help me please!
def octetChop(long_hexstring, from_ssh_):
startpoint_of_interface_def=0
# As of 14/8/13 , the network operator has not been implemented
network_operator_implemented=False
version_has_been_read = False
position_of_interface=0
chopped_octet_list = []
#This while loop moves through the string of the interface, based on the full length of the container
try:
while startpoint_of_interface_def < len(long_hexstring):
if version_has_been_read == True:
pass
else:
if startpoint_of_interface_def == 0:
startpoint_of_interface_def = startpoint_of_interface_def + 2
version_has_been_read = True
endpoint_of_interface_def = startpoint_of_interface_def+2
length_of_interface_name = long_hexstring[startpoint_of_interface_def:endpoint_of_interface_def]
length_of_interface_name_in_bytes = int(length_of_interface_name) * 2 #multiply by 2 because its calculating bytes
end_of_interface_name_point = endpoint_of_interface_def + length_of_interface_name_in_bytes
hex_name = long_hexstring[endpoint_of_interface_def:end_of_interface_name_point]
text_name = hex_name.decode("hex")
print "the text_name is " + text_name
operational_status_hex = long_hexstring[end_of_interface_name_point:end_of_interface_name_point+2]
startpoint_of_priority = end_of_interface_name_point+2
priority_hex = long_hexstring[startpoint_of_priority:startpoint_of_priority+2]
#Skip the reserved byte
network_operator_length_startpoint = startpoint_of_priority+4
single_interface_string = long_hexstring[startpoint_of_interface_def:startpoint_of_priority+4]
print single_interface_string + " is chopped from the octet string"# - keep for possible debugging
startpoint_of_interface_def = startpoint_of_priority+4
if network_operator_implemented == True:
network_operator_length = long_hexstring[network_operator_length_startpoint:network_operator_length_startpoint+2]
network_operator_length = int(network_operator_length) * 2
network_operator_start_point = network_operator_length_startpoint+2
network_operator_end_point = network_operator_start_point + network_operator_length
network_operator = long_hexstring[network_operator_start_point:network_operator_end_point]
#
single_interface_string = long_hexstring[startpoint_of_interface_def:network_operator_end_point]
#set the next startpoint if there is one
startpoint_of_interface_def = network_operator_end_point+1
else:
self.network_operator = None
print single_interface_string + " is chopped from the octet string"# - keep for possible debugging
#This is where each individual interface is stored, in a list for comparison.
chopped_octet_list.append(single_interface_string)
finally:
return chopped_octet_list
The reason your code is returning a blank list is the following: In this line:
else:
self.network_operator = None
self is not defined so you get a NameError exception. This means that the try jumps directly to the the finally clause without ever executing the part where you:
chopped_octet_list.append(single_interface_string)
As a consequence the list remains empty. In any case the code is overly complicated for such a task, I would follow one of the other answers.
I hope I got you right. You got a hex-string which contains various interface definition. Inside each interface definition the second octet describes the length of the name of the interface.
Lets say the string contains the interfaces eth0 and eth01 and looks like this (length 4 for eth0 and length 5 for eth01):
01046574683001000001056574683031010000
Then you can split it like this:
def splitIt (s):
tokens = []
while s:
length = int (s [2:4], 16) * 2 + 10 #name length * 2 + 10 digits for rest
tokens.append (s [:length] )
s = s [length:]
return tokens
This yields:
['010465746830010000', '01056574683031010000']
To add onto Hyperboreus's answer, here's a simple way to parse the interface strings once you split them:
def parse(s):
version = int(s[:2], 16)
name_len = int(s[2:4], 16)
name_end = 4 + name_len * 2
name = s[4:name_end].decode('hex')
op_status = int(s[name_end:name_end+2], 16)
priority = int(s[name_end+2:name_end+4], 16)
reserved = s[name_end+4:name_end+6]
return version, name_len, name, op_status, priority, reserved
Here's the output:
>>> parse('010465746830010000')
(1, 4, 'eth0', 1, 0, '00')
Check if the following helps. Call parse method below and pass a string stream into it, then iterate to get card infos (hope I got you right :)) parse will return you tuple(s) of the desired info.
>>> def getbytes(hs):
"""Returns a generator of bytes from a hex string"""
return (int(hs[i:i+2],16) for i in range(0,len(hs)-1,2))
>>> def get_single_card_info(g):
"""Fetches a single card info from a byte generator"""
v = g.next()
l = g.next()
name = "".join(chr(x) for x in map(lambda y: y.next(),[g]*l))
return (str(v),name,g.next(),g.next(),g.next())
>>> def parse(hs):
"""Parses a hex string stream and returns a generator of card infos"""
bs = getbytes(hs)
while True:
yield get_single_card_info(bs)
>>> c = 1
>>> for card in parse("01046574683001000001056574683031010000"):
print "Card:{0} -> Version:{1}, Id:{2}, Op_stat:{3}, priority:{4}, reserved:{5} bytes".format(c,*card)
c = c + 1
Card:1 -> Version:1, Id:eth0, Op_stat:1, priority:0, reserved:0 bytes
Card:2 -> Version:1, Id:eth01, Op_stat:1, priority:0, reserved:0 bytes
Pyparsing includes a built-in expression for parsing a counted array of elements, so this would take care of your 'name' field nicely. Here's the whole parser:
from pyparsing import Word,hexnums,countedArray
# read in 2 hex digits, convert to integer at parse time
octet = Word(hexnums,exact=2).setParseAction(lambda t:int(t[0],16))
# read in a counted array of octets, convert to string
nameExpr = countedArray(octet, intExpr=octet)
nameExpr.setParseAction(lambda t: ''.join(map(chr,t[0])))
# define record expression, with named results
recordExpr = (octet('version') + nameExpr('name') + octet('op_status') +
octet('priority') #+ octet('reserved'))
Parsing your sample:
sample = "01046574683001000004677265300000000266010000"
for rec in recordExpr.searchString(sample):
print rec.dump()
Gives:
[1, 'eth0', 1, 0]
- name: eth0
- op_status: 1
- priority: 0
- version: 1
[0, 'gre0', 0, 0]
- name: gre0
- op_status: 0
- priority: 0
- version: 0
[0, 'f\x01', 0, 0]
- name: f
- op_status: 0
- priority: 0
- version: 0
The dump() method shows results names that you can use to access the individually parsed bits, like rec.name or rec.version.
(I commented out the reserved byte, else the second entry wouldn't parse correctly. Also, the third entry contains a name with a \x01 byte.)