Parsing a file without BeautifulSoup - python
I've been working on this small piece for hours now and couldn't find a solution, and it should be simple. This time, I'll post the actual code, and not simple examples, as somehow I can't get the examples to work with the real code.
I'm trying to do this with built-in modules (though if you have the answer using bs4 I'd like to know it as well). It should be a simple thing.
I have two files, an HTML file that goes like this.
<b>Match #139</b></font></td></tr><tr bgcolor="#EEEEEE"><td align="CENTER" width="10%"><font color="Green" face="Tahoma,Arial" size="2"><b>Yes</b></font></td><td nowrap=""> <font face="Tahoma,Arial" size="2">3822pb01 </font></td><td><font face="Tahoma,Arial" size="2"><b>Door 1 x 3 x 1 Left with 'POLICE' Pattern</b></font><font class="fv"><br>Catalog: Parts: Door, Decorated</font></td><td nowrap=""><font class="fv"> </font></td></tr><tr bgcolor="#FFFFFF"><td align="CENTER" width="10%"><font color="Green" face="Tahoma,Arial" size="2"><b>Yes</b></font></td><td nowrap=""> <font face="Tahoma,Arial" size="2">3821pb01 </font></td><td><font face="Tahoma,Arial" size="2"><b>Door 1 x 3 x 1 Right with 'POLICE' Pattern</b></font><font class="fv"><br>Catalog: Parts: Door, Decorated</font></td><td nowrap=""><font class="fv"> </font></td></tr><tr bgcolor="#5E5A80"><td colspan="4"><font face="Tahoma,Arial" size="2" color="#FFFFFF"> <b>Match #140</b></font></td></tr><tr bgcolor="#EEEEEE"><td align="CENTER" width="10%"><font color="Green" face="Tahoma,Arial" size="2"><b>Yes</b></font></td><td nowrap=""> <font face="Tahoma,Arial" size="2">3822pb02 </font></td><td><font face="Tahoma,Arial" size="2"><b>Door 1 x 3 x 1 Left with Classic Fire Logo Pattern</b></font><font class="fv"><br>Catalog: Parts: Door, Decorated</font></td><td nowrap=""><font class="fv"> </font></td></tr><tr bgcolor="#FFFFFF"><td align="CENTER" width="10%"><font color="Green" face="Tahoma,Arial" size="2"><b>Yes</b></font></td><td nowrap=""> <font face="Tahoma,Arial" size="2">3821pb02 </font></td><td><font face="Tahoma,Arial" size="2"><b>Door 1 x 3 x 1 Right with Classic Fire Logo Pattern</b></font><font class="fv"><br>Catalog: Parts: Door, Decorated</font></td><td nowrap=""><font class="fv"> </font></td></tr><tr bgcolor="#5E5A80"><td colspan="4"><font face="Tahoma,Arial" size="2" color="#FFFFFF"> <b>
Please don't kill me, yes, it's only a line. You can paste it into some code editor to see it in multiple lines. The file continues with more "Matches".
I want to do two things.
1st, I want to create a dictionary that will use the match number as it's index number. So, for example, it would be
matches = {'139' : 'etc', '140' : 'etc'}
And then, if you look at the HTML, after the first link after the Match, there is a part number, in example, the first one is 3822pb01. There are usually 2 part numbers inside a match, and I want to create a tuple inside the dict with those 2 part numbers.
matches = {'139' : ['3822pb01', '3821pb01'], '140' : ['3822pb02', 3821pb02]}
So far, I have been able to strip out the part numbers, or the Match #'s, but not correlate the part #'s and the Match #'s.
Could someone help me approach this? - it runs a little away from my current knowledge.
Here's the full HTML file - http://pastebin.com/raw.php?i=eWWh4XfM - HTML doesn't have the best formatting
Using BeautifulSoup:
import re
from bs4 import BeautifulSoup
matches = {}
_catalog_link = re.compile(r'^http://www\.bricklink\.com/catalogItem\.asp\?P=')
soup = BeautifulSoup(htmlpage)
for match in soup.find_all(text=re.compile(r'Match #\d+')):
match_number = match.string.split('#', 1)[-1]
matches[match_number] = matched_links = []
# Find the parent table row
row = next(p for p in match.parents if p.name == 'tr')
# next rows hold the links
for sibling in row.next_siblings:
if sibling.name != 'tr':
continue
links = sibling.find_all('a', href=_catalog_link)
if not links:
break
matched_links.extend(l.string for l in links)
This produces:
{u'139': [u'3822pb01', u'3821pb01'],
u'140': [u'3822pb02', u'3821pb02'],
u'141': [u'3822pb06', u'3821pb06'],
u'142': [u'3822p03', u'3821p03'],
u'143': [u'3822p24', u'3821p24'],
u'144': [u'3822pb05', u'3821pb05'],
u'145': [u'3822pb04', u'3821pb04'],
u'146': [u'3822px1', u'3821px1'],
u'147': [u'3822', u'3821'],
u'148': [u'3189', u'3188'],
u'149': [u'801a', u'802a'],
u'150': [u'801', u'802'],
u'151': [u'445', u'446'],
u'152': [u'825', u'826'],
u'153': [u'825p01', u'826p01'],
u'154': [u'825p02', u'826p02'],
u'155': [u'3195', u'3194'],
u'156': [u'30231pb02', u'30231pb01'],
u'158': [u'30230px1', u'30230px2'],
u'159': [u'3936', u'3935'],
u'160': [u'30355', u'30356'],
u'161': [u'3586', u'3585'],
u'162': [u'3933', u'3934'],
u'164': [u'981', u'982'],
u'165': [u'43369', u'43368'],
u'166': [u'972', u'971'],
u'167': [u'972pa2', u'971pa2'],
u'168': [u'972p4f', u'971p4f'],
u'169': [u'972p63', u'971p63'],
u'170': [u'30073', u'30074'],
u'171': [u'6128', u'6127'],
u'172': [u'4466', u'4467'],
u'173': [u'fabah1', u'fabah2'],
u'174': [u'x46', u'x48'],
u'175': [u'4181', u'4182'],
u'176': [u'4181p05', u'4182p05'],
u'177': [u'4181pb01', u'4182pb01'],
u'178': [u'4181p02', u'4182p02'],
u'179': [u'4181p06', u'4182p06'],
u'180': [u'4181p04', u'4182p04'],
u'181': [u'4181px1', u'4182px1'],
u'182': [u'4181p03', u'4182p03'],
u'183': [u'4181p01', u'4182p01'],
u'184': [u'4181p07', u'4182p07'],
u'185': [u'3195px1', u'3194px1'],
u'186': [u'32190', u'32191'],
u'187': [u'32188', u'32189'],
u'188': [u'32527', u'32528'],
u'189': [u'32534', u'32535'],
u'190': [u'44350', u'44351'],
u'191': [u'44352', u'44353'],
u'192': [u'47712', u'47713'],
u'193': [u'42061', u'42060'],
u'194': [u'43710', u'43711'],
u'195': [u'41765', u'41764'],
u'196': [u'41748', u'41747'],
u'197': [u'41750', u'41749'],
u'198': [u'6565', u'6564'],
u'199': [u'41770', u'41769'],
u'200': [u'43723', u'43722'],
u'201': [u'43721', u'43720'],
u'202': [u'41768', u'41767'],
u'203': [u'3069bps5', u'3069bps4'],
u'204': [u'42061pb03', u'42060pb03'],
u'205': [u'42061pb05', u'42060pb05'],
u'206': [u'3005pb001', u'3005pb002'],
u'207': [u'48288pb02', u'48288pb01'],
u'208': [u'2582pb03', u'2582pb04'],
u'209': [u'712', u'713'],
u'211': [u'3039px17', u'3039px18'],
u'212': [u'3037px5', u'3037px6'],
u'213': [u'3037px3', u'3037px4'],
u'214': [u'30249pb02', u'30249pb01'],
u'215': [u'42022pb09', u'42022pb08'],
u'216': [u'42022pb05', u'42022pb06'],
u'217': [u'30647pb05', u'30647pb04'],
u'218': [u'30647pb01', u'30647pb02'],
u'219': [u'30647pb07', u'30647pb06'],
u'220': [u'30647px1', u'30647px2'],
u'221': [u'2744pb02', u'2744pb01'],
u'222': [u'42061px5', u'42060px5'],
u'223': [u'42061pb01', u'42060pb01'],
u'224': [u'42061px1', u'42060px1'],
u'225': [u'41748pb05', u'41747pb05'],
u'226': [u'41748pb16', u'41747pb16'],
u'227': [u'41748pb12', u'41747pb12'],
u'228': [u'41748pb15', u'41747pb15'],
u'229': [u'41748pb07', u'41747pb07'],
u'230': [u'41748px1', u'41747px1'],
u'231': [u'41748pb06', u'41747pb06'],
u'232': [u'41748pb14', u'41747pb14'],
u'233': [u'41748pb02', u'41747pb02'],
u'234': [u'41748pb04', u'41747pb04'],
u'235': [u'41748pb09', u'41747pb09'],
u'236': [u'41748pb08', u'41747pb08'],
u'237': [u'41748pb11', u'41747pb11'],
u'238': [u'41748pb03', u'41747pb03'],
u'239': [u'41748pb13', u'41747pb13'],
u'240': [u'41748pb10', u'41747pb10'],
u'241': [u'41750px2', u'41749px2'],
u'242': [u'41750pb01', u'41749pb01'],
u'243': [u'6565pb01', u'6564pb01'],
u'244': [u'4864bp10', u'4864bp11'],
u'245': [u'4864pb006L', u'4864pb006R'],
u'246': [u'2362pb04', u'2362pb05'],
u'247': [u'4215ap06', u'4215ap04'],
u'248': [u'4215ap24', u'4215ap25'],
u'249': [u'4215pb021', u'4215pb022'],
u'250': [u'4215ap07', u'4215ap05'],
u'251': [u'30117pb02L', u'30117pb02R'],
u'252': [u'30117pb03L', u'30117pb03R'],
u'253': [u'30117pb04L', u'30117pb04R'],
u'254': [u'30117pb01', u'30117pb05'],
u'255': [u'30116pb01', u'30116pb02'],
u'256': [u'2468pb02', u'2468pb03'],
u'257': [u'3245apx2', u'3245apx1'],
u'258': [u'4070pb02', u'4070pb01'],
u'259': [u'41855pb09', u'41855pb10'],
u'401': [u'47847pb001L', u'47847pb001R'],
u'418': [u'4460pb01', u'4460pb02'],
u'419': [u'3010pb027', u'3010pb026'],
u'420': [u'3010pb025', u'3010pb024'],
u'421': [u'2341pb02', u'2341pb01'],
u'439': [u'4286pb03', u'4286pb02'],
u'440': [u'41748pb17', u'41747pb17'],
u'472': [u'43710pb01', u'43711pb01'],
u'473': [u'30363pb08', u'30363pb09'],
u'474': [u'50305', u'50304'],
u'475': [u'50955', u'50956'],
u'512': [u'4286pb04', u'4286pb01'],
u'546': [u'47397', u'47398'],
u'572': [u'3193', u'3192'],
u'598': [u'3933a', u'3934a'],
u'606': [u'3822pb07', u'3821pb07'],
u'620': [u'3939px1', u'3939px2'],
u'621': [u'2431px18', u'2431px19'],
u'622': [u'3069bpx57', u'3069bpx56'],
u'643': [u'4215pb015', u'4215pb016'],
u'678': [u'54384', u'54383'],
u'680': [u'42061pb06', u'42060pb06'],
u'681': [u'42061pb02', u'42060pb02'],
u'682': [u'41748pb18', u'41747pb18'],
u'683': [u'41768pb01', u'41767pb01'],
u'684': [u'42061pb07', u'42060pb07'],
u'685': [u'48933pb02', u'48933pb03'],
u'686': [u'3622pb011', u'3622pb012'],
u'687': [u'3010pb055L', u'3010pb055R'],
u'688': [u'3008pb038', u'3008pb039'],
u'689': [u'3822pb08', u'3821pb08'],
u'690': [u'3822pb09', u'3821pb09'],
u'691': [u'3822pb10', u'3821pb10'],
u'692': [u'3189pb01', u'3188pb01'],
u'693': [u'3193pb01', u'3192pb01'],
u'694': [u'3193pb02', u'3192pb02'],
u'695': [u'3195pb01', u'3194pb01'],
u'696': [u'4864apx10', u'4864apx11'],
u'697': [u'4215pb029', u'4215pb030'],
u'700': [u'2362pb10', u'2362pb11'],
u'701': [u'4286pb06', u'4286pb05'],
u'702': [u'3678apb05', u'3678apb06'],
u'703': [u'3678apb07', u'3678apb08'],
u'704': [u'4460pb04', u'4460pb03'],
u'705': [u'2340pb17L', u'2340pb17R'],
u'706': [u'2340pb21L', u'2340pb21R'],
u'707': [u'2340pb03', u'2340pb02'],
u'708': [u'2340pb11', u'2340pb10'],
u'709': [u'2340pb04', u'2340pb05'],
u'710': [u'2340pb16', u'2340pb15'],
u'711': [u'2340pb07', u'2340pb06'],
u'712': [u'2340pb09', u'2340pb08'],
u'714': [u'2431pb039', u'2431pb040'],
u'727': [u'2431pb025', u'2431pb026'],
u'728': [u'791pb01L', u'791pb01R'],
u'766': [u'3004pb031L', u'3004pb031R'],
u'768': [u'3010pb057L', u'3010pb057R'],
u'769': [u'3009pb071L', u'3009pb071R'],
u'770': [u'3009pb072L', u'3009pb072R'],
u'771': [u'2873pb08L', u'2873pb08R'],
u'772': [u'4286pb07L', u'4286pb07R'],
u'773': [u'4286pb08L', u'4286pb08R'],
u'774': [u'2340pb25L', u'2340pb25R'],
u'775': [u'2340pb23L', u'2340pb23R'],
u'776': [u'3004pb021L', u'3004pb021R'],
u'777': [u'3004pb017L', u'3004pb017R']}
Related
Pytmx rendering animated tiles from Tiled map
So I have been working on a project for some time now and I realy wanted to get animated tiles in to the game. Im creating a 2d pixel art styled game with pygame and Im using the an editor called Tiled to create the map. Tiled generates a .tmx file as well as a .tsx file to be used to render the map. I have gotten the map to render without any problems. The problems comes with rendering animated tiles. They just dont get animated. I understand the basics of how the animation works. I just need to get the first image of the animation, wait the duration between frames and then render the next frame. But I just cant figure out how to get it working. There is minimal documentation of pytmx and how it reads animations from Tiled files. This is the .tmx file: <?xml version="1.0" encoding="UTF-8"?> <map version="1.2" tiledversion="1.3.4" orientation="orthogonal" renderorder="right-down" width="32" height="32" tilewidth="96" tileheight="96" infinite="0" nextlayerid="5" nextobjectid="5"> <tileset firstgid="1" source="Bigger-Textures(96x96).tsx"/> <layer id="1" name="Tile Layer 1" width="32" height="32"> <data encoding="csv"> 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,4,6,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,13,15,1,1,1,1,1,1,12,12,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,13,15,1,1,1,1,1,1,11,12,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,13,25,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,22,23,23,23,23,23,23,23,9,7,23,23,23,23,23,23,24,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,13,15,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,13,15,1,1,1,1,12,3,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,12,12,1,4,5,5,27,25,5,5,6,1,1,12,12,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,11,12,1,13,7,8,9,7,8,9,15,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,12,1,1,13,16,17,18,16,11,18,15,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,13,25,26,27,25,26,27,15,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,13,7,23,23,23,23,23,24,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,13,15,1,1,1,1,1,10,12,12,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,13,15,1,1,1,1,1,12,1,12,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,13,25,5,5,5,5,5,5,5,5,5,6,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,22,23,23,23,23,23,23,23,23,23,23,24,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 </data> </layer> <objectgroup id="4" name="Obstacles"> <object id="1" name="wall" x="0" y="-96" width="3168" height="96"/> <object id="2" name="wall" x="3072" y="0" width="96" height="3168"/> <object id="3" name="wall" x="-96" y="3072" width="3168" height="96"/> <object id="4" name="wall" x="-96" y="-96" width="96" height="3168"/> </objectgroup> </map> And this is the .tsx file: <?xml version="1.0" encoding="UTF-8"?> <tileset version="1.2" tiledversion="1.3.4" name="Bigger-Textures (96x96)" tilewidth="96" tileheight="96" tilecount="81" columns="9"> <image source="../gfx/tiles/tilesheets/Textures-sprite-sheet-4X.png" width="864" height="864"/> <tile id="1"> <animation> <frame tileid="1" duration="500"/> <frame tileid="2" duration="500"/> </animation> </tile> <tile id="2"> <animation> <frame tileid="2" duration="500"/> <frame tileid="1" duration="500"/> </animation> </tile> <tile id="9"> <animation> <frame tileid="9" duration="500"/> <frame tileid="10" duration="500"/> </animation> </tile> <tile id="10"> <animation> <frame tileid="10" duration="500"/> <frame tileid="9" duration="500"/> </animation> </tile> </tileset> And this is how I currently render the tiles: def render(self): self.ti = self.handler.currentMap.get_tile_image_by_gid xStart = max(0, self.handler.camera.xOffset / self.handler.currentMap.tilewidth) xEnd = min(self.handler.currentMap.width, (self.handler.camera.xOffset + self.handler.displayWidth) / self.handler.currentMap.tilewidth + 1) yStart = max(0, self.handler.camera.yOffset / self.handler.currentMap.tileheight) yEnd = min(self.handler.currentMap.height, (self.handler.camera.yOffset + self.handler.displayHeight) / self.handler.currentMap.tileheight + 1) for i in range(len(self.handler.currentMap.layers) - 1): for x in range(int(xStart), int(xEnd)): for y in range(int(yStart), int(yEnd)): tile = self.handler.currentMap.get_tile_image(x, y, i) if (tile): self.display.blit(tile, (x * self.handler.currentMap.tilewidth - self.handler.camera.xOffset, y * self.handler.currentMap.tileheight - self.handler.camera.yOffset)) This is what it says on the Pytmx github : # just iterate over animated tiles and demo them # tmx_map is a TiledMap object # tile_properties is a dictionary of all tile properties # iterate over the tile properties for gid, props in tmx_map.tile_properties.items(): # iterate over the frames of the animation # if there is no animation, this list will be empty for animation_frame in props['frames']: # do something with the gid and duration of the frame # this may change in the future, as it is a little awkward now image = tmx_map.get_tile_image_by_gid(gid) duration = animation_frame.duration ... Any help is greatly appreciated! Here is the project on GitHub if it is to any use :D
I had fun looking for this today. And it's not that complicated after all. I made something like this in Tiled Note the animated water tiles. Now look at the code below: def update(self, frame): if frame in getFrequencyList(6): self.current_anim_index += 1 if self.current_anim_index == 4: self.current_anim_index = 0 def getSurface(self): for layer in self.tmx_data.visible_layers: for x, y, image in layer.tiles(): for gid, props in self.tmx_data.tile_properties.items(): if image == self.tmx_data.get_tile_image_by_gid(props['frames'][0].gid): image = self.tmx_data.get_tile_image_by_gid(props['frames'][self.current_anim_index].gid) self.surface.blit(image, (x * 16, y * 16)) else: self.surface.blit(image, ((x * 16) + layer.offsetx, (y * 16) + layer.offsety)) return super().getSurface() when you browse the list of tiles by layers, you recover the image by position (x and y). The principle is that, before displaying the current image (the tile), we check that it is not in fact an animation. For that you have to get all the animated tiles. So we do for gid, props in self.tmx_data.tile_properties.items(): All animated tiles are in props['frames']. If you look at your tsx file, you could see something like this : <?xml version="1.0" encoding="UTF-8"?> <tileset version="1.5" tiledversion="1.6.0" name="Overworld (Light)" tilewidth="16" tileheight="16" tilecount="1664" columns="52"> <image source="Overworld (Light).png" trans="ff00ff" width="832" height="512"/> <tile id="30"> <animation> <frame tileid="30" duration="250"/> <frame tileid="82" duration="250"/> <frame tileid="134" duration="250"/> <frame tileid="186" duration="250"/> </animation> </tile> ... So each part of props['frame'] are tables represent each animation nodes. So if image == self.tmx_data.get_tile_image_by_gid(props['frames'][0].gid): means that current image is an animated tile. In that case all you need to de is to blit one of the tile in your props['frame'] table. As you can see I created a current_anim_index attribute. I vary it in the update method. I make sure that this is called in my game loop. The frame argument varies from 0 to 60 (Yes, 60 FPS). And getFrequencyList(6) return a table like [0, 10, 20, 30, 40, 50].
Moving and optimizing svg coordinates to a grid in python
I am working with .svg files exported from GIMP. I need the nodes on each path moved to the nearest vertex (on a 70px grid) and remove duplicates. A sample .svg looks like this: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"> <svg xmlns="http://www.w3.org/2000/svg" width="1283.95mm" height="1037.04mm" viewBox="0 0 3640 2940"> <path id="Selection" fill="none" stroke="black" stroke-width="1" d="M 1475.00,205.00 C 1475.00,225.21 1471.02,215.59 1471.00,231.00 1471.00,231.00 1471.00,329.00 1471.00,329.00 1471.00,331.56 1470.89,334.55 1471.65,337.00 1472.95,341.22 1476.88,346.28 1481.01,347.99 1484.25,349.34 1489.44,349.00 1493.00,349.00 1493.00,349.00 1515.00,349.00 1515.00,349.00 1515.00,349.00 1660.00,349.00 1660.00,349.00 1662.50,349.00 1665.62,349.11 1668.00,348.35 1682.12,343.87 1679.00,324.35 1679.00,313.00 1679.00,313.00 1679.00,231.00 1679.00,231.00 1678.98,215.59 1675.00,225.21 1675.00,205.25 1675.00,205.25 1685.00,205.25 1685.00,205.25 1685.00,205.25 1698.00,209.00 1698.00,209.00 1698.00,209.00 1731.00,209.00 1731.00,209.00 1731.00,209.00 1755.00,205.00 1755.00,205.00 1755.00,205.00 1755.00,209.00 1755.00,209.00 1754.97,222.24 1751.16,217.55 1751.00,229.00 1751.00,229.00 1751.00,314.00 1751.00,314.00 1751.00,314.00 1751.00,469.00 1751.00,469.00 1751.02,484.41 1755.00,474.79 1755.00,494.75 1755.00,494.75 1745.00,494.75 1745.00,494.75 1745.00,494.75 1732.00,491.00 1732.00,491.00 1732.00,491.00 1630.00,491.00 1630.00,491.00 1630.00,491.00 1605.00,495.00 1605.00,495.00 1605.00,474.79 1608.98,484.41 1609.00,469.00 1609.00,469.00 1609.00,441.00 1609.00,441.00 1608.99,433.51 1607.29,426.29 1599.96,422.45 1596.23,420.50 1590.22,421.00 1586.00,421.00 1586.00,421.00 1559.00,421.00 1559.00,421.00 1547.59,421.06 1541.16,427.47 1541.00,439.00 1541.00,439.00 1541.00,470.00 1541.00,470.00 1541.05,480.13 1542.88,477.97 1544.75,484.00 1545.21,486.62 1545.00,492.08 1544.75,494.75 1544.75,494.75 1535.00,494.75 1535.00,494.75 1535.00,494.75 1522.00,491.00 1522.00,491.00 1522.00,491.00 1420.00,491.00 1420.00,491.00 1420.00,491.00 1395.00,495.00 1395.00,495.00 1395.00,495.00 1395.00,491.00 1395.00,491.00 1395.03,477.76 1398.84,482.45 1399.00,471.00 1399.00,471.00 1399.00,386.00 1399.00,386.00 1399.00,386.00 1399.00,231.00 1399.00,231.00 1398.98,215.59 1395.00,225.21 1395.00,205.25 1395.00,205.25 1405.00,205.25 1405.00,205.25 1405.00,205.25 1418.00,209.00 1418.00,209.00 1418.00,209.00 1451.00,209.00 1451.00,209.00 1451.00,209.00 1475.00,205.00 1475.00,205.00 Z M 2035.00,1255.00 C 2035.00,1255.00 2035.00,1260.00 2035.00,1260.00 2034.92,1272.31 2031.02,1266.58 2031.00,1281.00 2031.00,1281.00 2031.00,1449.00 2031.00,1449.00 2031.02,1464.41 2035.00,1454.79 2035.00,1474.75 2035.00,1474.75 2025.00,1474.75 2025.00,1474.75 2025.00,1474.75 2012.00,1471.00 2012.00,1471.00 2012.00,1471.00 1910.00,1471.00 1910.00,1471.00 1910.00,1471.00 1885.00,1475.00 1885.00,1475.00 1885.00,1475.00 1885.00,1465.00 1885.00,1465.00 1903.68,1465.00 1894.73,1468.98 1910.00,1469.00 1910.00,1469.00 2010.00,1469.00 2010.00,1469.00 2012.98,1469.00 2016.18,1469.16 2018.99,1467.99 2023.12,1466.28 2027.05,1461.22 2028.35,1457.00 2028.35,1457.00 2029.00,1425.00 2029.00,1425.00 2029.00,1425.00 2029.00,1281.00 2029.00,1281.00 2028.98,1267.66 2023.60,1261.02 2010.00,1261.00 2010.00,1261.00 1916.00,1261.00 1916.00,1261.00 1916.00,1261.00 1770.00,1261.00 1770.00,1261.00 1767.02,1261.00 1763.82,1260.84 1761.01,1262.01 1756.88,1263.72 1752.95,1268.78 1751.65,1273.00 1751.65,1273.00 1751.00,1305.00 1751.00,1305.00 1751.00,1305.00 1751.00,1449.00 1751.00,1449.00 1751.01,1456.49 1752.71,1463.71 1760.04,1467.55 1763.77,1469.50 1769.78,1469.00 1774.00,1469.00 1774.00,1469.00 1800.00,1469.00 1800.00,1469.00 1815.27,1468.98 1806.32,1465.00 1825.00,1465.00 1825.00,1465.00 1825.00,1474.75 1825.00,1474.75 1825.00,1474.75 1815.00,1474.75 1815.00,1474.75 1815.00,1474.75 1802.00,1471.00 1802.00,1471.00 1802.00,1471.00 1769.00,1471.00 1769.00,1471.00 1769.00,1471.00 1745.00,1475.00 1745.00,1475.00 1745.00,1475.00 1745.00,1470.00 1745.00,1470.00 1745.08,1457.69 1748.98,1463.42 1749.00,1449.00 1749.00,1449.00 1749.00,1281.00 1749.00,1281.00 1748.98,1265.59 1745.00,1275.21 1745.00,1255.00 1745.00,1255.00 1749.00,1255.00 1749.00,1255.00 1749.00,1255.00 1770.00,1259.00 1770.00,1259.00 1770.00,1259.00 2010.00,1259.00 2010.00,1259.00 2010.00,1259.00 2035.00,1255.00 2035.00,1255.00 Z M 1054.75,2095.00 C 1055.00,2097.92 1055.21,2103.38 1054.75,2106.00 1053.06,2111.36 1051.12,2110.34 1051.00,2119.00 1051.00,2119.00 1051.00,2219.00 1051.00,2219.00 1051.02,2234.41 1055.00,2224.79 1055.00,2244.75 1055.00,2244.75 1045.00,2244.75 1045.00,2244.75 1045.00,2244.75 1032.00,2241.00 1032.00,2241.00 1032.00,2241.00 999.00,2241.00 999.00,2241.00 999.00,2241.00 975.25,2245.00 975.25,2245.00 975.00,2242.08 974.79,2236.62 975.25,2234.00 976.94,2228.64 978.88,2229.66 979.00,2221.00 979.00,2221.00 979.00,2121.00 979.00,2121.00 978.98,2105.59 975.00,2115.21 975.00,2095.25 975.00,2095.25 985.00,2095.25 985.00,2095.25 985.00,2095.25 998.00,2099.00 998.00,2099.00 998.00,2099.00 1031.00,2099.00 1031.00,2099.00 1031.00,2099.00 1054.75,2095.00 1054.75,2095.00 Z M 3447.00,2899.00 C 3440.66,2897.20 3422.63,2898.43 3415.00,2898.18 3415.00,2898.18 3379.00,2898.18 3379.00,2898.18 3379.00,2898.18 3366.99,2898.18 3366.99,2898.18 3366.99,2898.18 3361.42,2898.87 3361.42,2898.87 3361.42,2898.87 3355.00,2898.04 3355.00,2898.04 3344.76,2897.54 3340.14,2901.77 3340.00,2912.00 3339.93,2917.84 3339.11,2925.22 3345.10,2928.55 3349.34,2930.90 3353.58,2929.57 3358.00,2929.77 3358.00,2929.77 3366.00,2929.77 3366.00,2929.77 3366.00,2929.77 3413.00,2929.77 3413.00,2929.77 3421.78,2929.42 3439.43,2931.15 3447.00,2929.00 3447.01,2931.77 3446.67,2936.29 3448.74,2938.40 3451.05,2940.76 3464.95,2940.76 3467.26,2938.40 3469.33,2936.29 3468.99,2931.77 3469.00,2929.00 3474.26,2930.50 3487.74,2929.97 3494.00,2930.00 3504.82,2930.06 3502.45,2932.75 3511.00,2932.98 3523.14,2933.32 3518.81,2930.02 3532.00,2930.00 3532.00,2930.00 3619.00,2930.00 3619.00,2930.00 3630.80,2929.94 3632.98,2926.21 3633.00,2915.00 3633.01,2907.54 3633.86,2898.49 3624.00,2897.23 3618.29,2896.49 3615.45,2899.85 3611.00,2901.09 3611.00,2901.09 3599.00,2901.09 3599.00,2901.09 3599.00,2901.09 3576.00,2901.09 3576.00,2901.09 3569.17,2899.98 3573.24,2897.81 3559.00,2898.00 3550.73,2898.11 3551.68,2899.93 3545.00,2901.12 3545.00,2901.12 3530.00,2901.12 3530.00,2901.12 3530.00,2901.12 3502.00,2901.12 3502.00,2901.12 3502.00,2901.12 3492.00,2901.88 3492.00,2901.88 3484.85,2901.36 3481.54,2895.44 3469.00,2899.00 3468.97,2895.56 3469.08,2892.17 3466.99,2889.21 3463.30,2883.98 3453.87,2883.66 3449.65,2888.39 3446.93,2891.45 3447.04,2895.19 3447.00,2899.00 Z" /> </svg> I would need to replace each number inside the string to the closest number divisible by 70. Once that's done it's very likely that there will be duplicate paths so they should be removed until only one remains. Problem is I have absolutely no idea how to best achieve that in python. For the first part perhaps using regex to find all numbers and for each replace with something like: round(n / 70) * 70 For the second, I'm not sure yet if each line of 6 numbers would be equal to another full line, or only some of the numbers.
How do I manually strip XML tags?
I'm trying to remove the tags and create a new file but I can't see at how to accomplish this. I'm giving a file that has XML tags and I want to use strip and split to make into a list/string. I can't use an XML parser, or any other libraries. here is the text file: <team> <name>Denver Broncos</name> <players> <player> <jno>50</jno> <fname>Zaire</fname> <lname>Anderson</lname> <height>5-11</height> <weight>220</weight> <age>24</age> <position>ILB</position> <school>Nebraska</school> </player> <player> <jno>48</jno> <fname>Shaquil</fname> <lname>Barrett</lname> <height>6-2</height> <weight>250</weight> <age>23</age> <position>OLB</position> <school>Colorado State</school> </player> <player> <jno>35</jno> <fname>Kapri</fname> <lname>Bibbs</lname> <height>5-11</height> <weight>203</weight> <age>23</age> <position>RB</position> <school>Colorado State</school> </player> </players> </team> I want to use the string/list to produce a sentence like the following below: Here is the roster for the Denver Broncos. There are 3 players on the team. Zaire Anderson, ILB, wears #50. He is 5 foot 11 inches tall, and weighs 220 pounds. He is 24 years old. He went to Nebraska. Shaquil Barrett, OLB, wears #48. He is 6 foot 2 inches tall, and weighs 250 pounds. He is 23 years old. He went to Colorado State. Kapri Bibbs, RB, wears #48. He is 5 foot 11 inches tall, and weighs 203 pounds. He is 23 years old. He went to Colorado State. def test(filename): f=open(filename,"r") line = f.readline() f2 = open("BearsRoster.txt", "w") print line myList = [] stringl = "" for i in line: if i == ("<"): while i != ">": line.remove(i) else: stringl = stringl + i myList.append(stringl) stringl = "" else: stringl = stringl + i print myList for i in myList: print i print myList if i[0] == "<" or " ": myList.remove(i) obviously this code is incorrect. My idea was to go through the string and try to strip <xxxxx> that code. I just don't know how to approach it. After that I want to put that into the sentence I posted.
To remove tags use variable skip=True/False to control when to copy char to new string. When you find < then set skip=True, when you find > then set skip=False data = '''<team> <name>Denver Broncos</name> <players> <player> <jno>50</jno> <fname>Zaire</fname> <lname>Anderson</lname> <height>5-11</height> <weight>220</weight> <age>24</age> <position>ILB</position> <school>Nebraska</school> </player> <player> <jno>48</jno> <fname>Shaquil</fname> <lname>Barrett</lname> <height>6-2</height> <weight>250</weight> <age>23</age> <position>OLB</position> <school>Colorado State</school> </player> <player> <jno>35</jno> <fname>Kapri</fname> <lname>Bibbs</lname> <height>5-11</height> <weight>203</weight> <age>23</age> <position>RB</position> <school>Colorado State</school> </player> </players> </team>''' skip = False result = '' for char in data: if char == '<': skip = True elif char == '>': skip = False elif not skip: result += char print(result) If you need data from tags then you will have to build parser - recognize opening and closing tags, remember tags name, and probably build tree with tags. So you need a lot more work.
Python XML iterate over multiple blocks
I have an python XML parsing problem that I can't seem to figure out. I have the following XML: <data> <data_in base="base64"> </data_in> <log_sense_data> <ds base="bool">1</ds> <spf base="bool">0</spf> <page_code base="hex">15</page_code> <background_scan_results_log_page> <parameter> <parameter_code base="hex">0000</parameter_code> <du base="bool">0</du> <tsd base="bool">0</tsd> <etc base="bool">0</etc> <tmc base="hex">00</tmc> <format_linking base="hex">03</format_linking> <parameter_length base="dec">12</parameter_length> <description base="string">background scanning status parameter</description> <accumulated_power_on_minutes base="dec">579578</accumulated_power_on_minutes> <background_scanning_status base="hex">01</background_scanning_status> <number_of_background_scans_performed base="dec">112</number_of_background_scans_performed> <background_scan_progress base="hex">00000036</background_scan_progress> <number_of_background_medium_scans_performed base="dec">112</number_of_background_medium_scans_performed> </parameter> <parameter> <parameter_code base="hex">0001</parameter_code> <du base="bool">0</du> <tsd base="bool">0</tsd> <etc base="bool">0</etc> <tmc base="hex">00</tmc> <format_linking base="hex">03</format_linking> <parameter_length base="dec">20</parameter_length> <description base="string">background medium scan parameter</description> <accumulated_power_on_minutes base="dec">82932</accumulated_power_on_minutes> <reassign_status base="hex">05</reassign_status> <sense_key base="hex">01</sense_key> <additional_sense_code base="hex">17</additional_sense_code> <additional_sense_code_qualifier base="hex">01</additional_sense_code_qualifier> <vendor_specific base="hex">20e2570187</vendor_specific> <logical_block_address base="hex">00000000478994d8</logical_block_address> </parameter> <parameter> <parameter_code base="hex">0002</parameter_code> <du base="bool">0</du> <tsd base="bool">0</tsd> <etc base="bool">0</etc> <tmc base="hex">00</tmc> <format_linking base="hex">03</format_linking> <parameter_length base="dec">20</parameter_length> <description base="string">background medium scan parameter</description> <accumulated_power_on_minutes base="dec">104467</accumulated_power_on_minutes> <reassign_status base="hex">05</reassign_status> <sense_key base="hex">01</sense_key> <additional_sense_code base="hex">18</additional_sense_code> <additional_sense_code_qualifier base="hex">07</additional_sense_code_qualifier> <vendor_specific base="hex">203ab846ea</vendor_specific> <logical_block_address base="hex">00000000133d5046</logical_block_address> </parameter> </background_scan_results_log_page> </log_sense_data> </data> Where Parameter_code 0000 will always exist, and there could be any number of parameter_codes after that. Esentially I want to pull 2 values (power on minutes, background scans) from parameter_code 0000, as well as most values from parameter_code 0001 and greater, to be later put into a database. The code I have so far is this: import xml.etree.ElementTree as et log_page_tree = et.fromstring(results['Data']['RawData']) if log_page_tree.find('log_sense_data') == None: continue else: for element in log_page_tree.find('log_sense_data'): for pagecode in element.iter('page_code'): if pagecode.text == '15': for param in log_page_tree.find('log_sense_data').find('background_scan_results_log_page'): for derp in param.iter(): print derp.tag, derp.text #for totalpoweron in param.iter('accumulated_power_on_minutes'): #print totalpoweron.text I want to be able to keep the 2 values from parameter_code 0000, while iterating through the rest of the parameter_codes to be put into a database. Can anyone give me a push in the right direction here? If I specify param.iter('somevalue') to grab each value, the code doesn't seem to iterate.
OK, although there are ways you could simplify/improve your code, it sounds like you're happy up to here: for param in log_page_tree.find('log_sense_data').find('background_scan_results_log_page'): This will in fact iterate over each parameter. But now you want to switch on whether parameter_code is 0000, doing different things in each case. So: converters = { 'hex': lambda s: int(s, 16) 'dec': int, 'bool': bool } if param.find('parameter_code').text == '0000': accumulated_power_on_minutes = int(param.find('accumulated_power_on_minutes').text) number_of_background_scans_performed = int(param.find('number_of_background_scans_performed').text) else: obj = {} for elem in param.getchildren(): name = elem.tag base = elem.attrib['base'] converter = converters.get(base, lambda x: x) value = convert(elem.text) obj[name] = value # do something with obj
Setting default units in svg (Python svgwrite)
I'm generating SVG drawings using python's svgwrite. Every time I want to draw something, I find myself doing this ugly awkward thing: line = drawing.line(start = "%dmm" % start, end = "%dmm" % end) I wish I could just do: line = drawing.line(start = start, end = end) Is there a way to set the default units to 'mm' for the entire svg drawing?
A possible way is to set the viewBox attribute along with the document sizing, dwg = svgwrite.Drawing('myDrawing.svg', size=('170mm', '130mm'), viewBox=('0 0 170 130')) dwg.add(dwg.line(start=(30, 30), end=(50,50))) dwg.save() produces for me, <?xml version="1.0" encoding="utf-8" ?> <svg baseProfile="full" height="130mm" version="1.1" viewBox="0 0 170 130" width="170mm" xmlns="http://www.w3.org/2000/svg" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:xlink="http://www.w3.org/1999/xlink"><defs /><line x1="30" x2="50" y1="30" y2="50" /> </svg>
I just found that you can do this: from svgwrite import cm, mm dwg = svgwrite.Drawing('my_drawing.svg', height='10cm', width='10cm') dwg.add(dwg.line((0*cm, 0*cm), (10*cm, 10*cm)) dwg.save()