i have some problem to strip '[' at my string (read from file).
code
data = open(Koorpath1,'r')
for x in data:
print(x)
print(x.strip('['))
result
[["0.9986130595207214","26.41608428955078"],["39.44521713256836","250.2412109375"],["112.84327697753906","120.34269714355469"],["260.63800048828125","15.424667358398438"],["273.6199645996094","249.74160766601562"]]
"0.9986130595207214","26.41608428955078"],["39.44521713256836","250.2412109375"],["112.84327697753906","120.34269714355469"],["260.63800048828125","15.424667358398438"],["273.6199645996094","249.74160766601562"]]
Desired output :
"0.9986130595207214","26.41608428955078","39.44521713256836","250.2412109375","112.84327697753906","120.34269714355469","260.63800048828125","15.424667358398438","273.6199645996094","249.74160766601562"
Thanks
It strips the first two '[', it seems you have one long string, you have to split it first.
datalist = data.split[',']
for x in datalist:
# code here
If you don't want to split it and have it all in one string you need replace not strip (strip only works at the end and the beginning.
data = data.replace('[','')
If the data is JSON, then parse it into a Python list and treat it from there:
from itertools import chain
import json
nums = json.loads(x)
print(','.join('"%s"' % num for num in chain.from_iterable(nums)))
chain.from_iterable helps you "flatten" the list of lists and join concatenates everything into one long output.
Related
My First String
xxx.xxx.com-bonding_err_bond0-if_eth2-d.rrd.csv
But I want to result like this below
bonding_err_bond0-if_eth2
I try some code but seems not work correctly
csv = "xxx.xxx.com-bonding_err_bond0-if_eth2-d.rrd.csv"
x = csv.rsplit('.', 4)[2]
print(x)
But Result that I get is com-bonding_err_bond0-if_eth2-d But my purpose is bonding_err_bond0-if_eth2
If you are allowed to use the solution apart from regex,
You can break the solution into a smaller part to understand better and learn about join if you are not aware of it. It will come in handy.
solution= '-'.join(csv.split('.', 4)[2].split('-')[1:3])
Thanks,
Shashank
Probably you got the answer, but if you want a generic method for any string data you can do this:
In this way you wont be restricted to one string and you can loop the data as well.
csv = "xxx.xxx.com-bonding_err_bond0-if_eth2-d.rrd.csv"
first_index = csv.find("-")
second_index = csv.find("-d")
result = csv[first_index+1:second_index]
print(result)
# OUTPUT:
# bonding_err_bond0-if_eth2
You can just separate the string with -, remove the beginning and end, and then join them back into a string.
csv = "xxx.xxx.com-bonding_err_bond0-if_eth2-d.rrd.csv"
x = '-'.join(csv.split('-')[1:-1])
Output
>>> csv
>>> bonding_err_bond0-if_eth2
I want to split this python list (originalList):
['"car_type":"STANDARD","price":725842',
'"car_type":"LUXURY","price":565853',
'"car_type":"PEOPLE_CARRIER","price":239081',
'"car_type":"LUXURY_PEOPLE_CARRIER","price":661624',
'"car_type":"MINIBUS","price":654172']
to give me this list (pricesList):
[725842, 565853, 239081, 661624, 654172]
I tried this line of code below to split the list named originalList:
pricesList = [i.split("price:")[0] for i in originalList]
The outcome is a list with the same number of elements, but each element contains the car_type only, in short the splitting has removed everything to the left of the delimiter. How can I change my code above or even replace to obtain in the new list elements with the values to the left of the delimiter and everything to the right removed?
You forget the double-quotes " that are part of your delimiter, then pick the wrong index (0) which is before the split, and finally, you do not cast to int. You can do the following to get the desired output:
>>> [int(i.split('"price":')[-1]) for i in originalList]
[725842, 565853, 239081, 661624, 654172]
schwobaseggl answer is good, here is a possible alternative using json library (I guess original list comes from json processing)
import json
list(map(lambda x:json.loads('{'+x+'}')['price'],originalList))
You can try:
import json
n = ['"car_type":"STANDARD","price":725842',
'"car_type":"LUXURY","price":565853',
'"car_type":"PEOPLE_CARRIER","price":239081',
'"car_type":"LUXURY_PEOPLE_CARRIER","price":661624',
'"car_type":"MINIBUS","price":654172']
print [json.loads("{"+str(i)+"}")["price"] for i in n]
Another way of doing it:
pricesList = [int(originalList[i].split(",")[1].split(":")[1]) for i in range(0,len(l1))]
Solution
If you change to .split(':') you can just take the [-1] item, that will represent the numbers at the end
lista = [
'"car_type":"STANDARD","price":725842',
'"car_type":"LUXURY","price":565853',
'"car_type":"PEOPLE_CARRIER","price":239081',
'"car_type":"LUXURY_PEOPLE_CARRIER","price":661624',
'"car_type":"MINIBUS","price":654172'
]
new_lista = []
for i in range(len(lista)):
lista[i] = lista[i].split(':')
new_lista.append(lista[i][-1])
print(new_lista)
Output
(xenial)vash#localhost:~/python$ python3.7 split.py
['725842', '565853', '239081', '661624', '654172']
I have a long string variable full of hex values:
hexValues = 'AA08E3020202AA08E302AA1AA08E3020101' etc..
The first 2 bytes (AA08) are a signature for the start of a frame and the rest of the data up to the next AA08 are the contents of the signature.
I want to slice the string into a list based on the reoccurring start of frame sign, e.g:
list = [AA08, E3020202, AA08, F25S1212, AA08, 42ABC82] etc...
I'm not sure how I can split the string up like this. Some of the frames are also corrupted, where the start of the frame won'y have AA08, but maybe AA01.. so I'd need some kind of regex to spot these.
if I do list = hexValues.split('AA08)', the list just removes all the starts of the frame...
So I'm a bit stuck.
Newbie to python.
Thanks
For the case when you don't have "corrupted" data the following should do:
hex_values = 'AA08E3020202AA08E302AA1AA08E3020101'
delimiter = hex_values[:4]
hex_values = hex_values.replace(delimiter, ',' + delimiter + ',')
hex_list = hex_values.split(',')[1:]
print(hex_list)
['AA08', 'E3020202', 'AA08', 'E302AA1', 'AA08', 'E3020101']
Without considering corruptions, you may try this.
l = []
for s in hexValues.split('AA08'):
if s:
l += ['AA08', s]
I have a list of data similar to that below:
a = ['"105', '424"', '"102', '629"', '"104', '307"']
I want this data to be in a form similar to that of below:
a = ['105424', '102629', '104307']
I am unsure of how to proceed. I thought perhaps removing all the commas then inserting commas only where they should be and then removing the quotations. I am finding this to be quite challenging.
I'm assuming this data was originally in a csv file where data that contains commas is quoted ("105,424","102,629","104,307") and then you are splitting on comma:
>>> '"105,424","102,629","104,307"'.split(',')
['"105', '424"', '"102', '629"', '"104', '307"']
Rather you should let the csv module do the work as it will handle the double quotes:
import csv
with open('u:\\foobar.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
print [x.replace(',','') for x in row]
This prints: ['105424', '102629', '104307']
Does your data look something like:
"123", "123,456", "123,456,789"
If so then try this
input = '"123", "123,456", "123,456,789"'
import re
reg = re.compile('"(\d{1,3}(,\d{3})*)"')
stringValues = [wholematch.replace(',', '') for wholematch, _endmatch
in reg.findall(input)]
This regex should also work on thousands with decimal places as well.
re.compile('"(\d{1,3}(,\d{3})*(\.\d*)?)"')
If the source data is CSV, you should use #steven's answer.
Regardless, here's how you could process what you pasted.
As #troutwine stated, this will only work if the number parts are always in pairs.
a = ['"105', '424"', '"102', '629"', '"104', '307"']
from itertools import izip
def pairwise(iterable):
"s -> (s0,s1), (s2,s3), (s4, s5), ..."
a = iter(iterable)
return izip(a, a)
result = []
for x, y in pairwise(a):
result.append(''.join([x, y]).strip('"'))
print result
Gives:
['105424', '102629', '104307']
Pairwise snippet from here: Iterating over every two elements in a list
If you'll never have an unmatched pair, loop over a range 1/2 the size of the input list, mash the current index plus the next together, do a string substitution and skip to the current index plus two.
Reduce to the rescue:
l = ['"105', '424"', '"102', '629"', '"104', '307"', '"123', '456', '789"', '"123"']
# Concatenate everything and split by ", get non-empties
l2 = [num for num in reduce(lambda x, y: x+y, l).split('"') if num != '']
# Output:
# ['105424', '102629', '104307', '123456789', '123']
print l2
Few caveats though: This code can do numbers beyond thousands (ie, 1,457,664), but also assumes that the whole number was double-quoted.
As others have said though, you should revisit your data retrieval as there are most likely ways to get the values correctly without dealing with the double-quotes. This was a fun little challenge nonetheless.
I am reading a file, line-by-line and doing some text processing in order to get output in a certain format
My string processing code goes as follows:
file1=open('/myfolder/testfile.txt')
scanlines=file1.readlines()
string = ''
for line in scanlines:
if line.startswith('>from'):
continue
if line.startswith('*'):
continue
string.join(line.rstrip('\n'))
The output of this code is as follows:
abc
def
ghi
Is there a way to join these physical lines into one logical line, e.g:
abcdefghi
Basically, how can I concatenate multiple strings into one large string?
If I was reading from a file with very long strings is there the risk of an overflow by concatenating multiple physical lines into one logical line?
there are several ways to do this. for example just using + should do the trick.
"abc" + "def" # produces "abcdef"
If you try to concatenate multiple strings you can do this with the join method:
', '.join(('abc', 'def', 'ghi')) # produces 'abc, def, ghi'
If you want no delimiter, use the empty string ''.join() method.
Cleaning things up a bit, it would be easiest to append to array and then return the result
def joinfile(filename) :
sarray = []
with open(filename) as fd :
for line in fd :
if line.startswith('>from') or line.startswith('*'):
continue
sarray.append(line.rstrip('\n'))
return ''.join(sarray)
If you wanted to get really cute you could also do the following:
fd = open(filename)
str = ''.join([line.rstrip('\n') for line in fd if not (line.startswith('>from') or line.startswith('*'))])
Yes of course you could read a file big enough to overflow memory.
Use string addition
>>> s = 'a'
>>> s += 'b'
>>> s
'ab'
I would prefer:
oneLine = reduce(lambda x,y: x+y, \
[line[:-1] for line in open('/myfolder/testfile.txt')
if not line.startswith('>from') and \
not line.startswith('*')])
line[:-1] in order to remove all the \n
the second argument of reduce is a list comprehension which extracts all the lines you are interested in and removes the \n from the lines.
the reduce (just if you actually need that) to make one string from the list of strings.