How to cut a string with duplicates? - python

I have this string:
'fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw
fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw'
And I want to use this part of code for cut it in pieces of length 22:
from textwrap import wrap
w_str= (wrap(str,22))
And then I will got this:
fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw
The next step should take the last four letters and of the first string and past it at the beginning of the next and so on.
Just like this with an Id:
e_1
fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfgaavza
e_2
avzaeafeaeaddacbytt!tw
fhsdkfhskdslshsdkhlghslghs
e_3
lghsbksjvsfgsdnfsfbjfgzfga
zfgaavzaeafeaeaddacbytt!tw

Once you have your string as such:
_str = """fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw"""
You can do the following:
>>> _str = _str.split()
>>> new = [_str[i-1][len(_str[i-1])-4:len(_str[i-1])]+_str[i] if i > 0 else _str[i] for i in range(len(_str))]
>>> print '\n'.join(new)
fhsdkfhskdslshsdkhlghs
lghsbksjvsfgsdnfsfbjfgzfga
zfgaavzaeafeaeaddacbytt!tw
>>>
Edit
zip two lists together in a list comprehension, as such:
'\n'.join(['\n'.join(item) for item in zip(['e_'+str(num) for num in range(1, len(new)+1)], new)])
>>> _str = _str.split()
>>> new = [_str[i-1][len(_str[i-1])-4:len(_str[i-1])]+_str[i] if i > 0 else _str[i] for i in range(len(_str))]
>>> print '\n'.join(['\n'.join(item) for item in zip(['e_'+str(num) for num in range(1, len(new)+1)], new)])
e_1
fhsdkfhskdslshsdkhlghs
e_2
lghsbksjvsfgsdnfsfbjfgzfga
e_3
zfgaavzaeafeaeaddacbytt!tw
>>>

In some ways, strings are like lists in Python in the way you can reference their contents by index, and splice them and so on.
So you could use the index of the characters in the string to pull out the last 4 characters of each wrapped string:
input_string = 'fhsdkfhskdslshsdkhlghsbksjvsfgsdnfsfbjfgzfgaavzaeafeaeaddacbytt!tw'
split_strings = wrap(input_string, 22)
add_string = '' # nothing there at first, but will be updated as we process each wrapped string
for output_string in split_strings:
print add_string + output_string
add_string = output_string[-4:] # "[-4:]" means: "from the fourth last char of the string, to the end"
outputs:
fhsdkfhskdslshsdkhlghs
lghsbksjvsfgsdnfsfbjfgzfga
zfgaavzaeafeaeaddacbytt!tw

Related

specific characters printing with Python

given a string as shown below,
"[xyx],[abc].[cfd],[abc].[dgr],[abc]"
how to print it like shown below ?
1.[xyz]
2.[cfd]
3.[dgr]
The original string will always maintain the above-mentioned format.
I did not realize you had periods and commas... that adds a bit of trickery. You have to split on the periods too
I would use something like this...
list_to_parse = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
count = 0
for i in list_to_parse.split('.'):
for j in i.split(','):
string = str(count + 1) + "." + j
if string:
count += 1
print(string)
string = None
Another option is split on the left bracket, and then just re-add it with enumerate - then strip commas and periods - this method is also probably a tiny bit faster, as it's not a loop inside a loop
list_to_parse = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
for index, i in enumerate(list.split('[')):
if i:
print(str(index) + ".[" + i.rstrip(',.'))
also strip is really "what characters to remove" not a specific pattern. so you can add any characters you want removed from the right, and it will work through the list until it hits a character it can't remove. there is also lstrip() and strip()
string manipulation can always get tricky, so pay attention. as this will output a blank first object, so index zero isn't printed etc... always practice and learn your needs :D
You can use split() function:
a = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
desired_strings = [i.split(',')[0] for i in a.split('.')]
for i,string in enumerate(desired_strings):
print(f"{i+1}.{string}")
This is just a fun way to solve it:
lst = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
count = 1
var = 1
for char in range(0, len(lst), 6):
if var % 2:
print(f"{count}.{lst[char:char + 5]}")
count += 1
var += 1
output:
1.[xyx]
2.[cfd]
3.[dgr]
explanation : "[" appears in these indexes: 0, 6, 12, etc. var is for skipping the next pair. count is the counting variable.
Here we can squeeze the above code using list comprehension and slicing instead of those flag variables. It's now more Pythonic:
lst = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
lst = [lst[i:i+5] for i in range(0, len(lst), 6)][::2]
res = (f"{i}.{item}" for i, item in enumerate(lst, 1))
print("\n".join(res))
You can use RegEx:
import regex as re
pattern=r"(\[[a-zA-Z]*\])\,\[[a-zA-Z]*\]\.?"
results=re.findall(pattern, '[xyx],[abc].[cfd],[abc].[dgr],[abc]')
print(results)
Using re.findall:
import re
s = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
print('\n'.join(f'{i+1}.{x}' for i,x in
enumerate(re.findall(r'(\[[^]]+\])(?=,)', s))))
Output:
1.[xyx]
2.[cfd]
3.[dgr]

I want to print the letters in each word in matrix form given the number of columns

I have a string thestackoverflow
and I also have # columns = 4
Then I want the output as
thes
tack
over
flow
you can do it using python slice notation.
refer to this thread for a nice explanation on slice notation: Understanding slice notation
example code for your question:
>>> input_string = "thestackoverflow"
>>> chop_size = 4
>>> while(input_string):
... print input_string[:chop_size]
... input_string = input_string[chop_size:]
...
thes
tack
over
flow
You can have a look at textwrap
import textwrap
string = 'thestackoverflow'
max_width = 4
result = textwrap.fill(string,max_width)
print(result)
thes
tack
over
flow
If you don't want to use any module
string = 'thestackoverflow'
max_width = 4
row = 0
result = ''
while row*max_width < len(string):
result+='\n'+string[row*max_width:(row+1)*max_width]
row+=1
result = result.strip()
print(result)

Replace a substring in a string according to a list

According to tutorialspoint:
The method replace() returns a copy of the string in which the occurrences of old have been replaced with new. https://www.tutorialspoint.com/python/string_replace.htm
Therefore one can use:
>>> text = 'fhihihi'
>>> text.replace('hi', 'o')
'fooo'
With this idea, given a list [1,2,3], and a string 'fhihihi' is there a method to replace a substring hi with 1, 2, and 3 in order? For example, this theoretical solution would yield:
'f123'
You can create a format string out of your initial string:
>>> text = 'fhihihi'
>>> replacement = [1,2,3]
>>> text.replace('hi', '{}').format(*replacement)
'f123'
Use re.sub:
import re
counter = 0
def replacer(match):
global counter
counter += 1
return str(counter)
re.sub(r'hi', replacer, text)
This is going to be way faster than any alternative using str.replace
One solution with re.sub:
text = 'fhihihi'
lst = [1,2,3]
import re
print(re.sub(r'hi', lambda g, l=iter(lst): str(next(l)), text))
Prints:
f123
Other answers gave good solutions. If you want to re-invent the wheel, here is one way.
text = "fhihihi"
target = "hi"
l = len(target)
i = 0
c = 0
new_string_list = []
while i < len(text):
if text[i:i + l] == target:
new_string_list.append(str(c))
i += l
c += 1
continue
new_string_list.append(text[i])
i += 1
print("".join(new_string_list))
Used a list to prevent consecutive string creation.

Delete a certain number of zeros from right of a string

I'm trying to delete a certain number of zeros from right. For example:
"10101000000"
I want to remove 4 zeros... And get:
"1010100"
I tried to do string.rstrip("0") or string.strip("0") but this removes all the of zeros from right. How can I do that?
The question is not a duplicate because I can't use imports.
You can use a regex
>>> import re
>>> mystr = "10101000000"
>>> numzeros = 4
>>> mystr = re.sub("0{{{}}}$".format(numzeros), "", mystr)
>>> mystr
'1010100'
This will leave the string as is if it doesn't end in four zeros
You could also check and then slice
if mystr.endswith("0" * numzeros):
mystr = mystr[:-numzeros]
For a known number of zeros you can use slicing:
s = "10101000000"
zeros = 4
if s.endswith("0" * zeros):
s = s[:-zeros]
rstrip deletes all characters from the end that are in passed set of characters. You can delete trailing zeros like this:
s = s[:-4] if s[-4:] == "0"*4 else s
Here's my solution:
number = "10101000000"
def my_rstrip(number, char, count=4):
for x in range(count):
if number.endswith(char):
number = number[0:-1]
else:
break
return number
print my_rstrip(number, '0', 4)
>>> s[:-4]+s[-4:].replace('0000','')
Don't forget to convert to str
import re
a = 10101000000
re.sub("0000$","", str(a))
You try to split off the last 4 characters from the string by this way:
string[:-4]

Python Append Microseconds to date element of the list

Hello I have a list of data generated as below
l_ele = line.split()
['2014-02-10T15:57:00.400733+00:00', 'coccus1','info="processing"]
['2014-02-10T15:57:02.734042+00:00', 'coccus1' , info="processing"]
['2014-02-10T15:57:02+00:00','coccus1','info="processing"']
['2014-02-10T15:57:03+00:00', 'coccus1','info="looking for match"']
['2014-02-10T15:57:04+00:00', 'coccus1', info="sampling"
['2014-02-10T15:57:06.771501+00:00','coccus1','info="sampling"']
I would like to append the ssssss to 000000 bit to the dateelement of the list if it does not have it.How to achieve it ?
Expected Output:
['2014-02-10T15:57:00.400733+00:00', 'coccus1','info="processing"]
['2014-02-10T15:57:02.734042+00:00', 'coccus1' , info="processing"]
['2014-02-10T15:57:02.000000+00:00','coccus1','info="processing"']
['2014-02-10T15:57:03.000000+00:00', 'coccus1','info="looking for match"']
['2014-02-10T15:57:04.000000+00:00', 'coccus1', info="sampling"
['2014-02-10T15:57:06.771501+00:00','coccus1','info="sampling"']
It is not clear what format your data is in, but assuming two strings:
s1 = "2014-02-10T15:57:02+00:00"
s2 = "2014-02-10T15:57:02.734042+00:00"
you can ensure they both match formats by doing:
def process_string(s):
return s if len(s) == 32 else "".join((s[:-6], ".000000", s[-6:]))
Or, in Python pre-2.5:
def process_string:
if len(s) == 32:
return s
return "".join((s[:-6], ".000000", s[-6:]))
Examples:
>>> process_string(s1)
'2014-02-10T15:57:02.000000+00:00'
>>> process_string(s2)
'2014-02-10T15:57:02.734042+00:00'
Ordinarily, I would recommend using datetime to do this, but your timezone offset is not in the format supported by strptime.
To apply this to the first item in each list, simply access it by index, for example:
>>> l = ['2014-02-10T15:57:02+00:00', 'coccus1','info="processing"']
>>> l[0] = process_string(l[0])
>>> l
['2014-02-10T15:57:02.000000+00:00', 'coccus1', 'info="processing"']

Categories

Resources