Python: Split a mixed String - python

I read some lines from a file in the following form:
line = a b c d,e,f g h i,j,k,l m n
What I want is lines without the ","-separated elements, e.g.,
a b c d g h i m n
a b c d g h j m n
a b c d g h k m n
a b c d g h l m n
a b c e g h i m n
a b c e g h j m n
a b c e g h k m n
a b c e g h l m n
. . . . . . . . .
. . . . . . . . .
First I would split line
sline = line.split()
Now I would iterate over sline and look for elements that can be splited with "," as separator. The Problem is I don't know always how much from those elements I have to expect.
Any ideas?

Using regex, itertools.product and some string formatting:
This solution preserves the initial spacing as well.
>>> import re
>>> from itertools import product
>>> line = 'a b c d,e,f g h i,j,k,l m n'
>>> items = [x[0].split(',') for x in re.findall(r'((\w+,)+\w)',line)]
>>> strs = re.sub(r'((\w+,)+\w+)','{}',line)
>>> for prod in product(*items):
... print (strs.format(*prod))
...
a b c d g h i m n
a b c d g h j m n
a b c d g h k m n
a b c d g h l m n
a b c e g h i m n
a b c e g h j m n
a b c e g h k m n
a b c e g h l m n
a b c f g h i m n
a b c f g h j m n
a b c f g h k m n
a b c f g h l m n
Another example:
>>> line = 'a b c d,e,f g h i,j,k,l m n q,w,e,r f o o'
>>> items = [x[0].split(',') for x in re.findall(r'((\w+,)+\w)',line)]
>>> strs = re.sub(r'((\w+,)+\w+)','{}',line)
for prod in product(*items):
print (strs.format(*prod))
...
a b c d g h i m n q f o o
a b c d g h i m n w f o o
a b c d g h i m n e f o o
a b c d g h i m n r f o o
a b c d g h j m n q f o o
a b c d g h j m n w f o o
a b c d g h j m n e f o o
a b c d g h j m n r f o o
a b c d g h k m n q f o o
a b c d g h k m n w f o o
a b c d g h k m n e f o o
a b c d g h k m n r f o o
a b c d g h l m n q f o o
a b c d g h l m n w f o o
a b c d g h l m n e f o o
a b c d g h l m n r f o o
a b c e g h i m n q f o o
a b c e g h i m n w f o o
a b c e g h i m n e f o o
a b c e g h i m n r f o o
a b c e g h j m n q f o o
a b c e g h j m n w f o o
a b c e g h j m n e f o o
a b c e g h j m n r f o o
a b c e g h k m n q f o o
a b c e g h k m n w f o o
a b c e g h k m n e f o o
a b c e g h k m n r f o o
a b c e g h l m n q f o o
a b c e g h l m n w f o o
a b c e g h l m n e f o o
a b c e g h l m n r f o o
a b c f g h i m n q f o o
a b c f g h i m n w f o o
a b c f g h i m n e f o o
a b c f g h i m n r f o o
a b c f g h j m n q f o o
a b c f g h j m n w f o o
a b c f g h j m n e f o o
a b c f g h j m n r f o o
a b c f g h k m n q f o o
a b c f g h k m n w f o o
a b c f g h k m n e f o o
a b c f g h k m n r f o o
a b c f g h l m n q f o o
a b c f g h l m n w f o o
a b c f g h l m n e f o o
a b c f g h l m n r f o o

Your question is not really clear. If you want to strip off any part after commas (as your text suggests), then a fairly readable one-liner should do:
cleaned_line = " ".join([field.split(",")[0] for field in line.split()])
If you want to expand lines containing comma-separated fields into multiple lines (as your example suggests), then you should use the itertools.product function:
import itertools
line = "a b c d,e,f g h i,j,k,l m n"
line_fields = [field.split(",") for field in line.split()]
for expanded_line_fields in itertools.product(*line_fields):
print " ".join(expanded_line_fields)
This is the output:
a b c d g h i m n
a b c d g h j m n
a b c d g h k m n
a b c d g h l m n
a b c e g h i m n
a b c e g h j m n
a b c e g h k m n
a b c e g h l m n
a b c f g h i m n
a b c f g h j m n
a b c f g h k m n
a b c f g h l m n
If it's important to keep the original spacing, for some reason, then you can replace line.split() by re.findall("([^ ]*| *)", line):
import re
import itertools
line = "a b c d,e,f g h i,j,k,l m n"
line_fields = [field.split(",") for field in re.findall("([^ ]+| +)", line)]
for expanded_line_fields in itertools.product(*line_fields):
print "".join(expanded_line_fields)
This is the output:
a b c d g h i m n
a b c d g h j m n
a b c d g h k m n
a b c d g h l m n
a b c e g h i m n
a b c e g h j m n
a b c e g h k m n
a b c e g h l m n
a b c f g h i m n
a b c f g h j m n
a b c f g h k m n
a b c f g h l m n

If I have understood your example correctly You need following
import itertools
sss = "a b c d,e,f g h i,j,k,l m n d,e,f "
coma_separated = [i for i in sss.split() if ',' in i]
spited_coma_separated = [i.split(',') for i in coma_separated]
symbols = (i for i in itertools.product(*spited_coma_separated))
#use generator statement to save memory
for s in symbols:
st = sss
for part, symb in zip(coma_separated, s):
st = st.replace(part, symb, 1) # To prevent replacement of the
# same coma separated group replace once
# for first occurance
print (st.split()) # for python3 compatibility

Most other answers only produce one line instead of the multiple lines you seem to want.
To achieve what you want, you can work in several ways.
The recursive solution seems the most intuitive to me:
def dothestuff(l):
for n, i in enumerate(l):
if ',' in i:
# found a "," entry
items = i.split(',')
for j in items:
for rest in dothestuff(l[n+1:]):
yield l[:n] + [j] + rest
return
yield l
line = "a b c d,e,f g h i,j,k,l m n"
for i in dothestuff(line.split()): print i

for i in range(len(line)-1):
if line[i] == ',':
line = line.replace(line[i]+line[i+1], '')

import itertools
line_data = 'a b c d,e,f g h i,j,k,l m n'
comma_fields_indices = [i for i,val in enumerate(line_data.split()) if "," in val]
comma_fields = [i.split(",") for i in line_data.split() if "," in i]
all_comb = []
for val in itertools.product(*comma_fields):
sline_data = line_data.split()
for index,word in enumerate(val):
sline_data[comma_fields_indices[index]] = word
all_comb.append(" ".join(sline_data))
print all_comb

Related

How to apend a row in a pandas dataframe containing column's sum

Here is how my dataframe looks like
0 M M W B k a D G 247.719248 39.935064 12.983612 177.537373 214.337385 70.248041 78.162404 215.383443
1 n a Y j A N Q m 39.014265 64.053771 13.677425 169.164911 153.225780 31.095511 198.805600 179.653853
2 j z v I n N I X 152.177940 50.524997 79.063318 181.993409 51.367824 19.294708 217.844628 166.896151
3 n w a Y G B y O 243.468930 92.694170 200.305038 249.760627 156.588164 200.031428 146.933709 202.202242
4 R i h L J a q S 122.006004 34.979958 151.963992 116.795194 74.713682 252.979874 34.272430 45.334396
5 m Y n r u t t b 86.097651 229.911157 75.242197 214.069558 246.390175 235.507510 125.431980 90.467756
6 d i u d f Q a q 135.740363 13.388095 107.297373 10.520204 118.578496 101.770257 177.253815 78.800327
7 n F A x H u b y 55.497867 210.402998 191.356683 6.438180 85.967328 64.461602 157.265270 213.673103
8 q h w i S B h i 253.696469 168.964278 31.592088 160.404929 241.434909 232.280512 116.353252 11.540209
9 a z s d Y z l B 50.440346 80.492069 64.991017 88.663195 155.993675 85.967207 120.467390 71.219658
10 A U W m y R k K 156.153985 15.862058 95.013242 48.339397 235.440190 160.565380 236.421396 59.981690
11 z K K w o c n l 56.310181 210.101571 173.887020 181.040997 193.653296 250.875304 81.096499 234.868844
I want to append a row which will contain the sum of the column but it also contain string values.
I have tried this solution
df.loc['Total'] = df.select_dtypes(include=['float64', 'int64']).sum(axis=0)
But I am getting the sum in the string column as well like this
0 M M W B k a D G 247.719248 39.935064 12.983612 177.537373 214.337385 70.248041 78.162404 215.383443
1 n a Y j A N Q m 39.014265 64.053771 13.677425 169.164911 153.225780 31.095511 198.805600 179.653853
2 j z v I n N I X 152.177940 50.524997 79.063318 181.993409 51.367824 19.294708 217.844628 166.896151
3 n w a Y G B y O 243.468930 92.694170 200.305038 249.760627 156.588164 200.031428 146.933709 202.202242
4 R i h L J a q S 122.006004 34.979958 151.963992 116.795194 74.713682 252.979874 34.272430 45.334396
5 m Y n r u t t b 86.097651 229.911157 75.242197 214.069558 246.390175 235.507510 125.431980 90.467756
6 d i u d f Q a q 135.740363 13.388095 107.297373 10.520204 118.578496 101.770257 177.253815 78.800327
7 n F A x H u b y 55.497867 210.402998 191.356683 6.438180 85.967328 64.461602 157.265270 213.673103
8 q h w i S B h i 253.696469 168.964278 31.592088 160.404929 241.434909 232.280512 116.353252 11.540209
9 a z s d Y z l B 50.440346 80.492069 64.991017 88.663195 155.993675 85.967207 120.467390 71.219658
10 A U W m y R k K 156.153985 15.862058 95.013242 48.339397 235.440190 160.565380 236.421396 59.981690
11 z K K w o c n l 56.310181 210.101571 173.887020 181.040997 193.653296 250.875304 81.096499 234.868844
Total 1598.32 1211.31 1197.37 1604.73 1927.69 1705.08 1690.31 1570.02 1598.323248 1211.310187 1197.373003 1604.727974 1927.690905 1705.077334 1690.308374 1570.021673
Can i keep some value for the string sum? How it should be done?
Any help would be appreciated. I am newbie to pandas

Different slices give different inequalities for same elements

import numpy as np
a = np.array([.4], dtype='float32')
b = np.array([.4, .6])
print(a > b)
print(a > b[0], a > b[1])
print(a[0] > b[0], a[0] > b[1])
[ True False]
[False] [False]
True False
What's the deal? Yes, b.dtype == 'float64', but so are its slices b[0] & b[1], and a remains 'float32'.
Note: I'm asking why this occurs, not how to circumvent it, which I know (e.g. cast both to 'float64').
As I've noted in another answer, type casting in numpy is pretty complicated, and this is the root cause of the behaviour you are seeing. The documents linked in that answer make it clear that scalars(/0d arrays) and 1d arrays differ in type conversions, since the latter aren't considered value by value.
The first half of the problem you already know: the problem is that type conversion happens differently for your two cases:
>>> (a + b).dtype
dtype('float64')
>>> (a + b[0]).dtype
dtype('float32')
>>> (a[0] + b[0]).dtype
dtype('float64')
There's also a helper called numpy.result_type() that can tell you the same information without having to perform the binary operation:
>>> np.result_type(a, b)
dtype('float64')
>>> np.result_type(a, b[0])
dtype('float32')
>>> np.result_type(a[0], b[0])
dtype('float64')
I believe we can understand what's happening in your example if we consider the type conversion tables:
>>> from numpy.testing import print_coercion_tables
can cast
[...]
In these tables, ValueError is '!', OverflowError is '#', TypeError is '#'
scalar + scalar
+ ? b h i l q p B H I L Q P e f d g F D G S U V O M m
? ? b h i l q l B H I L Q L e f d g F D G # # # O ! m
b b b h i l q l h i l d d d e f d g F D G # # # O ! m
h h h h i l q l h i l d d d f f d g F D G # # # O ! m
i i i i i l q l i i l d d d d d d g D D G # # # O ! m
l l l l l l q l l l l d d d d d d g D D G # # # O ! m
q q q q q q q q q q q d d d d d d g D D G # # # O ! m
p l l l l l q l l l l d d d d d d g D D G # # # O ! m
B B h h i l q l B H I L Q L e f d g F D G # # # O ! m
H H i i i l q l H H I L Q L f f d g F D G # # # O ! m
I I l l l l q l I I I L Q L d d d g D D G # # # O ! m
L L d d d d d d L L L L Q L d d d g D D G # # # O ! m
Q Q d d d d d d Q Q Q Q Q Q d d d g D D G # # # O ! m
P L d d d d d d L L L L Q L d d d g D D G # # # O ! m
e e e f d d d d e f d d d d e f d g F D G # # # O ! #
f f f f d d d d f f d d d d f f d g F D G # # # O ! #
d d d d d d d d d d d d d d d d d g D D G # # # O ! #
g g g g g g g g g g g g g g g g g g G G G # # # O ! #
F F F F D D D D F F D D D D F F D G F D G # # # O ! #
D D D D D D D D D D D D D D D D D G D D G # # # O ! #
G G G G G G G G G G G G G G G G G G G G G # # # O ! #
S # # # # # # # # # # # # # # # # # # # # # # # O ! #
U # # # # # # # # # # # # # # # # # # # # # # # O ! #
V # # # # # # # # # # # # # # # # # # # # # # # O ! #
O O O O O O O O O O O O O O O O O O O O O O O O O ! #
M ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
m m m m m m m m m m m m m m # # # # # # # # # # # ! m
scalar + neg scalar
[...]
array + scalar
+ ? b h i l q p B H I L Q P e f d g F D G S U V O M m
? ? b h i l q l B H I L Q L e f d g F D G # # # O ! m
b b b b b b b b b b b b b b e f d g F D G # # # O ! m
h h h h h h h h h h h h h h f f d g F D G # # # O ! m
i i i i i i i i i i i i i i d d d g D D G # # # O ! m
l l l l l l l l l l l l l l d d d g D D G # # # O ! m
q q q q q q q q q q q q q q d d d g D D G # # # O ! m
p l l l l l l l l l l l l l d d d g D D G # # # O ! m
B B B B B B B B B B B B B B e f d g F D G # # # O ! m
H H H H H H H H H H H H H H f f d g F D G # # # O ! m
I I I I I I I I I I I I I I d d d g D D G # # # O ! m
L L L L L L L L L L L L L L d d d g D D G # # # O ! m
Q Q Q Q Q Q Q Q Q Q Q Q Q Q d d d g D D G # # # O ! m
P L L L L L L L L L L L L L d d d g D D G # # # O ! m
e e e e e e e e e e e e e e e e e e F F F # # # O ! #
f f f f f f f f f f f f f f f f f f F F F # # # O ! #
d d d d d d d d d d d d d d d d d d D D D # # # O ! #
g g g g g g g g g g g g g g g g g g G G G # # # O ! #
F F F F F F F F F F F F F F F F F F F F F # # # O ! #
D D D D D D D D D D D D D D D D D D D D D # # # O ! #
G G G G G G G G G G G G G G G G G G G G G # # # O ! #
S # # # # # # # # # # # # # # # # # # # # # # # O ! #
U # # # # # # # # # # # # # # # # # # # # # # # O ! #
V # # # # # # # # # # # # # # # # # # # # # # # O ! #
O O O O O O O O O O O O O O O O O O O O O O O O O ! #
M ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
m m m m m m m m m m m m m m # # # # # # # # # # # ! m
[...]
The above is part of the current promotion tables for value-based promotion. It denotes how differing types contribute to a result type when pairing two numpy objects of a given kind (see the first column and first row for the specific types). The types are to be understood according to the single-character dtype specifications (below "One-character strings"), in particular np.dtype('f') corresponds to np.float32 (f for C-style float) and np.dtype('d') (d for C-style double) to np.float64 (see also np.typename('f') and the same for 'd').
I have noted two items in boldface in the above tables:
scalar f + scalar d --> d
array f + scalar d --> f
Now let's look at your cases. The premise is that you have an 'f' array a and a 'd' array b. The fact that a only has a single element doesn't matter: it's a 1d array with length 1 rather than a 0d array.
When you do a > b you are comparing two arrays, this is not denoted in the above tables. I'm not sure what the behaviour is here; my guess is that a gets broadcast to b's shape and then its type is cast to 'd'. The reason I think this is that np.can_cast(a, np.float64) is True and np.can_cast(b, np.float32) is False. But this is just a guess, a lot of this machinery in numpy is not intuitive to me.
When you do a > b[0] you are comparing a 'f' array to a 'd' scalar, so according to the above you get a 'f' array. That's what (a + b[0]).dtype told us. (When you use a > b[0] you don't see the conversion step, because the result is always a bool.)
When you do a[0] > b[0] you are comparing a 'f' scalar to a 'd' scalar, so according to the above you get a 'd' scalar. That's what (a[0] + b[0]).dtype told us.
So I believe this is all consistent with the quirks of type conversion in numpy. While it might seem like an unfortunate corner case with the value of 0.4 in double and single precision, this feature goes deeper and the problem serves as a big red warning that you should be very careful when mixing different dtypes.
The safest course of action is to convert your types yourself in order to control what happens in your code. Especially since there's discussion about reconsidering some aspects of type promotion.
As a side note (for now), there's a work-in-progress NEP 50 created in May 2021 that explains how confusing type promotion can be when scalars are involved, and plans to simplify some of the rules eventually. Since this also involves breaking changes, its implementation in NumPy proper won't happen overnight.

Reformat Beautiful Soup Output to include CSS

I am trying to parse through the text of emails to expedite my workflow using Python. I first save the email has a .htm on my local drive. Then, I want to try pulling certain pieces of information out of a table within the email using Jupyter Notebook. Whenever I create the soup, the result is a spaced out text field. I am unable to use this soup to make proper HTML calls to pull data. How may I reformat the soup?
The .htm file is already text, but I would still like to use Beautiful Soup to help me parse through the text field. Should I be trying a different parse method?
from bs4 import BeautifulSoup
raw_file = open(r"C:\Users\Desktop\Example.htm").read()
soup = BeautifulSoup(raw_file, 'lxml')
print(soup)
I expected a nicely formatted soup file, instead, this is what the print statement returns:
<html><body>
<p>ÿþh t m l x m l n s : v = " u r n : s c h e m a s - m i c r o s o f t - c o m : v m l "
x m l n s : o = " u r n : s c h e m a s - m i c r o s o f t - c o m : o f f i c e : o f f i c e "
x m l n s : w = " u r n : s c h e m a s - m i c r o s o f t - c o m : o f f i c e : w o r d "
x m l n s : m = " h t t p : / / s c h e m a s . m i c r o s o f t . c o m / o f f i c e / 2 0 0 4 / 1 2 / o m m l "
x m l n s = " h t t p : / / w w w . w 3 . o r g / T R / R E C - h t m l 4 0 " >
h e a d >
m e t a h t t p - e q u i v = C o n t e n t - T y p e c o n t e n t = " t e x t / h t m l ; c h a r s e t = u n i c o d e " >
m e t a n a m e = P r o g I d c o n t e n t = W o r d . D o c u m e n t >
m e t a n a m e = G e n e r a t o r c o n t e n t = " M i c r o s o f t W o r d 1 5 " >
m e t a n a m e = O r i g i n a t o r c o n t e n t = " M i c r o s o f t W o r d 1 5 " >
b a s e
When I call -
print(raw_file)
the following returns:
ÿþ<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=unicode">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 15">
<meta name=Originator content="Microsoft Word 15">
<base

How to leave only one defined sub-string in a string in Python

Say I have one of the strings:
"a b c d e f f g" || "a b c f d e f g"
And I want there to be only one occurrence of a substring (f in this instance) throughout the string so that it is somewhat sanitized.
The result of each string would be:
"a b c d e f g" || "a b c d e f g"
An example of the use would be:
str = "a b c d e f g g g g g h i j k l"
str.leaveOne("g")
#// a b c d e f g h i j k l
If it doesn't matter which instance you leave, you can use str.replace, which takes a parameter signifying the number of replacements you want to perform:
def leave_one_last(source, to_remove):
return source.replace(to_remove, '', source.count(to_remove) - 1)
This will leave the last occurrence.
We can modify it to leave the first occurrence by reversing the string twice:
def leave_one_first(source, to_remove):
return source[::-1].replace(to_remove, '', source.count(to_remove) - 1)[::-1]
However, that is ugly, not to mention inefficient. A more elegant way might be to take the substring that ends with the first occurrence of the character to find, replace occurrences of it in the rest, and finally concatenate them together:
def leave_one_first_v2(source, to_remove):
first_index = source.index(to_remove) + 1
return source[:first_index] + source[first_index:].replace(to_remove, '')
If we try this:
string = "a b c d e f g g g g g h i j k l g"
print(leave_one_last(string, 'g'))
print(leave_one_first(string, 'g'))
print(leave_one_first_v2(string, 'g'))
Output:
a b c d e f h i j k l g
a b c d e f g h i j k l
a b c d e f g h i j k l
If you don't want to keep spaces, then you should use a version based on split:
def leave_one_split(source, to_remove):
chars = source.split()
first_index = chars.index(to_remove) + 1
return ' '.join(chars[:first_index] + [char for char in chars[first_index:] if char != to_remove])
string = "a b c d e f g g g g g h i j k l g"
print(leave_one_split(string, 'g'))
Output:
'a b c d e f g h i j k l'
If I understand correctly, you can just use a regex and re.sub to look for groups of two or more of your letter with or without a space and replace it by a single instance:
import re
def leaveOne(s, char):
return re.sub(r'((%s\s?)){2,}' % char, r'\1' , s)
leaveOne("a b c d e f g g g h i j k l", 'g')
# 'a b c d e f g h i j k l'
leaveOne("a b c d e f ggg h i j k l", 'g')
# 'a b c d e f g h i j k l'
leaveOne("a b c d e f g h i j k l", 'g')
# 'a b c d e f g h i j k l'
EDIT
If the goal is to get rid of all occurrences of the letter except one, you can still use a regex with a lookahead to select all letters followed by the same:
import re
def leaveOne(s, char):
return re.sub(r'(%s)\s?(?=.*?\1)' % char, '' , s)
print(leaveOne("a b c d e f g g g h i j k l g", 'g'))
# 'a b c d e f h i j k l g'
print(leaveOne("a b c d e f ggg h i j k l gg g", 'g'))
# 'a b c d e f h i j k l g'
print(leaveOne("a b c d e f g h i j k l", 'g'))
# 'a b c d e f g h i j k l'
This should even work with more complicated patterns like:
leaveOne("a b c ffff d e ff g", 'ff')
# 'a b c d e ff g'
Given String
mystr = 'defghhabbbczasdvakfafj'
cache = {}
seq = 0
for i in mystr:
if i not in cache:
cache[i] = seq
print (cache[i])
seq+=1
mylist = []
Here I have ordered the dictionary with values
for key,value in sorted(cache.items(),key=lambda x : x[1]):
mylist.append(key)
print ("".join(mylist))

regex pattern won't return in python script

Why does the first snippet return digits, but the latter does not? I have tried more complicated expressions without success. The expressions I use are valid according to pythex.org, but do not work in the script.
(\d{6}-){7}\d{6}) is one such expression. I've tested it against this string: 123138-507716-007469-173316-534644-033330-675057-093280
import re
pattern = re.compile('(\d{1})')
load_file = open('demo.txt', 'r')
search_file = load_file.read()
result = pattern.findall(search_file)
print(result)
==============
import re
pattern = re.compile('(\d{6})')
load_file = open('demo.txt', 'r')
search_file = load_file.read()
result = pattern.findall(search_file)
print(result)
When I put the string into a variable and then search the variable it works just fine. This should work as is. But it doesn't help if I want to read a text file. I've tried to read each line of the file and that seems to be where the script breaks down.
import re
pattern = re.compile('((\d{6}-){7})')
#pattern = re.compile('(\d{6})')
#load_file = open('demo.txt', 'r')
#search_file = load_file.read()
test_string = '123138-507716-007469-173316-534644-033330-675057-093280'
result = pattern.findall(test_string)
print(result)
=========
printout,
Search File:
ÿþB i t L o c k e r D r i v e E n c r y p t i o n R e c o v e r y K e y
T h e r e c o v e r y k e y i s u s e d t o r e c o v e r t h e d a t a o n a B i t L o c k e r p r o t e c t e d d r i v e .
T o v e r i f y t h a t t h i s i s t h e c o r r e c t r e c o v e r y k e y c o m p a r e t h e i d e n t i f i c a t i o n w i t h w h a t i s p r e s e n t e d o n t h e r e c o v e r y s c r e e n .
R e c o v e r y k e y i d e n t i f i c a t i o n : f f s d f a - f s d f - s f
F u l l r e c o v e r y k e y i d e n t i f i c a t i o n : 8 8 8 8 8 8 8 8 - 8 8 8 8 - 8 8 8 8 - 8 8 8 8 - 8 8 8 8 8 8 8 8 8 8 8
B i t L o c k e r R e c o v e r y K e y :
1 1 1 1 1 1 - 1 1 1 1 1 1 - 1 1 1 1 1 1 - 1 1 1 1 1 1 - 1 1 1 1 1 1 - 1 1 1 1 1 1 - 1 1 1 1 1 1 - 1 1 1 1 1 1
6 6 6 6 6 6
Search Results:
[]
Process finished with exit code 0
================
This is where I ended up. It finds the string just fine and without the commas.
import re
pattern = re.compile('(\w{6}-\w{6}-\w{6}-\w{6}-\w{6}-\w{6}-\w{6}-\w{6})')
load_file = open('demo3.txt', 'r')
for line in load_file:
print(pattern.findall(line))

Categories

Resources