Get list from string with exec in python - python

I have:
"[15765,22832,15289,15016,15017]"
I want:
[15765,22832,15289,15016,15017]
What should I do to convert this string to list?
P.S. Post was edited without my permission and it lost important part. The type of line that looks like list is 'bytes'. This is not string.
P.S. №2. My initial code was:
import urllib.request, re
f = urllib.request.urlopen("http://www.finam.ru/cache/icharts/icharts.js")
lines = f.readlines()
for line in lines:
m = re.match('var\s+(\w+)\s*=\s*\[\\s*(.+)\s*\]\;', line.decode('windows-1251'))
if m is not None:
varname = m.group(1)
if varname == "aEmitentIds":
aEmitentIds = line #its type is 'bytes', not 'string'
I need to get list from line
line from web page looks like
[15765, 22832, 15289, 15016, 15017]

Assuming s is your string, you can just use split and then cast each number to integer:
s = [int(number) for number in s[1:-1].split(',')]
For detailed information about split function:
Python3 split documentation

What you have is a stringified list. You could use a json parser to parse that information into the corresponding list
import json
test_str = "[15765,22832,15289,15016,15017]"
l = json.loads(test_str) # List that you need.
Or another way to do this would be to use ast
import ast
test_str = "[15765,22832,15289,15016,15017]"
data = ast.literal_eval(test_str)
The result is
[15765, 22832, 15289, 15016, 15017]
To understand why using eval() is bad practice you could refer to this answer

You can also use regex to pull out numeric values from the string as follows:
import re
lst = "[15765,22832,15289,15016,15017]"
lst = [int(number) for number in re.findall('\d+',lst)]
Output of the above code is,
[15765, 22832, 15289, 15016, 15017]

Related

how to print after the keyword from python?

i have following string in python
b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
I want to print the all alphabet next to keyword "name" such that my output should be
waqas
Note the waqas can be changed to any number so i want print any name next to keyword name using string operation or regex?
First you need to decode the string since it is binary b. Then use literal eval to make the dictionary, then you can access by key
>>> s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
>>> import ast
>>> ast.literal_eval(s.decode())['name']
'waqas'
It is likely you should be reading your data into your program in a different manner than you are doing now.
If I assume your data is inside a JSON file, try something like the following, using the built-in json module:
import json
with open(filename) as fp:
data = json.load(fp)
print(data['name'])
if you want a more algorithmic way to extract the value of name:
s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a",\
"persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],\
"name":"waqas"}'
s = s.decode("utf-8")
key = '"name":"'
start = s.find(key) + len(key)
stop = s.find('"', start + 1)
extracted_string = s[start : stop]
print(extracted_string)
output
waqas
You can convert the string into a dictionary with json.loads()
import json
mystring = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
mydict = json.loads(mystring)
print(mydict["name"])
# output 'waqas'
First you need to convert the string into a proper JSON Format by removing b from the string using substring in python suppose you have a variable x :
import json
x = x[1:];
dict = json.loads(x) //convert JSON string into dictionary
print(dict["name"])

How to delete everything from string up to the specific character in Python

I wanted to extract only date from following string. Here is variable:
file = '62-201809.csv'
I used rsplit to get rid of file csv extension like this:
splitf = file.rsplit('.', 1)[0]
I got 62-201809 so it's okey but now i need to get rid of everything to '-' and store only 201809 into variable.How to do it?
Try using:
>>> file = '62-201809.csv'
>>> file.split('-', 1)[1].split('.')[0]
'201809'
>>>
Or use regex:
>>> import re
>>> file = '62-201809.csv'
>>> re.search('-(\d+)', file).group(1)
'201809'
>>>
If want only use split can do that:
filen = '62-201809.csv'
number = filen.split('.')[0]
number2 = number.split('-')[1]
print(number2)
first get only number, and later the number 201809 only.

Extract list of words from filenames

I need to get a list of words, that files contains. Here is the files:
sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
I need to get that goes after task-<>_, so my list should looks:
['FmriPictures','FmriVernike','FmriWgWords','RestingState']
how can I implement it in python3?
Here's a Python Solution for this which uses Regex.
>>> import re
>>> test_str = 'sub-Dzh_task-FmriPictures_space-
MNI152NLin2009cAsym_desc-preproc_bold_mask-
Language_sub01_component_ica_s1_.nii'
>>> re.search('task-(.*?)_', test_str).group(1)
'FmriPictures'
I think you can do the same for every string.
l=["sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii"]
k=[]
for i in l:
k.append(i.split('-')[2].replace("_space",""))
print(k)
thats just approach.
You can loop over your list and use regex to get the names from the strings like this example:
import re
a = ['sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
'sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
'sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
'sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii']
out = []
for elm in a:
condition = re.search(r'_task-(.*?)_', elm)
if bool(condition):
out.append(condition.group(1))
print(out)
Output:
['FmriPictures', 'FmriVernike', 'FmriWgWords', 'RestingState']
I would just simply replace
sub-Dzh_task-
and
_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
with null. Just empty those lines out and you'll get the file names.

Python split rows in txt file

I have .txt list with many rows in the following format:
3AD544532F-272|5SD32332S-F72|5FD2124L-Y21|4WA32332P-A26|6DW3224C-I72
(...)
How can I split these numbers by vertical bar | so that I can receive output .txt file with the following items:
3AD544532F-272-14
5SD32332S-F72-12
5FD2124L-Y21-41
4WA32332P-A26-17
6DW3224C-I72-41
I tried using this script but with no correct result.
import sys
output = open("output_list.txt","w")
print("read line and put it into list or array as you like to call it")
list = open("list.txt").read().splitlines()
for i in list:
re.compiler()
input()
If you have a string like this 3AD544532F-272|5SD32332S-F72|5FD2124L-Y21|4WA32332P-A26|6DW3224C-I72
You can split by | using split built in function
line = `3AD544532F-272|5SD32332S-F72|5FD2124L-Y21|4WA32332P-A26|6DW3224C-I72`
your_splitted_line = line.split("|")
>>print your_splitted_line
>>['3AD544532F-27','5SD32332S-F72','5FD2124L-Y21','4WA32332P-A26','6DW3224C-I72']
split() is indeed the best solution. The alternative using regex is
>>> import re
>>> text = "3AD544532F-272|5SD32332S-F72|5FD2124L-Y21|4WA32332P-A26|6DW3224C-I72"
>>> for i in re.split('\|',text):
... print (i)
...
3AD544532F-272
5SD32332S-F72
5FD2124L-Y21
4WA32332P-A26
6DW3224C-I72

appending regex matches to a dictionary

I have a file in which there is the following info:
dogs_3351.txt:34.13559322033898
cats_1875.txt:23.25581395348837
cats_2231.txt:22.087912087912088
elephants_3535.txt:37.092592592592595
fish_1407.txt:24.132530120481928
fish_2078.txt:23.470588235294116
fish_2041.txt:23.564705882352943
fish_666.txt:23.17241379310345
fish_840.txt:21.77173913043478
I'm looking for a way to match the colon and append whatever appears afterwards (the numbers) to a dictionary the keys of which are the name of the animals in the beginning of each line.
Actually, regular expressions are unnecessary, provided that your data is well formatted and contains no surprises.
Assuming that data is a variable containing the string that you listed above:
dict(item.split(":") for item in data.split())
t = """
dogs_3351.txt:34.13559322033898
cats_1875.txt:23.25581395348837
cats_2231.txt:22.087912087912088
elephants_3535.txt:37.092592592592595
fish_1407.txt:24.132530120481928
fish_2078.txt:23.470588235294116
fish_2041.txt:23.564705882352943
fish_666.txt:23.17241379310345
fish_840.txt:21.77173913043478
"""
import re
d = {}
for p, q in re.findall(r'^(.+?)_.+?:(.+)', t, re.M):
d.setdefault(p, []).append(q)
print d
why dont you use the python find method to locate the index of the colons which you can use to slice the string.
>>> x='dogs_3351.txt:34.13559322033898'
>>> key_index = x.find(':')
>>> key = x[:key_index]
>>> key
'dogs_3351.txt'
>>> value = x[key_index+1:]
>>> value
'34.13559322033898'
>>>
Read in each line of the file as a text and process the lines individually as above.
Without regex and using defaultdict:
from collections import defaultdict
data = """dogs_3351.txt:34.13559322033898
cats_1875.txt:23.25581395348837
cats_2231.txt:22.087912087912088
elephants_3535.txt:37.092592592592595
fish_1407.txt:24.132530120481928
fish_2078.txt:23.470588235294116
fish_2041.txt:23.564705882352943
fish_666.txt:23.17241379310345
fish_840.txt:21.77173913043478"""
dictionary = defaultdict(list)
for l in data.splitlines():
animal = l.split('_')[0]
number = l.split(':')[-1]
dictionary[animal] = dictionary[animal] + [number]
Just make sure your data is well formatted

Categories

Resources