python modify strings in nested lists - python

I have a strings in nested lists structure, can someone give me a tip on how to modify the strings in a for loop?
For example, I am trying to delete the last couple characters my string these values: /CLG-MAXFLOW
If I do
example = 'slipstream_internal/slipstream_hq/36/CLG-MAXFLOW'
print(example[0:36])
This is what I am looking for:
'slipstream_internal/slipstream_hq/36'
But how can I apply this to strings inside nested lists?
devices = [['slipstream_internal/slipstream_hq/36/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/38/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/31/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/21/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/29/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/25/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/9/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/6/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/13/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/14/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/30/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/19/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/8/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/26/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/24/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/34/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/11/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/27/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/20/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/23/CLG-MAXFLOW'],
['slipstream_internal/slipstream_hq/15/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/37/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/39/CLG-MAXFLOW',
'slipstream_internal/slipstream_hq/10/CLG-MAXFLOW']]

Not exactly the best solution but worth giving a try:
def check_about(lists:list):
for i,j in enumerate(lists):
if isinstance(j,list):
check_about(j)
else:
lists[i]=lists[i].strip('/CLG-MAXFLOW')
return lists
print(check_about(devices))

If your output must be the same structure given by devices variable, but with the string changed, then you can do this:
for row in devices:
for index, string in enumerate(row):
row[index] = '/'.join(string.split('/')[:-1])
Output:
[['slipstream_internal/slipstream_hq/36'],
['slipstream_internal/slipstream_hq/38',
'slipstream_internal/slipstream_hq/31'],
['slipstream_internal/slipstream_hq/21',
'slipstream_internal/slipstream_hq/29'],
['slipstream_internal/slipstream_hq/25',
'slipstream_internal/slipstream_hq/9',
'slipstream_internal/slipstream_hq/6'],
['slipstream_internal/slipstream_hq/13',
'slipstream_internal/slipstream_hq/14',
'slipstream_internal/slipstream_hq/30'],
['slipstream_internal/slipstream_hq/19',
'slipstream_internal/slipstream_hq/8',
'slipstream_internal/slipstream_hq/26',
'slipstream_internal/slipstream_hq/24'],
['slipstream_internal/slipstream_hq/34',
'slipstream_internal/slipstream_hq/11',
'slipstream_internal/slipstream_hq/27',
'slipstream_internal/slipstream_hq/20',
'slipstream_internal/slipstream_hq/23'],
['slipstream_internal/slipstream_hq/15',
'slipstream_internal/slipstream_hq/37',
'slipstream_internal/slipstream_hq/39',
'slipstream_internal/slipstream_hq/10']]

Related

Given two list of words, than return as dictionary and set together

Hey (Sorry bad english) so am going to try and make my question more clear. if i have a function let's say create_username_dict(name_list, username_list). which takes in two list's 1 being the name_list with names of people than the other list being usernames that is made out of the names of people. what i want to do is take does two list than convert them to a dictonary and set them together.
like this:
>>> name_list = ["Ola Nordmann", "Kari Olsen", "Roger Jensen"]
>>> username_list = ["alejon", "carli", "hanri"]
>>> create_username_dict(name_list, username_list)
{
"Albert Jones": "alejon",
"Carlos Lion": "carli",
"Hanna Richardo": "hanri"
}
i have tried look around on how to connect two different list in too one dictonary, but can't seem to find the right solution
If both lists are in matching order, i.e. the i-th element of one list corresponds to the i-th element of the other, then you can use this
D = dict(zip(name_list, username_list))
Use zip to pair the list.
d = {key: value for key,value in zip(name_list, username_list)}
print(d)
Output:
{'Ola Nordmann': 'alejon', 'Kari Olsen': 'carli', 'Roger Jensen': 'hanri'}
Considering both the list are same length and one to one mapping
name_list = ["Ola Nordmann", "Kari Olsen", "Roger Jensen"]
username_list = ["alejon", "carli", "hanri"]
result_stackoverflow = dict()
for index, name in enumerate(name_list):
result_stackoverflow[name] = username_list[index]
print(result_stackoverflow)
>>> {'Ola Nordmann': 'alejon', 'Kari Olsen': 'carli', 'Roger Jensen': 'hanri'}
Answer by #alex does the same but maybe too encapsulated for a beginner. So this is the verbose version.

Python: Index slicing from a list for each index in for loop

I got stuck in slicing from a list of data inside a for loop.
list = ['[init.svc.logd]: [running]', '[init.svc.logd-reinit]: [stopped]']
what I am looking for is to print only key without it values (running/stopped)
Overall code,
for each in list:
print(each[:]) #not really sure what may work here
result expected:
init.svc.logd
anyone for a quick solution?
If you want print only the key, you could use the split function to take whatever is before : and then replace [ and ] with nothing if you don't want them:
list = ['[init.svc.logd]: [running]', '[init.svc.logd-reinit]: [stopped]']
for each in list:
print(each.split(":")[0].replace('[','').replace(']','')) #not really sure what may work here
which gives :
init.svc.logd
init.svc.logd-reinit
You should probably be using a regular expression. The concept of 'key' in the question is ambiguous as there are no data constructs shown that have keys - it's merely a list of strings. So...
import re
list_ = ['[init.svc.logd]: [running]', '[init.svc.logd-reinit]: [stopped]']
for e in list_:
if r := re.findall('\[(.*?)\]', e):
print(r[0])
Output:
init.svc.logd
init.svc.logd-reinit
Note:
This is more robust than string splitting solutions for cases where data are unexpectedly malformed

Python reading comma seperated list from JSON

I have the following JSON structure given in a python script:
print("Producers: ", metadata['plist']['dict']['array'][2]['dict']['string'])
The Problem is that I don't have a single entry on that field, instead I have multiple ones.
Please also see the RAW JSON here: https://pastebin.com/rtTgmwvn
How can I pull out these entries as a comma separated string for [2] which is the producers field?
Thanks in advance
You're almost there:
you can do something like this
print("Producers: ", ", ".join(i["string"] for i in metadata['plist']['dict']['array'][2]['dict'])
to break down the solution... your "dict" element in the JSON is actually a list of "dict", and therefore you can simply iterate over this list:
metadata['plist']['dict']['array'][2]['dict']
where each element is an actual dict with a "string" key.
Update
The format of the JSON is so tahat in some cases it is a list, and in some cases it is a single element. In that case, I would suggest writing a small function or use an if statement that handles each situation:
def get_csv(element):
if isinstance(element, dict):
return element["string"]
return ", ".join(i["string"] for i in element)
# and you would use it like so:
print("Producers: ", get_csv(metadata['plist']['dict']['array'][2]['dict']))
The following should do the trick:
def get_producer_csv(data):
producers = []
dict = data["plist"]["dict"]["array"][2]["dict"]
for dict_entry in dict:
producers.append(dict_entry["string"])
return ",".join(producers)
For your example, it returns the following: "David Heyman,David Barron,Tim Lewis"

Lists and indexes gives error because list out of range

I get an error when I try to run my funtion.
I know the reason. but I search a way to fix this.
list2=['name','position','salary','bonus']
list3=['name','position','salary']
def funtionNew(list):
print(len(list))
po= '{} {} {} {}'.format(list[0],list[1],list[2],list[3])
print(po)
funtionNew(list3)
So that I can make this for list2
po='{}{}{}{}'..format(list[0],list[1],list[2],list[3])
and make this for list3
po='{}{}{}'..format(list[0],list[1],list[2])
From the function implementation it seems like you try to concat the list items with spaces in between so you can try instead -
po=' '.join(list)
This is independent from the list length, however you have to make sure that all the items in the list are strings. So you can do the following -
po = ' '.join[str(s) for s in list]
Try the following:
def funtionNew(list):
print(len(list))
string_for_formatting = '{} ' * len(list)
po = string_for_formatting.format(*list)
print(po)
This creates a string with a variable number of {} terms according to the length of your list, and then uses format on the list. The asterisk *, unpacks the elements of your list as inputs for the function.

Regular expressions matching words which contain the pattern but also the pattern plus something else

I have the following problem:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
I need to find which elements of list2 are in list1. In actual fact the elements of list1 correspond to a numerical value which I need to obtain then change. The problem is that 'xyz2' contains 'xyz' and therefore matches also with a regular expression.
My code so far (where 'data' is a python dictionary and 'specie_name_and_initial_values' is a list of lists where each sublist contains two elements, the first being specie name and the second being a numerical value that goes with it):
all_keys = list(data.keys())
for i in range(len(all_keys)):
if all_keys[i]!='Time':
#print all_keys[i]
pattern = re.compile(all_keys[i])
for j in range(len(specie_name_and_initial_values)):
print re.findall(pattern,specie_name_and_initial_values[j][0])
Variations of the regular expression I have tried include:
pattern = re.compile('^'+all_keys[i]+'$')
pattern = re.compile('^'+all_keys[i])
pattern = re.compile(all_keys[i]+'$')
And I've also tried using 'in' as a qualifier (i.e. within a for loop)
Any help would be greatly appreciated. Thanks
Ciaran
----------EDIT------------
To clarify. My current code is below. its used within a class/method like structure.
def calculate_relative_data_based_on_initial_values(self,copasi_file,xlsx_data_file,data_type='fold_change',time='seconds'):
copasi_tool = MineParamEstTools()
data=pandas.io.excel.read_excel(xlsx_data_file,header=0)
#uses custom class and method to get the list of lists from a file
specie_name_and_initial_values = copasi_tool.get_copasi_initial_values(copasi_file)
if time=='minutes':
data['Time']=data['Time']*60
elif time=='hour':
data['Time']=data['Time']*3600
elif time=='seconds':
print 'Time is already in seconds.'
else:
print 'Not a valid time unit'
all_keys = list(data.keys())
species=[]
for i in range(len(specie_name_and_initial_values)):
species.append(specie_name_and_initial_values[i][0])
for i in range(len(all_keys)):
for j in range(len(specie_name_and_initial_values)):
if all_keys[i] in species[j]:
print all_keys[i]
The table returned from pandas is accessed like a dictionary. I need to go to my data table, extract the headers (i.e. the all_keys bit), then look up the name of the header in the specie_name_and_initial_values variable and obtain the corresponding value (the second element within the specie_name_and_initial_value variable). After this, I multiply all values of my data table by the value obtained for each of the matched elements.
I'm most likely over complicating this. Do you have a better solution?
thanks
----------edit 2 ---------------
Okay, below are my variables
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
species = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
You don't need a regex to find common elements, set.intersection will find all elements in list2 that are also in list1:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
print(set(list2).intersection(list1))
set(['xyz'])
Also if you wanted to compare 'xyz' to 'xyz2' you would use == not in and then it would correctly return False.
You can also rewrite your own code a lot more succinctly, :
for key in data:
if key != 'Time':
pattern = re.compile(val)
for name, _ in specie_name_and_initial_values:
print re.findall(pattern, name)
Based on your edit you have somehow managed to turn lists into strings, one option is to strip the []:
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
specie_name_and_initial_values = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
specie_name_and_initial_values = set(s.strip("[]") for s in specie_name_and_initial_values)
print(all_keys.intersection(specie_name_and_initial_values))
Which outputs:
set([u'Cyp26_G_R1', u'Cyp26_G_rep1'])
FYI, if you had lists inside the set you would have gotten an error as lists are mutable so are not hashable.

Categories

Resources