python add static string to all items in set - python

I have the following set in python (its actually one set item):
product_set = {'Product, Product_Source_System, Product_Number'}
I want to add a static prefix (source.) to all the comma seperated values in the set, so I get the following output:
{'source.Product, source.Product_Source_System, source.Product_Number'}
I tried with a set comprehension, but it doesn't do the trick or I'm doing something wrong. It only prefixes the first value in the set.
{"source." + x for x in set}
I know sets are immutable. I don't need a new set, just output the new values.
Anyone that can help?
Thanks in advance

Edit: Splitting the initial long string into a list of short strings and then (only if required) making a set out of the list:
s1 = set('Product, Product_Source_System, Product_Number'.split(', '))
Constructing a new set:
s1 = {'Product', 'Product_Source_System', 'Product_Number'}
s2 = {"source." + x for x in s1}
Only printing the new strings:
for x in s1:
print("source." + x)

Note: The shown desired result is a new set with updated comma-seperated values. Further down you mentioned: "I don't need a new set, just output the new values". Which one is it? Below an option to mimic your desired result:
import re
set = {'Product, Product_Source_System, Product_Number'}
set = {re.sub(r'^|(,\s*)', r'\1source.', list(set)[0])}
# set = {'source.'+ list(set)[0].replace(', ', ', source.')}
print(set)
Prints:
{'source.Product, source.Product_Source_System, source.Product_Number'}

Related

Removing duplicates based on a partial match Python

Say I have a list
var1 = ["VenueA/2003", "VenueA/2006", "VenueA/2009","VenueB/2009"]
What I want to do is remove all duplicate elements in the list based on the VenueX and keep the first occurrence
In the above example, there are three similar VenueA which are VenueA/2003, VenueA/2006 and VenueA/2009. As VenueA/2003 is the first occurrence, I want to keep that and remove the rest of VenueA
The result that I want is
var1 = ["VenueA/2003","VenueB/2009"]
How do I go about doing that?
You can build a map keyed by the prefixes, and with as value, one of the strings that has this prefix. The Map constructor can be used for this.
As the Map constructor will retain the last occurrence of the string that has a given prefix, you should reverse the input and the output to get the first match instead:
const arr = ["VenueA/2003", "VenueA/2006", "VenueA/2009","VenueB/2009"];
const map = new Map(arr.map(s => [s.split("/")[0], s]).reverse());
const result = [...map.values()].reverse();
console.log(result);
k=[]
for x in var1:
if x.startswith('VenueA') and len(k)==0:
k.append(x)
if x.startswith('VenueB') and len(k)==1:
k.append(x)
#output
['VenueA/2003', 'VenueB/2009']
Similarly, if you have more Venues you can increase the value of len and append them to k

Get selected node names into a list or tuple in Nuke with Python

I am trying to obtain a list of the names of selected nodes with Python in Nuke.
I have tried:
for s in nuke.selectedNodes():
n = s['name'].value()
print n
This gives me the names of the selected nodes, but as separate strings.
There is nothing I can do to them that will combine each string. If I
have three Merges selected, in the Nuke script editor I get:
Result: Merge3
Merge2
Merge1
If I wrap the last variable n in brackets, I get:
Result: ['Merge3']
['Merge2']
['Merge1']
That's how I know they are separate strings. I found one other way to
return selected nodes. I used:
s = nuke.tcl("selected_nodes")
print s
I get odd names back like node3a7c000, but these names work in anything
that calls a node, like nuke.toNode() and they are all on one line. I
tried to force these results into a list or a tuple, like so:
s = nuke.tcl("selected_nodes")
print s
Result: node3a7c000 node3a7c400 node3a7c800
s = nuke.tcl("selected_nodes")
s2 = s.replace(" ","', '")
s3 = "(" + "'" + s2 + "'" + ")"
print s3
Result: ('node3a7c000', 'node3a7c400', 'node3a7c800')
My result looks to have the standard construct of a tuple, but if I try
to call the first value from the tuple, I get a parentheses back. This
is as if my created tuple is still a string.
Is there anything I can do to gather a list or tuple of selected nodes
names? I'm not sure what I am doing wrong and it seems that my last
solution should have worked.
As you iterate over each node, you'll want to add its name to a list ([]), and then return that. For instance:
names = []
for s in nuke.selectedNodes():
n = s['name'].value()
names.append(n)
print names
This will give you:
# Result: ['Merge3', 'Merge2', 'Merge1']
If you're familiar with list comprehensions, you can also use one to make names in one line:
names = [s['name'].value() for s in nuke.selectedNodes()]
nodename = list()
for node in nuke.selectedNodes():
nodename.append(node.name())

Regular expressions matching words which contain the pattern but also the pattern plus something else

I have the following problem:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
I need to find which elements of list2 are in list1. In actual fact the elements of list1 correspond to a numerical value which I need to obtain then change. The problem is that 'xyz2' contains 'xyz' and therefore matches also with a regular expression.
My code so far (where 'data' is a python dictionary and 'specie_name_and_initial_values' is a list of lists where each sublist contains two elements, the first being specie name and the second being a numerical value that goes with it):
all_keys = list(data.keys())
for i in range(len(all_keys)):
if all_keys[i]!='Time':
#print all_keys[i]
pattern = re.compile(all_keys[i])
for j in range(len(specie_name_and_initial_values)):
print re.findall(pattern,specie_name_and_initial_values[j][0])
Variations of the regular expression I have tried include:
pattern = re.compile('^'+all_keys[i]+'$')
pattern = re.compile('^'+all_keys[i])
pattern = re.compile(all_keys[i]+'$')
And I've also tried using 'in' as a qualifier (i.e. within a for loop)
Any help would be greatly appreciated. Thanks
Ciaran
----------EDIT------------
To clarify. My current code is below. its used within a class/method like structure.
def calculate_relative_data_based_on_initial_values(self,copasi_file,xlsx_data_file,data_type='fold_change',time='seconds'):
copasi_tool = MineParamEstTools()
data=pandas.io.excel.read_excel(xlsx_data_file,header=0)
#uses custom class and method to get the list of lists from a file
specie_name_and_initial_values = copasi_tool.get_copasi_initial_values(copasi_file)
if time=='minutes':
data['Time']=data['Time']*60
elif time=='hour':
data['Time']=data['Time']*3600
elif time=='seconds':
print 'Time is already in seconds.'
else:
print 'Not a valid time unit'
all_keys = list(data.keys())
species=[]
for i in range(len(specie_name_and_initial_values)):
species.append(specie_name_and_initial_values[i][0])
for i in range(len(all_keys)):
for j in range(len(specie_name_and_initial_values)):
if all_keys[i] in species[j]:
print all_keys[i]
The table returned from pandas is accessed like a dictionary. I need to go to my data table, extract the headers (i.e. the all_keys bit), then look up the name of the header in the specie_name_and_initial_values variable and obtain the corresponding value (the second element within the specie_name_and_initial_value variable). After this, I multiply all values of my data table by the value obtained for each of the matched elements.
I'm most likely over complicating this. Do you have a better solution?
thanks
----------edit 2 ---------------
Okay, below are my variables
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
species = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
You don't need a regex to find common elements, set.intersection will find all elements in list2 that are also in list1:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
print(set(list2).intersection(list1))
set(['xyz'])
Also if you wanted to compare 'xyz' to 'xyz2' you would use == not in and then it would correctly return False.
You can also rewrite your own code a lot more succinctly, :
for key in data:
if key != 'Time':
pattern = re.compile(val)
for name, _ in specie_name_and_initial_values:
print re.findall(pattern, name)
Based on your edit you have somehow managed to turn lists into strings, one option is to strip the []:
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
specie_name_and_initial_values = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
specie_name_and_initial_values = set(s.strip("[]") for s in specie_name_and_initial_values)
print(all_keys.intersection(specie_name_and_initial_values))
Which outputs:
set([u'Cyp26_G_R1', u'Cyp26_G_rep1'])
FYI, if you had lists inside the set you would have gotten an error as lists are mutable so are not hashable.

wrong in creating string array in python?

all:
I want to create a string array and then pass it to a class in python as following:
from plottert import plotter
at[0]='./Re100/17/0.001/R/Vx-H'
at[1]='./Re100/33/0.001/R/Vx-H'
at[2]='./Re100/65/0.001/R/Vx-H'
b[0]='./U-0.001-H'
plotter (at,b)
but I got an error showing name 'at' is not defined.
I know that at.append() will do work. But, what I really want is to add the value to a SPECIFIC index of the array I want. Any help?
You could simply fill it with empty strings if you want
at = [''] * n #n = length of list
at[0]='./Re100/17/0.001/R/Vx-H'
at[1]=...
However as others have mentioned, you never initialized your list in the first place.
If you want to assign to indexes without having to know the final size of your data structure, use a dictionary instead:
at = {}
at[0] = 'zero'
at[4] = 'four' # look, it's sparse
As you can see, this also has the advantage (over append) that you can assign in any order.
If you want to convert this to an array later, you can do something like this:
at_arr = [at[i] if i in at else None
for i in range(max(at.keys())+1)]
# at_arr now holds the array ['zero', None, None, None, 'four']
First, create the lists (there is no non-basic-type arrays):
at = [''] * n # n = size of at
b = [''] * m # m = size of b
then execute your code.
You cant use lists you havent defined.

Python List to String Conversion

This seems like a simple task and I'm not sure if I've accomplished it already, or if I'm chasing my tail.
values = [value.replace('-','') for value in values] ## strips out hyphen (only 1)
print values ## outputs ['0160840020']
parcelID = str(values) ## convert to string
print parcelID ##outputs ['0160840020']
url = 'Detail.aspx?RE='+ parcelID ## outputs Detail.aspx?RE=['0160840020']
As you can see I'm trying to append the number attached to the end of the URL in order to change the page via a POST parameter. My question is how do I strip the [' prefix and '] suffix? I've already tried parcelID.strip("['") with no luck. Am I doing this correctly?
values is a list (of length 1), which is why it appears in brackets. If you want to get just the ID, do:
parcelID = values[0]
Instead of
parcelID = str(values)
Assuming you actually have a list of values when you perform this (and not just one item) this would solve you problem (it would also work for one item as you have shown)
values = [value.replace('-','') for value in values] ## strips out hyphen (only 1)
# create a list of urls from the parcelIDs
urls = ['Detail.aspx?RE='+ str(parcelID) for parcelID in values]
# use each url one at a time
for url in urls:
# do whatever you need to do with each URL

Categories

Resources