Remove specific characters from list python - python

I am fairly new to Python. I have a list as follows:
sorted_x = [('pvg-cu2', 50.349189), ('hkg-pccw', 135.14921), ('syd-ipc', 163.441705), ('sjc-inap', 165.722676)]
I am trying to write a regex which will remove everything after the '-' and before the ',', i.e I need the same list to look as below:
[('pvg', 50.349189), ('hkg', 135.14921), ('syd', 163.441705), ('sjc', 165.722676)]
I have written a regex as follows:
for i in range(len(sorted_x)):
title_search = re.search('^\(\'(.*)-(.*)\', (.*)\)$', str(sorted_x[i]), re.IGNORECASE)
if title_search:
title = title_search.group(1)
time = title_search.group(3)
But this requires me to create two new lists and I don't want to change my original list.
Can you please suggest a simple way so that I can modify my original list without creating a new list?

result = [(a.split('-', 1)[0], b) for a, b in sorted_x]
Example:
>>> sorted_x = [('pvg-cu2', 50.349189), ('hkg-pccw', 135.14921), ('syd-ipc', 163.441705), ('sjc-inap', 165.722676)]
>>> [(a.split('-', 1)[0], b) for a, b in sorted_x]
[('pvg', 50.349189000000003), ('hkg', 135.14921000000001), ('syd', 163.44170500000001), ('sjc', 165.72267600000001)]

Related

python add static string to all items in set

I have the following set in python (its actually one set item):
product_set = {'Product, Product_Source_System, Product_Number'}
I want to add a static prefix (source.) to all the comma seperated values in the set, so I get the following output:
{'source.Product, source.Product_Source_System, source.Product_Number'}
I tried with a set comprehension, but it doesn't do the trick or I'm doing something wrong. It only prefixes the first value in the set.
{"source." + x for x in set}
I know sets are immutable. I don't need a new set, just output the new values.
Anyone that can help?
Thanks in advance
Edit: Splitting the initial long string into a list of short strings and then (only if required) making a set out of the list:
s1 = set('Product, Product_Source_System, Product_Number'.split(', '))
Constructing a new set:
s1 = {'Product', 'Product_Source_System', 'Product_Number'}
s2 = {"source." + x for x in s1}
Only printing the new strings:
for x in s1:
print("source." + x)
Note: The shown desired result is a new set with updated comma-seperated values. Further down you mentioned: "I don't need a new set, just output the new values". Which one is it? Below an option to mimic your desired result:
import re
set = {'Product, Product_Source_System, Product_Number'}
set = {re.sub(r'^|(,\s*)', r'\1source.', list(set)[0])}
# set = {'source.'+ list(set)[0].replace(', ', ', source.')}
print(set)
Prints:
{'source.Product, source.Product_Source_System, source.Product_Number'}

Python - Concating numerous lists with only one variable assigned to all

How do you concat numerous lists altogether into one single list, when there's a collection of lists assigned to only one variable already?
Most online advices has shown two or many variables to concat together, but mine's only a single variable assigned to many lists. I attempted at a nested For-Loop, but resulted in duplications and incoherent lists. Also attempted with extend and append functions with no success. Maybe I should approach this with Data Frame?
Any help is much appreciated. If you have questions, feel free to ask.
Actual Code:
from bs4 import BeautifulSoup as bs
import requests
import re
from time import sleep
from random import randint
def price():
baseURL='https://www.apartmentlist.com/ca/redwood-city'
r=requests.get(baseURL)
soup=bs(r.content,'html.parser')
block=soup.find_all('div',class_='css-1u6cvl9 e1k7pw6k0')
sleep(randint(2,10))
for properties in block:
priceBlock=properties.find_all('div',class_="css-q23zey e131nafx0")
price=[price.text for price in priceBlock]
strPrice=''.join(price) #Change from list to string type
removed=r'[$]' #Select and remove $ characters
removed2=r'Ask' #Select and remove Ask
removed3=r'[,]' #Select and remove comma
modPrice=re.sub(removed,' ',strPrice) #Substitute $ for '_'
modPrice2=re.sub(removed2,' 0',modPrice) #Substitute Ask for _0
modPrice3=re.sub(removed3,'',modPrice2) #Eliminate space within price
segments=modPrice3.split() #Change string with updates into list, remain clustered
for inserts in segments:
newPrice=[inserts] #Returns values from string to list by brackets.
print(newPrice)
price()
Actual Output:
#After executing the program
['2157']
['2805']
['0']
['1875']
['2800']
['2265']
['2735']
['3985']
...
...
Attempt for:
['2157', '2805', '0', '2800',...] # all the while assigned to a single variable.
Again, any help is appreciated.
The issue in your code is that the "for inserts in segments" loop only takes each price, puts it in its own list, then output the list with only 1 content. So, you need to add all prices to the same list, then after the loop output it.
In your case you can use a list comprehension like this to achieve what you want:
from bs4 import BeautifulSoup as bs
import requests
import re
from time import sleep
from random import randint
def price():
baseURL='https://www.apartmentlist.com/ca/redwood-city'
r=requests.get(baseURL)
soup=bs(r.content,'html.parser')
block=soup.find_all('div',class_='css-1u6cvl9 e1k7pw6k0')
sleep(randint(2,10))
result = []
for properties in block:
priceBlock=properties.find_all('div',class_="css-q23zey e131nafx0")
price=[price.text for price in priceBlock]
strPrice=''.join(price) #Change from list to string type
removed=r'[$]' #Select and remove $ characters
removed2=r'Ask' #Select and remove Ask
removed3=r'[,]' #Select and remove comma
modPrice=re.sub(removed,' ',strPrice) #Substitute $ for '_'
modPrice2=re.sub(removed2,' 0',modPrice) #Substitute Ask for _0
modPrice3=re.sub(removed3,'',modPrice2) #Eliminate space within price
segments=modPrice3.split() #Change string with updates into list, remain clustered
result += [insert for insert in segments]
print(result)
price()
(Hopefully I understood the problem)
If each of the sublists is a variable, you can do one of the following to convert them into a single list:
a = ['2157']
b = ['2805']
c = ['0']
d = ['1875']
e = ['2800']
f = ['2265']
g = ['2735']
h = ['3985']
#Pythonic Way
test = [i[0] for i in [a, b, c, d, e, f, g, h]]
print(test)
#Detailed Way
check = []
for i in a,b,c,d,e,f,g,h:
check.append(i[0])
print(check)
If your function creates lists, then you would just modify the for loops to reference your function:
#Pythonic Way
test = [i[0] for i in YOUR_FUNCTION()]
print(test)
#Detailed Way
check = []
for i in YOUR_FUNCTION():
check.append(i[0])
print(check)
How do you concat numerous lists altogether into one single list, when there's a collection of lists assigned to only one variable already?
In Python, it's common to flatten a list of lists either with a list comprehension or itertools.chain.
from itertools import chain
prices = [
['2157'],
['2805'],
['0'],
['1875'],
['2800'],
['2265'],
['2735'],
['3985'],
]
# list comprehension
[x for row in prices for x in row]
>>> ['2157', '2805', '0', '1875', '2800', '2265', '2735', '3985']
# itertools.chain will return a generator like object
chain.from_iterable(prices)
>>> <itertools.chain at 0x7f01573076a0>
# if you want a list back call list
list(chain.from_iterable(prices))
>>> ['2157', '2805', '0', '1875', '2800', '2265', '2735', '3985']
For your code above the price function is only printing output and not returning an object. You could have the function create an empty list and add to the list each time you loop through the properties. Then return the list.
def price():
# web scrape code
new_price = []
for properties in block:
# processing code
new_price += [x for x in segments]
return chain.from_iterable(new_prices)

Python: Unify multiple lists into one

Could you help me with the following challenge I am currently facing:
I have multiple lists, each of which contains multiple strings. Each string has the following format:
"ID-Type" - where ID is a number and type is a common Python type. One such example can be found here:
["1-int", "2-double", "1-string", "5-list", "5-int"],
["3-string", "1-int", "1-double", "5-double", "5-string"]
Before calculating further, I now want to preprocess these list to unify them the following way:
Count how often each type is appearing in each list
Generate a new list, combining both results
Create a mapping from initial list to that new list
As an example
In the above lists, we have the following types:
List 1: 2 int, 1 double, 1 string, 1 list
List 2: 2 string, 2 double, 1 int
The resulting table should now contain:
2 int, 2 double, 2 string, 1 list (in order to be able to contain both lists), like this:
[
"int_1-int",
"int_2-int",
"double_1-double",
"double_2-double",
"string_1-string",
"string_2-string",
"list_1-list"
]
And lastly, in order to map input to output, the idea is to have a corresponding dictionary to map this transformation, e.g., for list_1:
{
"1-int": "int_1-int",
"2-double": "double_1-double",
"1-string": "string_1-string",
"5-list": "list_1-list",
"5-int": "int_2-int"
}
I want to prevent to do this with a nested loop and multiple iterations - are there any libraries or is there maybe a smart vectorized solution to address this challenge?
Just add them:
Example :
['it'] + ['was'] + ['annoying']
You should read the Python tutorial to learn basic info like this.
Just another method....
import itertools
ab = itertools.chain(['it'], ['was'], ['annoying'])
list(ab)
Just add them: Example :
['it'] + ['was'] + ['annoying']
You should read the Python tutorial to learn basic info like this.
Just another method....
import itertools
ab = itertools.chain(['it'], ['was'], ['annoying'])
list(ab)
In general, this approach doesn't really make sense unless you specifically need to have the items in the resulting list and dict in this exact format. But here's how you can do it:
def process_type_list(type_list):
mapping = dict()
for i in type_list:
i_type = i.split('-')[1]
n_occur = 1
map_val = f'{i_type}_{n_occur}-{i_type}'
while map_val in mapping.values():
n_occur += 1
map_val = f'{i_type}_{n_occur}-{i_type}'
mapping[i] = map_val
return mapping
l1 = ["1-int", "2-double", "1-string", "5-list", "5-int"]
l2 = ["3-string", "1-int", "1-double", "5-double", "5-string"]
l1_mapping = process_type_list(l1)
l2_mapping = process_type_list(l2)
Additionally, Python does not have a double type. C doubles are implemented as Python floats (or decimal.Decimal if you need fine control over the precision)
I am pretty sure that this is what you want to do:
To make a joint list:
['any item'] + ['any item 2']
If you want to turn the list into a dictionary:
dict(zip(['key 1', 'key 2'], ['value 1', 'value 2']))
Another method of joining 2 lists:
a = ['list item', 'another list item']
a.extend(['another list item', 'another list item'])

How to convert for results into a list-Python

I am trying to put my results into a list.
Here is my code:
from ise import ERS
l = ise.get_endpoints(groupID=my_group_id)['response']
Here is my output:
[('AA:BB:CD', 'cvr5667'), ('AA:BB:CC', '8888')]
Here is my desired output which is a list of just the first elements of inside the parentheses:
['AA:BB:CD','AA:BB:CC']
I am new at python and working with lists/dicts so any suggestions would. All I am trying to do it put the first elements inside the parentheses in one list like i showed.
Using list comprehension (as suggested in comments too):
lst_1 = [('AA:BB:CD', 'cvr5667'), ('AA:BB:CC', '8888')]
lst_result = [i[0] for i in lst_1]
With something like this ?
result = [('AA:BB:CD', 'cvr5667'), ('AA:BB:CC', '8888')]
first_elements_to_list = [tmp[0] for tmp in result]
print(first_elements_to_list)
print:
['AA:BB:CD', 'AA:BB:CC']

Break one line python code to multiple lines

How to break one line code into multiple descriptive line because i am unable to understand this one line code.
data = formatted_data + "|" + '|'.join(["{}".format(a) for b, a in sorted(values.items()) if a and b not in ['SecureHash']])
Is this correct or not any one help me:
for b, a in sorted(values.items()):
if a and b not in ['SecureHash']:
c = ["{}".format(a)]
data = formatted_data + "|" + "|".join(c)
This code is collecting a string representation of a, and then building another string with it.
You need to define an external list, to account for the list comprehension expression.
c = ["{}".format(a) for b, a in sorted(values.items()) if a and b not in ['SecureHash']]
Further, to break down how c is being assembled, you can expand the list comprehension:
c = []
for b, a in sorted(values.items()):
if a and b not in ['SecureHash']:
c.append('{}'.format(a))
Finally, just combine the three parts:
data = formatted_data + "|" + "|".join(c)
Well, generally you can see any opening bracket and the plus operators in the string as a "breaking point". Working with your example:
data = formatted_data
data += "|"
data += '|'.join(["{}".format(a) for b, a in sorted(values.items()) if a and b
not in ['SecureHash']])
OK so now we need to unpack what's happening in that join:
data = formatted_data
data += "|"
jointmp = ["{}".format(a) for b, a in sorted(values.items()) if a and b not in ['SecureHash']]
data += '|'.join(jointmp)
OK so we've got some string formatting and a bunch of list comprehensions:
data = formatted_data
data += "|"
jointmp = []
for b, a in sorted(values.items()):
if a and b not in ['SecureHash']:
jointmp += ["{}".format(a)] # Equivalent to str(a) ?
data += '|'.join(jointmp)
To do the last stage there was a lot of going back and for as things were expanded. Those list comprehensions are quite terse...
There are some questions here though:
Where did values come from?
What's the "{}".format(a) for?
etc.
Your "expanded" code is not quite equivalent because you don't handle the case where there are no matches / values is empty and you are replacing data each time rather than growing it.
You might want to read up on list comprehensions.
basically it's making a list of things like this:
[item for item in iterable_thing]
so this one is making a list of strings ("{}".format(a)). I assume a is a hash, but let's pretend it's a number in a range:
["{}".format(a) for a in range(5)]
will make:
>>>['0', '1', '2', '3', '4']
Comprehensions can become quite complicated with the addition of if statements, and whoever wrote this code is one of those a in b in i in j kind of people, it seems, so their code is hard to follow. Good variable names are SO important.

Categories

Resources