Creating Dictionary Value Based on Startswith

Creating Dictionary Value Based on Startswith - python

How can I separate lists into one dictionary based on whether it starts with an letter or number?
webscrape1= ['Owner1','Owner2', 'Owner3', '555 Address Street',]
webscrape2 = ['Owner1','555 Address Street',]
webscrape3 = ['Owner1','Owner2', 'Owner3', 'Owner4', 'Owner5', '555 Address Street',]
An attribute error occurs if I try:
address = address[1:].startswith(('0', '1', '2', '3', '4', '5', '6', '7', '8', '9'))

This should give you the expected result:
d = {"Owner" : [], "Address" : []}
for el in webscrape:
if el.startswith(('0', '1', '2', '3', '4', '5', '6', '7', '8', '9')):
d["Address"].append(el)
else:
d["Owner"].append(el)
print(d)

I can see from the code that your list always contains the address at the last index of the list. So you can directly fetch that using the webscrape[-1] statement and to fetch all the owners simply index from the beginning to the second last element. webscrape[0:webscrape.length-1].

Related

Sort list of strings numerically and filter duplicates?

Given a list of strings in the following format:
[
"464782,-100,4,3,1,100,0,0"
"465042,-166.666666666667,4,3,1,100,0,0",
"465825,-250.000000000001,4,3,1,100,0,0",
"466868,-166.666666666667,4,3,1,100,0,0",
"467390,-200.000000000001,4,3,1,100,0,0",
"469999,-100,4,3,1,100,0,0",
"470260,-166.666666666667,4,3,1,100,0,0",
"474173,-100,4,3,1,100,0,0",
"474434,-166.666666666667,4,3,1,100,0,0",
"481477,-100,4,3,1,100,0,1",
"531564,259.011439671919,4,3,1,60,1,0",
"24369,-333.333333333335,4,3,1,100,0,0",
"21082,410.958904109589,4,3,1,60,1,0",
"21082,-250,4,3,1,100,0,0",
"22725,-142.857142857143,4,3,1,100,0,0",
"23547,-166.666666666667,4,3,1,100,0,0",
"24369,-333.333333333335,4,3,1,100,0,0",
"27657,-200.000000000001,4,3,1,100,0,0",
"29301,-142.857142857143,4,3,1,100,0,0",
"30123,-166.666666666667,4,3,1,100,0,0",
"30945,-250,4,3,1,100,0,0",
"32588,-166.666666666667,4,3,1,100,0,0",
"34232,-250,4,3,1,100,0,0",
"35876,-142.857142857143,4,3,1,100,0,0",
"36698,-166.666666666667,4,3,1,100,0,0",
"37520,-250,4,3,1,100,0,0",
"42451,-142.857142857143,4,3,1,100,0,0",
"43273,-166.666666666667,4,3,1,100,0,0",
]
How can I sort the list based on the first number in each line with python?
And then, once sorted, remove all duplicates, if any are there?
The sorting criteria for the list is the number before the first comma in each line, which is always an integer.
I tried using list.sort() , however, this sorts the items in lexical order, not numerically.

You could use a dictionary for this. The key will be number before the first comma and the value the entire string. Duplicates will be eliminated, but only the last occurrence of a particular number's string is stored.
l = ['464782,-100,4,3,1,100,0,0',
'465042,-166.666666666667,4,3,1,100,0,0',
'465825,-250.000000000001,4,3,1,100,0,0',
'466868,-166.666666666667,4,3,1,100,0,0',
'467390,-200.000000000001,4,3,1,100,0,0',
...]
d = {int(s.split(',')[0]) : s for s in l}
result = [d[key] for key in sorted(d.keys())]

I would try one of these two methods:
def sort_list(lis):
nums = [int(num) if isdigit(num) else float(num) for num in lis]
nums = list(set(nums))
nums.sort()
return [str(i) for i in nums] # I assumed you wanted them to be strings.
The first will raise a TypeError if all items in lis are not ints, floats, or string representations of a number. The second method doesn't have that problem, but it's a bit wonkier.
def sort_list(lis):
ints = [int(num) for num in lis if num.isdigit()]
floats = [float(num) for num in lis if not num.isdigit()]
nums = ints.copy()
nums.extend(floats)
nums = list(set(nums))
nums.sort()
return [str(i) for i in nums] # I assumed you wanted them to be strings.
Hope this helps.

You can try this.
First we need to remove the duplicates inside the list using set()
removed_duplicates_list = list(set(listr))
Then we convert the list of strings in to a list of tuples
list_of_tuples = [tuple(i.split(",")) for i in removed_duplicates_list]
Then we sort it using the sort()
list_of_tuples.sort()
The complete code sample below:
listr = [
"464782,-100,4,3,1,100,0,0"
"465042,-166.666666666667,4,3,1,100,0,0",
"465825,-250.000000000001,4,3,1,100,0,0",
"466868,-166.666666666667,4,3,1,100,0,0",
"467390,-200.000000000001,4,3,1,100,0,0",
"469999,-100,4,3,1,100,0,0",
"470260,-166.666666666667,4,3,1,100,0,0",
"474173,-100,4,3,1,100,0,0",
"474434,-166.666666666667,4,3,1,100,0,0",
"481477,-100,4,3,1,100,0,1",
"531564,259.011439671919,4,3,1,60,1,0",
"24369,-333.333333333335,4,3,1,100,0,0",
"21082,410.958904109589,4,3,1,60,1,0",
"21082,-250,4,3,1,100,0,0",
"22725,-142.857142857143,4,3,1,100,0,0",
"23547,-166.666666666667,4,3,1,100,0,0",
"24369,-333.333333333335,4,3,1,100,0,0",
"27657,-200.000000000001,4,3,1,100,0,0",
"29301,-142.857142857143,4,3,1,100,0,0",
"30123,-166.666666666667,4,3,1,100,0,0",
"30945,-250,4,3,1,100,0,0",
"32588,-166.666666666667,4,3,1,100,0,0",
"34232,-250,4,3,1,100,0,0",
"35876,-142.857142857143,4,3,1,100,0,0",
"36698,-166.666666666667,4,3,1,100,0,0",
"37520,-250,4,3,1,100,0,0",
"42451,-142.857142857143,4,3,1,100,0,0",
"43273,-166.666666666667,4,3,1,100,0,0",
]
removed_duplicates_list = list(set(listr))
list_of_tuples = [tuple(i.split(",")) for i in removed_duplicates_list]
list_of_tuples.sort()
print(list_of_tuples) # the output is a list of tuples
OUTPUT:
[('21082', '-250', '4', '3', '1', '100', '0', '0'),
('21082', '410.958904109589', '4', '3', '1', '60', '1', '0'),
('22725', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
('23547', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('24369', '-333.333333333335', '4', '3', '1', '100', '0', '0'),
('27657', '-200.000000000001', '4', '3', '1', '100', '0', '0'),
('29301', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
('30123', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('30945', '-250', '4', '3', '1', '100', '0', '0'),
('32588', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('34232', '-250', '4', '3', '1', '100', '0', '0'),
('35876', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
('36698', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('37520', '-250', '4', '3', '1', '100', '0', '0'),
('42451', '-142.857142857143', '4', '3', '1', '100', '0', '0'),
('43273', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('464782','-100','4','3','1','100','0'),
('465042','-166.666666666667','4','3','1','100','0','0'),
('465825', '-250.000000000001', '4', '3', '1', '100', '0', '0'),
('466868', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('467390', '-200.000000000001', '4', '3', '1', '100', '0', '0'),
('469999', '-100', '4', '3', '1', '100', '0', '0'),
('470260', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('474173', '-100', '4', '3', '1', '100', '0', '0'),
('474434', '-166.666666666667', '4', '3', '1', '100', '0', '0'),
('481477', '-100', '4', '3', '1', '100', '0', '1'),
('531564', '259.011439671919', '4', '3', '1', '60', '1', '0')]

I hope this will help to.
I place all your list elements in a separate file named lista.txt
In this example I will get your list from file... I like to be more organizated and to have separate files you can do in on python as well, but the idea is you need to get all elements from list one by one (while function or for function) and to add them to a temporary list by checking if the new items already exist, if is exist pass and then you can sample use .sort() because will do the trick and with numbers.
# Global variables
file = "lista.txt"
tempList = []
# Logic get items from file
def GetListFromFile(fileName):
# Local variables
showDoneMsg = True
# Try to run this code
try:
# Open file and try to read it
with open(fileName, mode="r") as f:
# Define line
line = f.readline()
# For every line in file
while line:
# Get out all end white space (\n, \r)
item = line.rstrip()
# Check if this item is not allready in the list
if item not in tempList:
# Append item to a temporar list
tempList.append(item)
# Show me if a itmes allready exist
else:
print("Dublicate >>", item)
# Go to new line
line = f.readline()
# This is optional because is callet automatical
# but I like to be shore
f.close()
# Execptions
except FileNotFoundError:
print("ERROR >> File do not exist!")
showDoneMsg = False
# Sort the list
tempList.sort()
# Show me when is done if file exist
if showDoneMsg == True:
print("\n>>> DONE <<<\n")
# Logic show list items
def ShowListItems(thisList):
if len(thisList) == 0:
print("Temporary list is empty...")
else:
print("This is new items list:")
for i in thisList:
print(i)
# Execute function
GetListFromFile(file)
# Testing if items was sorted
ShowListItems(tempList)
Out put:
========================= RESTART: D:\Python\StackOverflow\help.py =========================
Dublicate >> 43273,-166.666666666667,4,3,1,100,0,0
>>> DONE <<<
21082,-250,4,3,1,100,0,0
21082,410.958904109589,4,3,1,60,1,0
22725,-142.857142857143,4,3,1,100,0,0
...
474434,-166.666666666667,4,3,1,100,0,0
481477,-100,4,3,1,100,0,1
531564,259.011439671919,4,3,1,60,1,0
>>>

Python - List of unique sequences

I have a dictionary with elements as lists of certain sequence:
a = {'seq1':['5', '4', '3', '2', '1', '6', '7', '8', '9'],
'seq2':['9', '8', '7', '6', '5', '4', '3', '2', '1'],
'seq3':['5', '4', '3', '2', '1', '11', '12', '13', '14'],
'seq4':['15', '16', '17'],
'seq5':['18', '19', '20', '21', '22', '23'],
'seq6':['18', '19', '20', '24', '25', '26']}
So there are 6 sequences
What I need to do is:
To find only unique lists (if two lists contains the same elements (regardless of their order), they are not unique) - say I need to get rid of the second list (the first founded unique list will stay)
In unique lists I need to find unique subsequences of elements and print
it
Bounds of unique sequences are found by resemblance of elements order - in the 1st and the 3rd lists the bound ends exactly after element '1', so we get the subsequence ['5','4','3','2','1']
As the result I would like to see elements exactly in the same order as it was in the beginning (if it`s possible at all somehow). So I expect this:
[['5', '4', '3', '2', '1']['6', '7', '8', '9']['11', '12', '13', '14']['15', '16', '17']['18', '19', '20']['21', '22', '23']['24', '25', '26']]
Tried to do it this way:
import itertools
unique_sets = []
a = {'seq1':["5","4","3","2","1","6","7","8","9"], 'seq2':["9","8","7","6","5","4","3","2","1"], 'seq3':["5","4","3","2","1","11","12","13","14"], 'seq4':["15","16","17"], 'seq5':["18","19","20","21","22","23"], 'seq6':["18","19","20","24","25","26"]}
b = []
for seq in a.values():
b.append(seq)
for seq1, seq2 in itertools.combinations(b,2): #searching for intersections
if set(seq1).intersection(set(seq2)) not in unique_sets:
#if set(seq1).intersection(set(seq2)) == set(seq1):
#continue
unique_sets.append(set(seq1).intersection(set(seq2)))
if set(seq1).difference(set(seq2)) not in unique_sets:
unique_sets.append(set(seq1).difference(set(seq2)))
for it in unique_sets:
print(it)
I got this which is a little bit different from my expectations:
{'9', '5', '2', '3', '7', '1', '4', '8', '6'}
set()
{'5', '2', '3', '1', '4'}
{'9', '8', '6', '7'}
{'5', '2', '14', '3', '1', '11', '12', '4', '13'}
{'17', '16', '15'}
{'19', '20', '18'}
{'23', '21', '22'}
Without comment in the code above the result is even worse.
Plus I have the problem with unordered elements in the sets, which I get as the result. Tried to do this with two separate lists:
seq1 = set([1,2,3,4,5,6,7,8,9])
seq2 = set([1,2,3,4,5,10,11,12])
and it worked fine - elements didn`t ever change their position in sets. Where is my mistake?
Thanks.
Updated: Ok, now I have a little bit more complicated task, where offered alghorithm won`t work
I have this dictionary:
precond = {
'seq1': ["1","2"],
'seq2': ["3","4","2"],
'seq3': ["5","4","2"],
'seq4': ["6","7","4","2"],
'seq5': ["6","4","7","2"],
'seq6': ["6","1","8","9","10"],
'seq7': ["6","1","8","11","9","12","13","14"],
'seq8': ["6","1","8","11","4","15","13"],
'seq9': ["6","1","8","16","9","11","4","17","18","2"],
'seq10': ["6","1","8","19","9","4","16","2"],
}
I expect these sequences, containing at least 2 elements:
[1, 2],
[4, 2],
[6, 7],
[6, 4, 7, 2],
[6, 1, 8]
[9,10],
[6,1,8,11]
[9,12,13,14]
[4,15,13]
[16,9,11,4,17,18,2]
[19,9,4,16,2]
Right now I wrote this code:
precond = {
'seq1': ["1","2"],
'seq2': ["3","4","2"],
'seq3': ["5","4","2"],
'seq4': ["6","7","4","2"],
'seq5': ["6","4","7","2"],
'seq6': ["6","1","8","9","10"],
'seq7': ["6","1","8","11","9","12","13","14"],
'seq8': ["6","1","8","11","4","15","13"],
'seq9': ["6","1","8","16","9","11","4","17","18","2"],
'seq10': ["6","1","8","19","9","4","16","2"],
}
seq_list = []
result_seq = []
#d = []
for seq in precond.values():
seq_list.append(seq)
#print(b)
contseq_ind = 0
control_seq = seq_list[contseq_ind]
mainseq_ind = 1
el_ind = 0
#index2 = 0
def compar():
if control_seq[contseq_ind] != seq_list[mainseq_ind][el_ind]:
mainseq_ind += 1
compar()
else:
result_seq.append(control_seq[contseq_ind])
contseq_ind += 1
el_ind += 1
if contseq_ind > len(control_seq):
control_seq = seq_list[contseq_ind + 1]
compar()
else:
compar()
compar()
This code is not complete anyway - I created looking for the same elements from the beginning, so I still need to write a code for searching of sequence in the end of two compared elements.
Right now I have a problem with recursion. Immidiately after first recursed call I have this error:
if control_seq[contseq_ind] != b[mainseq_ind][el_ind]:
UnboundLocalError: local variable 'control_seq' referenced before assignment
How can I fix this? Or maybe you have a better idea, than using recursion? Thank you in advance.

Not sure if this is what you wanted, but it gets the same result:
from collections import OrderedDict
a = {'seq1':["5","4","3","2","1","6","7","8","9"],
'seq2':["9","8","7","6","5","4","3","2","1"],
'seq3':["5","4","3","2","1","11","12","13","14"],
'seq4':["15","16","17"],
'seq5':["18","19","20","21","22","23"],
'seq6':["18","19","20","24","25","26"]}
level = 0
counts = OrderedDict()
# go through each value in the list of values to count the number
# of times it is used and indicate which list it belongs to
for elements in a.values():
for element in elements:
if element in counts:
a,b = counts[element]
counts[element] = a,b+1
else:
counts[element] = (level,1)
level+=1
last = 0
result = []
# now break up the dictionary of unique values into lists according
# to the count of each value and the level that they existed in
for k,v in counts.items():
if v == last:
result[-1].append(k)
else:
result.append([k])
last = v
print(result)
Result:
[['5', '4', '3', '2', '1'],
['6', '7', '8', '9'],
['11', '12', '13', '14'],
['15', '16', '17'],
['18', '19', '20'],
['21', '22', '23'],
['24', '25', '26']]

Remove duplicate items from list

I tried following this post but, it doesnt seem to be working for me.
I tried this code:
for bresult in response.css(LIST_SELECTOR):
NAME_SELECTOR = 'h2 a ::attr(href)'
yield {
'name': bresult.css(NAME_SELECTOR).extract_first(),
}
b_result_list.append(bresult.css(NAME_SELECTOR).extract_first())
#set b_result_list to SET to remove dups, then change back to LIST
set(b_result_list)
list(set(b_result_list))
for brl in b_result_list:
print("brl: {}".format(brl))
This prints out:
brl: https://facebook.site.com/users/login
brl: https://facebook.site.com/users
brl: https://facebook.site.com/users/login
When I just need:
brl: https://facebook.site.com/users/login
brl: https://facebook.site.com/users
What am I doing wrong here?
Thank you!

you are discarding the result when you need to save it ... b_result_list never actually changes... so you are just iterating over the original list. instead save the result of the set operation
b_result_list = list(set(b_result_list))
(note that sets do not preserve order)

If you want to maintain order and uniqueify, you can do:
>>> li
['1', '1', '2', '2', '3', '3', '3', '3', '1', '1', '4', '5', '4', '6', '6']
>>> seen=set()
>>> [e for e in li if not (e in seen or seen.add(e))]
['1', '2', '3', '4', '5', '6']
Or, you can use the keys of an OrderedDict:
>>> from collections import OrderedDict
>>> OrderedDict([(k, None) for k in li]).keys()
['1', '2', '3', '4', '5', '6']
But a set alone may substantially change the order of the original list:
>>> list(set(li))
['1', '3', '2', '5', '4', '6']

Python 3: how to create list out of float numbers?

Anyone knows how can I solve this issue?
I have the following code.
result=[]
for i in range(len(response_i['objcontent'][0]['rowvalues'])):
lat = response_i['objcontent'][0]['rowvalues'][i][0]
print(lat)
for i in lat:
result.append(i)
print (result)
Following is the output of print(lat):
92.213725
191.586143
228.981615
240.353291
and following is the output of print(result):
['9', '2', '.', '2', '1', '3', '7', '2', '5', '1', '9', '1', '.', '5', '8',
'6', '1', '4', '3', '2', '2', '8', '.', '9', '8', '1', '6', '1', '5', '2',
'4', '0', '.', '3', '5', '3', '2', '9', '1']
I expected to get the output in following format:
[92.213725, 191.586143, 228.981615, 240.353291]
Anyone knows how to fix this issue?
Thanks

So, your error is that instead of simply adding your latitute to the list, you are iterating over each character of the latitude, as a string, and adding that character to a list.
result=[]
for value in response_i['objcontent'][0]['rowvalues']:
lat = value[0]
print(lat)
result.append(float(lat))
print (result)
Besides that, using range(len(...))) is the way things have to be done in almost all modern languages, because they either don't implement a "for ...each" or do it in an incomplete or faulty way.
In Python, since the beginning it is a given that whenever one wants a for iteration he wants to get the items of a sequence, not its indices (for posterior retrieval of the indices). Some auxiliar built-ins come in to play to ensure you just interate the sequence: zip to mix one or more sequences, and enumerate to yield the indices as well if you need them.

How could i refresh a list once an item has been removed from a list within a list in python

This is quite complicated but i would like to be able to refresh a larger list once at item has been taken out of a mini list within the bigger list.
listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
i used the code below to split it into lists of threes
split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
print(split)
# [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
split = [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
if i deleted #3 from the first list, split will now be
del split[0][-1]
split = [['1','2'],['4','5','6'],['6','8','9'],['5','3','7']]
after #3 has been deleted, i would like to be able to refresh the list so that it looks like;
split = [['1','2','4'],['5','6','6'],['8','9','5'],['3','7']]
thanks in advance

Not sure how big this list is getting, but you would need to flatten it and recalculate it:
>>> listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '3'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> del split[0][-1]
>>> split
[['1', '2'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> listA = sum(split, []) # <- flatten split list back to 1 level
>>> listA
['1', '2', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '4'], ['5', '6', '6'], ['8', '9', '5'], ['3', '7']]

Just recreate the single list from your nested lists, then re-split.
You can join the lists, assuming they are only one level deep, with something like:
rejoined = [element for sublist in split for element in sublist]
There are no doubt fancier ways, or single-liners that use itertools or some other library, but don't overthink it. If you're only talking about a few hundred or even a few thousand items this solution is quite good enough.

I need this for turning of cards in the deck in a solitaire game.
You can deal your cards using itertools.groupby() with a good key function:
def group_key(x, n=3, flag=[0], counter=itertools.count(0)):
if next(counter) % n == 0:
flag[0] = flag[0] ^ 1
return flag[0]
^ is a bitwise operator, basically it change the value of the flag from 0 to 1 and viceversa. The flag value is an element of a list because we're doing some kind of memoization.
Example:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '8', '9']
['5', '3', '7']
Now let's say you've used card '9' and '8', so your new deck looks like:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '5', '3']
['7']

Build an object that contains a list and tracks when the list is altered (probably by controlling write to it), then have the object do it's own split every time the data is altered and save the split list to a member of the object.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating Dictionary Value Based on Startswith - python

This should give you the expected result: d = {"Owner" : [], "Address" : []} for el in webscrape: if el.startswith(('0', '1', '2', '3', '4', '5', '6', '7', '8', '9')): d["Address"].append(el) else: d["Owner"].append(el) print(d)

I can see from the code that your list always contains the address at the last index of the list. So you can directly fetch that using the webscrape[-1] statement and to fetch all the owners simply index from the beginning to the second last element. webscrape[0:webscrape.length-1].

Related

Sort list of strings numerically and filter duplicates?

Python - List of unique sequences

Remove duplicate items from list

Python 3: how to create list out of float numbers?

How could i refresh a list once an item has been removed from a list within a list in python

Categories

Resources