park = "a park.shp"
road = "the roads.shp"
school = "a school.shp"
train = "the train"
bus = "the bus.shp"
mall = "a mall"
ferry = "the ferry"
viaduct = "a viaduct"
dataList = [park, road, school, train, bus, mall, ferry, viaduct]
print dataList
for a in dataList:
print a
#if a.endswith(".shp"):
# dataList.remove(a)
print dataList
gives the following output (so the loop is working and reading everything correctly):
['a park.shp', 'the roads.shp', 'a school.shp', 'the train', 'the bus.shp', 'a mall', 'the ferry', 'a viaduct']
a park.shp
the roads.shp
a school.shp
the train
the bus.shp
a mall
the ferry
a viaduct
['a park.shp', 'the roads.shp', 'a school.shp', 'the train', 'the bus.shp', 'a mall', 'the ferry', 'a viaduct']
but when I remove the # marks to run the if statement, where it should remove the strings ending in .shp, the string road remains in the list?
['a park.shp', 'the roads.shp', 'a school.shp', 'the train', 'the bus.shp', 'a mall', 'the ferry', 'a viaduct']
a park.shp
a school.shp
the bus.shp
the ferry
a viaduct
['the roads.shp', 'the train', 'a mall', 'the ferry', 'a viaduct']
Something else I noticed, it doesn't print all the strings when it's clearly in a for loop that should go through each string? Can someone please explain what's going wrong, where the loop keeps the string road but finds the other strings ending with .shp and removes them correctly?
Thanks,
C
(FYI, this is on Python 2.6.6, because of Arc 10.0)
You are mutating the list and causing the index to skip.
Use a list comprehension like this:
[d for d in dataList if not d.endswith('.shp')]
and then get:
>>> ['the train', 'a mall', 'the ferry', 'a viaduct']
Removing items from the same list you're iterating over almost always causes problems. Make a copy of the original list and iterate over that instead; that way you don't skip anything.
for a in dataList[:]: # Iterate over a copy of the list
print a
if a.endswith(".shp"):
dataList.remove(a) # Remove items from the original, not the copy
Of course, if this loop has no purpose other than creating a list with no .shp files, you can just use one list comprehension and skip the whole mess.
no_shp_files = [a for a in datalist if not a.endswith('.shp')]
Related
I am following this example semantic clustering:
!pip install sentence_transformers
from sentence_transformers import SentenceTransformer
from sklearn.cluster import KMeans
embedder = SentenceTransformer('all-MiniLM-L6-v2')
# Corpus with example sentences
corpus = ['A man is eating food.',
'A man is eating a piece of bread.',
'A man is eating pasta.',
'The girl is carrying a baby.',
'The baby is carried by the woman',
'A man is riding a horse.',
'A man is riding a white horse on an enclosed ground.',
'A monkey is playing drums.',
'Someone in a gorilla costume is playing a set of drums.',
'A cheetah is running behind its prey.',
'A cheetah chases prey on across a field.'
]
corpus_embeddings = embedder.encode(corpus)
# Perform kmean clustering
num_clusters = 5
clustering_model = KMeans(n_clusters=num_clusters)
clustering_model.fit(corpus_embeddings)
cluster_assignment = clustering_model.labels_
clustered_sentences = [[] for i in range(num_clusters)]
for sentence_id, cluster_id in enumerate(cluster_assignment):
clustered_sentences[cluster_id].append(corpus[sentence_id])
for i, cluster in enumerate(clustered_sentences):
print("Cluster", i+1)
print(cluster)
print(len(cluster))
print("")
Which results to the following lists:
Cluster 1
['The girl is carrying a baby.', 'The baby is carried by the woman']
2
Cluster 2
['A man is riding a horse.', 'A man is riding a white horse on an enclosed ground.']
2
Cluster 3
['A man is eating food.', 'A man is eating a piece of bread.', 'A man is eating pasta.']
3
Cluster 4
['A cheetah is running behind its prey.', 'A cheetah chases prey on across a field.']
2
Cluster 5
['A monkey is playing drums.', 'Someone in a gorilla costume is playing a set of drums.']
2
How to add these separate list to one?
Expected outcome:
list2[['The girl is carrying a baby.', 'The baby is carried by the woman'], .....['A monkey is playing drums.', 'Someone in a gorilla costume is playing a set of drums.']]
I tried the following:
list2=[]
for i in cluster:
list2.append(i)
list2
But I returns me only the last one:
['A monkey is playing drums.',
'Someone in a gorilla costume is playing a set of drums.']
Any ideas?
Following that example, you don't need to anything to get a list of lists; that's already been done for you.
Try printing clustered_sentences.
Basically, you need to get a "flat" list from a list of lists, you can achieve that with python list comprehension:
flat = [item for sub in clustered_sentences for item in sub]
I have list of strings which need to be transformed into a smaller list of strings, depending whether two consecutive elements belong to the same phrase.
This happens, at the moment, if the last character of the i-th string is lower and the first character of the i+1-th string is also lower, but more complex conditions should be checked in the future.
For example this very profound text:
['I am a boy',
'and like to play'
'My friends also'
'like to play'
'Cats and dogs are '
'nice pets, and'
'we like to play with them'
]
should become:
['I am a boy and like to play',
'My friends also like to play',
'Cats and dogs are nice pets, and we like to play with them'
]
My python solution
I think the data you have posted is comma seperated. If it is pfb a simple loop solution.
data=['I am a boy',
'and like to play',
'My friends also',
'like to play',
'Cats and dogs are ',
'nice pets, and',
'we like to play with them'
]
required_list=[]
for j,i in enumerate(data):
print(i,j)
if j==0:
req=i
else:
if i[0].isupper():
required_list.append(req)
req=i
else:
req=req+" "+i
required_list.append(req)
print(required_list)
Here is your code check it
data = ['I am a boy',
'and like to play'
'My friends also'
'like to play'
'Cats and dogs are '
'nice pets, and'
'we like to play with them'
]
joined_string = ",".join(data).replace(',',' ')
import re
values = re.findall('[A-Z][^A-Z]*', joined_string)
print(values)
Since you want to do it recursively, you can try something like this:
def join_text(text, new_text):
if not text:
return
if not new_text:
new_text.append(text.pop(0))
return join_text(text, new_text)
phrase = text.pop(0)
if phrase[0].islower(): # you can add more complicated logic here
new_text[-1] += ' ' + phrase
else:
new_text.append(phrase)
return join_text(text, new_text)
phrases = [
'I am a boy',
'and like to play',
'My friends also',
'like to play',
'Cats and dogs are ',
'nice pets, and',
'we like to play with them'
]
joined_phrases = []
join_text(phrases, joined_phrases)
print(joined_phrases)
My solution has some problems with witespaces, but I hope you got the idea.
Hope it helps!
I'm trying to create entries (1000), and I'm starting with the name. I came up with some names, and I planned to copy the entry with the number 0-9 added to to create more unique names. So I used a for loop inside of a for loop. Is it not possible to change the index into a string and add it to the end of the item in the list.
I thought about incrementing, because I have been coding a lot in C++ lately, but that didn't work because you don't need to increment when you use python's range function. I thought about changing the order of the loops
name = ['event', 'thing going on', 'happening', 'what everyones talkin', 'that thing', 'the game', 'the play', 'outside time', 'social time', 'going out', 'having fun']
for index in range(10):
for item in name:
name.append(item+str(index))
return name
I want it to print out ['event0', 'thing going on1', ... 'having fun10']
Thank you!
Using list comprehension
enumerate() - method adds a counter to an iterable and returns it in a form of enumerate object.
Ex.
name = ['event', 'thing going on', 'happening', 'what everyones talkin', 'that thing', 'the game', 'the play', 'outside time', 'social time', 'going out', 'having fun']
new_list = [x+str(index) for index,x in enumerate(name)]
print(new_list)
O/P:
['event0', 'thing going on1', 'happening2', 'what everyones talkin3', 'that thing4', 'the game5', 'the play6', 'outside time7', 'social time8', 'going out9', 'having fun10']
use a new list and append there:
newName = []
for index in range(10):
for item in name:
newName.append(item+str(index))
return newName
You're adding more elements to the list every time you run through the loop. Make a new empty list and append the elements to that list. That way your initial list stays the same.
The inner for loop is appending new element to the array.
You simply need
for index in range(10):
name[index] = name[index] + str(index)
now your array contains your expected output. This changes your original array btw. If you want it to keep unchanged, then do the following.
newArray = []
for index in range(10):
newArray [index] = name[index] + str(index)
It will work as you expect:
name = ['event', 'thing going on', 'happening', 'what everyones talkin', 'that thing',
'the game', 'the play', 'outside time', 'social time', 'going out', 'having fun']
for index in range(10):
name[index] = name[index]+str(index)
print (name)
Output:
['event0', 'thing going on1', 'happening2', 'what everyones talkin3', 'that thing4', 'the game5', 'the play6', 'outside time7', 'social time8', 'going out9', 'having fun10']
I have been having problems trying to find a way to replace tags in my strings in Python.
What I have at the moment is the text:
you should buy a {{cat_breed + dog_breed}} or a {{cat_breed + dog_breed}}
Where cat_breed and dog_breed are lists of cat and dog breeds.
What I want to end up with is:
you should buy a Scottish short hair or a golden retriever
I want the tag to be replaced by a random entry in one of the two lists.
I have been looking at re.sub() but I do not know how to fix the problem and not just end up with the same result in both tags.
Use random.sample to get two unique elements from the population.
import random
cats = 'lazy cat', 'cuddly cat', 'angry cat'
dogs = 'dirty dog', 'happy dog', 'shaggy dog'
print("you should buy a {} or a {}".format(*random.sample(dogs + cats, 2)))
There's no reason to use regular expressions here. Just use string.format instead.
I hope the idea below gives you some idea on how to complete your task:
list1 = ['cat_breed1', 'cat_breed2']
list2 = ['dog_breed1', 'dog_breed2']
a = random.choice(list1)
b = random.choice(list2)
sentence = "you should buy a %s or a %s" %(a, b)
My goal is to split the string below only on double white spaces. See example string below and an attempt using the regular split function.
My attempt
>>> _str='The lorry ran into the mad man before turning over'
>>> _str.split()
['The', 'lorry', 'ran', 'into', 'the', 'mad', 'man', 'before', 'turning', 'over']
Ideal result:
['the lorry ran', 'into the mad man', 'before turning over']
Any suggestions on how to arrive at the ideal result? thanks.
split can use an argument which is used to split:
>>> _str='The lorry ran into the mad man before turning over'
>>> _str.split(' ')
['The lorry ran', 'into the mad man', 'before turning over']
From the doc
str.split([sep[, maxsplit]])
Return a list of the words in the string, using sep as the delimiter string.
If maxsplit is given, at most maxsplit splits are
done (thus, the list will have at most maxsplit+1 elements).
If sep is given, consecutive delimiters are not grouped together and are deemed
to delimit empty strings (for example,
'1,,2'.split(',') returns ['1', '', '2']). The sep argument may
consist of multiple characters (for example, '1<>2<>3'.split('<>')
returns ['1', '2', '3']).
split takes a seperator argument. Just pass ' ' to it:
>>> _str='The lorry ran into the mad man before turning over'
>>> _str.split(' ')
['The lorry ran', 'into the mad man', 'before turning over']
>>>
Give your split() a double space as an argument.
>>> _str='The lorry ran into the mad man before turning over'
>>> _str.split(" ")
['The lorry ran', 'into the mad man', 'before turning over']
>>>
Use the re module:
>>> import re
>>> example = 'The lorry ran into the mad man before turning over'
>>> re.split(r'\s{2}', example)
['The lorry ran', 'into the mad man', 'before turning over']
Since, you need to split on 2 or more spaces, you can do.
>>> import re
>>> _str = 'The lorry ran into the mad man before turning over'
>>> re.split("\s{2,}", _str)
['The lorry ran', 'into the mad man', 'before turning over']
>>> _str = 'The lorry ran into the mad man before turning over'
>>> re.split("\s{2,}", _str)
['The lorry ran', 'into the mad man', 'before turning over']