Script Slicing in Python

Script Slicing in Python - python

I am new in programming and Python.
I have a list in my hand and I want to manipulate my list as each words will be consisted of (first) 6 letters maximum.
Sample list as below.
new_list = ['partiye', 'oy', 'vermeyecegimi', 'bilerek', 'sandiga', 'gidecegim']
I used below code for cutting the words.
s = []
a = []
for i in range(len(new_list)):
s = new_list[i]
for j in new_list:
a = s[:6]
print a
Each words consist of max 6 letters. The output of the code is "partiy
oy
vermey
bilere
sandig
gidece".
But I could not assign my updated (cutted) words into a new list. Can someone advise me how can I do that ?

This is where list comprehensions become really handy:
>>> new_list = ['partiye', 'oy', 'vermeyecegimi', 'bilerek', 'sandiga', 'gidecegim']
>>> print [i[:6] for i in new_list]
['partiy', 'oy', 'vermey', 'bilere', 'sandig', 'gidece']
If you wanted to expand it:
s = []
for i in new_list:
s.append(i[:6])
It's pretty much the same approach as what you're doing, but it's a little neater.
What you're doing is continually re-assigning a to a value so it's rather pointless. I think you wanted a.append(s[:6]) anyway, which adds a value to a list, but even then that won't work since you're approaching the solution the wrong way :p.

Related

Quicker way to filter lists based on a check to external variable?

I have a variable = 'P13804'
I also have a list like this:
['1T9G\tA\t2.9\tP11310\t241279.81', '1T9G\tS\t2.9\tP38117\t241279.81', '1T9G\tD\t2.9\tP11310\t241279.81', '1T9G\tB\t2.9\tP11310\t241279.81', '1T9G\tR\t2.9\tP13804\t241279.81', '1T9G\tC\t2.9\tP11310\t241279.81']
You can see, if you split each item in this list up by tab, that the third item in each sub-list of this list is sometimes 'P11310' and sometimes is 'P13804'.
I want to remove the items from the list, where the third item does not match my variable of interest (i.e. in this case P13804).
I know a way to do this is:
var = 'P13804'
new_list = []
for each_item in list1:
split_each_item = each_item.split('\t')
if split_each_item[3] != var:
new_list.append(each_item)
print(new_list)
In reality, the lists are really long, and i have a lot of variables to check. So I'm wondering does someone have a faster way of doing this?

It is generally more efficient in Python to build a list with a comprehension than repeatedly appending to it. So I would use:
var = 'P13804'
new_list = [i for i in list1 if i.split('\t')[2] == var]
According to timeit, it saves more or less 20% of the elapsed time.

How to form a list from queried data?

I have the following code:
s = (f'{item["Num"]}')
my_list = []
my_list.append(s)
print(my_list)
As you can see i want this to form a list that i will then be able to store under my_list, the output from my code looks like this (this is a sample from around 2000 different values)
['01849']
['01852']
['01866']
['01883']
etc...
This is not what i had in mind, i want it to look like this
[`01849', '01852', '01866', '01883']
Has anyone got any suggestions on what i do wrong when i create the list? Thanks

You can fix your problem and represent this compactly with a list comprehension. Assuming your collection is called items, it can be represented as such, without the loop:
my_list = [f'{item["Num"]}' for item in items]

You should first initialize a list here, and then use a for-loop to populate it. So:
my_list = []
for values in range(0, #length of your list):
s = (f'{item["Num"]}')
my_list.append(s)
print(my_list)
Even better, you can also use a list comprehension for this:
my_list = [(f'{item["Num"]}') for values in range(0, #length of your list)]

Relationship between elements of two list: how to exploit it in Python?

SO here is my minimal working example:
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# Then I want to have these items all shuffled around, so for instance
list2 = np.random.permutation(list2)
#Now here is my problem: I want to understand which element of the new list2 came from which element of list1. I am looking for something like this:
list1.index(something)
# Basically given an element of list2, I want to understand from where it came from, in list1. I really cant think of a simple way of doing this, but there must be an easy way!
Can you please suggest me an easy solution? This is a minimal working example,however the main point is that I have a list, I do some operation on the elements and assign these to a new list. And then the items get all shuffled around and I need to understand where they came from.

enumerate, like everyone said is the best option but there is an alternative if you know the mapping relation. You can write a function that does the opposite of the mapping relation. (eg. decodes if the original function encodes.)
Then you use decoded_list = map(decode_function,encoded_list) to get a new list. Then by cross comparing this list with the original list, you can achieve your goal.
Enumerate is better if you are certain that the same list was modified using the encode_function from within the code to get the encoded list.
However, if you are importing this new list from elsewhere, eg. from a table on a website, my approach is the way to go.

You could use a permutation list/index :
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# Then I want to have these items all shuffled around, so for instance
index_list = range(len(list2))
index_list = np.random.permutation(index_list)
list3 = [list2[i] for i in index_list]
then,with input_element:
answer = index_list[list3.index(input_element)]

Based on your code:
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# made a recode of index and value
index_list2 = list(enumerate(list2))
# Then I want to have these items all shuffled around, so for instance
index_list3 = np.random.permutation(index_list2)
idx, list3 = zip(*index_list3)
#get the index of element_input in list3, then get the value of the index in idx, that should be the answer you want.
answer = idx[list3.index(element_input)]

def index3_to_1(index):
y = list3[index]
x = np.log(y)/np.log(2) # inverse y=f(x) for your operation
return list1.index(x)
This supposes that the operations you are doing on list2 are reversible. Also, it supposes that each element in list1 is unique.

Python loop through list and shorten by one

I have a list:
mylist = ['apple', 'orange', 'dragon', 'panda']
I want to be able to is loop over the list, do something on each element and then remove the element. I tried this:
for l in mylist:
print l
list.remove(l)
but the my output is:
apple
dragon
EDIT
I actually want to be able to do some comparisons in the loop. So basically I want to be able to take each element, one-by-one, remove that element for the list and compare it against all the other elements in the list. The comparison is a little complex so I don't want to use list comprehension. And I want to be reducing the list by one each time until the list is empty and all elements have been compared with each other.
What is the best way to get each element, work with it and remove it without skipping elements in the list?
Any help, much appreciated.
REDIT
Just to make clear - the real point of this is to go through each element, which is a string fragment and match it with other fragments which have overlapping sequences on either end, thereby building up a complete sequence. The element being processed should be removed from the list prior to looping so that it isn't compared with itself, and the list should shrink by 1 element each processing loop.
In the case a better list example would be:
mylist = ['apples and or', 'oranges have', 'in common', 'e nothing in c']
to give:
'apples and oranges have nothing in common'
Apologies for not being clear from the outset, but it was a specific part of this larger problem that I was stuck on.

You can just reverse the list order (if you want to process the items in the original order), then use pop() to get the items and remove them in turn:
my_list = ['apple', 'orange', 'dragon', 'panda']
my_list.reverse()
while my_list:
print(my_list.pop())

Based on your requirement that you want to "be able to take each element, one-by-one, . . . for the list and compare it against all the other elements in the list", I believe you're best suited to use itertools. Here, without the inefficiency of removing elements from your list, you gain the fool-proof ability to compare every combination to eachother once and only once. Since your spec doesn't seem to provide any use for the deletion (other than achieving the goal of combinations), I feel this works quite nicely.
That said, list comprehensions would be the most python way to approach this, in my opinion, as it does not compromise any capability to do complex comparisons.
import itertools
l = ['apple', 'orange', 'dragon', 'panda']
def yourfunc(a,b):
pass
for a, b in itertools.combinations_with_replacement(l, 2):
yourfunc(a,b)
A list comprehension approach would have this code instead:
[yourfunc(a, b) for a,b in itertools.combinations(l, 2)]
EDIT: Based on your additional information, I believe you should reconsider itertools.
import itertools
l = ['apples and or', 'oranges have', 'in common', 'e nothing in c', 'on, dont you know?']
def find_overlap(a,b):
for i in xrange(len(a)):
if a[-i:] == b[0:i]:
return a + b[i:]
return ''
def reduce_combinations(fragments):
matches = []
for c in itertools.combinations(fragments, 2):
f = reduce(find_overlap, c[1:], c[0])
if f: matches.append(f)
return matches
copy = l
while len(copy) > 1:
copy = reduce_combinations(copy)
print copy
returns
['apples and oranges have nothing in common, dont you know?']
**EDIT: (again). **This permutation is a practical solution and has the added benefit of--while having more computations than the above solution, will provide all possible technical matches. The problem with the above solution is that it expects exactly one answer, which is evidenced by the while loop. Thus, it is much more efficient, but also potentially returning nothing if more than one answer exists.
import itertools
l = ['apples and or', 'oranges have', 'in common', 'e nothing in c', 'on, dont you know?']
def find_overlap(a,b):
for i in xrange(len(a)):
if a[-i:] == b[0:i]:
return a + b[i:]
return ''
matches = []
for c in itertools.combinations(l, 2):
f = reduce(find_overlap, c[1:], c[0])
if f: matches.append(f)
for c in itertools.combinations(matches, len(matches)):
f = reduce(find_overlap, c[1:], c[0])
if f: print f

Is there any reason you can't simply loop through all of the elements, do something to them and then reset the list to an empty list afterwards? Something like:
for l in my_list:
print l
my_list = []
# or, if you want to mutate the actual list object, and not just re-assign
# a blank list to my_list
my_list[:] = []
EDIT
Based on your update, what you need to do is use the popping approach that has been mentioned:
while len(my_list):
item = my_list.pop()
do_some_complicated_comparisons(item)
if you do care about order, then just pop from the front:
my_list.pop(0)
or reverse the list before looping:
my_list.reverse()

You can't remove elements while iterating over the list. Process the elements and then take care of the list. This is the case in all programming languages, not just Python, because it causes these skipping issues.
As an alternative, you can do list = [] afterwards when you're done with the elements.

By making a copy:
for l in original[:]:
print l
original.remove(l)

You could use the stack operations to achieve that:
while len(mylist):
myitem = mylist.pop(0)
# Do something with myitem
# ...

#! C:\python27
import string
list = ['apple', 'orange', 'dragon', 'panda']
print list
myLength = len(list) -1
print myLength
del list[myLength]
print list
[EDIT]
Heres the code to loop through and find a word which a user input and remove it.
#! C:\python27
import string
whattofind = raw_input("What shall we look for and delete?")
myList = ['apple', 'orange', 'dragon', 'panda']
print myList
for item in myList:
if whattofind in myList:
myList.remove(whattofind)
print myList

If forward order doesn't matter, I might do something like this:
l = ['apple', 'orange', 'dragon', 'panda']
while l:
print l.pop()
Given your edit, an excellent alternative is to use a deque instead of a list.
>>> import collections
>>> l = ['apple', 'orange', 'dragon', 'panda']
>>> d = collections.deque(l)
>>> while d:
i = d.popleft()
for j in d:
if i > j: print (i, j)
...
('orange', 'dragon')
Using a deque is nice because popping from either end is O(1). Best not to use a deque for random access though, because that's slower than for a list.
On the other hand, since you're iterating over the whole list every time anyway, the asymptotic performance of your code will be O(n ** 2) anyway. So using a list and popping from the beginning with pop(0) is justifiable from an asymptotic point of view (though it will be slower than using a deque by some constant multiple).
But in fact, since your goal seems to be generating combinations, you should consider hexparrot's answer, which is quite elegant -- though performance-wise, it shouldn't be too different from the above deque-based solution, since removing items from a deque is cheap.

Hmm.. Seeing as you are not able to remove all items this way, even though you iterate through all of them.. try this:
#! C:\python27
list1 = ["apple","pear","falcon","bear"] #define list 1
list2 = [] #define list 2
item2 ="" #define temp item
for item in list1[:]:
item2 = item+"2" #take current item from list 2 and do something. (Add 2 in my case)
list2.append(item2) #add modified item to list2
list1.remove(item) #remove the un-needed item from list1
print list1 #becomes empty
print list2 #full and with modified items.
Im assuming if you are running a comparison, you can dump an ''if'' clause after ''for'' to run the comparison or something. But that seems to be the way to do it.

difference between 2 pieces Python code

I'm doing an exercise as following:
# B. front_x
# Given a list of strings, return a list with the strings
# in sorted order, except group all the strings that begin with 'x' first.
# e.g. ['mix', 'xyz', 'apple', 'xanadu', 'aardvark'] yields
# ['xanadu', 'xyz', 'aardvark', 'apple', 'mix']
# Hint: this can be done by making 2 lists and sorting each of them
# before combining them.
sample solution:
def front_x(words):
listX = []
listO = []
for w in words:
if w.startswith('x'):
listX.append(w)
else:
listO.append(w)
listX.sort()
listO.sort()
return listX + listO
my solution:
def front_x(words):
listX = []
for w in words:
if w.startswith('x'):
listX.append(w)
words.remove(w)
listX.sort()
words.sort()
return listX + words
as I tested my solution, the result is a little weird. Here is the source code with my solution: http://dl.dropbox.com/u/559353/list1.py. You might want to try it out.

The problem is that you loop over the list and remove elements from it (modifying it):
for w in words:
if w.startswith('x'):
listX.append(w)
words.remove(w)
Example:
>>> a = range(5)
>>> for i in a:
... a.remove(i)
...
>>> a
[1, 3]
This code works as follows:
Get first element, remove it.
Move to the next element. But it is not 1 anymore because we removed 0 previously and thus 1 become the new first element. The next element is therefore 2 and 1 is skipped.
Same for 3 and 4.

Two main differences:
Removing an element from a list inside loop where the list is being iterated doesn't quite work in Python. If you were using Java you would get an exception saying that you are modifying a collection that is being iterated. Python doesn't shout this error apparently. #Felix_Kling explains it quite well in his answer.
Also you are modifying the input parameter words. So the caller of your function front_x will see words modified after the execution of the function. This behaviour, unless is explicitly expected, is better to be avoided. Imagine that your program is doing something else with words. Keeping two lists as in the sample solution is a better approach.

Altering the list you're iterating over results in undefined behaviour. That's why the sample solution creates two new lists instead of deleting from the source list.
for w in words:
if w.startswith('x'):
listX.append(w)
words.remove(w) # Problem here!
See this question for a discussion on this matter. It basically boils down to list iterators iterating through the indexes of the list without going back and checking for modifications (which would be expensive!).
If you want to avoid creating a second list, you will have to perform two iterations. One to iterate over words to create listX and another to iterate over listX deleting from words.

That hint is misleading and unnecessary, you can do this without sorting and combining two lists independently:
>>> items = ['mix', 'xyz', 'apple', 'xanadu', 'aardvark']
>>> sorted(items, key=lambda item: (item[0]!='x', item))
['xanadu', 'xyz', 'aardvark', 'apple', 'mix']
The built-in sorted() function takes an option key argument that tells it what to sort by. In this case, you want to create a tuples like (False, 'xanadu') or (True, 'apple') for each element of the original list, which you can do with a lambda.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.