Iterating over multiple lists in Python

Iterating over multiple lists in Python - python

I have a list within a list, and I am trying to iterate through one list, and then in the inner list I want to search for a value, and if this value is present, place that list in a variable.
Here's what I have, which doesn't seem to be doing the job:
for z, g in range(len(tablerows), len(andrewlist)):
tablerowslist = tablerows[z]
if "Andrew Alexander" in tablerowslist:
andrewlist[g] = tablerowslist
Any ideas?
This is the list structure:
[['Kyle Bazzy', 'FUP dropbox message', '8/18/2011', 'Swing Trade Stocks</a>', ' ', 'Affiliate blog'], ['Kyle Bazzy', 'FUP dropbox message', '8/18/2011', 'Swing Trade Software</a>', ' ', 'FUP from dropbox message. Affiliate blog'], ['Kyle Bazzy', 'FUP dropbox message', '8/18/2011', 'Start Day Trading (Blog)</a>', ' ', 'FUP from dropbox message'], ['Kyle Bazzy', 'Call, be VERY NICE', '8/18/2011', ' ', 'r24867</a>', 'We have been very nice to him, but he wants to cancel, we need to keep being nice and seeing what is wrong now.'], ['Jason Raznick', 'Reach out', '8/18/2011', 'Lexis Nexis</a>', ' ', '-'], ['Andrew Alexander', 'Check on account in one week', '8/18/2011', ' ', 'r46876</a>', '-'], ['Andrew Alexander', 'Cancel him from 5 dollar feed', '8/18/2011', ' ', 'r37693</a>', '-'], ['Aaron Wise', 'FUP with contract', '8/18/2011', 'YouTradeFX</a>', ' ', "Zisa is on vacation...FUP next week and then try again if she's still gone."], ['Aaron Wise', 'Email--JASON', '8/18/2011', 'Lexis Nexis</a>', ' ', 'email by today'], ['Sarah Knapp', '3rd FUP', '8/18/2011', 'Steven L. Pomeranz</a>', ' ', '-'], ['Sarah Knapp', 'Are we really interested in partnering?', '8/18/2011', 'Reverse Spins</a>', ' ', "V. political, doesn't seem like high quality content. Do we really want a partnership?"], ['Sarah Knapp', '2nd follow up', '8/18/2011', 'Business World</a>', ' ', '-'], ['Sarah Knapp', 'Determine whether we are actually interested in partnership', '8/18/2011', 'Fayrouz In Dallas</a>', ' ', "Hasn't updated since September 2010."], ['Sarah Knapp', 'See email exchange w/Autumn; what should happen', '8/18/2011', 'Graham and Doddsville</a>', ' ', "Wasn't sure if we could partner bc of regulations, but could do something meant simply to increase traffic both ways."], ['Sarah Knapp', '3rd follow up', '8/18/2011', 'Fund Action</a>', ' ', '-']]
For any value that has a particular value in it, say, Andrew Alexander, I want to make a separate list of these.
For example:
[['Andrew Alexander', 'Check on account in one week', '8/18/2011', ' ', 'r46876</a>', '-'], ['Andrew Alexander', 'Cancel him from 5 dollar feed', '8/18/2011', ' ', 'r37693</a>', '-']]

Assuming you have a list whose elements are lists, this is what I'd do:
andrewlist = [row for row in tablerows if "Andrew Alexander" in row]

>>> #I have a list within a list,
>>> lol = [[1, 2, 42, 3], [4, 5, 6], [7, 42, 8]]
>>> found = []
>>> #iterate through one list,
>>> for i in lol:
... #in the inner list I want to search for a value
... if 42 in i:
... #if this value is present, place that list in a variable
... found.append(i)
...
>>> found
[[1, 2, 42, 3], [7, 42, 8]]

for z, g in range(len(tablerows), len(andrewlist)):
This means "make a list of the numbers which are between the length of tablerows and the length of andrewlist, and then look at each of those numbers in turn, and treat those numbers as a list of two values, and assign the two values to z and g each time through the loop".
A number cannot be treated as a list of two values, so this fails.
You need to be much, much clearer about what you are doing. Show an example of the contents of tablerows before the loop, and the contents of andrewlist before the loop, and what it should look like afterwards. Your description is muddled: I can only guess that when you say "and then I want to iterate through one list" you mean one of the lists in your list-of-lists; but I can't tell whether you want one specific one, or each one in turn. And then when you next say "and then in the inner list I want to...", I have no idea what you're referring to.

Related

Extract text with multiple regex patterns in Python

I have a list with address information
The placement of words in the list can be random.
address = [' South region', ' district KTS', ' 4', ' app. 106', ' ent. 1', ' st. 15']
I want to extract each item of a list in a new string.
r = re.compile(".region")
region = list(filter(r.match, address))
It works, but there are more than 1 pattern "region". For example, there can be "South reg." or "South r-n".
How can I combine a multiple patterns?
And digit 4 in list means building number. There can be onle didts, or smth like 4k1.
How can I extract building number?

Hopefully I understood the requirement correctly.
For extracting the region, I chose to get it by the first word, but if you can be sure of the regions which are accepted, it would be better to construct the regex based on the valid values, not first word.
Also, for the building extraction, I am not sure of which are the characters you want to keep, versus the ones which you may want to remove. In this case I chose to keep only alphanumeric, meaning that everything else would be stripped.
CODE
import re
list1 = [' South region', ' district KTS', ' -4k-1.', ' app. 106', ' ent. 1', ' st. 15']
def GetFirstWord(list2,column):
return re.search(r'\w+', list2[column].strip()).group()
def KeepAlpha(list2,column):
return re.sub(r'[^A-Za-z0-9 ]+', '', list2[column].strip())
print(GetFirstWord(list1,0))
print(KeepAlpha(list1,2))
OUTPUT
South
4k1

Why is this code acting as if it is stuck in an infinite loop?

I'm trying to create entries (1000), and I'm starting with the name. I came up with some names, and I planned to copy the entry with the number 0-9 added to to create more unique names. So I used a for loop inside of a for loop. Is it not possible to change the index into a string and add it to the end of the item in the list.
I thought about incrementing, because I have been coding a lot in C++ lately, but that didn't work because you don't need to increment when you use python's range function. I thought about changing the order of the loops
name = ['event', 'thing going on', 'happening', 'what everyones talkin', 'that thing', 'the game', 'the play', 'outside time', 'social time', 'going out', 'having fun']
for index in range(10):
for item in name:
name.append(item+str(index))
return name
I want it to print out ['event0', 'thing going on1', ... 'having fun10']
Thank you!

Using list comprehension
enumerate() - method adds a counter to an iterable and returns it in a form of enumerate object.
Ex.
name = ['event', 'thing going on', 'happening', 'what everyones talkin', 'that thing', 'the game', 'the play', 'outside time', 'social time', 'going out', 'having fun']
new_list = [x+str(index) for index,x in enumerate(name)]
print(new_list)
O/P:
['event0', 'thing going on1', 'happening2', 'what everyones talkin3', 'that thing4', 'the game5', 'the play6', 'outside time7', 'social time8', 'going out9', 'having fun10']

use a new list and append there:
newName = []
for index in range(10):
for item in name:
newName.append(item+str(index))
return newName

You're adding more elements to the list every time you run through the loop. Make a new empty list and append the elements to that list. That way your initial list stays the same.

The inner for loop is appending new element to the array.
You simply need
for index in range(10):
name[index] = name[index] + str(index)
now your array contains your expected output. This changes your original array btw. If you want it to keep unchanged, then do the following.
newArray = []
for index in range(10):
newArray [index] = name[index] + str(index)

It will work as you expect:
name = ['event', 'thing going on', 'happening', 'what everyones talkin', 'that thing',
'the game', 'the play', 'outside time', 'social time', 'going out', 'having fun']
for index in range(10):
name[index] = name[index]+str(index)
print (name)
Output:
['event0', 'thing going on1', 'happening2', 'what everyones talkin3', 'that thing4', 'the game5', 'the play6', 'outside time7', 'social time8', 'going out9', 'having fun10']

Is there a way to remove specific combinations of punctuation in a string?

I have been iterating over soup that I have scraped, and part of the data I need is so close to being right, but I just cant get the last part clean. Is there a simple way for the following.
I've tried to use re and join, but neither work, due to the fact that the way the punctuation shows up is varied.
I want to turn this:
"['Coming To ', America]", "['Captain ', America, ': The Winter...']",
"[America, 'n Pie']", "[America, 'n Made']"
Into this:
'Coming To America', 'Captain America: The Winter...', 'American Pie',
'American Made'

As you are probably reading python code from a file, you should use eval as this is the most generic method to compute what you want.
This avoids adding a new line of replace each time a new character appears (such as tabs or parenthesis), but this also leads to security breaches if you are not careful with what you are doing
The eval function lets a Python program run Python code within itself.
You need to define the variable America to make it a valid python statement, then you can eval this to a list and then join each part
s = ["['Coming To ', America]", "['Captain ', America, ': The Winter...']", "[America, 'n Pie']", "[America, 'n Made']"]
America = 'America'
for x in s:
print(''.join(eval(x)))
Output :
Coming To America
Captain America: The Winter...
American Pie
American Made

use map() on the list and filter() on each string in the list:
lst = ["['Coming To ', America]", "['Captain ', America, ': The Winter...']",
"[America, 'n Pie']", "[America, 'n Made']"]
punct = set(list("[],'\n"))
print(list(
map(lambda s: ''.join(filter(lambda c: c not in punct, s)), lst)
))
Outputs:
['Coming To America', 'Captain America : The Winter...', 'America n Pie', 'America n Made']
if you want to remove other characters just add them to punct

Using ast for this might be overdoing it, but anyway here is a way:
import ast
# AST visitor that transforms names into strings
class NamesAsStrings(ast.NodeTransformer):
def visit_Name(self, node):
return ast.copy_location(ast.Str(
s=node.id,
ctx=node.ctx
), node)
ss = ("['Coming To ', America]",
"['Captain ', America, ': The Winter...']",
"[America, 'n Pie']",
"[America, 'n Made']")
visitor = NamesAsStrings()
strs = [''.join(ast.literal_eval(visitor.visit(ast.parse(s)).body[0].value)) for s in ss]
print(*strs, sep='\n')
Output:
Coming To America
Captain America: The Winter...
American Pie
American Made
This works only if non-string elements (here America) are valid Python names. However, it has the advantage that it will deal correctly with escaped characters in the strings.

The function that you want is the replace method of the strings.
It's syntax is like this :
newString = oldString.replace("oldSubstring", "newSubstring")
So, using it to resolve your problem would look like this:
a = ["['Coming To ', America]", "['Captain ', America, ': The Winter...']", "[America, 'n Pie']", "[America, 'n Made']"]
result = []
toRemove = ["', ", ", '", "'", "[", "]"]
for element in a:
b = element
for punct in toRemove:
b = b.replace(punct, "")
result.append(b)
print("\n".join(result))

Regex - Splitting Strings at full-stops unless it's part of an honorific [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a list containing all possible titles:
['Mr.', 'Mrs.', 'Ms.', 'Dr.', 'Prof.', 'Rev.', 'Capt.', 'Lt.-Col.', 'Col.', 'Lt.-Cmdr.', 'The Hon.', 'Cmdr.', 'Flt. Lt.', 'Brgdr.', 'Wng. Cmdr.', 'Group Capt.' ,'Rt.', 'Maj.-Gen.', 'Rear Admrl.', 'Esq.', 'Mx', 'Adv', 'Jr.']
I need a Python 2.7 code that can replace all full-stops \. with newline \n unless it's one of the above titles.
Splitting it into a list of strings would be fine as well.
Sample Input:
Modi is waiting in line to Thank Dr. Manmohan Singh for preparing a road map for introduction of GST in India. The bill is set to pass.
Sample Output:
Modi is waiting in line to Thank Dr. Manmohan Singh for preparing a road map for introduction of GST in India.
The bill is set to pass.

This should do the trick, here we use a list comprehension with a conditional statement to concatenate the words with a \n if they contain a full-stop, and are not in the list of key words. Otherwise just concatenate a space.
Finally the words in the sentence are joined using join(), and we use rstrip() to eliminate any newline remaining at the end of the string.
l = set(['Mr.', 'Mrs.', 'Ms.', 'Dr.', 'Prof.', 'Rev.', 'Capt.', 'Lt.-Col.',
'Col.', 'Lt.-Cmdr.', 'The Hon.', 'Cmdr.', 'Flt. Lt.', 'Brgdr.', 'Wng. Cmdr.',
'Group Capt.' ,'Rt.', 'Maj.-Gen.', 'Rear Admrl.', 'Esq.', 'Mx', 'Adv', 'Jr.'] )
s = 'Modi is waiting in line to Thank Dr. Manmohan Singh for preparing a road
map for introduction of GST in India. The bill is set to pass.'
def split_at_period(input_str, keywords):
final = []
split_l = input_str.split(' ')
for word in split_l:
if '.' in word and word not in keywords:
final.append(word + '\n')
continue
final.append(word + ' ')
return ''.join(final).rstrip()
print split_at_period(s, l)
or a one liner :D
print ''.join([w + '\n' if '.' in w and w not in l else w + ' ' for w in s.split(' ')]).rstrip()
Sample output:
Modi is waiting in line to Thank Dr. Manmohan Singh for preparing a road map for introduction of GST in India.
The bill is set to pass.
How it works?
Firstly we split up our string with a space ' ' delimiter using the split() string function, thus returning the following list:
>>> ['Modi', 'is', 'waiting', 'in', 'line', 'to', 'Thank', 'Dr.',
'Manmohan', 'Singh', 'for', 'preparing', 'a', 'road', 'map', 'for',
'introduction', 'of', 'GST', 'in', 'India.', 'The', 'bill', 'is',
'set', 'to', 'pass.']
We then start to build up a new list by iterating through the split-up list. If we see a word that contains a period, but is not a keyword, (Ex: India. and pass. in this case) then we have to concatenate a newline \n to the word to begin the new sentence. We can then append() to our final list, and continue out of the current iteration.
If the word does not end off a sentence with a period, we can just concatenate a space to rebuild the original string.
This is what final looks like before it is built as a string using join().
>>> ['Modi ', 'is ', 'waiting ', 'in ', 'line ', 'to ', 'Thank ', 'Dr.
', 'Manmohan ', 'Singh ', 'for ', 'preparing ', 'a ', 'road ', 'map ',
'for ', 'introduction ', 'of ', 'GST ', 'in ', 'India.\n', 'The ', 'bill ',
'is ', 'set ', 'to ', 'pass.\n']
Excellent, we have spaces, and newlines where they need to be! Now, we can rebuild the string. Notice however, that the the last element in the list also happens to contain a \n, we can clean that up with calling rstrip() on our new string.
The initial solution did not support spaces in the keywords, I've included a new more robust solution below:
import re
def format_string(input_string, keywords):
regexes = '|'.join(keywords) # Combine all keywords into a regex.
split_list = re.split(regexes, input_string) # Split on keys.
removed = re.findall(regexes, input_string) # Find removed keys.
newly_joined = split_list + removed # Interleave removed and split.
newly_joined[::2] = split_list
newly_joined[1::2] = removed
space_regex = '\.\s*'
for index, section in enumerate(newly_joined):
if '.' in section and section not in removed:
newly_joined[index] = re.sub(space_regex, '.\n', section)
return ''.join(newly_joined).strip()

convert all titles (and sole dot) into a regular expression
use a replacement callback
code:
import re
l = "|".join(map(re.escape,['.','Mr.', 'Mrs.', 'Ms.', 'Dr.', 'Prof.', 'Rev.', 'Capt.', 'Lt.-Col.', 'Col.', 'Lt.-Cmdr.', 'The Hon.', 'Cmdr.', 'Flt. Lt.', 'Brgdr.', 'Wng. Cmdr.', 'Group Capt.' ,'Rt.', 'Maj.-Gen.', 'Rear Admrl.', 'Esq.', 'Mx', 'Adv', 'Jr.']))
e="Dear Mr. Foo, I would like to thank you. Because Lt.-Col. Collins told me blah blah. Bye."
def do_repl(m):
s = m.group(1)
if s==".":
rval=".\n"
else:
rval = s
return rval
z = re.sub("("+l+")",do_repl,e)
# bonus: leading blanks should be stripped even that's not the question
z= re.sub(r"\s*\n\s*","\n",z,re.DOTALL)
print(z)
output:
Dear Mr. Foo, I would like to thank you.
Because Lt.-Col. Collins told me blah blah.
Bye.

Matching partial cases to Python dictionary

At first glance I thought this to be a simple issue yet I cannot find an answer that accurately fits...
I have a dictionary of state names and abbreviations like so;
{(' ak', ',ak', ', ak', 'juneau', ',alaska', ', alaska'): 'alaska',
(' al', ',al', ', al', 'montgomery', ',alabama', ', alabama'): 'alabama',
(' ar', ',ar', ', ar', 'little rock', ',arkansas', ', arkansas'): 'arkansas',
(' az', ',az', ', az', 'phoenix', ',arizona', ', arizona'): 'arizona',
I am attempting to map this dictionary over various cases of self reported Twitter location that I have in a pandas dataframe to look for partial matches. For example, if one case read 'anchorage,ak' it would change the value to Alaska. I could see this being quite simple if it were a list, yet there must be another way to do this without looping. Any help is greatly appreciated!

I think timgeb has the right idea above. I would add two things:
1) You can also remove all whitespace from the given case before processing-- thus, there will be no need to include ' ak', ',ak', and ', ak' all as keys-- a simple 'ak' key would suffice.
2) Instead of repeating the state values in the dictionary, I would create an extra hash from integers to states i.e. {0: 'alaska, 1: 'alabama' ...} and store the corresponding integer key in your original dictionary.
Thus your resulting dictionary should look something like this:
A = {'ak': 0, 'juneau': 0, 'alaska': 0, 'al': 1, 'montgomery': 1, 'alabama': 1, ...}
And to access state names from integer values, you should have another dictionary like this for all 50 states:
B = {0: 'alaska', 1: 'alabama', ...}
so given a case...
case = 'anchorage,ak'
case_list = case.replace(' ', '').split(',') # remove all whitespace and split case by comma
for elem in case_list:
if elem in A:
# insert code to replace case with B[A[elem]]
break

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Iterating over multiple lists in Python - python

Assuming you have a list whose elements are lists, this is what I'd do: andrewlist = [row for row in tablerows if "Andrew Alexander" in row]

Related

Extract text with multiple regex patterns in Python

Why is this code acting as if it is stuck in an infinite loop?

Is there a way to remove specific combinations of punctuation in a string?

Regex - Splitting Strings at full-stops unless it's part of an honorific [closed]

Matching partial cases to Python dictionary

Categories

Resources