what does this mean in this context?
stuff = [i.split() for i in row]
import csv
with open('AB.csv', 'r') as ABfile:
AB=csv.reader(ABfile,csv.excel)
for row in AB:
print(row)
stuff = [i.split() for i in row]
print(stuff)
this is the output
['qqq', 'qqq', 'sd3 3ds', '12/12/2012']
[['qqq'], ['qqq'], ['sd3', '3ds'], ['12/12/2012']]
This is a list comprehension. It is building the same list as
stuff = []
for i in row:
stuff.append(i.split())
It's just a convenient and pythonic way to build a list.
The split method splits a string into a list on whitespace, examples:
>>> 'qqq'.split()
['qqq']
>>> 'sd3 3ds'.split()
['sd3', '3ds']
For each element in row, split is called and the resulting list is added to stuff. That's why you end up with a list of lists for stuff.
Related
I did some workarounds but none of them worked so here I am with a question on how can we split a value from a list based on a keyword and update in the same list
here is my code,
result_list = ['48608541\ncsm_radar_main_dev-7319-userdevsigned\nLogd\nG2A0P3027145002X\nRadar\ncompleted 2022-10-25T10:43:01\nPASS: 12FAIL: 1SKIP: 1\n2:25:36']
what I want to remove '\n' and write something like this,
result_list = ['48608541', 'csm_radar_main_dev-7319-userdevsigned', 'Logd', 'G2A0P3027145002X', .....]
You need to split each of the string by \n which results in a list that you need to flatten. You can use list-comprehension:
>>> [x for item in result_list for x in item.split('\n') ]
# output:
['48608541', 'csm_radar_main_dev-7319-userdevsigned', 'Logd', 'G2A0P3027145002X', 'Radar', 'completed 2022-10-25T10:43:01', 'PASS: 12FAIL: 1SKIP: 1', '2:25:36']
this will split each element of your list at \n and update in same list
result_list = [item for i in result_list for item in i.split('\n') ]
Solution using regex (re):
import re
result_list = re.split('\n', result_list[0])
#output:
['48608541', 'csm_radar_main_dev-7319-userdevsigned', 'Logd', 'G2A0P3027145002X', 'Radar', 'completed 2022-10-25T10:43:01', 'PASS: 12FAIL: 1SKIP: 1', '2:25:36']
The split() method of the str object does this:
Return a list of the words in the string, using sep as the delimiter string.
>>> '1,2,3'.split(',')
['1', '2', '3']
so here we have the answer as follows:
string_object = "48608541\ncsm_radar_main_dev-7319-userdevsigned\nLogd\nG2A0P3027145002X\nRadar\ncompleted 2022-10-25T10:43:01\nPASS: 12FAIL: 1SKIP: 1\n2:25:36"
result_list = string_object.split(sep='\n')
I have a list of strings with a few unclean entries and I want to replace the unclean entries with clean entries
list = ['created_DATE', 'column1(case', 'timestamp', 'location(case']
I want to get a list that is like this
cleanList = ['created_DATE', 'column1', 'timestamp', 'location']
I tired the following:
str_match = [s for s in list if "(case" in s] *#find the intersecting elements*
print (str_match)
new=[]
for k in str_match:
a=k.replace("(case" , "")
new.append(a) *#make an list of the words without the substring*
print(new)
I am not sure how do I now replace the entries from the new list into the original list. Can someone please help.
Thank you
If you want to remove all occurrences of "case(" from your list's elements, then you could write it like this:
list = ['created_DATE', 'column1(case', 'timestamp', 'location(case']
clean = []
for n in list:
clean.append(n.replace("(case", ""))
print(clean)
You can either create a new list clean as told by #alani:
import re
myList = ['created_DATE', 'column1(case', 'timestamp', 'location(case']
clean = [re.sub("\(.*", "", s) for s in myList]
print(clean)
or iterate over elements of myList and update in place
for i in range(len(myList)):
if "(case" in myList[i]:
myList[i] = myList[i].replace("(case" , "")
print(myList)
Im trying to filter the list1 based on another list2 with the following code:
import csv
with open('screen.csv') as f: #A file with a list of all the article titles
reader = csv.reader(f)
list1 = list(reader)
print(list1)
list2 = ["Knowledge Management", "modeling language"] #key words that article title should have (at least one of them)
list2 = [str(x) for x in list2]
occur = [i for i in list1 for j in list2 if str(j) in i]
print(occur)
but the output is empty.
My list1 looks like this:
list_1 is actually a list of lists, not a list of strings, so you need to flatten it (e.g. by doing this) before trying to compare elements:
list_1 = [['foo bar'], ['baz beep bop']]
list_2 = ['foo', 'bub']
flattened_list_1 = [
element
for sublist in list_1
for element in sublist
]
occurrences = [
phrase
for phrase in flattened_list_1 if any(
word in phrase
for word in list_2
)
]
print(occurrences)
# output:
# ['foo bar']
import pandas as pd
import numpy as np
df = pd.DataFrame(data)
print(df[df.column_of_list.map(lambda x: np.isin(x, another_list).all())])
#OR
print(df[df[0].map(lambda x: np.isin(x, another_list).all())])
Try with real data:
import numpy as np
import pandas as pd
data = ["Knowledge Management", "modeling language"]
another_list=["modeling language","natural language"]
df = pd.DataFrame(data)
a = df[df[0].map(lambda x: np.isin(x, another_list).all())]
print(a)
Your list1 is a list of lists, because the csv.reader that you're using to create it always returns lists for each row, even if there's only a single item. (If you're expecting a single name from each row, I'm not sure why you're using csv here, it's only going to be a hindrance.)
Later when you check if str(j) in i as part of your filtering list comprehension, you're testing if the string j is present in the list i. Since the values in list2 are not full titles but key-phrases, you aren't going to find any matches. If you were checking in the inner strings, you'd get substring checks, but when you test list membership it must be an exact match.
Probably the best way to fix the problem is to do away with the nested lists in list1. Try creating it with:
with open('screen.csv') as f:
list1 = [line.strip() for line in f]
I have the below list of lists
[['Afghanistan,2.66171813,7.460143566,0.490880072,52.33952713,0.427010864,-0.106340349,0.261178523'], ['Albania,4.639548302,9.373718262,0.637698293,69.05165863,0.74961102,-0.035140377,0.457737535']]
I want to create a new list with only the country names.
So
[Afghanistan, Albania]
Currently using this code.
with open(fileName, "r") as f:
_= next(f)
row_lst = f.read().split()
countryLst = [[i] for i in row_lst]
Try this, using split(',') as your first element in list of list is string separated by comma.
>>> lst = [['Afghanistan,2.66171813,7.460143566,0.490880072,52.33952713,0.427010864,-0.106340349,0.261178523'], ['Albania,4.639548302,9.373718262,0.637698293,69.05165863,0.74961102,-0.035140377,0.457737535']]
Output:
>>> [el[0].split(',')[0] for el in lst]
['Afghanistan', 'Albania']
Explanation:
# el[0] gives the first element in you list which a string.
# .split(',') returns a list of elements after spliting by `,`
# [0] finally selecting your first element as required.
Edit-1:
Using regex,
pattern = r'([a-zA-Z]+)'
new_lst = []
for el in lst:
new_lst+=re.findall(pattern, el[0])
>>> new_lst # output
['Afghanistan', 'Albania']
Looks like a CSV file. Use the csv module
Ex:
import csv
with open(fileName, "r") as f:
reader = csv.reader(f)
next(reader) #Skip header
country = [row[0] for row in reader]
I want make script that reads lines from file, than takes slices from each line, combines all slices from 1 line with all slices from 2 line, then combines all slices from previous step with 3rd line.
For example, we have
Stackoverflow (4)
python (3)
question (3)
I get first list with slices of (number) letters.
lst = ['Stac', 'tack', 'acko', 'ckov', 'kove', 'over', 'verf', 'erfl', 'rflo', 'flow']
Then i need to combine it with second list:
lst = ['pyt', 'yth', 'tho', 'hon']
Desired output:
finallist = ['Stacpyt', 'tackpyt', 'ackopyt', 'ckovpyt', 'kovepyt', 'overpyt', 'verfpyt', 'erflpyt', 'rflopyt', 'flowpyt' 'Stacyth', 'tackyth', 'ackoyth', 'ckovyth', 'koveyth', 'overyth', 'verfyth', 'erflyth', 'rfloyth', 'flowyth', ..... , 'erflhon', 'rflohon', 'flowhon']
then with 3rd list:
lst = ['que', 'ues', 'est', 'sti', 'tio', 'ion']
finallist = ['Stacpytque', 'tackpytque', 'ackopytque', 'ckovpytque', 'kovepytque', 'overpytque', 'verfpytque', 'erflpytque', 'rflopytque', .... 'erflhonion', 'rflohonion', 'flowhonion']
I stuck at point where I need to make finallist with combined results.
I am trying pieces of code like this, but its wrong:
for i in lst:
for y in finallist:
finallist.append(i + y)
So if finallist is empty - it should copy lst in first loop iteration, and if finallist is not empty it should combine each element with lst and so on.
I used re.match() in order to get the word and the integer value from your file.
Then, I compute all the sliced subwords and add them to a list, which is then added to a global list.
Finally, I compute all the possibilties you are looking for thank to itertools.product() which behaves like a nested for-loop.
Then, .join() the tuples obtained and you get the final list you wanted.
from itertools import product
from re import match
the_lists = []
with open("filename.txt", "r") as file:
for line in file:
m = match(r'(.*) \((\d+)\)', line)
word = m.group(1)
num = int(m.group(2))
the_list = [word[i:i+num] for i in range(len(word) - num + 1)]
the_lists.append(the_list)
combinaisons = product(*the_lists)
final_list = ["".join(c) for c in combinaisons]
Use ittertools
import itertools
list1 = ['Stac', 'tack', 'acko', 'ckov', 'kove', 'over', 'verf', 'erfl', 'rflo', 'flow']
list2 = ['pyt', 'yth', 'tho', 'hon']
list3 = ['que', 'ues', 'est', 'sti', 'tio', 'ion']
final_list = list(itertools.product(list(itertools.product(list1,list2)),list3))
This will give you all combinations, then you can just join all of them to get your string.
import itertools
def combine(lst):
result = list(itertools.product(*lst))
result = [''.join(item) for item in result]
return result
list1 = ['Stac', 'tack', 'acko', 'ckov', 'kove', 'over', 'verf', 'erfl', 'rflo', 'flow']
list2 = ['pyt', 'yth', 'tho', 'hon']
list3 = ['que', 'ues', 'est', 'sti', 'tio', 'ion']
lst = [list1, list2, list3] # append more list to lst, then pass lst to combination
print combine(lst)
Append all of the candidate lists to lst, and the combine() function will generate all kinds of combinations and then returns the result as a list.