creating two separate lists read from a file in python

creating two separate lists read from a file in python - python

Can i please know how the data from a file can be split into two separate lists. For example,
file contains data as 1,2,3,4;5,6,7
my code:
for num in open('filename','r'):
list1 = num.strip(';').split()
Here , i want a new list before semi colon (i.e) [1,2,3,4] and new list after semi colon (i.e) [5,6,7]

If you are certain that your file only contains 2 lists, you can use a list comprehension:
l1, l2 = [sub.split(',') for sub in data.split(';')]
# l1 = ['1', '2', '3', '4']
# l2 = ['5', '6', '7']
More generally,
lists = [sub.split(',') for sub in data.split(';')]
# lists[0] = ['1', '2', '3', '4']
# lists[1] = ['5', '6', '7']
If integers are needed, you can use a second list comprehension:
lists = [[int(item) for item in sub.split(',')] for sub in data.split(';')]

To get the final list you need to split on "," as well (and probably map() the result to int()):
with open("filename") as f:
for line in f:
list1, list2 = [x.split(",") for x in line.rstrip().split(";")]

Depending on the size of your file, you could simply read the whole file into a string at once and then first split by semicolon, then by comma:
with open('filename', 'r') as f: #open file
s = f.read() #read entire contents into string
lists = s.split(';') #split separate lists by semicolon delimiters
for l in lists: #for each list
l = [int(x) for x in l.split(',')] #separate the string by commas and convert to integers

Related

How to compare each element of multiple lists and return name of lists which are different

I am having multiple lists and I need to compare each list with one another and return the name of lists which are different. We need to consider value of elements in list irrespective of their position while comparing lists.
For example:-
Lis1=['1','2','3']
Lis2=['1','2']
Lis3=['0','1','3']
Lis4=[]
Lis5=['1','2']
Output:-
['Lis1','Lis2','Lis3','Lis4']
Thanks in advance.

Try this:
input_lists = {"Lis1": ['1', '2', '3'], "Lis2": ['1', '2'],
"Lis3": ['0', '1', '3'], "Lis4": [], "Lis5": ['1', '2']}
output_lists = {}
for k, v in input_lists.items():
if sorted(v) not in output_lists.values():
output_lists[k] = sorted(v)
unique_keys = list(output_lists.keys())
print(unique_keys) # ['Lis1', 'Lis2', 'Lis3', 'Lis4']

import itertools
Lis1=['1','2','3']
Lis2=['1','2']
Lis3=['0','1','3']
Lis4=[]
Lis5=['1','2']
k=[Lis1,Lis2,Lis3,Lis4,Lis5]
k.sort()
list(k for k,_ in itertools.groupby(k))
output
[[], ['0', '1', '3'], ['1', '2'], ['1', '2', '3']]

a simple way to implement
Lis1=['1','2','3']
Lis2=['1','2']
Lis3=['0','1','3']
Lis4=[]
Lis5=['1','2']
lis=[Lis1,Lis2,Lis3,Lis4,Lis5]
final=[]
for ele in lis:
if(ele not in final):
final.append(ele)
print(final)

with your given data you can use:
Lis1=['1','2','3']
Lis2=['1','2']
Lis3=['0','1','3']
Lis4=[]
Lis5=['1','2']
name_lis = {'Lis1': Lis1, 'Lis2': Lis2, 'Lis3': Lis3, 'Lis4': Lis4, 'Lis5': Lis5}
tmp = set()
response = []
for k, v in name_lis.items():
s = ''.join(sorted(v))
if s not in tmp:
tmp.add(s)
response.append(k)
print(response)
output:
['Lis1', 'Lis2', 'Lis3', 'Lis4']
name_lis dictionary contains the name of your list and the actual list, you are iterating over each list, and for each list, you are sorting the elements and then converting in a string, if the string was encountered before you know that the list is a duplicate if not you are adding the list to the response

Remove wildcard string from list

I have a list which is a large recurring dataset with headers of the form:
array = ['header = 1','0','1','2',...,'header = 1','1','2','3',...,'header = 2','1','2','3']
The header string can vary between each individual dataset, but the size of the individual datasets do not.
I would like to remove all of the headers so that I am left with:
array = ['0','1','2',...,'1','2','3',...,'1','2','3']
If the header string does not vary, then I can remove them with:
lookup = array[0]
while True:
try:
array.remove(lookup)
except ValueError:
break
However, if the header strings do change, then they are not caught, and I am left with:
array = ['0','1','2',...,'1','2','3',...,'header = 2','1','2','3']
Is there a way in which the sub-string "header" can be removed, regardless of what else is in the string?

Best use a list comprehension with a condition instead of repeatedly removing elements. Also, use startswith instead of using a fixed lookup to compare to.
>>> array = ['header = 1','0','1','2','header = 1','1','2','3','header = 2','1','2','3']
>>> [x for x in array if not x.startswith("header")]
['0', '1', '2', '1', '2', '3', '1', '2', '3']
Note that this does not modify the existing list but create a new one, but it should be considerably faster as each single remove has O(n) complexity.
If you do not know what the header string is, you can still determine it from the first element:
>>> lookup = array[0].split()[0] # use first part before space
>>> [x for x in array if not x.startswith(lookup)]
['0', '1', '2', '1', '2', '3', '1', '2', '3']

Using the find() method you can determine whether or not the word "header" is contained in the first list item and use that to determine whether or not to remove the first item.

max method of a list consisting of strings

I am wondering why the result is '4' if I write the following code:
lists = ['1','2','3','4']
print(max(lists))
lists.append(5)
print(max(lists))
I suppose that the max method of lists converts from str to int first and then gives me the max of ints in the first couple of lines, but this seems untrue if I try the next lines. Can anyone explain this?

Your list contains strings and you are appending an integer.
lists = ['1', '2', '3', '4', 5]
TypeError: '>' not supported between instances of 'str' and 'int'
If you had only strings or only int's max will do the comparison as the '>' operator will work. You need to convert the list to all strings or all ints.
lists = [int(x) for x in lists] #by list comprehension
>>> max(lists) # lists= [1, 2, 3, 4, 5]
>>> 5
lists = [str(x) for x in lists]
>>> max(lists) # lists = ['1', '2', '3', '4', '5']
>>> '5'
If you have not yet seen list comprehensions it's doing this but much faster and in a single line of code.
new_list = []
for x in lists:
x = int(x) #convert each individual term to integer objects
new_list.append(x)
lists = new_list

Actually, it doesn't give you TypeError it will return '4' for both cases below:
list = ['1', '2', '3', '4']
list = ['1', '2', '3', '4', 5]
because when comparing objects in python, str always > int that's why you are getting '4' as the max value because it is the highest value among the strings
here is an example to prove what I'm saying:
>print '1' > 5
True
>print '1' > '5'
False

You can try this:
lists = ['1','2','3','4']
print(max(list(map(int, lists))))
lists.append(5)
print(max(list(map(int, lists))))

How to remove specific strings from a list

From the following list how can I remove elements ending with Text.
My expected result is a=['1,2,3,4']
My List is a=['1,2,3,4,5Text,6Text']
Should i use endswith to go about this problem?

Split on commas, then filter on strings that are only digits:
a = [','.join(v for v in a[0].split(',') if v.isdigit())]
Demo:
>>> a=['1,2,3,4,5Text,6Text']
>>> [','.join(v for v in a[0].split(',') if v.isdigit())]
['1,2,3,4']
It looks as if you really wanted to work with lists of more than one element though, at which point you could just filter:
a = ['1', '2', '3', '4', '5Text', '6Text']
a = filter(str.isdigit, a)
or, using a list comprehension (more suitable for Python 3 too):
a = ['1', '2', '3', '4', '5Text', '6Text']
a = [v for v in a if v.isdigit()]

Use str.endswith to filter out such items:
>>> a = ['1,2,3,4,5Text,6Text']
>>> [','.join(x for x in a[0].split(',') if not x.endswith('Text'))]
['1,2,3,4']
Here str.split splits the string at ',' and returns a list:
>>> a[0].split(',')
['1', '2', '3', '4', '5Text', '6Text']
Now filter out items from this list and then join them back using str.join.

try this. This works with every text you have in the end.
a=['1,2,3,4,5Text,6Text']
a = a[0].split(',')
li = []
for v in a:
try : li.append(int(v))
except : pass
print li

How do I remove hyphens from a nested list?

In the nested list:
x = [['0', '-', '3', '2'], ['-', '0', '-', '1', '3']]
how do I remove the hyphens?
x = x.replace("-", "")
gives me AttributeError: 'list' object has no attribute 'replace', and
print x.remove("-")
gives me ValueError: list.remove(x): x not in list.

x is a list of lists. replace() will substitute a pattern string for another within a string. What you want is to remove an item from a list. remove() will remove the first occurrence of an item. A simple approach:
for l in x:
while ("-" in l):
l.remove("-")
For more advanced solutions, see the following: Remove all occurrences of a value from a Python list

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

creating two separate lists read from a file in python - python

To get the final list you need to split on "," as well (and probably map() the result to int()): with open("filename") as f: for line in f: list1, list2 = [x.split(",") for x in line.rstrip().split(";")]

Related

How to compare each element of multiple lists and return name of lists which are different

Remove wildcard string from list

max method of a list consisting of strings

How to remove specific strings from a list

How do I remove hyphens from a nested list?

Categories

Resources