How to turn a file into a list - python

I am currently trying to make a text file with numbers into a list
the text file is
1.89
1.99
2.14
2.51
5.03
3.81
1.97
2.31
2.91
3.97
2.68
2.44
Right now I only know how to read the file. How can i make this into a list?
afterwards how can I assign the list to another list?
for example
jan = 1.89
feb = 1.99
etc
Code from comments:
inFile = open('program9.txt', 'r')
lineRead = inFile.readline()
while lineRead != '':
words = lineRead.split()
annualRainfall = float(words[0])
print(format(annualRainfall, '.2f'))
lineRead = inFile.readline()
inFile.close()

months = ('jan', 'feb', ...)
with open('filename', 'rb') as f:
my_list = [float(x) for x in f]
res = dict(zip(months, my_list))
This will however work ONLY if there are the same number of lines than months!

A file is already an iterable of lines, so you don't have to do anything to make it into an iterable of lines.
If you want to make it specifically into a list of lines, you can do the same thing as with any other iterable: call list on it:
with open(filename) as f:
lines = list(f)
But if you want to convert this into a list of floats, it doesn't matter what kind of iterable you start with, so you might as well just use the file as-is:
with open(filename) as f:
floats = [float(line) for line in f]
(Note that float ignores trailing whitespace, so it doesn't matter whether you use a method that strips off the newlines or leaves them in place.)
From a comment:
now i just need to find out how to assign the list to another list like jan = 1.89, feb = 1.99 and so on
If you know you have exactly 12 values (and it will be an error if you don't), you can write whichever of these looks nicer to you:
jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec = (float(line) for line in f)
jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec = map(float, f)
However, it's often a bad idea to have 12 separate variables like this. Without knowing how you're planning to use them, it's hard to say for sure (see Keep data out of your variable names for some relevant background on making the decision yourself), but it might be better to have, say, a single dictionary, using the month names as keys:
floats = dict(zip('jan feb mar apr may jun jul aug sep oct nov dec'.split(),
map(float, f))
Or to just leave the values in a list, and use the month names as just symbolic names for indices into that list:
>>> jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec = range(12)
>>> print(floats[mar])
2.14
That might be even nicer with an IntEnum, or maybe a namedtuple. Again, it really depends on how you plan to use the data after importing them.

Related

How to re-sort a list/text with Python?

My bot reads another bot's message, then temporarily saves that message, makes a few changes with .replace and then the bot is supposed to change the format of the entries it finds.
I have tried quite a few things, but have not figured it out.
The text looks like this:
06 6 452872995438985XXX
09 22 160462182344032XXX
11 17 302885091519234XXX
And I want to get the following format:
6/06 452872995438985XXX
22/09 160462182344032XXX
17/11 302885091519234XXX
I have already tried the following things:
splitsprint = test.split(' ') # test is in this case the string we use e.g. the text shown above
for x in splitsprint:
month, day, misc = x
print(f"{day}/{month} {misc}")
---
newline = test.split('\n')
for line in newline:
month, day, misc = line.split(' ')
print(f"{day}/{month} {misc}")
But always I got a ValueError: too many values to unpack (expected 3) error or similar.
Does anyone here see my error?
It's because of the trailing white space in the input, I'm guessing. Use strip
s = '''
06 6 452872995438985XXX
09 22 160462182344032XXX
11 17 302885091519234XXX
'''
lines = s.strip().split('\n')
tokens = [l.split(' ') for l in lines]
final = [f'{day}/{month} {misc}' for month, day, misc in tokens]
for f in final:
print(f)

python's json: AttributeError: 'str' object has no attribute 'keys'

I am trying to load a string (the actual program read this line from a file and it is a very large file that I can not manually modify) formatted as a dictionary.
I need to convert the string line into a json object so I can check value of specific key, e.g. myJson[Date] .
This is the script:
import json
mystring = "{'Date': 'Fri, 19 Apr 2019 03:58:04 GMT', 'Server': 'Apache/2.4.39', 'Accept-Ranges': 'bytes'}"
mystring = json.dumps(mystring)
myJson = json.loads(mystring)
print(str(myJson.keys()))
print(str(myJson))
I am getting this error:
AttributeError: 'str' object has no attribute 'keys'
I suspect that the mystring format is not conforming and that the single quotes should be double quotes? Given that I have a large data, and I can not simply replace single colons with double one using simple search/replace as single colons may be included in the values which I should not modify. If this is the cause of the problem, is there any way to replace the colons of the key/value pair only without touching the colons in the values? I am hoping that this is not the problem.
Rather than dealing with the single quoted string and struggling to convert it into json, just use ast package to convert it into a valid dict.
import ast
mystring = "{'Date': 'Fri, 19 Apr 2019 03:58:04 GMT', 'Server': 'Apache/2.4.39', 'Accept-Ranges': 'bytes'}"
my_dict = ast.literal_eval(mystring)
the result is:
> print(my_dict["Date"])
Fri, 19 Apr 2019 03:58:04 GMT
This code stores the string as a dictionary in a variable called "Tempvar"
From that variable you can just use the keys like a regular dictionary.
import json
mystring = "{'Date': 'Fri, 19 Apr 2019 03:58:04 GMT', 'Server': 'Apache/2.4.39', 'Accept-Ranges': 'bytes'}"
exec("tempvar = " + mystring)
mystring = json.dumps(mystring)
myJson = json.loads(mystring)
print(str(tempvar['Date']))
print(str(myJson))
Hope this helps
Yes json decoder likes double quotes for keys and values, and I think you can use python to do the replacement, try if applies:
mystring = "{'Date': 'Fri, 19 Apr 2019 03:58:04 GMT', 'Server': 'Apache/2.4.39', 'Accept-Ranges': 'bytes'}"
json_string = mystring.replace("'", "\"")
d = json.loads(json_string)
dstring = json.dumps(d)
myJson = json.loads(dstring)
print(str(myJson.keys()))

Python: how to extract string from file - only once

I have the below output from router stored in a file
-#- --length-- -----date/time------ path
3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image
4 1896 Sep 27 2019 14:22:08 +05:30 taas/NN41_R11_Golden_Config
5 1876 Nov 27 2017 20:07:50 +05:30 taas/nfast_default.cfg
I want to search for substring 'Golden_Image' from the file & get the complete path. So here, the required output would be this string:
taas/NN41_R11_Golden_Image
First attempt:
import re
with open("outlog.out") as f:
for line in f:
if "Golden_Image" in line:
print(line)
Output:
3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image
Second attempt
import re
hand = open('outlog.out')
for line in hand:
line = line.rstrip()
x = re.findall('.*?Golden_Image.*?',line)
if len(x) > 0:
print x
Output:
['3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image']
Neither of these give the required output. How can I fix this?
This is actually surprisingly fiddly to do if the path can contain spaces.
You need to use the maxsplit argument to split to identify the path field.
with open("outlog.out") as f:
for line in f:
field = line.split(None,7)
if "Golden_Image" in field:
print(field)
Do split on the line and check for the "Golden_Image" string exists in the splitted parts.
import re
with open("outlog.out") as f:
for line in f:
if not "Golden_Image" in i:
continue
print re.search(r'\S*Golden_Image\S*', line).group()
or
images = re.findall(r'\S*Golden_Image\S*', open("outlog.out").read())
Example:
>>> s = '''
-#- --length-- -----date/time------ path
3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image
4 1896 Sep 27 2019 14:22:08 +05:30 taas/NN41_R11_Golden_Config
5 1876 Nov 27 2017 20:07:50 +05:30 taas/nfast_default.cfg'''.splitlines()
>>> for line in s:
for i in line.split():
if "Golden_Image" in i:
print i
taas/NN41_R11_Golden_Image
>>>
Reading full content at once and then doing the search will not be efficient. Instead, file can be read line by line and if line matches the criteria then path can be extracted without doing further split and using RegEx.
Use following RegEx to get path
\s+(?=\S*$).*
Link: https://regex101.com/r/zuH0Zv/1
Here if working code:
import re
data = "3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image"
regex = r"\s+(?=\S*$).*"
test_str = "3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image"
matches = re.search(regex, test_str)
print(matches.group().strip())
Follow you code, if you just want get the right output, you can more simple.
with open("outlog.out") as f:
for line in f:
if "Golden_Image" in line:
print(line.split(" ")[-1])
the output is :
taas/NN41_R11_Golden_Image
PS: if you want some more complex operations, you may need try the re module like the #Avinash Raj answered.

Python print both the matching groups in regex

I want to find two fixed patterns from a log file. Here is a line in a log file looks like
passed dangerb.xavier64.423181.k000.drmanhattan_resources.log Aug 23
04:19:37 84526 362
From this log, I want to extract drmanhattan and 362 which is a number just before the line ends.
Here is what I have tried so far.
import sys
import re
with open("Xavier.txt") as f:
for line in f:
match1 = re.search(r'((\w+_\w+)|(\d+$))',line)
if match1:
print match1.groups()
However, everytime I run this script, I always get drmanhattan as output and not drmanhattan 362.
Is it because of | sign?
How do I tell regex to catch this group and that group ?
I have already consulted this and this links however, it did not solve my problem.
line = 'Passed dangerb.xavier64.423181.r000.drmanhattan_resources.log Aug 23 04:19:37 84526 362'
match1 = re.search(r'(\w+_\w+).*?(\d+$)', line)
if match1:
print match1.groups()
# ('drmanhattan_resources', '362')
If you have a test.txt file that contains the following lines:
Passed dangerb.xavier64.423181.r000.drmanhattan_resources.log Aug 23
04:19:37 84526 362 Passed
dangerb.xavier64.423181.r000.drmanhattan_resources.log Aug 23 04:19:37
84526 363 Passed
dangerb.xavier64.423181.r000.drmanhattan_resources.log Aug 23 04:19:37
84526 361
you can do:
with open('test.txt', 'r') as fil:
for line in fil:
match1 = re.search(r'(\w+_\w+).*?(\d+)\s*$', line)
if match1:
print match1.groups()
# ('drmanhattan_resources', '362')
# ('drmanhattan_resources', '363')
# ('drmanhattan_resources', '361')
| mean OR so your regex catch (\w+_\w+) OR (\d+$)
Maybe you want something like this :
((\w+_\w+).*?(\d+$))
With re.search you only get the first match, if any, and with | you tell re to look for either this or that pattern. As suggested in other answers, you could replace the | with .* to match "anything in between" those two pattern. Alternatively, you could use re.findall to get all matches:
>>> line = "passed dangerb.xavier64.423181.k000.drmanhattan_resources.log Aug 23 04:19:37 84526 362"
>>> re.findall(r'\w+_\w+|\d+$', line)
['drmanhattan_resources', '362']

To output 1 on 1 relationship, using Python 2.76

using Python 2.76, I want outputs like:
jan is 01
feb is 02
mar is 03
so I write like below:
Dict = {'jan':'01', 'feb':'02', 'mar':'03'}
for month in Dict.keys():
for num in Dict.values():
print month + " is: " + num
however the output is not what I wanted. Shall I use dictionary in this case, or I heading a wrong direction.
Thanks.
You can do this more straightforwardly using the iteritems() method on a dictionary:
for month, num in Dict.iteritems():
print month, "is", num
What you're doing is printing each value for each key, leading to nine outputs.

Categories

Resources