python-Get a string after specific word in line - python

I'm searching JIRA tickets which have specific subject.I put results in JSON file (whole file:https://1drv.ms/f/s!AizscpxS0QM4attoSBbMLkmKp1s)
I wrote a python code to get ticket description
#!/usr/bin/python
import sys
import json
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding("utf-8")
sys.stdout = open('output.txt','wt')
datapath = sys.argv[1]
data = json.load(open(datapath))
for issue in data['issues']:
if len(issue['fields']['subtasks']) == 0 or 'description' in issue['fields']:
custom_field = issue['fields']['description']
my_string=custom_field
#print custom_field
print my_string.split("name:",1)[1]
Some tickets have this value in description:
"description": "name:some name\r\n\r\ncount:5\r\n\r\nregion:some region\r\n\r\n\u00a0",
i need to get values after Name, count and region for all tickets:
desired output (in this example JSON file):
some name 5 some region
some name 5 some region
With code above i can get all values after name
some name^M
^M
count:5^M
^M
region:some region
Also, how to skip processing tickets which have no these values in description, in that case i get:
print custom_field.split("name",1)[2]
IndexError: list index out of range

This looks like a job for a regular expression:
>>> import re
>>> x = r"(\w+):(.+)\r\n\r"
>>> regexp = re.compile(x)
>>> s = "name:some name\r\n\r\ncount:5\r\n\r\nregion:some region\r\n\r\n\u00a0"
>>> regexp.findall(s)
[('name', 'some name'), ('count', '5'), ('region', 'some region')]
Or, if you want a dictionary back,
>>> dict(regexp.findall(s))
{'count': '5', 'region': 'some region', 'name': 'some name'}
You can drop the keys from the dict like this:
>>> mydict = dict(regexp.findall(s))
>>> mydict.values()
mydict.values()
['5', 'some region', 'some name']
But be careful, because they may not be in the order you expect. To match your desired output:
>>> mydict = dict(regexp.findall(s))
>>> print("{name} {count:2s} {region}".format(**mydict))
some name 5 some region
If you don't have the expected values, the findall() call will return an empty or incomplete list. In that case you must check the returned dict before printing it, otherwise the format() call will fail.
One way to ensure that the dict always has the expected values is to set it up beforehand with defaults.
>>> mydict = {'count': 'n/a', 'region': 'n/a', 'name': 'n/a'}
>>> mydict.update(dict(regexp.findall(s)))
Then the format() call will always work, even if one of the fields is missing from the data.

you can use this try catch expression
try:
print custom_field.split("name",1)[2]
except :
print("Skipping ..")

Related

Trying to parse list in loops

I have a basic list as follows
data = "ffff,999,John Doe, Sam Adams"
mydata = data.split(',')
I want to be able to check if the 4th field is not null and if it is set a variable to the 4th field else set the variable to the 3rd field.
I have the following code
if mydata[3] is not None:
name = mydata[3]
elif mydata[2] is not None:
name = mydata[2]
The first part works, but if I set data to
data = "ffff,999,John Doe,"
The code doesn't do anything. What am I doing wrong?
Thanks
Since you split on ',' and data has trailing comma, your last item in lst is an empty string
>>> data = "ffff,999,John Doe,"
>>>
>>> data.split(',')
['ffff', '999', 'John Doe', '']
>>> lst = data.split(',')
>>>
>>> lst[3] is not None
True
This is how split() behaves.
Python2.7 str.split()

Python store line by line in List from Text File

i have a text file like so and i would like to process it in python
info.txt
firstname1
surname1
email#email.com1
student1
-------------------
firstname2
surname2
email#email.com2
student2
-----------------
i want to write a python code which iterares and stores each line in each indexs example: [firstname,surname,email#email.com,student] and ignore the "-----"
python code
with open('log.txt') as f:
lines = f.read().splitlines()
x = x + 1
for i in lines:
print i
but i believe this is wrong i amm very new to python can some one please point me in the correct direction
i want the output to me somthing like so
output
index 1 :first name: firstname1
Surname: surname1
Email: email#email.com1
Student student1
index 2 :first name: firstname2
Surname: surname2
Email: email#email.com2
student: student2
I know it'd be better form to explain the general guidelines of how to do something like this, but for a simple task like this, the code speaks for itself, really...
I'd implement it like this.
from pprint import pprint # For nicer formatting of the output.
# For the sake of a self-contained example,
# the data is inlined here.
#
# `f` could be replaced with `open('log.txt').
f = """
firstname1
surname1
email#email.com1
student1
-------------------
firstname2
surname2
email#email.com2
student2
-----------------
""".splitlines()
data = []
current = None
for line in f:
line = line.strip() # Remove leading and trailing spaces
if not line: # Ignore empty lines
continue # Skip the rest of this iteration.
if line.startswith('-----'): # New record.
current = None # Clear the `current` variable
continue # Skip the rest of the iteration
if current is None: # No current entry?
# This can happen either after a ----- line, or
# when we're dealing with the very first line of the file.
current = [] # Create an empty list,
data.append(current) # and push it to the list of data.
current.append(line)
pprint(data)
The output is a list of lists:
[['firstname1', 'surname1', 'email#email.com1', 'student1'],
['firstname2', 'surname2', 'email#email.com2', 'student2']]
Here's a solution that might be a bit more elegant. (As long as your file strictly keeps the format from your example, that is four lines of data followed by a dashed line.)
from itertools import izip # skip this line if you are using Python 3
with open('info.txt') as f:
result = [{'first name': first.strip(), 'Surname': sur.strip(),
'Email': mail.strip(), 'student': stud.strip()}
for first, sur, mail, stud, _ in izip(*[f]*5)]
This gives you a list of dictionaries as follows:
[{'first name': 'firstname1', 'Surname': 'surname1', 'Email': 'email#email.com1', 'student': 'student1'}, {'first name': 'firstname2', 'Surname': 'surname2', 'Email': 'email#email.com2', 'student': 'student2'}]
Where your "index 1" corresponds to the first element of the list (i.e. result[0]), "index 2" corresponds to the second element of the list, and so on.
For example, you can get the surname of your index == 2 with:
index = 2
result[index - 1]['Surname']
If you are really bothered that the index is shifted, you could built a dictionary from the result. Demo:
>>> result = dict(enumerate(result, 1))
>>> result
{1: {'first name': 'firstname1', 'Surname': 'surname1', 'Email': 'email#email.com1', 'student': 'student1'}, 2: {'first name': 'firstname2', 'Surname': 'surname2', 'Email': 'email#email.com2', 'student': 'student2'}}
>>>
>>> result[2]['Surname']
'surname2'
>>>
>>> for index, info in result.items():
... print index, info['first name'], info['Surname'], info['Email'], info['student']
...
1 firstname1 surname1 email#email.com1 student1
2 firstname2 surname2 email#email.com2 student2

convert file to python dict

Here is my file that I want to convert to a python dict:
#
# DATABASE
#
Database name FooFileName
Database file FooDBFile
Info file FooInfoFile
Database ID 3
Total entries 8888
I have tried several things and I can't get it to convert to a dict. I ultimately want to be able to pick off the 'Database file' as a string. Thanks in advance.
Here is what I have tried already and the errors:
# ValueError: need more than 1 value to unpack
#d = {}
#for line in json_dump:
#for k,v in [line.strip().split('\n')]:
# for k,v in [line.strip().split(None, 1)]:
# d[k] = v.strip()
#print d
#print d['Database file']
# IndexError: list index out of range
#d = {}
#for line in json_dump:
# line = line.strip()
# parts = [p.strip() for p in line.split('/n')]
# d[parts[0]] = (parts[1], parts[2])
#print d
First you need to separate the string after last # . you can do it with regular expressions , re.search will do it :
>>> import re
>>> s="""#
... # DATABASE
... #
... Database name FooFileName
... Database file FooDBFile
... Info file FooInfoFile
... Database ID 3
... Total entries 8888"""
>>> re.search(r'#\n([^#]+)',s).group(1)
'Database name FooFileName\nDatabase file FooDBFile\nInfo file FooInfoFile\nDatabase ID 3\nTotal entries 8888'
also in this case you can just use split , you can split the text with # then choose the last element :
>>> s2=s.split('#')[-1]
Then you can use a dictionary comprehension and list comprehension , note that re.split is a good choice for this case as it use r' {2,}' for split that match 2 or more space :
>>> {k:v for k,v in [re.split(r' {2,}',i) for i in s2.split('\n') if i]}
{'Database name': 'FooFileName', 'Total entries': '8888', 'Database ID': '3', 'Database file': 'FooDBFile', 'Info file': 'FooInfoFile'}
Actually when we split, it returns a list of 3 values in it , so we need 3 variables to store the returned results, now we combine the first and second value returned , separated by a space to act as a key whose value is the third value returned in the list , This may be the most simple approach but I guess it will get your job done and it is easy to understand as well
d = {}
for line in json_dump:
if line.startswith('#'): continue
for u,k,v in line.strip().split():
d[u+" "+k] = v.strip()
print d
print d['Database file']
EDITED to reflect a line-wise regular expression approach.
Since it appears your file is not tab-delimited, you could use a regular expression to isolate the columns:
import re
#
# The rest of your code that loads up json_dump
#
d = {}
for line in json_dump:
if line.startswith('#'): continue ## For filtering out comment lines
line = line.strip()
#parts = [p.strip() for p in line.split('/n')]
try:
(key, value) = re.split(r'\s\s+', line) ## Split the line of input using 2 or more consecutive white spaces as the delimiter
except ValueError: continue ## Skip malformed lines
#d[parts[0]] = (parts[1], parts[2])
d[key] = value
print d
This yields this dictionary:
{'Database name': 'FooFileName', 'Total entries': '8888', 'Database ID': '3', 'Database file': 'FooDBFile', 'Info file': 'FooInfoFile'}
Which should allow you to isolate the individual values.

Most elegant way to format multi-line strings in Python

I have a multiline string, where I want to change certain parts of it with my own variables. I don't really like piecing together the same text using + operator. Is there a better alternative to this?
For example (internal quotes are necessary):
line = """Hi my name is "{0}".
I am from "{1}".
You must be "{2}"."""
I want to be able to use this multiple times to form a larger string, which will look like this:
Hi my name is "Joan".
I am from "USA".
You must be "Victor".
Hi my name is "Victor".
I am from "Russia".
You must be "Joan".
Is there a way to do something like:
txt == ""
for ...:
txt += line.format(name, country, otherName)
info = [['ian','NYC','dan'],['dan','NYC','ian']]
>>> for each in info:
line.format(*each)
'Hi my name is "ian".\nI am from "NYC".\nYou must be "dan".'
'Hi my name is "dan".\nI am from "NYC".\nYou must be "ian".'
The star operator will unpack the list into the format method.
In addition to a list, you can also use a dictionary. This is useful if you have many variables to keep track of at once.
text = """\
Hi my name is "{person_name}"
I am from "{location}"
You must be "{person_met}"\
"""
person = {'person_name': 'Joan', 'location': 'USA', 'person_met': 'Victor'}
print text.format(**person)
Note, I typed the text differently because it lets me line up the text easier. You have to add a '\' at the beginning """ and before the end """ though.
Now if you have several dictionaries in a list you can easily do
people = [{'person_name': 'Joan', 'location': 'USA', 'person_met': 'Victor'},
{'person_name': 'Victor', 'location': 'Russia', 'person_met': 'Joan'}]
alltext = ""
for person in people:
alltext += text.format(**person)
or using list comprehensions
alltext = [text.format(**person) for person in people]
line = """Hi my name is "{0}".
I am from "{1}".
You must be "{2}"."""
tus = (("Joan","USA","Victor"),
("Victor","Russia","Joan"))
lf = line.format # <=== wit, direct access to the right method
print '\n\n'.join(lf(*tu) for tu in tus)
result
Hi my name is "Joan".
I am from "USA".
You must be "Victor".
Hi my name is "Victor".
I am from "Russia".
You must be "Joan".

can a list be converted to an integer

I am trying to write a program to convert a message inta a secret code. I m trying to create a basic code to work up from. here is the problem.
data = input('statement')
for line in data:
code = ('l' == '1',
'a' == '2'
'r' == '3',
'y' == '4')
line = line.replace(data, code, [data])
print(line)
this point of the above progam is so when i input my name:
larry
the output should be
12334
but I continue to recieve this message
TypeError: 'list' object cannot be interpreted as an integer
so i assumed this meant that my code variable must be an integer to be used in replace()
is there a way to convert that string into an integer or is there another way to fix this?
The reason why your original code gave you the error is because of line.replace(data, code, [data]). The str.replace method can take 3 arguments. The first is the string you want to replace, the second is the replacement string, and the third, optional argument is how many instances of the string you want to replace - an integer. You were passing a list as the third argument.
However, there are other problems to your code as well.
code is currently (False, False, False, False). What you need is a dictionary. You might also want to assign it outside of the loop, so you don't evaluate it every iteration.
code = {'l': '1', 'a': '2', 'r': '3', 'y': '4'}
Then, change your loop to this:
data = ''.join(code[i] for i in data)
print(data) gives you the desired output.
Note however that if a letter in the input isn't in the dictionary, you'll get an error. You can use the dict.get method to supply a default value if the key isn't in the dictionary.
data = ''.join(code.get(i, ' ') for i in data)
Where the second argument to code.get specifies the default value.
So your code should look like this:
code = {'l': '1', 'a': '2', 'r': '3', 'y': '4'}
data = input()
data = ''.join(code.get(i, ' ') for i in data)
print(data)
Just to sum up:
% cat ./test.py
#!/usr/bin/env python
data = raw_input()
code = {'l': '1', 'a': '2',
'r': '3', 'y': '4'}
out = ''.join(code[i] for i in data)
print (out)
% python ./test.py
larry
12334
You can use translate:
>>> print("Larry".lower().translate(str.maketrans('lary', '1234')))
12334
(assuming Python 3)
The previous comments should give you a good explanation on your error message,
so I will just give you another way to make the translation from data to code.
We can make use of Python's translate method.
# We will use the "maketrans" function, which is not included in Python's standard Namespace, so we need to import it.
from string import maketrans
data = raw_input('statement')
# I recommend using raw_input when dealing with strings, this way
# we won't need to write the string in quotes.
# Now, we create a translation table
# (it defines the mapping between letters and digits similarly to the dict)
trans_table = maketrans('lary', '1234')
# And we translate the guy based on the trans_table
secret_data = data.translate(trans_table)
# secret_data is now a string, but according to the post title you want integer. So we convert the string into an integer.
secret_data = int(secret_data)
print secret_data
Just for the record, if you are interested in encoding data, you should check for
hashing.
Hashing is a widely used method for generating secret data format.
A simple example of hashing in Python (using the so-called sha256 hashing method):
>>> import hashlib
>>> data = raw_input('statement: ')
statement: larry
>>> secret_data = hashlib.sha256(data)
>>>print secret_data.hexdigest()
0d098b1c0162939e05719f059f0f844ed989472e9e6a53283a00fe92127ac27f

Categories

Resources