I've got an old informix database that was written for cobol. All the fields are in code so my SQL queries look like.
SELECT uu00012 FROM uu0001;
This is pretty hard to read.
I have a text file with the field definitions like
uu00012 client
uu00013 date
uu00014 f_name
uu00015 l_name
I would like to swap out the code for the more english name. Run a python script on it maybe and have a file with the english names saved.
What's the best way to do this?
If each piece is definitely a separate word, re.sub is definitely the way to go here:
#create a mapping of old vars to new vars.
with open('definitions') as f:
d = dict( [x.split() for x in f] )
def my_replace(match):
#if the match is in the dictionary, replace it, otherwise, return the match unchanged.
return d.get( match.group(), match.group() )
with open('inquiry') as f:
for line in f:
print re.sub( r'\w+', my_replace, line )
Conceptually,
I would probably first build a mapping of codings -> english (in memory or o.
Then, for each coding in your map, scan your file and replace with the codes mapped english equivalent.
infile = open('filename.txt','r')
namelist = []
for each in infile.readlines():
namelist.append((each.split(' ')[0],each.split(' ')[1]))
this will give you a list of key,value pairs
i dont know what you want to do with the results from there though, you need to be more explicit
dictionary = '''uu00012 client
uu00013 date
uu00014 f_name
uu00015 l_name'''
dictionary = dict(map(lambda x: (x[1], x[0]), [x.split() for x in dictionary.split('\n')]))
def process_sql(sql, d):
for k, v in d.items():
sql = sql.replace(k, v)
return sql
sql = process_sql('SELECT f_name FROM client;', dictionary)
build dictionary:
{'date': 'uu00013', 'l_name': 'uu00015', 'f_name': 'uu00014', 'client': 'uu00012'}
then run thru your SQL and replace human readable values with coded stuff. The result is:
SELECT uu00014 FROM uu00012;
import re
f = open("dictfile.txt")
d = {}
for mapping in f.readlines():
l, r = mapping.split(" ")
d[re.compile(l)] = r.strip("\n")
sql = open("orig.sql")
out = file("translated.sql", "w")
for line in sql.readlines():
for r in d.keys():
line = r.sub(d[r], line)
out.write(line)
Related
I want to ask something about translating somestring using python. I have a csv file contains list of abreviation dictionary like this.
before, after
ROFL, Rolling on floor laughing
STFU, Shut the freak up
LMK, Let me know
...
I want to translate string that contains word in column "before" to be word in column "after". I try to use this code, but it doesn't change anything.
def replace_abbreviation(tweet):
dictionary = pd.read_csv("dict.csv", encoding='latin1')
dictionary['before'] = dictionary['before'].apply(lambda val: unicodedata.normalize('NFKD', val).encode('ascii', 'ignore').decode())
tmp = dictionary.set_index('before').to_dict('split')
tweet = tweet.translate(tmp)
return tweet
For example :
Input = "lmk your test result please"
Output = "let me know your test
result please"
You can read the contents to a dict and then use the following code.
res = {}
with open('dict.csv') as file:
next(file) # skip the first line "before, after"
for line in file:
k, v = line.strip().split(', ')
res[k] = v
def replace(tweet):
return ' '.join(res.get(x.upper(), x) for x in tweet.split())
print(replace('stfu and lmk your test result please'))
Output
Shut the freak up and Let me know your test result please
I have a couple of lines inside a text that i am looking to turn the first word to a key (space is between each) with a function, and the rest to follow as values.
This is what the text contains:
FFFB10 11290 Charlie
1A9345 37659 Delta
221002 93323 Omega
The idea is to turn the first word into a key, but also arrange it (row underneath a row) visualy, so the first word(FFFB10) is the key, and the rest are values, meaning:
Entered: FFFB10
Location: 11290
Name: Charlie
I tried with this as a beginning:
def code(codeenter, file):
for line in file.splitlines():
if name in line:
parts = line.split(' ')
But i dont know how to continue (i erased most of the code), any suggestions?
Assuming you managed to extract a list of lines without the newline character at the end.
def MakeDict(lines):
return {key: (location, name) for key, location, name in (line.split() for line in lines)}
This is an ordinary dictionary comprehension with a generator expression. The former is all the stuff in brackets and the latter is inside the last pair of brackets. line.split splits a line with whitespace being the delimiter.
Example run:
>>> data = '''FFFB10 11290 Charlie
... 1A9345 37659 Delta
... 221002 93323 Omega'''
>>> lines = data.split('\n')
>>> lines
['FFFB10 11290 Charlie', '1A9345 37659 Delta', '221002 93323 Omega']
>>> def MakeDict(lines):
... return {key: (location, name) for key, location, name in (line.split() for line in lines)}
...
>>>
>>> MakeDict(lines)
{'FFFB10': ('11290', 'Charlie'), '1A9345': ('37659', 'Delta'), '221002': ('93323', 'Omega')}
How to format the output:
for key, values in MakeDict(lines).items():
print("Key: {}\nLocation: {}\nName: {}".format(key, *values))
See ForceBru's answer on how to construct the dictionary. Here's the printing part:
for k, (v1, v2) in your_dict.items():
print("Entered: {}\nLocation: {}\nName: {}\n".format(k, v1, v2))
You can try this:
f = [i.strip('\n').split() for i in open('filename.txt')]
final_dict = {i[0]:i[1:] for i in f}
Assuming the data is structured like this:
FFFB10 11290 Charlie
1A9345 37659 Delta
221002 93323 Omega
Your output will be:
{'FFFB10': ['11290', 'Charlie'], '221002': ['93323', 'Omega'], '1A9345': ['37659', 'Delta']}
You may want to consider using namedtuple.
from collections import namedtuple
code = {}
Code = namedtuple('Code', 'Entered Location Name')
filename = '/Users/ca_mini/Downloads/junk.txt'
with open(filename, 'r') as f:
for row in f:
row = row.split()
code[row[0]] = Code(*row)
>>> code
{'1A9345': Code(Entered='1A9345', Location='37659', Name='Delta'),
'221002': Code(Entered='221002', Location='93323', Name='Omega'),
'FFFB10': Code(Entered='FFFB10', Location='11290', Name='Charlie')}
I have a file with 2 columns:
Anzegem Anzegem
Gijzelbrechtegem Anzegem
Ingooigem Anzegem
Aalst Sint-Truiden
Aalter Aalter
The first column is a town and the second column is the district of that town.
I made a dictionary of that file like this:
def readTowns(text):
input = open(text, 'r')
file = input.readlines()
dict = {}
verzameling = set()
for line in file:
tmp = line.split()
dict[tmp[0]] = tmp[1]
return dict
If I set a variable 'writeTowns' equal to readTowns(text) and do writeTown['Anzegem'], I want to get a collection of {'Anzegem', 'Gijzelbrechtegem', 'Ingooigem'}.
Does anybody know how to do this?
I think you can just create another function that can create appropriate data structure for what you need. Because, at the end you will end up writing code which basically manipulates the dictionary returned by readTowns to generate data as per your requirement. Why not keep the code clean and create another function for that. You Just create a name to list dictionary and you are all set.
def writeTowns(text):
input = open(text, 'r')
file = input.readlines()
dict = {}
for line in file:
tmp = line.split()
dict[tmp[1]] = dict.get(tmp[1]) or []
dict.get(tmp[1]).append(tmp[0])
return dict
writeTown = writeTowns('file.txt')
print writeTown['Anzegem']
And if you are concerned about reading the same file twice, you can do something like this as well,
def readTowns(text):
input = open(text, 'r')
file = input.readlines()
dict2town = {}
town2dict = {}
for line in file:
tmp = line.split()
dict2town[tmp[0]] = tmp[1]
town2dict[tmp[1]] = town2dict.get(tmp[1]) or []
town2dict.get(tmp[1]).append(tmp[0])
return dict2town, town2dict
dict2town, town2dict = readTowns('file.txt')
print town2dict['Anzegem']
You could do something like this, although, please have a look at #ubadub's answer, there are better ways to organise your data.
[town for town, region in dic.items() if region == 'Anzegem']
It sounds like you want to make a dictionary where the keys are the districts and the values are a list of towns.
A basic way to do this is:
def readTowns(text):
with open(text, 'r') as f:
file = input.readlines()
my_dict = {}
for line in file:
tmp = line.split()
if tmp[1] in dict:
my_dict[tmp[1]].append(tmp[0])
else:
my_dict[tmp[1]] = [tmp[0]]
return dict
The if/else blocks can also be achieved using python's defaultdict subclass (docs here) but I've used the if/else statements here for readability.
Also some other points: the variables dict and file are python types so it is bad practice to overwrite these with your own local variable (notice I've changed dict to my_dict in the code above.
If you build your dictionary as {town: district}, so the town is the key and the district is the value, you can't do this easily*, because a dictionary is not meant to be used in that way. Dictionaries allow you to easily find the values associated with a given key. So if you want to find all the towns in a district, you are better of building your dictionary as:
{district: [list_of_towns]}
So for example the district Anzegem would appear as {'Anzegem': ['Anzegem', 'Gijzelbrechtegem', 'Ingooigem']}
And of course the value is your collection.
*you could probably do it by iterating through the entire dict and checking where your matches occur, but this isn't very efficient.
I am trying to create a list named "userlist" with all the usernames listed beside "List:",
my idea is to parse the line with "List:" and then split based on "," and put them in a list,
however am not able to capture the line ,any inputs on how can this be achieved?
output=""" alias: tech.sw.host
name: tech.sw.host
email: tech.sw.host
email2: tech.sw.amss
type: email list
look_elsewhere: /usr/local/mailing-lists/tech.sw.host
text: List tech SW team
list_supervisor: <username>
List: username1,username2,username3,username4,
: username5
Members: User1,User2,
: User3,User4,
: User5 """
#print output
userlist = []
for line in output :
if "List" in line:
print line
If it were me, I'd parse the entire input so as to have easy access to every field:
inFile = StringIO.StringIO(ph)
d = collections.defaultdict(list)
for line in inFile:
line = line.partition(':')
key = line[0].strip() or key
d[key] += [part.strip() for part in line[2].split(',')]
print d['List']
Using regex, str.translate and str.split :
>>> import re
>>> from string import whitespace
>>> strs = re.search(r'List:(.*)(\s\S*\w+):', ph, re.DOTALL).group(1)
>>> strs.translate(None, ':'+whitespace).split(',')
['username1', 'username2', 'username3', 'username4', 'username5']
You can also create a dict here, which will allow you to access any attribute:
def func(lis):
return ''.join(lis).translate(None, ':'+whitespace)
lis = [x.split() for x in re.split(r'(?<=\w):',ph.strip(), re.DOTALL)]
dic = {}
for x, y in zip(lis[:-1], lis[1:-1]):
dic[x[-1]] = func(y[:-1]).split(',')
dic[lis[-2][-1]] = func(lis[-1]).split(',')
print dic['List']
print dic['Members']
print dic['alias']
Output:
['username1', 'username2', 'username3', 'username4', 'username5']
['User1', 'User2', 'User3', 'User4', 'User5']
['tech.sw.host']
Try this:
for line in output.split("\n"):
if "List" in line:
print line
When Python is asked to treat a string like a collection, it'll treat each character in that string as a member of that collection (as opposed to each line, which is what you're trying to accomplish).
You can tell this by printing each line:
>>> for line in ph:
... print line
...
a
l
i
a
s
:
t
e
...
By the way, there are far better ways of handling this. I'd recommend taking a look at Python's built-in RegEx library: http://docs.python.org/2/library/re.html
Try using strip() to remove the white spaces and line breakers before doing the check:
if 'List:' == line.strip()[:5]:
this should capture the line you need, then you can extract the usernames using split(','):
usernames = [i for i in line[5:].split(',')]
Here is my two solutions, which are essentially the same, but the first is easier to understand.
import re
output = """ ... """
# First solution: join continuation lines, the look for List
# Join lines such as username5 with previous line
# List: username1,username2,username3,username4,
# : username5
# becomes
# List: username1,username2,username3,username4,username5
lines = re.sub(r',\s*:\s*', ',', output)
for line in lines.splitlines():
label, values = [token.strip() for token in line.split(':')]
if label == 'List':
userlist = userlist = [user.strip() for user in values.split(',')]
print 'Users:', ', '.join(userlist)
# Second solution, same logic as above
# Different means
tokens, = [line for line in re.sub(r',\s*:\s*', ',', output).splitlines()
if 'List:' in line]
label, values = [token.strip() for token in tokens.split(':')]
userlist = userlist = [user.strip() for user in values.split(',')]
print 'Users:', ', '.join(userlist)
I have a list in the following format in a txt file :
Shoes, Nike, Addias, Puma,...other brand names
Pants, Dockers, Levis,...other brand names
Watches, Timex, Tiesto,...other brand names
how to put these into dictionary like this format:
dictionary={Shoes: [Nike, Addias, Puma,.....]
Pants: [Dockers, Levis.....]
Watches:[Timex, Tiesto,.....]
}
How to do this in a for loop rather than manual input.
i have tried
clothes=open('clothes.txt').readlines()
clothing=[]
stuff=[]
for line in clothes:
items=line.replace("\n","").split(',')
clothing.append(items[0])
stuff.append(items[1:])
Clothing:{}
for d in clothing:
Clothing[d]= [f for f in stuff]
Here's a more concise way to do things, though you'll probably want to split it up a bit for readability
wordlines = [line.split(', ') for line in open('clothes.txt').read().split('\n')]
d = {w[0]:w[1:] for w in wordlines}
How about:
file = open('clothes.txt')
clothing = {}
for line in file:
items = [item.strip() for item in line.split(",")]
clothing[items[0]] = items[1:]
Try this, it will remove the need for replacing line breaks and is quite simple, but effective:
clothes = {}
with open('clothes.txt', 'r', newline = '/r/n') as clothesfile:
for line in clothesfile:
key = line.split(',')[0]
value = line.split(',')[1:]
clothes[key] = value
The 'with' statement will make sure the file reader is closed after your code to implement the dictionary is executed. From there you can use the dictionary to your heart's content!
Using list comprehension you could do:
clothes=[line.strip() for line in open('clothes.txt').readlines()]
clothingDict = {}
for line in clothes:
arr = line.split(",")
clothingDict[arr[0]] = [arr[i] for i in range(1,len(arr))]