Display text from a key/value pair in a nested dictionary

Display text from a key/value pair in a nested dictionary - python

I'm a novice.
I am trying to print the elements from the Periodic Table to the screen arranged like the table itself. I'm using (' - ') to separate the symbols that I haven't written in the dictionary yet. I'm only using a nested dictionary with two entries total to minimize confusion.
Training Source last exercise.
I asked this question elsewhere and someone (correctly) suggested using str.join(list) but It wasn't part of the tutorial.
I'm trying to teach myself and I want to understand. No schooling, no work, no instructor.
The hints at the bottom of the linked tutorial says:
1."Use a for loop to loop through each element. Pick out the elements' row numbers and column numbers."
2."Use two nested for loops to print either an element's symbol or a series of spaces, depending on how full that row is."
I'd like to solve it this way. Thanks in advance.
Note* No, pre-intermediate, intermediate or advanced code please, the tutorial has only covered code related to variables, strings, numbers, lists, tuples, functions(beginners),if statements, while loops, basic terminal apps and dictionaries.
Lastly I'd like to have the table itself printed with the shape of the real Periodic Table. If you could throw in a bit of that code for a novice it'd really help thanks.
My attempt(wrong):
ptable = {'mercury':{'symbol':'hg','atomic number': '80','row': '6','column': '12','weight':'200.59',}, 'tungsten':{'symbol':'w','atomic number':'74','row':'6','column':'6','weight':'183.84'},}
for line in range(1,7):
for key in ptable:
row = int(ptable[key]['row'])
column = int(ptable[key]['column'])
if line != row:
print('-'*18)
else:
space = 18 - column
print('-'*(column-1)+ptable[key]['symbol']+'-'*space)
outputs:
------------------
------------------
------------------
------------------
------------------
------------------
------------------
------------------
------------------
------------------
-----------hg------
-----w------------
The output should have 7 lines as in the Periodic table. It is supposed to display the symbols of each element in the correct place as in the Periodic Table. Since I only have two elements in the library it should show Hg and W in their correct places
The experienced programmers' solution:
for line in range(1, 8): # line will count from 1 to 7
# display represents the line of elements we will print later
# hyphens show that the space is empty
# so we fill the list with hyphens at first
display = ['-'] * 18
for key in ptable:
if int(ptable[key]['row']) == line:
# if our element is on this line
# add that element to our list
display[int(ptable[key]['column']) - 1] = ptable[key]['symbol']
# str.join(list) will "join" the elements of your list into a new string
# the string you call it on will go in between all of your elements
print(''.join(display))

I honestly think this code isn't that hard to understand and I think trying to change it would only make it more complicated. I'm going to put you some links at the end for you to check it out and understand the ''join() method and the range() function which you seem not to understand either. You said you wanted to learn Python by yourself and that's a great thing! (I'm doing it too) But that doesn't mean you have to stick to a tutorial ;). You can go beyond it and also skip the parts you don't care about and come back later when you need them. If you need explanations about methods (like ''.join) or any other thing let me know. Sorry if that doesn't help you ;(.
Links:
The .join() method
The range() function

Related

Python convert list into split lists

so I have been given the task of using an api to pull student records and learnerID's to put into an in house application. The json formatting is dreadful and the only successful way I managed to split students individually is by the last value.
Now I am at the next stumbling block, I need to split these student lists into smaller sections so I implement a for loop as so:
student = request.text.split('"SENMajorNeedsDetails"')
for students in student:
r = str(student).split(',')
print (student[0], student[1])
print (r[0], r[1])
This works perfectly except this puts it all into a single list again and each student record isn't a set length (some have more values/fields than others).
so what I am looking to do is have a list for each student split on the comma, so student1 would equal [learnerID,personID,name,etc...]
this way when I want to reference the learnerID I can call learner1[0]
It is also very possible that I am going about this the wrong way and I should be doing some other form of list comprehension
my step by step process that I am aiming towards is:
pull data from system - DONE
split data into individual students - DONE
take learnerID,name,group of each student and add database entry
I have split step 3 into two stages where one involves my issue above and the second is the create database records
Below is a shortended example of the list item student[0], followed by student[1] if more is needed then say
:null},{"LearnerId":XXXXXX,"PersonId":XXXXXX,"LearnerCode":"XXXX-XXXXXX","UPN":"XXXXXXXXXXX","ULN":"XXXXXXXXXX","Surname":"XXXXX","Forename":"XXXXX","LegalSurname":"XXXXX","LegalForename":"XXXXXX","DateOfBirth":"XX/XX/XXXX 00:00:00","Year":"XX","Course":"KS5","DateOfEntry":"XX/XX/XXXX 00:00:00","Gender":"X","RegGroup":"1XX",],
:null},{"LearnerId":YYYYYYY,"PersonId":YYYYYYYY,"LearnerCode":"XXXX-YYYYYYYY","UPN":"YYYYYYYYYY","ULN":"YYYYYYYYYY","Surname":"YYYYYYYY","Forename":"YYYYYY","LegalSurname":"YYYYYY","LegalForename":"YYYYYYY","DateOfBirth":"XX/XX/XXXX 00:00:00","Year":"XX","Course":"KS5","DateOfEntry":"XX/XX/XXXX 00:00:00","Gender":"X","RegGroup":"1YY",],
Sorry doesn't like putting it on seperate lines
EDIT* changed wording at the end and added a redacted student record

Just to clarify the resolution to my issue was to learn how to parse JSON propperly, this was pointed out by #Patrick Haugh and all credit should go to him for pointing me in the right direction. Second most helpful person was #ArndtJonasson
The problem was that I was manually trying to do the job of the JSON library and I am no where near that level of competency yet. As stated originally it was totally likely that I was going about it in completely the wrong way.

How to impliment a binary search on a list created from a file

This is my first post, please be gentle. I'm attempting to sort some
files into ascending and descending order. Once I have sorted a file, I am storing it in a list which is assigned to a variable. The user is then to choose a file and search for an item. I get an error message....
TypeError: unorderable types; int() < list()
.....when ever I try to search for an item using the variable of my sorted list, the error occurs on line 27 of my code. From research, I know that an int and list cannot be compared, but I cant for the life of me think how else to search a large (600) list for an item.
At the moment I'm just playing around with binary search to get used to it.
Any suggestions would be appreciated.
year = []
with open("Year_1.txt") as file:
for line in file:
line = line.strip()
year.append(line)
def selectionSort(alist):
for fillslot in range(len(alist)-1,0,-1):
positionOfMax=0
for location in range(1,fillslot+1):
if alist[location]>alist[positionOfMax]:
positionOfMax = location
temp = alist[fillslot]
alist[fillslot] = alist[positionOfMax]
alist[positionOfMax] = temp
def binarySearch(alist, item):
first = 0
last = len(alist)-1
found = False
while first<=last and not found:
midpoint = (first + last)//2
if alist[midpoint] == item:
found = True
else:
if item < alist[midpoint]:
last = midpoint-1
else:
first = midpoint+1
return found
selectionSort(year)
testlist = []
testlist.append(year)
print(binarySearch(testlist, 2014))
Year_1.txt file consists of 600 items, all years in the format of 2016.
They are listed in descending order and start at 2017, down to 2013. Hope that makes sense.

Is there some reason you're not using the Python: bisect module?
Something like:
import bisect
sorted_year = list()
for each in year:
bisect.insort(sorted_year, each)
... is sufficient to create the sorted list. Then you can search it using functions such as those in the documentation.
(Actually you could just use year.sort() to sort the list in-place ... bisect.insort() might be marginally more efficient for building the list from the input stream in lieu of your call to year.append() ... but my point about using the `bisect module remains).
Also note that 600 items is trivial for modern computing platforms. Even 6,000 won't take but a few milliseconds. On my laptop sorting 600,000 random integers takes about 180ms and similar sized strings still takes under 200ms.
So you're probably not gaining anything by sorting this list in this application at that data scale.
On the other hand Python also includes a number of modules in its standard libraries for managing structured data and data files. For example you could use Python: SQLite3.
Using this you'd use standard SQL DDL (data definition language) to describe your data structure and schema, SQL DML (data manipulation language: INSERT, UPDATE, and DELETE statements) to manage the contents of the data and SQL queries to fetch data from it. Your data can be returned sorted on any column and any mixture of ascending and descending on any number of columns with the standard SQL ORDER BY clauses and you can add indexes to your schema to ensure that the data is stored in a manner to enable efficient querying and traversal (table scans) in any order on any key(s) you choose.
Because Python includes SQLite in its standard libraries, and because SQLite provides SQL client/server semantics over simple local files ... there's almost no downside to using it for structured data. It's not like you have to install and maintain additional software, servers, handle network connections to a remote database server nor any of that.

I'm going to walk through some steps before getting to the answer.
You need to post a [mcve]. Instead of telling us to read from "Year1.txt", which we don't have, you need to put the list itself in the code. Do you NEED 600 entries to get the error in your code? No. This is sufficient:
year = ["2001", "2002", "2003"]
If you really need 600 entries, then provide them. Either post the actual data, or
year = [str(x) for x in range(2017-600, 2017)]
The code you post needs to be Cut, Paste, Boom - reproduces the error on my computer just like that.
selectionSort is completely irrelevant to the question, so delete it from the question entirely. In fact, since you say the input was already sorted, I'm not sure what selectionSort is actually supposed to do in your code, either. :)
Next you say testlist = [].append(year). USE YOUR DEBUGGER before you ask here. Simply looking at the value in your variable would have made a problem obvious.
How to append list to second list (concatenate lists)
Fixing that means you now have a list of things to search. Before you were searching a list to see if 2014 matched the one thing in there, which was a complete list of all the years.
Now we get into binarySearch. If you look at the variables, you see you are comparing the integer 2014 with some string, maybe "1716", and the answer to that is useless, if it even lets you do that (I have python 2.7 so I am not sure exactly what you get there). But the point is you can't find the integer 2014 in a list of strings, so it will always return False.
If you don't have a debugger, then you can place strategic print statements like
print ("debug info: binarySearch comparing ", item, alist[midpoint])
Now here, what VBB said in comments worked for me, after I fixed the other problems. If you are searching for something that isn't even in the list, and expecting True, that's wrong. Searching for "2014" returns True, if you provide the correct list to search. Alternatively, you could force it to string and then search for it. You could force all the years to int during the input phase. But the int 2014 is not the same as the string "2014".

How to find and replace 6 digit numbers within HREF links from map of values across site files, ideally using SED/Python

I need to create a BASH script, ideally using SED to find and replace value lists in href URL link constructs with HTML sit files, looking-up in a map (old to new values), that have a given URL construct. There are around 25K site files to look through, and the map has around 6,000 entries that I have to search through.
All old and new values have 6 digits.
The URL construct is:
One value:
HREF=".*jsp\?.*N=[0-9]{1,}.*"
List of values:
HREF=".*\.jsp\?.*N=[0-9]{1,}+N=[0-9]{1,}+N=[0-9]{1,}...*"
The list of values are delimited by + PLUS symbol, and the list can be 1 to n values in length.
I want to ignore a construct such as this:
HREF=".*\.jsp\?.*N=0.*"
IE the list is only N=0
Effectively I'm only interested in URL's that include one or more values that are in the file map, that are not prepended with CHANGED -- IE the list requires updating.
PLEASE NOTE: in the above construct examples: .* means any character that isn't a digit; I'm just interested in any 6 digit values in the list of values after N=; so I've trying to isolate the N= list from the rest of the URL construct, and it should be noted that this N= list can appear anywhere within this URL construct.
Initially, I want to create a script that will create a report of all links that fulfills the above criteria and that have a 6 digital OLD value that's in the map file, with its file path, to get an understanding of links impacted. EG:
Filename link
filea.jsp /jsp/search/results.jsp?N=204200+731&Ntx=mode+matchallpartial&Ntk=gensearch&Ntt=
filea.jsp /jsp/search/BROWSE.jsp?Ntx=mode+matchallpartial&N=213890+217867+731&
fileb.jsp /jsp/search/results.jsp?N=0+450+207827+213767&Ntx=mode+matchallpartial&Ntk=gensearch&Ntt=
Lastly, I'd like to find and replace all 6 digit numbers, within the URL construct lists, as outlined above, as efficiently as possible (I'd like it to be reasonably fast as there could be around 25K files, with 6K values to look up, with potentially multiple values in the list).
**PLEASE NOTE:** There is an additional issue I have, when finding and replacing, is that an old value could have been assigned a new value, that's already been used, that may also have to be replaced.
E.G. If the map file is as below:
MAP-FILE.txt
OLD NEW
214865 218494
214866 217854
214867 214868
214868 218633
... ...
and there is a HREF link such as:
/jsp/search/results.jsp?Ntx=mode+matchallpartial&Ntk=gensearch&N=0+450+214867+214868
214867 changes to 214868 - this would need to be prepended to flag that this value has been changed, and should not be replaced, otherwise what was 214867 would become 218633 as all 214868 would be changed to 218633. Hope this makes sense - I would then need to run through file and remove all 6 digit numbers that had been marked with the prepended flag, such that link would become:
/jsp/search/results.jsp?Ntx=mode+matchallpartial&Ntk=gensearch&N=0+450+214868CHANGED+218633CHANGED
Unless there's a better way to manage these infile changes.
Could someone please help me on this, I'm note an expert with these kind of changes - so help would be massively appreciated.
Many thanks in advance,
Alex

I will write the outline for the code in some kind of pseudocode. And I don't remember Python well to quickly write the code in Python.
First find what type it is (if contains N=0 then type 3, if contains "+" then type 2, else type 1) and get a list of strings containing "N=..." by exploding (name of PHP function) by "+" sign.
The first loop is on links. The second loop is for each N= number. The third loop looks in map file and finds the replacing value. Load the data of the map file to a variable before all the loops. File reading is the slowest operation you have in programming.
You replace the value in the third loop, then implode (PHP function) the list of new strings to a new link when returning to a first loop.
Probably you have several files with the links then you need another loop for the files.
When dealing with repeated codes you nees a while loop until spare number found. And you need to save the numbers that are already used in a list.

How to 'flatten' lines from text file if they meet certain criteria using Python?

To start I am a complete new comer to Python and programming anything other than web languages.
So, I have developed a script using Python as an interface between a piece of Software called Spendmap and an online app called Freeagent. This script works perfectly. It imports and parses the text file and pushes it through the API to the web app.
What I am struggling with is Spendmap exports multiple lines per order where as Freeagent wants One line per order. So I need to add the cost values from any orders spread across multiple lines and then 'flatten' the lines into One so it can be sent through the API. The 'key' field is the 'PO' field. So if the script sees any matching PO numbers, I want it to flatten them as per above.
This is a 'dummy' example of the text file produced by Spendmap:
5090071648,2013-06-05,2013-09-05,P000001,1133997,223.010,20,2013-09-10,104,xxxxxx,AP
COMMENT,002091
301067,2013-09-06,2013-09-11,P000002,1133919,42.000,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
301067,2013-09-06,2013-09-11,P000002,1133919,359.400,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
301067,2013-09-06,2013-09-11,P000003,1133910,23.690,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
The above has been formatted for easier reading and normally is just one line after the next with no text formatting.
The 'key' or PO field is the first bold item and the second bold/italic item is the cost to be totalled. So if this example was to be passed through the script id expect the first row to be left alone, the Second and Third row costs to be added as they're both from the same PO number and the Fourth line to left alone.
Expected result:
5090071648,2013-06-05,2013-09-05,P000001,1133997,223.010,20,2013-09-10,104,xxxxxx,AP
COMMENT,002091
301067,2013-09-06,2013-09-11,P000002,1133919,401.400,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
301067,2013-09-06,2013-09-11,P000003,1133910,23.690,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
Any help with this would be greatly appreciated and if you need any further details just say.
Thanks in advance for looking!

I won't give you the solution. But you should:
Write and test a regular expression that breaks the line down into its parts, or use the CSV library.
Parse the numbers out so they're decimal numbers rather than strings
Collect the lines up by ID. Perhaps you could use a dict that maps IDs to lists of orders?
When all the input is finished, iterate over that dict and add up all orders stored in that list.
Make a string format function that outputs the line in the expected format.
Maybe feed the output back into the input to test that you get the same result. Second time round there should be no changes, if I understood the problem.
Good luck!

I would use a dictionary to compile the lines, using get(key,0.0) to sum values if they exist already, or start with zero if not:
InputData = """5090071648,2013-06-05,2013-09-05,P000001,1133997,223.010,20,2013-09-10,104,xxxxxx,AP COMMENT,002091
301067,2013-09-06,2013-09-11,P000002,1133919,42.000,20,2013-10-31,103,xxxxxx,AP COMMENT,002143
301067,2013-09-06,2013-09-11,P000002,1133919,359.400,20,2013-10-31,103,xxxxxx,AP COMMENT,002143
301067,2013-09-06,2013-09-11,P000003,1133910,23.690,20,2013-10-31,103,xxxxxx,AP COMMENT,002143"""
OutD = {}
ValueD = {}
for Line in InputData.split('\n'):
# commas in comments won't matter because we are joining after anyway
Fields = Line.split(',')
PO = Fields[3]
Value = float(Fields[5])
# set up the output string with a placeholder for .format()
OutD[PO] = ",".join(Fields[:5] + ["{0:.3f}"] + Fields[6:])
# add the value to the old value or to zero if it is not found
ValueD[PO] = ValueD.get(PO,0.0) + Value
# the output is unsorted by default, but you could sort or preserve original order
for POKey in ValueD:
print OutD[POKey].format(ValueD[POKey])
P.S. Yes, I know Capitals are for Classes, but this makes it easier to tell what variables I have defined...

Reading text and assigning a class to data in Python

I've been searching around, and had no luck finding anything answering my question.
Essentially I have a file with the following data:
Title - 19
Artist - Adele
Year released - 2008
1 - Daydreamer, 3:41, 1
2 - Best for Last, 4:19, 5
3 - Chasing Pavements, 3:31, 7
4 - Cold Shoulder, 3:12, 3
Title - El Camino
Artist - The Black Keys
Year released - 2011
1 - Lonely Boy, 3:13, 1
2 - Run Right Back, 3:17, 10
EOF
I know how to create classes, and how to assign an object to a class and values to that object, but I am just about ready to tear my hair out on how it is I'm supposed to process the text. From text, I need to create a title for the album, and assign the album's information to it. There's more else besides that needs to be done, and there are more lines to be read, and I just don't know where to start on this. I've found two "album.py" files via google, and I've been unable to make heads or tails of how to apply the solution to my case.
And yes, this is for a school assignment. I've done some digging around and found some things relevant, but I'm just not understanding it. I'm new to programming in general, and I've made progress but this seems too far over my head.
I know I could reduce this to lists using split (\n\n) and operating on a series of progressively smaller lists, but I am trying to avoid this method at all costs.
EDIT:
For the time being, it's best to assume I know nothing. Though, to answer below question: I can open the file and read it. If its a consistent CSV formatted file, I can write code to process the enclosed data, and create a class structure that uses that data. Right now I'm just having trouble with the first three lines, and the digits immediately below.
APRIL 4 2012:
Okay, I have some code, I've left the comments with respect to it underneath.
def getInput():
global albums
raw = open("album.txt","r")
infile = raw
raw.close
text=""
line = infile.readline()
while (line != "EOF\n" ):
text += line
line=infile.readline()
text=text.rstrip("\n\n")
albums=[str(n) for n in text.split("\n\n")]
return albums
class Album():
def __init__(self, title, artist, date):
self.title=title
self.artist=artist
self.date=date
self.track={}
def addSong(self, TrackID, title, time, ranking):
self.track+={self}
def getAlbumLength(self):
asdf=0
def getRanking(self):
asdf=0
def labels(x): #establishes labels per item to be used for Album Classifier
title=""
artist=""
date=""
for i in range(0,len(albums),1):
sublist=[str(n) for n in albums[i].split("\n")]
RANDUMB=len(albums[i])
title=sublist[0]
artist=sublist[1]
date=sublist[2]
for j in range(0,len(sublist),1):
song_info = [str(k) for k in sublist[3:].split("," and " - ")]
TrackID=song_info[0]
title=song_info[1]
time=song_info[2]
ranking=song_info[3]
getInput()
labels(albums)
Personal comments on code:
I was trying to avoid getting it into lists because I anticipated this problem. As the functions are concerned, I have to use every single bloody one, because it's in the assignment requirements... I am displeased because I could probably get around using them. The code is working sufficiently enough, except for the last part of it where I am trying to take the song information. I want to split the song information into lists, which are nested into the album information list. Something like:
[Album title, Artist, Date released,[01,Song,3:44,2],[02,Song,0:01,9]....]
The current code gives me index out of range error as of right now... I am using python3.
TLDR: The substance of my problem has thus changed from one of trying to solve how to go about starting the solution to how to take items in a list and convert them into nested lists.

If you end up editing your question to contain some more specific examples of what is giving you trouble, I will edit this answer. But to address your general question, there are some steps involved to achieving your goal.
Like you said, you need to write a class that reflects the structure you intend to have from this data.
You will need to parse this file, probably line by line. So you have to determine if this file format is consistant. If it is, then you need to determine:
What is the delimiter between each set of data, which will be conformed into a class instance?
What is the delimiter between each field of each line?
When you are looping over each line, you will know that you need to start a new album object whenever you encounter a blank line.
When you know you are starting a new album, you can assume that the first line will be a title, the second an artist, the third, the year, etc.
For each of these lines you will also have to have rules of how to split each one into the data you want. At a basic level it can be a simple set of splits. At a more advanced level you might define regular expressions for each type of line.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.