Issue with concatenating and replace in Pandas dataframe - python

I am stymied with something, and if I fix it one way, it breaks something else.
I have a set of data that lists file status by country. What I want to do is, for each country in the Country column, print all missing files by each status in the VisitStatus column. So for all rows where country=France, then for every visit that is "Complete", list the number of missing files.
There are two dataframes that I am concatenating into one combined set to work with and deliver final output. I am concat'ing df_s and df_ins into df_combined.
When I grab a set of unique values for the Country and VisitStatus columns to loop over, then try to write out the results per country to an Excel file workbook, quirks in the data kick out a 'duplicate sheetname' error. In one of the source dataframes, there is a status of "Do Not Review" in the VisitStatus column, but in the other source dataframe, it's named "Do not review", lowercase for the second two words. When they're concatenated, this kicks out unique values of "Do Not Review" and "Do not review". Then when the xslx writer tries to make the workbooks for the second one, it checks it against the existing workbooks DISREGARDING CASE, finds the first one, decides they are the same since it is ignoring case, and kicks out the error saying that the 'Do not review' worksheet already exists.
If I run replace() and change all the "Do not review" values in the VisitStatus column into "Do Not Review" so they all match and don't give two results for that when I call unique(), it breaks and gives me a KeyError on VisitStatus.
So far I have read thread after thread about this and haven't been able to solve this. I just tried running the replace() on the source dataframe, and then it throws an error saying that "status" is a float and can't be handled like a string.
I'm at a loss. Thanks in advance!
# COMBO
# Merge the screening and in study datasets
df_combined = pd.concat([df_s,df_ins], axis=0, ignore_index=True)
df_combined = df_combined.query('VisitStatus != "Hand Off Information"')
print(df_combined.columns.values)
print("---------------------------------------------------------------------------------")
# Display and save out country and missing file status
statuses = df_combined['VisitStatus'].unique()
countries = df_combined['Country'].unique()
for status in statuses:
print("X" + status + "X")
print('\n')
print (statuses)
for country in countries:
for status in statuses:
print('\n')
print("---> Missing Files for " + country + " all visits with status of: " + str(status))
df_cmb = df_combined[(df_combined.Country==country) & (df_combined.VisitStatus==status)]
print('\n')
numRows=df_cmb.shape[0]
if numRows > 0:
print("----> Number of visits in " + str(status) + " subset: " + str(numRows))
print("DRF Forms Missing: " + str(df_cmb['DRF-Form-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['DRF-Form-Uploaded'].sum()) + " collected")
print("CSSRS Forms Missing: " + str(df_cmb['CSSRS-Form-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['CSSRS-Form-Uploaded'].sum()) + " collected")
print("CDR Forms Missing: " + str(df_cmb['CDR-Form-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['CDR-Form-Uploaded'].sum()) + " collected")
print("CDR Audio Missing: " + str(df_cmb['CDR-Audio-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['CDR-Audio-Uploaded'].sum()) + " collected")
print("MMSE Forms Missing: " + str(df_cmb['MMSE-Form-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['MMSE-Form-Uploaded'].sum()) + " collected")
print("MMSE Audio Missing: " + str(df_cmb['MMSE-Audio-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['MMSE-Audio-Uploaded'].sum()) + " collected")
print("RBANS Forms Missing: " + str(df_cmb['RBANS-Form-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['RBANS-Form-Uploaded'].sum()) + " collected")
print("RBANS Audio Missing: " + str(df_cmb['RBANS-Audio-Uploaded'].sum()) + " vs. " + str(numRows - df_cmb['RBANS-Audio-Uploaded'].sum()) + " collected")
print("--------------------------------------")
print('\n')
else:
print("No " + status + " files/visits for " + country)
if country =="United States":
country="USA"
# something is borked in the next line - somehow there are two "Do Not Review" status types in the combined file, triggers an "already in use" for sheetname
df_cmb.to_excel(combo_writer, header=True, index=False, sheet_name=str(country)[:3] + "-by-" + str(status))

Oh lord. I'm answering my own question.
So I tinkered some more and nothing else made sense, so I started wondering if I was putting in the arguments for replace() correctly, and I had them backwards. I assumed the "Do not review" needed to be changed to "Do Not Review", but it was the other way around...I assumed incorrectly as to which source file data needed to be modified. Once I flipped them, it works.

Related

Why do I have different api responses from Openweathermap JSON and PyOWM library?

I am using two different ways to get current weather and I have got different data from two API.
I suspect PyOWM doesn't work properly because if I changed a city and run a script several times, it hangs with the same data and shows the same cyphers no matter what a city I type in the script. But at least pyowm shows weather pretty close to real if it is launched for first time. Webapi from https://openweathermap.org/ works pretty accurately and I don't have problems with it's JSON response. But PyOWM's response seems to be shows random data. Surely, I could forget about PyOWM and never use it but I am new with this sort of api responses discrepancy and I would like to know whether I do something wrong or I don't understand where I screwed up.
web API https://openweathermap.org/current
import json, requests
place = "London"
apikey = "e4784f34c74efe649018567223752b21"
lang = "en"
r = requests.get("http://api.openweathermap.org/data/2.5/weather?q=" + place + "&appid=" + apikey + "&lang=" + lang + "&units=metric", timeout=20)
api_answer = json.dumps(r.json())
weather_is = "Now in " + place + ": " + json.loads(api_answer)["weather"][0]["description"] + ".\n"
t_txt = "Temperature:\n"
t_now = "now: " + str(json.loads(api_answer)["main"]["temp"]) + "\n"
t_max = "maximum: " + str(json.loads(api_answer)["main"]["temp_max"]) + "\n"
t_min = "minimum: " + str(json.loads(api_answer)["main"]["temp_min"])
final_txt = weather_is + t_txt + t_now + t_max + t_min
print(final_txt)
PyOWM API https://pyowm.readthedocs.io/en/latest/usage-examples-v2/weather-api-usage-examples.html
import pyowm
owm = pyowm.OWM('e4784f34c74efe649018567223752b21', language = "en")
place = "London"
observation = owm.weather_at_place('place')
w = observation.get_weather()
print("Now in " + place + ": " + w.get_detailed_status() + ".")
temperature_at_place_now = w.get_temperature('celsius')["temp"]
temperature_at_place_max = w.get_temperature('celsius')["temp_max"]
temperature_at_place_min = w.get_temperature('celsius')["temp_min"]
print ("Temperature:")
print ("now: " + str(temperature_at_place_now))
print ("maximum: " + str(temperature_at_place_max))
print ("minimum: " + str(temperature_at_place_min))
[web api output] 1 [pyowm api output] 2

Arcmap script will not print messages in arcmap console

I have a Python script for Arcmap that I wrote. I'm trying to create a tool that reprojects all the feature classes within the workspace to a specified feature class.
The problem that I'm having is that I cannot get Arcmap to print the "completed" messages. The messages that I want to have appear will print when I hard-code the variables and run it as a script, but they will not print in Arcmap. You can see in the code below that I have specific printed messages that I want printed, but they just won't appear.
Code:
#Import modules
import arcpy, os
#Set workspace directory
from arcpy import env
#Define workspace
inWorkspace = arcpy.GetParameterAsText(0)
env.workspace = inWorkspace
env.overwriteOutput = True
try:
#Define local feature class to reproject to:
targetFeature = arcpy.GetParameterAsText(1)
#Describe the input feature class
inFc = arcpy.Describe(targetFeature)
sRef = inFc.spatialReference
#Describe input feature class
fcList = arcpy.ListFeatureClasses()
#Loop to re-define the feature classes and print the messages:
for fc in fcList:
desc = arcpy.Describe(fc)
if desc.spatialReference.name != sRef.name:
print "Projection of " + str(fc) + " is " + desc.spatialReference.name + ", so re-defining projection now:\n"
newFc = arcpy.Project_management(fc, "projected_" + fc, sRef)
newFeat = arcpy.Describe(newFc)
count = arcpy.GetMessageCount()
print "The reprojection of " + str(newFeat.baseName) + " " + arcpy.GetMessage(count-1) + "\n"
#Find out which feature classes have been reprojected
outFc = arcpy.ListFeatureClasses("projected_*")
#Print a custom messagae describing which feature classes were reprojected
for fc in outFc:
desc = arcpy.Describe(fc)
name = desc.name
name = name[:name.find(".")]
name = name.split("_")
name = name[1] + "_" + name[0]
print "The new file that has been reprojected is named " + name + "\n"
except arcpy.ExecuteError:
pass
severity = arcpy.GetMaxSeverity()
if severity == 2:
print "Error occurred:\n{0}".format(arcpy.GetMessage(2))
elif severity == 1:
print "Warning raised:\n{1}".format(arcpy.GetMessage(1))
else:
print "Script complete"
When I upload a script into an Arcmap toolbox, the following lines (From the above code) will NOT print:
print "Projection of " + str(fc) + " is " + desc.spatialReference.name + ", so re-defining projection now:\n"
print "The reprojection of " + str(newFeat.baseName) + " " + arcpy.GetMessage(count-1) + "\n"
print "The new file that has been reprojected is named " + name + "\n"
How can I fix this?
print only prints the messages while your script is running in Python interpreter. In order to print logs while the script is running in ArcGIS Toolbox, you need to use arcpy.AddMessage()
arcpy.AddMessage("Projection of {0} is {1}, so re-defining projection now: ".format(str(fc), desc.spatialReference.name)

Getting input from user and writing it to a text file

f=open("test.txt","w")
f.write("PERSONALINFO"+"\n")
f.write("\n")
f.close()
f=open("test.txt","a")
f.write("Customer 1 Info:""\n")
print()
print("Customer 1 input:")
user1_title=input("Enter Mr, Mrs, Miss, Ms:")
user1_name=input("Enter fullname:")
user1_town=input("Enter town and country you live in:")
user1_age=input("Enter birth MM/DD/YY with numbers:""\n")
print()
print("Name:",user1_title + "", user1_name,"\n""Hometown:",user1_town,"\n" "Age:", user1_age, file=f)
print("1.Student")
print("2.Not working")
User1_working_status=input("Enter working status:")
if user1_name=="1":
print("student")
elif user1_name=="2":
print("Not working")
input("Please explain why:")
I can't get my elif statement "Explain why" to print to my text file. Can anyone help me? I've tried everything but nothing works so I'm stuck.
From what I understand, you want to create and append all the information to a text file. For the Explain why part, you should store the information to a variable and then write to file. If you use the with context manager for file I/O, you would not have to close the file explicitly.
user1_title=input("Enter Mr, Mrs, Miss, Ms: ")
user1_name=input("Enter fullname: ")
user1_town=input("Enter town and country you live in: ")
user1_age=input("Enter birth MM/DD/YY with numbers:""\n")
print("1.Student")
print("2.Not working")
User1_working_status=input("Enter working status: ")
with open("test.txt", 'a+') as f:
if User1_working_status=="1":
f.write("{}\n{}\n{}\n{}\n".format("Name: " + user1_title + " " + user1_name, "Town: " + user1_town, "Age: " + user1_age, "Working status: student"))
elif User1_working_status=="2":
explain = input("Please explain why: ")
f.write("{}\n{}\n{}\n{}\n{}\n".format("Name: " + user1_title + " " + user1_name, "Town: " + user1_town, "Age: " + user1_age, "Working status: Not Working", "Reason: " + explain))
print("Information written to test.txt")
Hope this helps.
To write in your text file, you should use f.write instead of printing (which shows on the console). And as stated in the comments, remember to close the file when the program finished.
Also the working status is set in User1_working_status variable, and your if statement condition reads user1_name.

Usage of subprocess class in Python

final="cacls " + "E:/" + "\"" + list1[2] + " " + list1[3] + "\"" + " /p " + str
pro = subprocess.Popen(final,shell=True, stdin=subprocess.PIPE)
pro.communicate(bytes("Y\r\n",'utf-8'))
Trying to set permission to a folder Using Python but while running this command , User input needs to be provided too i.e
it asks ARE YOU SURE(Y/N) and the user needs to enter "Y" or "N"
The above code is not setting the permission.
This was the question i had asked before:
Python code to send command through command line
As a smart programmer, use PBS
Then, the code is:
from pbs import type as echo# Isn't it echo for Windows? If not, use the correct one
script = Command("/path/to/cacls ")
print script(echo("Y"), ("E:/" + "\"" + list1[2] + " " + list1[3] + "\"" + " /p " + str).split())

IRC Bot - flood protection (python)

if data.find('PRIVMSG') != -1:
nick = data.split('!')[ 0 ].replace(':','')
text = ''
if data.count(text) >= 200:
sck.send('KICK ' + " " + chan + " :" 'flooding' + '\r\n')
I'm trying to code a flood protection for the bot, I want it to kick a user if he enters more then 200 characters, how can I make it so it can read the other lines instead of just the first line? and the code above doesn't work, it doesnt kick the user but if I change the sck.send() to sck.send('PRIVMSG ' + chan + " :" 'flooding' + '\r\n') it works.
fixed the kicking problem, and the code works now, but it only reads the first line, not sure how to make it read the other lines if the user keeps flooding the channel.
if data.find('PRIVMSG') != -1:
nick = data.split('!')[ 0 ].replace(':','')
text = ''
if data.count(text) >= 200:
sck.send('KICK ' + " " + chan + " " + nick + " :" 'flooding' + '\r\n')
As far as I remember, the colon is a reserved character in the IRC protocol. That is, the first colon in a server message denotes the start of user-supplied data (that's also why ":" is not allowed in nicks/channel names). Hence, it suffices to search for the first colon and calculate the length of the remaining string.
Furthermore, data.find('PRIVMSG') is pretty unreliable. What if a user types the word "PRIVMSG" in regular channel conversation? Go look up the IRC RFC, it specifies the format of PRIVMSGs in detail.
Besides, you should be a little more specific. What exactly is the problem you're facing? Extracting the nick? Calculating the message length? Connecting to IRC?

Categories

Resources