Need Suggestion. - python

How to debug "NameError: global name 'X' is not defined" in Python? I am pretty much new in Python. I am using jupyter_notebook with Python 2.7 to execute code. I am facing following error.
My code:
logFile = "NASAlog.txt"
def parseLogs():
parsed_logs=(sc
.textFile(logFile)
.map(parseApacheLogLine)
.cache())
access_logs = (parsed_logs
.filter(lambda s: s[1] == 1)
.map(lambda s: s[0])
.cache())
failed_logs = (parsed_logs
.filter(lambda s: s[1] == 0)
.map(lambda s: s[0]))
failed_logs_count = failed_logs.count()
if failed_logs_count > 0:
print 'Number of invalid logline: %d' % failed_logs.count()
for line in failed_logs.take(20):
print 'Invalid logline: %s' % line
print 'Read %d lines, successfully parsed %d lines, failed to parse %d lines' % (parsed_logs.count(), access_logs.count(), failed_logs.count())
return parsed_logs, access_logs, failed_logs
parsed_logs, access_logs, failed_logs = parseLogs()
ERROR
> NameError Traceback (most recent call last)
> <ipython-input-18-b365aa793252> in <module>()
> 24 return parsed_logs, access_logs, failed_logs
> 25
> ---> 26 parsed_logs, access_logs, failed_logs = parseLogs()
>
> <ipython-input-18-b365aa793252> in parseLogs()
> 2
> 3 def parseLogs():
> ----> 4 parsed_logs=(sc
> 5 .textFile(logFile)
> 6 .map(parseApacheLogLine)
>
> NameError: global name 'sc' is not defined

The problem is that you did never define sc. Therefore python can't find it. (Makes sense, doesn't it?)
Now there are several possible reasons:
- python is case-sensitive. Did you somewhere define SC instead of sc? ... Or Sc instead of sc?
You defined sc in another function (-> you defined it in a function outside parseLogs()). If you only define it there the variable will be local and just be available to the code inside the function. Add the line global sc to the first line of your function to make it accessible everywhere in you whole code.
You simply did not define sc.

Related

Why "NameError: name 'product_id_list' is not defined"=

I write this and i don't know why product_id_list is not defined if i have defined it like 4 lines before.
Any suggestions? I thin identation is alright so I don't have any more ideas and I also searched around without luck.
Thank you!!
def make_dataSet_rowWise(reorder_product):
print('unique Product in dataset = ', len(reorder_product.product_id.unique()))
print('unique order_id in dataset = ', len(reorder_product.order_id.unique()))
product_id_list = reorder_product.product_id.unique().tolist()
product_id_list.append("order_id")
product_id_dict = {}
i = 0
for prod_id in product_id_list:
product_id_dict[prod_id] = i
i = i+1
product_id_df = pd.Dataframe(columns = product_id_list)
row_list_all = []
order_id_list = reorder_product.order_id.unique()
i = 1
for id in order_id_list:
#print(i)
i = i+1
np_zeros = np.zeros(shape = [len(product_id_list)-1])
ordered_product_list = reorder_product.loc[reorder_product.order_id == id]["product_id"].tolist()
for order_prod in ordered_product_list:
np_zeros[product_id_dict.get(order_prod)] = 1
row_list = np_zeros.tolist()
row_list.append(id)
row_list_all.append(row_list)
return (row_list_all, product_id_list)
df_row_wise = make_dataSet_rowWise(reorder_product_99Pct)
product_id_df = pd.DataFrame(df_row_wise[0], columns = df_row_wise[1])
product_id_df.head()
The error I have is this one:
NameError Traceback (most recent call last)
<ipython-input-343-07bcac1b3b48> in <module>
7 i = 0
8
----> 9 for prod_id in product_id_list:
10 product_id_dict[prod_id] = i
11 i = i+1
NameError: name 'product_id_list' is not defined
As already mentioned by the other answers, your indentation is wrong.
My recommendation is that you use a IDE like VSCode, there is also a free web version https://vscode.dev/
With such kind of IDE you can see that your indentation is wrong, check screenshot and line 27
There are also wrong indentations with the 3 for loops. The correct indentation should be as the following
I think your indentation may be wrong, the for-loops and return statement is out of the function (with your indentation) so I indented it so that it would still be part of the function...
def make_dataSet_rowWise(reorder_product):
print('unique Product in dataset = ', len(reorder_product.product_id.unique()))
print('unique order_id in dataset = ', len(reorder_product.order_id.unique()))
product_id_list = reorder_product.product_id.unique().tolist()
product_id_list.append("order_id")
product_id_dict = {}
i = 0
for prod_id in product_id_list:
product_id_dict[prod_id] = i
i = i+1
product_id_df = pd.Dataframe(columns = product_id_list)
row_list_all = []
order_id_list = reorder_product.order_id.unique()
i = 1
for id in order_id_list:
#print(i)
i = i+1
np_zeros = id.zeros(shape = [len(product_id_list)-1])
ordered_product_list = reorder_product.loc[reorder_product.order_id == id]["product_id"].tolist()
for order_prod in ordered_product_list:
np_zeros[product_id_dict.get(order_prod)] = 1
row_list = np_zeros.tolist()
row_list.append(id)
row_list_all.append(row_list)
return (row_list_all, product_id_list)
I'm new here, but i think you either need to define the variable out of the scope of
def make_dataSet_rowWise(reorder_product):
OR indent the for loops to be inside
make_dataSet_rowWise

Pyspark - name 'when' is not defined

Could someone please help me with the below.
joinDf = join_df2(df_tgt_device_dim.withColumn("hashvalue", F.sha2(F.concat_ws(",", *valColumns), 256)).alias("target"),
df_final.withColumn("hashvalue", F.sha2(F.concat_ws(",", *valColumns), 256)).alias("source"),
conditions,
"full_outer",
keyColumns)
deltaDf = get_active_records(joinDf, common_cols, "Type2")
wind_spc = Window.partitionBy(*keyColumns).orderBy(col("effective_start_ts").desc())
df_device_new = deltaDf.withColumn("Rank", F.row_number().over(wind_spc))
deltaDf_final = df_device_new .filter( col("diff") != 'unchanged_act_records').withColumn("crnt_ind",when(df_device_new .Rank == 1 ,lit('Y'))\
.when(df_device_new.Rank != 1 ,lit('N'))).drop("Rank")
deltaDf_final.union(deltaDf.filter(col("diff") == 'unchanged_act_records').withColumn("crnt_ind",lit('N'))).createOrReplaceTempView(f"device_delta")
Below is the error.
NameError: name 'when' is not defined
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<command-104590> in <module>
17 wind_spc = Window.partitionBy(*keyColumns).orderBy(col("effective_start_ts").desc())
18 df_device_new = deltaDf.withColumn("Rank", F.row_number().over(wind_spc))
---> 19 deltaDf_final = df_device_new .filter( col("diff") != 'unchanged_act_records').withColumn("crnt_ind",when(df_device_new .Rank == 1 ,lit('Y'))\
.when(df_device_new.Rank != 1
21 deltaDf_final.union(deltaDf.filter(col("diff") == 'unchanged_act_records').withColumn("crnt_ind",lit('N'))).createOrReplaceTempView(f"device_delta")
NameError: name 'when' is not defined
I have tried F.when, but it did not work.
Could someone please assist thank you.
Try to use .otherwise instead of .when after first .when
df_device_new = deltaDf.withColumn("Rank", F.row_number().over(wind_spc))
deltaDf_final = df_device_new .filter( col("diff") != 'unchanged_act_records').withColumn("crnt_ind",when(df_device_new .Rank == 1 ,lit('Y'))\
.otherwise(df_device_new.Rank != 1 ,lit('N'))).drop("Rank")

matplotlib xlim TypeError: '>' not supported between instances of 'int' and 'list'

this is the original repo i'm trying to run in my computer: https://github.com/kreamkorokke/cs244-final-project
import os
import matplotlib.pyplot as plt
import argparse
from attacker import check_attack_type
IMG_DIR = "./plots"
def read_lines(f, d):
lines = f.readlines()[:-1]
for line in lines:
typ, time, num = line.split(',')
if typ == 'seq':
d['seq']['time'].append(float(time))
d['seq']['num'].append(float(num))
elif typ == 'ack':
d['ack']['time'].append(float(time))
d['ack']['num'].append(float(num))
else:
raise "Unknown type read while parsing log file: %s" % typ
def main():
parser = argparse.ArgumentParser(description="Plot script for plotting sequence numbers.")
parser.add_argument('--save', dest='save_imgs', action='store_true',
help="Set this to true to save images under specified output directory.")
parser.add_argument('--attack', dest='attack',
nargs='?', const="", type=check_attack_type,
help="Attack name (used in plot names).")
parser.add_argument('--output', dest='output_dir', default=IMG_DIR,
help="Directory to store plots.")
args = parser.parse_args()
save_imgs = args.save_imgs
output_dir = args.output_dir
attack_name = args.attack
if save_imgs and attack_name not in ['div', 'dup', 'opt'] :
print("Attack name needed for saving plot figures.")
return
normal_log = {'seq':{'time':[], 'num':[]}, 'ack':{'time':[], 'num':[]}}
attack_log = {'seq':{'time':[], 'num':[]}, 'ack':{'time':[], 'num':[]}}
normal_f = open('log.txt', 'r')
attack_f = open('%s_attack_log.txt' % attack_name, 'r')
read_lines(normal_f, normal_log)
read_lines(attack_f, attack_log)
if attack_name == 'div':
attack_desc = 'ACK Division'
elif attack_name == 'dup':
attack_desc = 'DupACK Spoofing'
elif attack_name == 'opt':
attack_desc = 'Optimistic ACKing'
else:
raise 'Unknown attack type: %s' % attack_name
norm_seq_time, norm_seq_num = normal_log['seq']['time'], normal_log['seq']['num']
norm_ack_time, norm_ack_num = normal_log['ack']['time'], normal_log['ack']['num']
atck_seq_time, atck_seq_num = attack_log['seq']['time'], attack_log['seq']['num']
atck_ack_time, atck_ack_num = attack_log['ack']['time'], attack_log['ack']['num']
plt.plot(norm_seq_time, norm_seq_num, 'b^', label='Regular TCP Data Segments')
plt.plot(norm_ack_time, norm_ack_num, 'bx', label='Regular TCP ACKs')
plt.plot(atck_seq_time, atck_seq_num, 'rs', label='%s Attack Data Segments' % attack_desc)
plt.plot(atck_ack_time, atck_ack_num, 'r+', label='%s Attack ACKs' % attack_desc)
plt.legend(loc='upper left')
x = max(max(norm_seq_time, norm_ack_time),max(atck_seq_time, atck_ack_time))
y = max(max(norm_seq_num, norm_ack_num),max(atck_seq_num, atck_ack_num))
plt.xlim(0, x)
plt.ylim(0,y)
plt.xlabel('Time (s)')
plt.ylabel('Sequence Number (Bytes)')
if save_imgs:
# Save images to figure/
if not os.path.exists(output_dir):
os.makedirs(output_dir)
plt.savefig(output_dir + "/" + attack_name)
else:
plt.show()
normal_f.close()
attack_f.close()
if __name__ == "__main__":
main()
after running this i get this error
Traceback (most recent call last):
File "plot.py", line 85, in <module>
main()
File "plot.py", line 66, in main
plt.xlim(0, a)
File "/usr/lib/python3/dist-packages/matplotlib/pyplot.py", line 1427, in xlim
ret = ax.set_xlim(*args, **kwargs)
File "/usr/lib/python3/dist-packages/matplotlib/axes/_base.py", line 3267, in set_xlim
reverse = left > right
TypeError: '>' not supported between instances of 'int' and 'list'
Done! Please check ./plots for all generated plots.
how can i solve this problem? or better yet if there is another way of running this project? i installed matplotlib via pip3 install matplotlib command (same with scapy) and my main python version is python2 right now but i run the project with python3, could the issue be about this? what am i missing? or is it about mininet itself?
The problem is in this line
x = max(max(norm_seq_time, norm_ack_time),max(atck_seq_time, atck_ack_time))
IIUC, you wanna assign to x the maximum value among all those four lists. However, when you pass two lists to the max function, such as max(norm_seq_time, norm_ack_time), it will return the list it considers the greater one, and not the highest value considering both lists.
Instead, you can do something like:
x = max(norm_seq_time + norm_ack_time + atck_seq_time + atck_ack_time)
This will concatenate the four lists into a single one. Then, the function will return the highest value among all of them. You might wanna do that to the calculation of y as well.
If this is not what you wanted, or if you have any further issues, please let us know.
with the help of a friend we solved this problem by changing a part in code into this:
max_norm_seq_time = max(norm_seq_time) if len(norm_seq_time) > 0 else 0
max_norm_ack_time = max(norm_ack_time) if len(norm_ack_time) > 0 else 0
max_atck_seq_time = max(atck_seq_time) if len(atck_seq_time) > 0 else 0
max_atck_ack_time = max(atck_ack_time) if len(atck_ack_time) > 0 else 0
x = max((max_norm_seq_time, max_norm_ack_time,\
max_atck_seq_time, max_atck_ack_time))
plt.xlim([0,x])
max_norm_seq_num = max(norm_seq_num) if len(norm_seq_num) > 0 else 0
max_norm_ack_num = max(norm_ack_num) if len(norm_ack_num) > 0 else 0
max_atck_seq_num = max(atck_seq_num) if len(atck_seq_num) > 0 else 0
max_atck_ack_num = max(atck_ack_num) if len(atck_ack_num) > 0 else 0
plt.ylim([0, max((max_norm_seq_num, max_norm_ack_num,\
max_atck_seq_num, max_atck_ack_num))])
```
writing here just in case anyone else needs it.

Gurobi Python: Unsupported type (<class 'tuple'>) for LinExpr addition argument Error

I am trying to write the constraint as shown in the image. But getting the below error:
> --------------------------------------------------------------------------- GurobiError Traceback (most recent call
> last) <ipython-input-112-d0e0b7b1cb5e> in <module>()
> ----> 1 Boiler_capacity = m.addConstrs((boiler_produced_thermal[t] <= boiler_thermal_max for t in time_slots), name = "Boiler_capacity")
>
> model.pxi in gurobipy.Model.addConstrs
> (../../src/python/gurobipy.c:89458)()
>
> model.pxi in gurobipy.Model.addConstr
> (../../src/python/gurobipy.c:87963)()
>
> linexpr.pxi in gurobipy.LinExpr.__sub__
> (../../src/python/gurobipy.c:34728)()
>
> linexpr.pxi in gurobipy.LinExpr.__add__
> (../../src/python/gurobipy.c:34333)()
>
> linexpr.pxi in gurobipy.LinExpr.add
> (../../src/python/gurobipy.c:31162)()
>
> GurobiError: Unsupported type (<class 'tuple'>) for LinExpr addition
> argument
What i have tried so far:
Boiler_capacity = m.addConstrs((boiler_produced_thermal[t] <= boiler_thermal_max for t in time_slots), name = "Boiler_capacity")
Where :
boiler_produced_thermal is variable with index time slot
boiler_thermal_max = 21000 is assigned to integer value.
time_slots = ['k1', 'k2','k3', 'k4', 'k5']
This code works:
Instead of using the variable name, if used the value assigned to a variable. It works. But i didn't get what is the actual reason behind this.
Boiler_capacity = m.addConstrs((boiler_produced_thermal[t] <= 21000 for t in time_slots), name = "Boiler_capacity")
Can somebody help me to understand the problem?.

Python "local variable 'pc' referenced before assignment" issue in basic decomplier

A friend and myself are working on creating a basic proof-of-concept decompiler that takes a string of hex values and returns a more readable version. Our code is listed below
testProgram = "00 00 FF 55 47 00"
# should look like this
# NOP
# NOP
# MOV 55 47
# NOP
pc = 0
output = ""
def byte(int):
return testProgram[3 * int:3 * int + 2]
def interpret():
currentByte = byte(pc)
if currentByte == "00":
pc += 1
return "NOP"
if currentByte == "FF":
returner = "MOV " + byte(pc + 1) + " " + byte(pc + 2)
pc += 3
return returner
while(byte(pc) != ""):
output += interpret() + "\n"
print(output)
however, running the code tells us this
Traceback (most recent call last):
File "BasicTest.py", line 62, in <module>
output += interpret() + "\n"
File "BasicTest.py", line 50, in interpret
currentByte = byte(pc)
UnboundLocalError: local variable 'pc' referenced before assignment
Because pc is a global variable, shouldn't it be usable from anywhere? Any and all help is appreciate - if you spot other errors, feel free to leave a comment pointing them out!
Been seeing this a lot lately. When you do
if currentByte == "00":
pc += 1 # <----------
return "NOP"
You're assigning to the local variable pc, but pc isn't declared yet in the local scope. If you want to modify the global pc you need to declare that explicitly at the top of the function
global pc

Categories

Resources