Clicking on flask button runs python code and unpickles files with tokenizer - python

I am building a website with flask and when you click on a button I'm trying to run my machine learning code that is in a different .py file. But when I click on that button I get this error
AttributeError: Can't get attribute 'Tokenizer' on <module '__main__' from 'c:filepath'
I've been told it's because my Tokenizer class isn't able to unpickle the file. But I'm not sure why it's not able to because when I run my machine learning code on it's own it works fine. But when I try to click on the button through flask, that's when I get that error. Any help would be much appreciated
The function I'm trying to run is called starter("no") from a file called Music_Generator_2.py
app.py
#app.route('/generated')
def generated():
print("start")
Music_Generator_2.start("no") #from Music_Generator_2
print("sucess")
return render_template('index.html', tested_generator="generated")
The error occurs on the second line of this code
Music_Generator_2.py
model = tf.keras.models.load_model("model_25epochs.h5", custom_objects=SeqSelfAttention.get_custom_objects())
tokenizer = pickle.load(open("tokenizer25.p", "rb"))
#generate from random
max_generate = 200
unique_notes = tokenizer.unique_word
seq_len = 200
generate = generate_from_random(unique_notes, seq_len)
generate = generate_notes(generate, model, unique_notes, max_generate, seq_len)
write_midi_file(generate, tokenizer, "rand test.mid", start=seq_len - 1, fs=7, max_generate=max_generate)
#generate from a note
max_generate = 300
unique_notes = tokenizer.unique_word # same as above
seq_len = 300
generate = generate_from_one_note(tokenizer, "72")
generate = generate_notes(generate, model, unique_notes, max_generate, seq_len)
This is the code that I'm trying to in my machine learning program
Music_Generator_2.py
Tokenizer class
class Tokenizer:
def __init__(self):
self.notes_to_index = {}
self.index_to_notes = {}
self.num_word = 0
self.unique_word = 0
self.note_freq = {}
'''transform a list of notes from strings to indexes
list_array is a list of notes in string format'''
def transform(self, list_array):
transformed = []
for i in list_array:
transformed.append([self.notes_to_index[note] for note in i])
return np.array(transformed, dtype = np.int32)
'''partial fir on the dictionary of the tokenizer
notes is a list of notes'''
def partial_fit(self, notes):
for note in notes:
note_str = ",".join(str(n) for n in note)
if note_str in self.note_freq:
self.note_freq[note_str] += 1
self.num_word += 1
else:
self.note_freq[note_str] = 1
self.unique_word += 1
self.num_word += 1
self.notes_to_index[note_str] =self.unique_word
self.index_to_notes[self.unique_word] = note_str
'''add a new note to the dictionary
note is the new note to be added as a string'''
def add_new_note(self, note):
assert note not in self.notes_to_index
self.unique_word += 1
self.notes_to_index[note] = self.unique_word
self.index_to_notes[self.unique_word] = note
Solved: I moved my tokenizer class into it's own .py file and then I just imported that file for app.py and Mustic_Generator_2.py. I found the solution from here

This could be an issue of how you are running Flask. Are you running it inside of a virtualenv? If so, make sure that the correct pip packages are installed. I would make sure that the environment in which I run Flask is identical to the one where you run it on your own and it works.

Related

Get filelocation with win32gui and run script while a specified progam is running

I am trying to program something with win32gui and now I need to know how I can get the filelocation of the file I am running in Visual Studio Code, Word or Wordpad. I used win32gui.GetOpenFileName, but it doesn't work anymore after I checked the type of variable I get. My question is: Is it the best way to get the filelocation or is there a better way. If it is the best way does someone know why it's not working anymore?
My Second question is regarding the running of the script. I want to run the script automatically as soon as I start Word, Visual Studio Code or Wordpad. What is the best way to do this?
I tried finding solutions online but at the moment I am lacking inspiration. Thank you for your help.
Update>
I got this far with the code. I want to change it so that I can get the file automatically and I want to start is with Visual Studio Code instead of starting it manuely every time.
import win32gui
import math
e = math.e
maxwidth = 1874
maxheight = 950
x =450
y = 1500 + round(maxwidth/(1+e**(-0.009*(0-57.45))))
#minwidth = 700
#minheight = 505
k = 0
l = 0
def main():
win32gui.EnumWindows(callback, None)
def callback(hwnd, extra):
windowopen = False
name = win32gui.GetWindowText(hwnd)
if name.endswith('Visual Studio Code') and name.startswith('_test.py'):
windowopen = True
print(name)
while windowopen:
j = 0
for i in open('D:\_test.py', 'r'):
j += 1
k = round(maxwidth/(1+e**(-0.009*(j-57.45))))
l = round(maxheight/(1+e**(-0.009*(j-57.45))))
yuse = y - k
win32gui.MoveWindow(hwnd, yuse, x, k, l, True)
print(name)
file = win32gui.GetOpenFileName
print(file)
if __name__ == '__main__':
main()
Update>
I imported os and than used the .path.dirname function to get my file location:
import os
dir_path = os.path.dirname(os.path.realpath(__file__))

Python 3. How to access the dictionary of a running one from another module?

I have a ready Flask server where nlp models are run. While 5 devices were connected to it, there were no problems, then we increased this number to 15 and then the performance problem began. We decided to increase the number of cores on this server from 12 to 24, but got a performance increase of around 20%. We decided to try another method, launched 3 servers via Gunicorn and taskset --cpu-list (giving 8 cores to each server) on different ports with a limit of 5 devices per server, here the performance almost doubled, became approximately the same as on a server with 12 cores when there were 5 devices. But then another problem arose, each server ran its models in isolation from each other and RAM consumption increased by 3 times.
Models are loaded into the dictionary by this code
import torch
import torch.nn as nn
import torchvision
import os
import learning
app = Flask(__name__)
models_list_name = os.listdir('models')
global_dict_models = {}
for token in models_list_name:
try:
tok, n_class, model_type, typ, h_layer = token.split('.')
t, model = learning.get_model(model_type, typ)
n_class = int(n_class)
h_layer = int(h_layer)
model = learning.NewModel(model, n_class, typ, h_layer)
model.load_state_dict(torch.load(os.path.join('models', token), map_location=torch.device('cpu')))
print(model, tok)
with open(tok, 'r') as f:
label = f.read().split('\n')
print(label)
global_dict_models[tok] = (t, model, label)
except ValueError:
pass
and when a predict is needed, the query simply accesses the dictionary
t, model, labels = global_dict_models[token]
x = t.encode_plus(text, add_special_tokens=True, max_length=512, truncation=True, padding="max_length", return_tensors='pt')
output = torch.sigmoid(model(x['input_ids'].squeeze(1), x['attention_mask'])).detach().cpu().numpy()
Models on all servers are the same. I want to make a single module with model loading, where a dictionary is formed and so that each server would simply access it through a function.
I tried to create a separate module according to this principle
from flask import Flask, request, make_response
import torch
import torch.nn as nn
import torchvision
import os
import learning
app = Flask(__name__)
models_list_name = os.listdir('models')
global_dict_models = {}
def load_models():
for token in models_list_name:
try:
tok, n_class, model_type, typ, h_layer = token.split('.')
t, model = learning.get_model(model_type, typ)
n_class = int(n_class)
h_layer = int(h_layer)
model = learning.NewModel(model, n_class, typ, h_layer)
model.load_state_dict(torch.load(os.path.join('models', token), map_location=torch.device('cpu')))
print(model, tok)
with open(tok, 'r') as f:
label = f.read().split('\n')
print(label)
global_dict_models[tok] = (t, model, label)
except ValueError:
pass
def dict_models(token):
t, model, labels = global_dict_models[token]
return t, model, labels
if __name__ == '__main__':
load_models()
print('Start')
app.run(host="0.0.0.0", port="5000", threaded=True, processes=1)
And then a problem arose that it was impossible to read a dictionary that was launched in another module.
Tell me how to load the dictionary once and have access to it from other modules? Or how to solve this problem in another way?

AttributeError: 'str' object has no attribute 'name' when using Streamlit

I've been trying to replicate the demo website from this repo using Streamlit.
But I'm stuck when I'm going to process the image with the model. The error message is AttributeError: 'str' object has no attribute 'name'. But in data.py or the code to read the image there is no 'name' attribute. Or I'm missing something here?
This is the snippet code
streamlitdemo.py
#st.cache()
def load_model():
gpu_ids=[]
model = create_model(gpu_ids)
model.eval()
return model
a = 'wave.jpg'
b = 'building.jpg'
c = 'test_samples/madoka.jpg'
def anime2sketch(img_input, load_size=512):
img, aus_resize = read_img_path(img_input.name, load_size)
model = load_model()
aus_tensor = model(img)
aus_img = tensor_to_img(aus_tensor)
image_pil = Image.fromarray(aus_img)
image_pil = image_pil.resize(aus_resize, Image.BICUBIC)
return image_pil
demo.py
.
.
.
def read_img_path(path, load_size):
"""read tensors from a given image path
Parameters:
path (str) -- input image path
load_size(int) -- the input size. If <= 0, don't resize
"""
img = Image.open(path).convert('RGB')
aus_resize = None
if load_size > 0:
aus_resize = img.size
transform = get_transform(load_size=load_size)
image = transform(img)
return image.unsqueeze(0), aus_resize
model.py
.
.
.
def create_model(gpu_ids=[]):
"""Create a model for anime2sketch
hardcoding the options for simplicity
"""
norm_layer = functools.partial(nn.InstanceNorm2d, affine=False, track_running_stats=False)
net = UnetGenerator(3, 1, 8, 64, norm_layer=norm_layer, use_dropout=False)
ckpt = torch.load('weights/netG.pth')
for key in list(ckpt.keys()):
if 'module.' in key:
ckpt[key.replace('module.', '')] = ckpt[key]
del ckpt[key]
net.load_state_dict(ckpt)
if len(gpu_ids) > 0:
assert(torch.cuda.is_available())
net.to(gpu_ids[0])
net = torch.nn.DataParallel(net, gpu_ids) # multi-GPUs
return net
But, when I'm hardcode the path with a/b/c variable, the model work properly. And I've already change read_img_path(img_input.name, load_size) to read_img_path(img_input, load_size) and I got FileNotFoundError: [Errno 2] No such file or directory: 'wave' error message.
This is the output when I'm hardcode the path
In that repo, the author already provide demo website but using Gradio. When I tried to run the demo code with Gradio it is work properly. I'm using the same code from the author, but I tweak it a little bit.
Thank you.

How to deploy a simple neural network from A-Z in MXNet

I am trying to build and deploy a simple neural network in MXNet and deploy it on a server using mxnet-model-server.
The biggest issue is to deploy the model - model server crashes after uploading the .mar file but I have no idea what the problem could be.
I used the following code to create a custom (but very simple) neural network for testing:
from __future__ import print_function
import numpy as np
import mxnet as mx
from mxnet import nd, autograd, gluon
data_ctx = mx.cpu()
model_ctx = mx.cpu()
# fix the seed
np.random.seed(42)
mx.random.seed(42)
num_examples = 1000
X = mx.random.uniform(shape=(num_examples, 49))
y = mx.random.uniform(shape=(num_examples, 1))
dataset_train = mx.gluon.data.dataset.ArrayDataset(X, y)
dataset_test = dataset_train
data_loader_train = mx.gluon.data.DataLoader(dataset_train, batch_size=25)
data_loader_test = mx.gluon.data.DataLoader(dataset_test, batch_size=25)
num_outputs = 2
net = gluon.nn.HybridSequential()
net.hybridize()
with net.name_scope():
net.add(gluon.nn.Dense(49, activation="relu"))
net.add(gluon.nn.Dense(64, activation="relu"))
net.add(gluon.nn.Dense(num_outputs))
net.collect_params().initialize(mx.init.Normal(sigma=.1), ctx=model_ctx)
softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .01})
epochs = 1
smoothing_constant = .01
for e in range(epochs):
cumulative_loss = 0
for i, (data, label) in enumerate(data_loader_train):
data = data.as_in_context(model_ctx).reshape((-1, 49))
label = label.as_in_context(model_ctx)
with autograd.record():
output = net(data)
loss = softmax_cross_entropy(output, label)
loss.backward()
trainer.step(data.shape[0])
cumulative_loss += nd.sum(loss).asscalar()
Following, exported the model using:
net.export("model_files/my_project")
The result are a .json and .params file.
I created a signature.json
{
"inputs": [
{
"data_name": "data",
"data_shape": [
1,
49
]
}
]
}
The model handler is the same from the mxnet tutorial:
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.
"""
ModelHandler defines a base model handler.
"""
import logging
import time
class ModelHandler(object):
"""
A base Model handler implementation.
"""
def __init__(self):
self.error = None
self._context = None
self._batch_size = 0
self.initialized = False
def initialize(self, context):
"""
Initialize model. This will be called during model loading time
:param context: Initial context contains model server system properties.
:return:
"""
self._context = context
self._batch_size = context.system_properties["batch_size"]
self.initialized = True
def preprocess(self, batch):
"""
Transform raw input into model input data.
:param batch: list of raw requests, should match batch size
:return: list of preprocessed model input data
"""
assert self._batch_size == len(batch), "Invalid input batch size: {}".format(len(batch))
return None
def inference(self, model_input):
"""
Internal inference methods
:param model_input: transformed model input data
:return: list of inference output in NDArray
"""
return None
def postprocess(self, inference_output):
"""
Return predict result in batch.
:param inference_output: list of inference output
:return: list of predict results
"""
return ["OK"] * self._batch_size
def handle(self, data, context):
"""
Custom service entry point function.
:param data: list of objects, raw input from request
:param context: model server context
:return: list of outputs to be send back to client
"""
self.error = None # reset earlier errors
try:
preprocess_start = time.time()
data = self.preprocess(data)
inference_start = time.time()
data = self.inference(data)
postprocess_start = time.time()
data = self.postprocess(data)
end_time = time.time()
metrics = context.metrics
metrics.add_time("PreprocessTime", round((inference_start - preprocess_start) * 1000, 2))
metrics.add_time("InferenceTime", round((postprocess_start - inference_start) * 1000, 2))
metrics.add_time("PostprocessTime", round((end_time - postprocess_start) * 1000, 2))
return data
except Exception as e:
logging.error(e, exc_info=True)
request_processor = context.request_processor
request_processor.report_status(500, "Unknown inference error")
return [str(e)] * self._batch_size
Following, I created the .mar file using:
model-archiver --model-name my_project --model-path my_project --handler ssd_service:handle
Starting the model on the server:
mxnet-model-server --start --model_store my_project --models ssd=my_project.mar
I literally followed every tutorial on:
https://github.com/awslabs/mxnet-model-server
However, the server is crashing. The worker die, backend worker die, workers are disconnected, Load model failed: ssd, error: worker died
I have absolutely no clue what to do so I would be very glad if you helped me out!
Best
I tried out your code and it works fine on my laptop. If I run: curl -X POST http://127.0.0.1:8080/predictions/ssd -F "data=[0 1 2 3 4]", I get: OK%
I can only guess why it doesn't work on your machine:
Notice that model-store argument should be written with - not with _ as it is in your example. My command to run mxnet-model-server looks like this: mxnet-model-server --start --model-store ./ --models ssd=my_project.mar
Which version of mxnet-model-server you use? The latest is 1.0.2, but I have 1.0.1 installed, so maybe you want to downgrade and try it out: pip install mxnet-model-server==1.0.1.
Same question to MXNet version. In my case I use nightly build which I get via pip install mxnet --pre. I see that your model is very basic, so it shouldn't depend much... Nevertheless, install the 1.4.0 (current one) just in case.
Not sure, but hope it will help you.

IPython cluster and PicklingError

my problem seem to be similar to This Thread however, while I think I am following the advised method, I still get a PicklingError. When I run my process locally without sending to an IPython Cluster Engine the function works fine.
I am using zipline with IPyhon's notebook, so I first create a class based on zipline.TradingAlgorithm
Cell [ 1 ]
from IPython.parallel import Client
rc = Client()
lview = rc.load_balanced_view()
Cell [ 2 ]
%%px --local # This insures that the Class and modules exist on each engine
import zipline as zpl
import numpy as np
class Agent(zpl.TradingAlgorithm): # must define initialize and handle_data methods
def initialize(self):
self.valueHistory = None
pass
def handle_data(self, data):
for security in data.keys():
## Just randomly buy/sell/hold for each security
coinflip = np.random.random()
if coinflip < .25:
self.order(security,100)
elif coinflip > .75:
self.order(security,-100)
pass
Cell [ 3 ]
from zipline.utils.factory import load_from_yahoo
start = '2013-04-01'
end = '2013-06-01'
sidList = ['SPY','GOOG']
data = load_from_yahoo(stocks=sidList,start=start,end=end)
agentList = []
for i in range(3):
agentList.append(Agent())
def testSystem(agent,data):
results = agent.run(data) #-- This is how the zipline based class is executed
#-- next I'm just storing the final value of the test so I can plot later
agent.valueHistory.append(results['portfolio_value'][len(results['portfolio_value'])-1])
return agent
for i in range(10):
tasks = []
for agent in agentList:
#agent = testSystem(agent,data) ## On its own, this works!
#-- To Test, uncomment the above line and comment out the next two
tasks.append(lview.apply_async(testSystem,agent,data))
agentList = [ar.get() for ar in tasks]
for agent in agentList:
plot(agent.valueHistory)
Here is the Error produced:
PicklingError Traceback (most recent call last)/Library/Python/2.7/site-packages/IPython/kernel/zmq/serialize.pyc in serialize_object(obj, buffer_threshold, item_threshold)
100 buffers.extend(_extract_buffers(cobj, buffer_threshold))
101
--> 102 buffers.insert(0, pickle.dumps(cobj,-1))
103 return buffers
104
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
If I override the run() method from zipline.TradingAlgorithm with something like:
def run(self, data):
return 1
Trying something like this...
def run(self, data):
return zpl.TradingAlgorithm.run(self,data)
results in the same PicklingError.
then the passing off to the engines works, but obviously the guts of the test are not performed. As run is a method internal to zipline.TradingAlgorithm and I don't know everything that it does, how would I make sure it is passed through?
It looks like the zipline TradingAlgorithm object is not pickleable after it has been run:
import zipline as zpl
class Agent(zpl.TradingAlgorithm): # must define initialize and handle_data methods
def handle_data(self, data):
pass
agent = Agent()
pickle.dumps(agent)[:32] # ok
agent.run(data)
pickle.dumps(agent)[:32] # fails
But this suggests to me that you should be creating the Agents on the engines, and only passing data / results back and forth (ideally, not passing data across at all, or at most once).
Minimizing data transfers might look something like this:
define the class:
%%px
import zipline as zpl
import numpy as np
class Agent(zpl.TradingAlgorithm): # must define initialize and handle_data methods
def initialize(self):
self.valueHistory = []
def handle_data(self, data):
for security in data.keys():
## Just randomly buy/sell/hold for each security
coinflip = np.random.random()
if coinflip < .25:
self.order(security,100)
elif coinflip > .75:
self.order(security,-100)
load the data
%%px
from zipline.utils.factory import load_from_yahoo
start = '2013-04-01'
end = '2013-06-01'
sidList = ['SPY','GOOG']
data = load_from_yahoo(stocks=sidList,start=start,end=end)
agent = Agent()
and run the code:
def testSystem(agent, data):
results = agent.run(data) #-- This is how the zipline based class is executed
#-- next I'm just storing the final value of the test so I can plot later
agent.valueHistory.append(results['portfolio_value'][len(results['portfolio_value'])-1])
# create references to the remote agent / data objects
agent_ref = parallel.Reference('agent')
data_ref = parallel.Reference('data')
tasks = []
for i in range(10):
for j in range(len(rc)):
tasks.append(lview.apply_async(testSystem, agent_ref, data_ref))
# wait for the tasks to complete
[ t.get() for t in tasks ]
And plot the results, never fetching the agents themselves
%matplotlib inline
import matplotlib.pyplot as plt
for history in rc[:].apply_async(lambda : agent.valueHistory):
plt.plot(history)
This is not quite the same code you shared - three agents bouncing back and forth on all your engines, whereas this has on agent per engine. I don't know enough about zipline to say whether that's useful to you or not.

Categories

Resources