Question: Model dowloading with custom vision & python API - python

i am new to Microsoft Custom Vision and I am working on an intergration of Microsoft Azure Custom Vision API using jupyter notebooks/python. I was able to upload images, tag them automatically and train the first iterations. However, as I was trying to download a Docker file of the train iteration/model I got stuck while trying to export the model. Using the function export_iteration I ended up having an mst.rest.pipeline.clientrawresponse object. I think currently it is only stored in the exporting queue. How do I access this queue element to download it to my local system?
PS: I am working with a General (compact) model format so it should be exportable.
Example code:
# Initalize the Training Client
training_key = "your-training-key"
ENDPOINT = "your-endpoint"
c_plat = CustomVisionTrainingClient(training_key,ENDPOINT)
# List all projects you have
projects = c_plat.get_projects()
#Always take the newest project and its newest iteration and export it
iterations = c_plat.get_iterations(projects[0].id)
c_plat.export_iteration(project_id=projects[0].id, iteration_id=iterations[0].id, platform = "DockerFile", raw=True, flavor = "ARM")

After some trial and error I found a solution:
#Always takes the newest project and its newest iteration
iterations = c_plat.get_iterations(projects[0].id)
response = c_plat.export_iteration(project_id=projects[0].id, iteration_id=iterations[0].id, platform = "DockerFile", raw=False, flavor="ARM")
# Opnening the uri
import webbrowser
webbrowser.open(c_plat.get_exports(project_id=projects[0].id, iteration_id=iterations[0].id)[0].download_uri)
This opened the uri in a new tab and started the automatic download. Hope this might help somebody else.
Cheers!

Related

How To Download Google Pegasus Library Model

I am a very newbie and currently working for my Final Project. I watch a youtube video that teach me to code Abstractive Text Summarization with google's Pegasus library. It Works fine but I need it to be more efficient.
So here is the code
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
tokenizer = PegasusTokenizer.from_pretrained("google/pegasus-xsum")
model = PegasusForConditionalGeneration.from_pretrained("google/pegasus-xsum")
Everytime I run that code, it always download the "Google Pegasus-xsum" library which sized about 2.2 GB.
So here is the sample of the code in notebook : https://github.com/nicknochnack/PegasusSummarization/blob/main/Pegasus%20Tutorial.ipynb
and it will running download the library like picture below :
Is there any way to download the library first and then I saved it locally, and everytime I run the code it's just gonna call the library locally?
Something like caching or saving the library locally maybe?
Thanks.
Mac
Using inspect you can find and locate the modules easily.
import inspect
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
tokenizer = PegasusTokenizer.from_pretrained("google/pegasus-xsum")
model = PegasusForConditionalGeneration.from_pretrained("google/pegasus-xsum")
print(inspect.getfile(PegasusForConditionalGeneration))
print(inspect.getfile(PegasusTokenizer))
You will get their paths sth like this
/usr/local/lib/python3.9/site-packages/transformers/models/pegasus/modeling_pegasus.py
/usr/local/lib/python3.9/site-packages/transformers/models/pegasus/tokenization_pegasus.py
Now, if you go and see what is inside the tokenization_pegasus.py file, you will notice that the model of google/pegasus-xsum is being probably fetched by the following line
PRETRAINED_VOCAB_FILES_MAP = {
"vocab_file": {"google/pegasus-xsum": "https://huggingface.co/google/pegasus-xsum/resolve/main/spiece.model"}
}
where here if you open:
https://huggingface.co/google/pegasus-xsum/resolve/main/spiece.model
You will get the model downloaded directly to your machine.
UPDATE
After some search on Google, I've found sth important where you can get the used models and all their related files downloaded to your working directory by the following
tokenizer.save_pretrained("local_pegasus-xsum_tokenizer")
model.save_pretrained("local_pegasus-xsum_tokenizer_model")
Ref:
https://github.com/huggingface/transformers/issues/14561
So that after running it, you will see the following being saved automatically in your working directory. So, now you can call the models directly but you need to search how...
Also, the 12.2GB file that you wanted to know its path locally, it is being located here online
https://huggingface.co/google/pegasus-xsum/tree/main
And after downloading the models to your directory as you can see from the screenshot its name is pytorch_model.bin as it’s named online.

How to apply Azure OCR API with Request library on local images?

Actually, I'm using ocr.space as an OCR API for the OCR task in my project (It's a python project).
I would like to use Azure OCR API and check which API is better than the other.
I followed this documentation https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/quickstarts-sdk/client-library?tabs=visual-studio&pivots=programming-language-python.
As you can see the computervision_client.read(...) function needs an image URL to work correctly. However, I want to apply this API on local image in my computer.
What do you suggest mates ?
Thank you
I had the same issue, they discussed it on github here. And somebody put up a good list of examples for using all the Azure OCR functions with local images. Try using the read_in_stream() function, something like
with open("path_to_image.png", "rb") as image_stream:
job = client.read_in_stream(
image=image_stream,
mode="Printed",
raw=True
)
operation_id = job.headers['Operation-Location'].split('/')[-1]
image_analysis = computervision_client.get_read_result(operation_id)
while image_analysis.status.lower() in ['notstarted', 'running']:
time.sleep(1)
image_analysis = computervision_client.get_read_result(
operation_id=operation_id)
print(image_analysis)
lines = [res.lines for res in image_analysis.analyze_result.read_results]
print(lines)

Remove Video From Plex's "Recently Added" Section

Is it possible to remove videos from Plex's "Recently Added" section? I'd like to add some old videos to my server for posterity but not have them appears as newly added.
When I initially sought an answer to this question, I looked on support forums, including Plex's own and Reddit. All answers said it was either not possible or suggested editing the metadata_items table of the SQLite database directly (yuck). I have since found a better way to do this, but those support threads are now locked, so I'm unable to share this solution there. Hopefully answering this question here helps others find it.
This actually is possible and quite easy via the REST API. First, pip install plexapi. Then use the following script to update the addedAt field of your video of choice.
import os
import sys
from plexapi.myplex import MyPlexAccount
USERNAME = os.environ.get("PLEX_USERNAME")
PASSWORD = os.environ.get("PLEX_PASSWORD")
account = MyPlexAccount(USERNAME, PASSWORD)
plex = account.resource("<YOUR_PLEX_HOSTNAME>").connect()
library = plex.library.section("<YOUR_PLEX_LIBRARY_NAME>")
video = library.get(title="<YOUR MOVIE TITLE>")
updates = {"addedAt.value": "2018-08-21 11:19:43"}
video.edit(**updates)
That's it! What we're doing here is changing the addedAt value to be something that is older, since "Recently Added" is sorted by this date, so we're moving the video to the back of the line.
alexdlaird suggestion worked very well for me! I had to make one change, and I believe that's because I have 2FA on my Plex account:
import os
import sys
import plexapi
from plexapi.server import PlexServer
baseurl = 'http://plexserver:32400'
token = 'YOUR PLEX TOKEN'
plex = PlexServer(baseurl, token)
library = plex.library.section("Movies")
video = library.get(title="MOVIE NAME")
updates = {"addedAt.value": "2018-08-21 11:19:43"}
video.edit(**updates)
I was faced to this situation and while gathering documentations to modify the SQL db, I thought about another solution and it did the trick for me. As previously said, all support threads regarding this issue are closed elsewhere so I'm posting it here for others to eventually find it :
Stopped Plex Server
Disabled the server (computer) internal clock automatic sync
Set the internal clock manually to a year before
Started Plex Server
Added the movies
Scanned the library
Movie got added to the library without showing into recently added BUT the metadata from moviedb where not able to be loaded (probably caused by the difference in time and date between my server and moviedb server)
Stopped Plex Server
Restored the internal clock automatic sync
Started Plex server
Updated all metadata of the library (actually I did it manually for each as I didn't know if I had some custom metadata somewhere else in the library)
Enjoyed !

Why can't I read a joblib file from my github repo?

I've built a simple app in Python, with a front-end UI in Dash.
It relies on three files,
small dataframe, in pickle format ,95KB
large scipy sparse matrix, in NPZ format, 12MB
large scikit KNN-model, in job lib format, 65MB
I have read in the first dataframe successfully by
link = 'https://github.com/user/project/raw/master/filteredDF.pkl'
df = pd.read_pickle(link)
But when I try this with the others, say, the model by:
mLink = 'https://github.com/user/project/raw/master/knnModel.pkl'
filehandler = open(mLink, 'rb')
model_knn = pickle.load(filehandler)
I just get an error
Invalid argument: 'https://github.com/user/project/raw/master/knnModel45Percent.pkl'
I also pushed these files using Github LFS, but the same error occurs.
I understand that hosting large static files on github is bad practice, but I haven't been able to figure out how to use PyDrive or AWS S3 w/ my project. I just need these files to be read in by my project, and I plan to host the app on something like Heroku. I don't really need a full-on DB to store files.
The best case would be if I could read in these large files stored in my repo, but if there is a better approach, I am willing as well. I spent the past few days struggling through Dropbox, Amazon, and Google Cloud APIs and am a bit lost.
Any help appreciated, thank you.
Could you try the following?
from io import BytesIO
import pickle
import requests
mLink = 'https://github.com/aaronwangy/Kankoku/blob/master/filteredAnimeList45PercentAll.pkl?raw=true'
mfile = BytesIO(requests.get(mLink).content)
model_knn = pickle.load(mfile)
Using the BytesIO you create a file object out of the response that you get from GitHub. That object can then be using in pickle.load. Note that I have added ?raw=true to the URL of the request.
For the ones having the KeyError 10 try
model_knn = joblib.load(mfile)
instead of
model_knn = pickle.load(mfile)

how to get image search result using bing search api with python?

I need some image sample for machine learning training. I have not enough resource now, so I need to crawl some using the search engine. Google is not free now and I choose bing.
I have tried pybing. It seems not work now.
I don't known how to get the appid.
from py_bing_search import PyBingImageSearch
bing_image = PyBingImageSearch('Your-Api-Key-Here', "x-box console", image_filters='Size:medium+Color:Monochrome') #image_filters is optional
first_fifty_result= bing_image.search(limit=50, format='json') #1-50
print (first_fifty_result[0].media_url)

Categories

Resources