Quote from the official web-site:
Customize and embed state-of-the-art computer vision for specific domains. Build frictionless customer experiences, optimize manufacturing processes, accelerate digital marketing campaigns—and more. No machine learning expertise is required.
There are a lot of well-written references explaining how to upload your images from the local storage to Azure Custom Vision workspace, using SDKs or simple HTTP requests. Most of them concern training classification/object detection models and managing iterations.
However, in some cases one may need to be able to download all the images he used for training back to a local storage.
When may this happen?
Imagine you uploaded your images and after uploading them they have mysteriously dissappeared from your laptop. Or, which is more likely to happen, you used all the predictions to retrain your model and you need to re-work them using OpenCV. Sounds reasonable, right?
Thus, I decided to share re-usable code snippet to download multiple in a single batch using official API reference. Feel free to use it for your machine learning projects, I will be completely satisfied if someone finds it useful.
First of all we need the API reference. It took me a while to finally find this reference, so I find it useful to provide the link in the article.
According to this documentation you can get all the training images with their downloadable links.
Great, that's what we were looking for.
The plan is quite straightforward
Get all the API keysGet all the tagged images using API callFor each tagged image get a download uriFor each tag create a folder with the same nameSave image on your PCEnjoy
N.B. Usually, the endpoint is in the following format: Region + api.cognitive.microsoft.com
In our case it is westeurope.api.cognitive.microsoft.com
Up we go
Firstly, we need to obtain training key and project id. All this information may be easily found on your Custom Vision workspace, in the project settings menu
Here is the final code. I tried to comment every line to make it understandable, though there is nothing complicated in it at all.
# coding: utf-8
"""Download training photos from Azure Custom Vision workspace
"""
import http.client
import urllib.request
import urllib.parse
import urllib.error
import base64
import json
import os
# create tag folders if they do not exist
tag_folders = ['ok', 'ko']
for folder in tag_folders:
if not os.path.exists(folder):
os.makedirs(folder)
# request headers
# only training key is needed
headers = {
# Request headers
'Training-Key': '<your-training-key>'
}
# query parameters
# here only several are used
params = urllib.parse.urlencode({
# Format - int32. Maximum number of images to return. Defaults to 50, limited to 256.
'take': '256',
# Format - int32. Number of images to skip before beginning the image batch. Defaults to 0.
'skip': '256'
})
try:
# base url
conn = http.client.HTTPSConnection(
'southcentralus.api.cognitive.microsoft.com')
conn.request(
"GET", "/customvision/v3.0/training/projects/<project-id>/images/tagged?%s" % params, "{body}", headers)
response = conn.getresponse()
# get response as a raw string
data = response.read()
# convert the string to a json object
data_json = json.loads(data)
for item in data_json:
# uri for image download
originalImageUri = item['originalImageUri']
# image tag
tag_name = item["tags"][0]["tagName"]
# to not erase previously saved photos counter (file_counter) = number of photos in a folder + 1
file_counter = len([name for name in os.listdir(
tag_name) if os.path.isfile(os.path.join(tag_name, name))])
# as the tag name corresponds to the folder name so just save a photo to a corresponding folder
output_file = os.path.join(tag_name, str(file_counter) + '.jpg')
# download image from uri
urllib.request.urlretrieve(originalImageUri, output_file)
conn.close()
except Exception as e:
print("Error")
Important thing to know : the max batch size is 256, consequently, the maximum number of images that you may download at once may not exceed this limit. BUT, fortunately there is the skip parameter, thus, to download all the photos, you only need to adjust your skip size which will always be equal to:
skip_size = (iteration_n - 1) * 256
E.G. first iteration, skip = 0, second - 256, and so on and so forth.
Hope you found this useful. Code well, and let the force be with you!
Here's the same thing but it will keep looping until it gets all the images:...
# coding: utf-8 """Download training photos from Azure Custom Vision workspace""" import http.client import urllib.request import urllib.parse import urllib.error import base64 import json import os # create tag folders if they do not exist tag_folders = ['<tag-1>', '<tag-2>'] for folder in tag_folders: if not os.path.exists(folder): os.makedirs(folder) # request headers # only training key is needed headers = { # Request headers 'Training-Key': '<your-training-key>' } skip = 0 # Resetting skip to 0 to start from the beginning while True: # This loop will keep running until we break out of it # query parameters params = urllib.parse.urlencode({ 'take': '256', 'skip': str(skip) }) try: # base url conn = http.client.HTTPSConnection('uksouth.api.cognitive.microsoft.com') conn.request("GET", "/customvision/v3.0/training/projects/<project-id>/images/tagged?%s" % params, "{b…