You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Files and versions

112 lines
4.0 KiB

# Image Embedding Operator with Resnet50
3 years ago
Authors: Kyle, shiyu22
## Overview
This Operator generates feature vectors from the pytorch pretrained **Resnet50** mode, which is trained on [COCO dataset](https://cocodataset.org/#download).
**Resnet** models were proposed in “[Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)”, this model was the winner of ImageNet challenge in 2015. The fundamental breakthrough with ResNet was it allowed us to train extremely deep neural networks with 150+layers successfully. Prior to ResNet training very deep neural networks was difficult due to the problem of vanishing gradients.
## Interface
`Class Resnet50ImageEmbedding(Operator)` [source](./resnet50_image_embedding.py)
`__init__(self, model_name: str)`
- model_name(str): the model name for embedding, like 'resnet50'.
`__call__(self, img_tensor: torch.Tensor)`
- img_tensor(torch.Tensor): the normalized image tensor.
- cnn(numpy.ndarray): the embedding of image.
## How to use
### Requirements
You can get the required python package by [requirements.txt](./requirements.txt) and [pytorch/requirements.txt](./pytorch/requirements.txt). In fact, Towhee will automatically install these packages when you first load the Operator Repo, so you don't need to install them manually, here is just a list.
- towhee
- torch
- torchvision
- numpy
### How it works
The `towhee/resnet50-image-embedding` Operator implements the function of image embedding, which can add to the pipeline, for example, it's the key Operator named embedding_model within [image_embedding_resnet50](https://hub.towhee.io/towhee/image-embedding-resnet50) pipeline, and it is the red box in the picture below.
When using this Operator to write Pipline's Yaml file, you need to declare the following content according to the interface of Resnet50ImageEmbedding class:
name: 'embedding_model'
function: 'towhee/resnet50-image-embedding'
tag: 'main'
model_name: 'resnet50'
df: 'image_preproc'
name: 'img_tensor'
col: 0
df: 'embedding'
type: map
name: 'image_preproc'
name: 'img_transformed'
vtype: 'torch.Tensor'
name: 'embedding'
name: 'cnn'
vtype: 'numpy.ndarray'
We can see that in yaml, the **operator** part declares the `init_args` of the class and the` input` and `output`dataframe, and the **dataframe** declares the parameter `name` and `vtype`.
### File Structure
Here is the main file structure of the `resnet50-image-embedding` Operator. If you want to learn more about the source code or modify it yourself, you can learn from it.
├── .gitattributes
├── .gitignore
├── README.md
├── __init__.py
├── requirements.txt #General python dependency package
├── resnet50_image_embedding.py #The python file for Towhee, it defines the interface of the system and usually does not need to be modified.
├── resnet50_image_embedding.yaml #The YAML file contains Operator information, such as model frame, input, and output.
├── pytorch #The directory of the pytorh
│   ├── __init__.py
│   ├── model #The directory of the pytorch model, which can store data such as weights.
│   ├── requirements.txt #The python dependency package for the pytorch model.
│   └── model.py #The code of the pytorch model, including the initialization model and prediction.
├── test_data/ #The directory of test data, including test.jpg
└── test_resnet50_image_embedding.py #The unittest file of this Operator.
## Reference
- https://pytorch.org/hub/pytorch_vision_resnet/
- https://arxiv.org/abs/1512.03385