logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

4.2 KiB

Image Embedding Operator with Resnet50

Authors: derekdqc, shiyu22

Overview

This Operator generates feature vectors from the pytorch pretrained Resnet50 model[1], which is trained on imagenet dataset.

Resnet models were proposed in “Deep Residual Learning for Image Recognition”[2], this model was the winner of ImageNet challenge in 2015. "The fundamental breakthrough with ResNet was it allowed us to train extremely deep neural networks with 150+layers successfully. Prior to ResNet training very deep neural networks were difficult due to the problem of vanishing gradients"[3].

Interface

Class Resnet50ImageEmbedding(Operator) [source]

__init__(self, model_name: str)

params:

  • model_name(str): the model name for embedding, like 'resnet50'.

__call__(self, img_tensor: torch.Tensor)

params:

  • img_tensor(torch.Tensor): the normalized image tensor.

return:

  • cnn(numpy.ndarray): the embedding of image.

How to use

Requirements

You can get the required python package by requirements.txt and pytorch/requirements.txt. Towhee will automatically install these packages when you first load the Operator Repo, so you don't need to install them manually, here is just a list.

  • towhee
  • torch
  • torchvision
  • numpy

How it works

The towhee/resnet50-image-embedding Operator implements the function of image embedding, which can add to the pipeline, for example, it's the key Operator named embedding_model within image_embedding_resnet50 pipeline, and it is the red box in the picture below.

img

When using this Operator to write Pipeline's Yaml file, you need to declare the following content according to the interface of Resnet50ImageEmbedding class:

operators:
    -
        name: 'embedding_model'
        function: 'towhee/resnet50-image-embedding'
        tag: 'main'
        init_args:
            model_name: 'resnet50'
        inputs:
            -
                df: 'image_preproc'
                name: 'img_tensor'
                col: 0
        outputs:
            -
                df: 'embedding'
        iter_info:
            type: map
dataframes:
    -
        name: 'image_preproc'
        columns:
            -
                name: 'img_transformed'
                vtype: 'torch.Tensor'
    -
        name: 'embedding'
        columns:
            -
                name: 'cnn'
                vtype: 'numpy.ndarray'

We can see that in yaml, the operator part declares the init_args of the class and the input and output dataframe, and the dataframe declares the parameter name and vtype.

File Structure

Here is the main file structure of the resnet50-image-embedding Operator. If you want to learn more about the source code or modify it yourself, you can learn from it.

├── .gitattributes
├── .gitignore
├── README.md
├── __init__.py
├── requirements.txt              #General python dependency package
├── resnet50_image_embedding.py   #The python file for Towhee, it defines the interface of the system and usually does not need to be modified.
├── resnet50_image_embedding.yaml #The YAML file contains Operator information, such as model frame, input, and output.
├── pytorch               #The directory of the pytorh
│   ├── __init__.py
│   ├── model             #The directory of the pytorch model, which can store data such as weights.
│   ├── requirements.txt  #The python dependency package for the pytorch model.
│   └── model.py          #The code of the pytorch model, including the initialization model and prediction.
├── test_data/   #The directory of test data, including test.jpg
└── test_resnet50_image_embedding.py  #The unittest file of this Operator.

Reference

[1].https://pytorch.org/hub/pytorch_vision_resnet/

[2].https://arxiv.org/abs/1512.03385

[3].https://towardsdatascience.com/understanding-and-coding-a-resnet-in-keras-446d7ff84d33

4.2 KiB

Image Embedding Operator with Resnet50

Authors: derekdqc, shiyu22

Overview

This Operator generates feature vectors from the pytorch pretrained Resnet50 model[1], which is trained on imagenet dataset.

Resnet models were proposed in “Deep Residual Learning for Image Recognition”[2], this model was the winner of ImageNet challenge in 2015. "The fundamental breakthrough with ResNet was it allowed us to train extremely deep neural networks with 150+layers successfully. Prior to ResNet training very deep neural networks were difficult due to the problem of vanishing gradients"[3].

Interface

Class Resnet50ImageEmbedding(Operator) [source]

__init__(self, model_name: str)

params:

  • model_name(str): the model name for embedding, like 'resnet50'.

__call__(self, img_tensor: torch.Tensor)

params:

  • img_tensor(torch.Tensor): the normalized image tensor.

return:

  • cnn(numpy.ndarray): the embedding of image.

How to use

Requirements

You can get the required python package by requirements.txt and pytorch/requirements.txt. Towhee will automatically install these packages when you first load the Operator Repo, so you don't need to install them manually, here is just a list.

  • towhee
  • torch
  • torchvision
  • numpy

How it works

The towhee/resnet50-image-embedding Operator implements the function of image embedding, which can add to the pipeline, for example, it's the key Operator named embedding_model within image_embedding_resnet50 pipeline, and it is the red box in the picture below.

img

When using this Operator to write Pipeline's Yaml file, you need to declare the following content according to the interface of Resnet50ImageEmbedding class:

operators:
    -
        name: 'embedding_model'
        function: 'towhee/resnet50-image-embedding'
        tag: 'main'
        init_args:
            model_name: 'resnet50'
        inputs:
            -
                df: 'image_preproc'
                name: 'img_tensor'
                col: 0
        outputs:
            -
                df: 'embedding'
        iter_info:
            type: map
dataframes:
    -
        name: 'image_preproc'
        columns:
            -
                name: 'img_transformed'
                vtype: 'torch.Tensor'
    -
        name: 'embedding'
        columns:
            -
                name: 'cnn'
                vtype: 'numpy.ndarray'

We can see that in yaml, the operator part declares the init_args of the class and the input and output dataframe, and the dataframe declares the parameter name and vtype.

File Structure

Here is the main file structure of the resnet50-image-embedding Operator. If you want to learn more about the source code or modify it yourself, you can learn from it.

├── .gitattributes
├── .gitignore
├── README.md
├── __init__.py
├── requirements.txt              #General python dependency package
├── resnet50_image_embedding.py   #The python file for Towhee, it defines the interface of the system and usually does not need to be modified.
├── resnet50_image_embedding.yaml #The YAML file contains Operator information, such as model frame, input, and output.
├── pytorch               #The directory of the pytorh
│   ├── __init__.py
│   ├── model             #The directory of the pytorch model, which can store data such as weights.
│   ├── requirements.txt  #The python dependency package for the pytorch model.
│   └── model.py          #The code of the pytorch model, including the initialization model and prediction.
├── test_data/   #The directory of test data, including test.jpg
└── test_resnet50_image_embedding.py  #The unittest file of this Operator.

Reference

[1].https://pytorch.org/hub/pytorch_vision_resnet/

[2].https://arxiv.org/abs/1512.03385

[3].https://towardsdatascience.com/understanding-and-coding-a-resnet-in-keras-446d7ff84d33