Update

Signed-off-by: shiyu22 <shiyu.chen@zilliz.com>
4 years ago · 761cb0a09a
7 changed files with 379 additions and 2 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,209 @@
 ### Linux ###
 *~
 # temporary files which can be created if a process still has a handle open of a deleted file
 .fuse_hidden*
 # KDE directory preferences
 .directory
 # Linux trash folder which might appear on any partition or disk
 .Trash-*
 # .nfs files are created when an open file is removed but is still being accessed
 .nfs*
 ### OSX ###
 # General
 .DS_Store
 .AppleDouble
 .LSOverride
 # Icon must end with two \r
 Icon
 # Thumbnails
 ._*
 # Files that might appear in the root of a volume
 .DocumentRevisions-V100
 .fseventsd
 .Spotlight-V100
 .TemporaryItems
 .Trashes
 .VolumeIcon.icns
 .com.apple.timemachine.donotpresent
 # Directories potentially created on remote AFP share
 .AppleDB
 .AppleDesktop
 Network Trash Folder
 Temporary Items
 .apdisk
 ### Python ###
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
 *$py.class
 # C extensions
 *.so
 # Distribution / packaging
 .Python
 build/
 develop-eggs/
 dist/
 downloads/
 eggs/
 .eggs/
 lib/
 lib64/
 parts/
 sdist/
 var/
 wheels/
 share/python-wheels/
 *.egg-info/
 .installed.cfg
 *.egg
 MANIFEST
 # PyInstaller
 #  Usually these files are written by a python script from a template
 #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 *.manifest
 *.spec
 # Installer logs
 pip-log.txt
 pip-delete-this-directory.txt
 # Unit test / coverage reports
 htmlcov/
 .tox/
 .nox/
 .coverage
 .coverage.*
 .cache
 nosetests.xml
 coverage.xml
 *.cover
 *.py,cover
 .hypothesis/
 .pytest_cache/
 cover/
 # Translations
 *.mo
 *.pot
 # Django stuff:
 *.log
 local_settings.py
 db.sqlite3
 db.sqlite3-journal
 # Flask stuff:
 instance/
 .webassets-cache
 # Scrapy stuff:
 .scrapy
 # Sphinx documentation
 docs/_build/
 # PyBuilder
 .pybuilder/
 target/
 # Jupyter Notebook
 .ipynb_checkpoints
 # IPython
 profile_default/
 ipython_config.py
 # pyenv
 #   For a library or package, you might want to ignore these files since the code is
 #   intended to run in multiple environments; otherwise, check them in:
 # .python-version
 # pipenv
 #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 #   install all needed dependencies.
 #Pipfile.lock
 # PEP 582; used by e.g. github.com/David-OConnor/pyflow
 __pypackages__/
 # Celery stuff
 celerybeat-schedule
 celerybeat.pid
 # SageMath parsed files
 *.sage.py
 # Environments
 .env
 .venv
 env/
 venv/
 ENV/
 env.bak/
 venv.bak/
 # Spyder project settings
 .spyderproject
 .spyproject
 # Rope project settings
 .ropeproject
 # mkdocs documentation
 /site
 # mypy
 .mypy_cache/
 .dmypy.json
 dmypy.json
 # Pyre type checker
 .pyre/
 # pytype static type analyzer
 .pytype/
 # Cython debug symbols
 cython_debug/
 ### Windows ###
 # Windows thumbnail cache files
 Thumbs.db
 Thumbs.db:encryptable
 ehthumbs.db
 ehthumbs_vista.db
 # Dump file
 *.stackdump
 # Folder config file
 [Dd]esktop.ini
 # Recycle Bin used on file shares
 $RECYCLE.BIN/
 # Windows Installer files
 *.cab
 *.msi
 *.msix
 *.msm
 *.msp
 # Windows shortcuts
 *.lnk
--- a/README.md
+++ b/README.md
@ -1,3 +1,56 @@
 # image-embedding-pipeline-template
 # **Template: Image Embedding Pipeline**
 This is another test repo
 Authors:
 ## Overview
 > <font color=red>**Note:** this is just a **template**, not a runnable pipeline.</font>
 This pipeline **cannot be run**, which is the **template for the image embedding pipeline class** and defines YAML template file for embedding images, as well as the standard inputs and outputs. You can complete the pipeline by filling in the parameters(`init_args`) of the Operator section in [image_embedding_pipeline_template.yaml](./image_embedding_pipeline_template.yaml) and update this README file. FYI, [image-embedding-resnet50](https://hub.towhee.io/towhee/image-embedding-resnet50) is based on this template.
 This pipeline is used to **extract the feature vector of the image**. It first normalizes the image and then uses a model to generate the vector.
 ## Interface
 **Input Arguments:**
 - img_tensor:
  - the input image to be encoded
  - supported types: `PIL.Image`
 **Pipeline Output:**
 The pipeline returns a tuple `Tuple[('cnn', numpy.ndarray)]` containing following fields:
 - feature_vector:
  - the embedding of input image
  - data type: `numpy.ndarray`
 ## How to use
 1. Install [Towhee](https://github.com/towhee-io/towhee)
 ```Bash
 $ pip3 install towhee
 ```
 > You can refer to [Getting Started with Towhee](https://towhee.io/) for more details. If you have any questions, you can [submit an issue to the towhee repository](https://github.com/towhee-io/towhee/issues).
 2. Run it with Towhee
 ```Python
 >>> from towhee import pipeline
 >>> from PIL import Image
 >>> img = Image.open('path/to/your/image') #for example './test.jpg'
 >>> embedding_pipeline = pipeline('user/repo_name')  #the pipeline repo, such as 'towhee/image-embedding-resnet50'
 >>> embedding = embedding_pipeline(img)
 ```
 ## **How it works**
 This pipeline includes two main operators: [transform image](https://hub.towhee.io/towhee/transform-image-operator-template)  and [image embedding](https://hub.towhee.io/towhee/image-embedding-operator-template). The transform image operator will first convert the original image into a normalized format, such as with 512x512 resolutions. Then, the normalized image will be encoded via image embedding operator, and finally we get a feature vector of the given image.
 > Refer [Towhee architecture](https://github.com/towhee-io/towhee#towhee-architecture) for basic concepts in Towhee: pipeline, operator, dataframe.
 ![img](./readme_res/pipeline.png)
--- a/config.py
+++ b/config.py
@ -0,0 +1,3 @@
 TEST_IMG = './test_data/test.jpg'
 DIMENSION = 1000
 REPO_NAME = 'towhee/image-embedding-resnet'
--- a/image_embedding_pipeline_template.yaml
+++ b/image_embedding_pipeline_template.yaml
@ -0,0 +1,93 @@
 name: 'image_embedding_resnet50'
 operators:
    -
        name: '_start_op'
        function: '_start_op'
        init_args:
        inputs:
            -
                df: '_start_df'
                name: 'img'
                col: 0
        outputs:
            -
                df: 'image'
        iter_info:
            type: map
    -
        name: 'preprocessing'
        function: 'towhee/image-transform-template' #your transform-image repo name
        tag: 'main' #tag to the repo, default is 'main'
        init_args:
            size:      #size of image, such as 256
        inputs:
            -
                df: 'image'
                name: 'img'
                col: 0
        outputs:
            -
                df: 'image_preproc'
        iter_info:
            type: map
    -
        name: embedding_model
        function: 'towhee/image-embedding-operator-template' #your image-embedding repo name
        tag: 'main' #tag to the repo, default is 'main'
        init_args:
            model_name:  #model_name for image-embedding operator, such as 'resnet50'
        inputs:
            -
                df: 'image_preproc'
                name: 'img_tensor'
                col: 0
        outputs:
            -
                df: 'embedding'
        iter_info:
            type: map
    -
        name: '_end_op'
        function: '_end_op'
        init_args:
        inputs:
            -
                df: 'embedding'
                name: 'feature_vector'
                col: 0
        outputs:
            -
                df: '_end_df'
        iter_info:
            type: map
 dataframes:
    -
        name: '_start_df'
        columns:
            -
                name: 'img'
                vtype: 'PIL.Image'
    -
        name: 'image'
        columns:
            -
                name: 'img'
                vtype: 'PIL.Image'
    -
        name: 'image_preproc'
        columns:
            -
                name: 'img_transformed'
                vtype: 'torch.Tensor'
    -
        name: 'embedding'
        columns:
            -
                name: 'feature_vector'
                vtype: 'numpy.ndarray'
    -
        name: '_end_df'
        columns:
            -
                name: 'feature_vector'
                vtype: 'numpy.ndarray'
--- a/readme_res/pipeline.png
+++ b/readme_res/pipeline.png
--- a/test_data/test.jpg
+++ b/test_data/test.jpg
--- a/test_image_embedding_pipeline_yaml.py
+++ b/test_image_embedding_pipeline_yaml.py
@ -0,0 +1,19 @@
 import unittest
 from towhee import pipeline
 from PIL import Image
 from config import DIMENSION, REPO_NAME, TEST_IMG
 class TestImageEmbeddingPipelineClass(unittest.TestCase):
    test_img = Image.open(TEST_IMG)
    def test_image_embedding_resnet50(self):
        self.dimension = DIMENSION  #the dimension of image embedding
        self.repo_name = REPO_NAME
        embedding_pipeline = pipeline(self.repo_name)
        embedding = embedding_pipeline(self.test_img)
        assert (1, self.dimension)==op(img_tensor)[0].shape
 if __name__ == '__main__':
    unittest.main()