# Transform Image Operator Authors: Kyle, shiyu22 ## Overview This operator uses PyTorch to transform the image, such as cropping, PIL.Image and Tensor conversion; Normalization and other operations on the image. In computer vision (CV) directions, image transformations are usually an indispensable part, which can be used to pre-process images and enhance data. And transforms are common image transformations, they can be chained together using [`Compose`](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.Compose) in Pytorch. ## Interface `Class TransformImage(Operator)` [\[source\]](./transform_image.py) `__init__(self, size: int)` **params:** - size(int): the size of the output image. `__call__(self, img_tensor: Union[np.ndarray, Image.Image, torch.Tensor, str])` **params:** Img_tensor(np.ndarray/Image.Image/torch.Tensor, str): original image data, the type can be np.ndarry, PIL.image, or str path of the image. **return:** img_transformed(torch.Tensor): the tensor of the transformed image. ## How to use ### Requirements You can get the required python package by [requirements.txt](./requirements.txt). In fact, Towhee will automatically install these packages when you first load the Operator Repo, so you don't need to install them manually, here is just a list. - towhee - torch - torchvision - numpy - pillow ### How it works The `towhee/transform-image` Operator is used for image transformation and is an important part of data preprocessing. It can be added to the pipeline and is usually used as the first custom operator of the pipeline. For example, it's the first Operator named processing within [image_embedding_resnet50](https://hub.towhee.io/towhee/image-embedding-resnet50) pipeline, and it is the red box in the picture below. ![img](./pic/operator.png) When using this Operator to write Pipline's Yaml file, you need to declare the following content according to the interface of TransformImage class: ```yaml operators: - name: 'preprocessing' function: 'towhee/transform-image' tag: 'main' init_args: size: 256 inputs: - df: 'image' name: 'img_tensor' col: 0 outputs: - df: 'image_preproc' iter_info: type: map dataframes: - name: 'image' columns: - name: 'img_tensor' vtype: 'PIL.Image' - name: 'image_preproc' columns: - name: 'img_transformed' vtype: 'torch.Tensor' ``` > In the Interface section, we said that the input of the Operator can be np.ndarry, PIL.image, or str path of the image, but here we only use PIL.Image as an example, which is also used in [image_embedding_resnet50](https://hub.towhee.io/towhee/image-embedding-resnet50) pipeline, of course you can also change to the dataframe you want. ### File Structure Here is the main file structure of the `transform-image` Operator. If you want to learn more about the source code or modify it yourself, you can learn from it. ```bash ├── .gitattributes ├── .gitignore ├── README.md ├── __init__.py ├── requirements.txt #General python dependency package ├── transform_image.py #The python file for Towhee, it defines the interface of the system. ├── transform_image.yaml #The YAML file contains Operator information, such as frame, input, and output. ├── test_data/ #The directory of test data, including test.jpg └── test_transform_image.py #The unittest file of this Operator. ``` ## Reference - https://pytorch.org/vision/stable/transforms.html