towhee
copied
Readme
Files and versions
Updated 4 years ago
towhee
ViT Embedding Operator
Authors: kyle he
Overview
The ViT(Vision Transformer) is a model for image classification that employs a Transformer-like architecture over patches of the image. This includes the use of Multi-Head Attention, Scaled Dot-Product Attention and other architectural features seen in the Transformer architecture traditionally used for NLP[1], which is trained on imagenet dataset.
Interface
__init__(self, model_name: str = 'vit_large_patch16_224',
                 framework: str = 'pytorch', weights_path: str = None)
Args:
- model_name:
- the model name for embedding
 - supported types: 
str, for example 'vit_large_patch16_224' 
 - framework:
- the framework of the model
 - supported types: 
str, default is 'pytorch' 
 - weights_path:
- the weights path
 - supported types: 
str, default is None, using pretrained weights 
 
__call__(self, img_path: str)
Args:
- img_path:
- the input image path
 - supported types: 
str 
 
Returns:
The Operator returns a tuple Tuple[('embedding', numpy.ndarray)] containing following fields:
- feature_vector:
- the embedding of the image
 - data type: 
numpy.ndarray - shape: (dim,)
 
 
Requirements
You can get the required python package by requirements.txt.
How it works
The towhee/vit-embedding Operator implements the function of image embedding, which can add to the pipeline. For example, it's the key Operator named embedding_model within image-embedding-vitlarage pipeline.
Reference
| 
              
                
                   | 13 Commits | ||
|---|---|---|---|
| 
                
                  
                    
                      
                      
                       | 
              
                4 years ago | ||
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												841 B
											 
                      
                         | 
              
              
              
              4 years ago | |
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												3.0 KiB
											 
                      
                         | 
              
              
              
              4 years ago | |
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												1.7 KiB
											 
                      
                         | 
              
              
              
              4 years ago | |
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												719 B
											 
                      
                         | 
              
              
              
              4 years ago | |
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												74 B
											 
                      
                         | 
              
              
              
              4 years ago | |
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												2.2 KiB
											 
                      
                         | 
              
              
              
              4 years ago | |
| 
                
                  
                    
                       | 
              
                
                  
                    
											 
												240 B
											 
                      
                         | 
              
              
              
              4 years ago | |