VGGish Embedding Operator (Pytorch)

Authors: Jael Gu

Overview

This operator uses reads the waveform of an audio file and then applies VGGish to extract features. The original VGGish model is built on top of Tensorflow.[1] This operator converts VGGish into Pytorch. It generates a set of vectors given an input. Each vector represents features of a non-overlapping clip with a fixed length of 0.96s and each clip is composed of 64 mel bands and 96 frames. The model is pre-trained with a large scale of audio dataset AudioSet. As suggested, this model is suitable to extract features at high level or warm up a larger model.

Interface

__call__(self, filepath: str)

Args:

filepath:
- the input audio path
- supported types: str

Returns:

The Operator returns a tuple Tuple[('embs', numpy.ndarray)] containing following fields:

embs:
- embeddings of the audio
- data type: numpy.ndarray
- shape: (num_clips,128)

Requirements

You can get the required python package by requirements.txt.

How it works

The towhee/torch-vggish Operator implements the function of audio embedding, which can be added to a towhee pipeline. For example, it is the key operator of the pipeline audio-embedding-vggish.

Reference

[1]. https://github.com/tensorflow/models/tree/master/research/audioset/vggish [2]. https://tfhub.dev/google/vggish/1

1.5 KiB

Raw Blame History

VGGish Embedding Operator (Pytorch)

Authors: Jael Gu

Overview

Interface

__call__(self, filepath: str)

Args:

filepath:
- the input audio path
- supported types: str

Returns:

The Operator returns a tuple Tuple[('embs', numpy.ndarray)] containing following fields:

embs:
- embeddings of the audio
- data type: numpy.ndarray
- shape: (num_clips,128)

Requirements

You can get the required python package by requirements.txt.

How it works

The towhee/torch-vggish Operator implements the function of audio embedding, which can be added to a towhee pipeline. For example, it is the key operator of the pipeline audio-embedding-vggish.

Reference

[1]. https://github.com/tensorflow/models/tree/master/research/audioset/vggish [2]. https://tfhub.dev/google/vggish/1

Readme

Files and versions

1.5 KiB Raw Blame History

VGGish Embedding Operator (Pytorch)

Overview

Interface

Requirements

How it works

Reference

1.5 KiB Raw Blame History

VGGish Embedding Operator (Pytorch)

Overview

Interface

Requirements

How it works

Reference

1.5 KiB

Raw Blame History

1.5 KiB

Raw Blame History