Custom Embeddings

LightlyOne allows you to customize the embeddings of your images fully (New in 2.3.15). This can be useful if you have a special image type that requires a different embedding model than the one which is provided in the LightlyOne Worker.

Let's assume the following structure for the Input datasource:

input_datasource/ ├── image_1.png └── subdir/ ├── image_2.png └── image_3.png

To provide the embeddings to the LightlyOne Worker, they have to be stored UTF-8 encoded as a CSV file to the LightlyOne datasource in the .lightly/embeddings/ directory:

lightly_datasource/ └── .lightly/ └── embeddings/ └── custom_embeddings.csv

The embedding CSV file must be UTF-8 encoded and have the following format:

filenames,embedding_0,embedding_1,...,embedding_31,labels image_1.jpg,-0.86,0.49,...,0 subdir/image_2.jpg,0.86,0.78,...,0 subdir/image_3.jpg,-1.09,-0.93,...,0

The entries in the filenames column must match the image filenames in the Input datasource. Every embedding dimension is stored as a separate column (embedding_0, embedding_1, ..., embedding_31) while the last column of the embedding file must be named labels and only contain the value 0 for all entries.

❗️

Number of Embedding Dimensions

The number of embedding dimensions can be customized by removing/adding more embedding columns in the CSV file. The number of dimensions must match the num_ftrs option in the lightly config (see below).

To schedule a run with custom embeddings, the embedding file's location must be passed in the worker config. The path of the embedding file must be relative to the .lightly/embeddings/ directory in the LightlyOne datasource.

from lightly.api import ApiWorkflowClient # Create the LightlyOne client to connect to the API. client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN", dataset_id="MY_DATASET_ID") client.schedule_compute_worker_run( worker_config={ "embeddings": "custom_embeddings.csv", }, selection_config={ "n_samples": 50, "strategies": [ { "input": { "type": "EMBEDDINGS" }, "strategy": { "type": "DIVERSITY" } } ] }, lightly_config={ "model": { "num_ftrs": 32, # Must match number of embedding dimensions. } }, )

Did this page help you?