Using Terraform or OpenTofu to Manage Your Cloud Resources

This Terraform/OpenTofu provider allows you to manage your Marqo Cloud resources using infrastructure as code through either Terraform or OpenTofu. Both platforms enable you to safely and predictably create, change, and delete cloud resources.

OpenTofu is an open-source fork of Terraform, and this provider is compatible with both platforms. For more information on opentofu and terraform, please visit the following links: OpenTofu and Terraform.

Provider Registries

OpenTofu Provider: registry.opentofu.org/marqo-ai/marqo
Terraform Provider: Available on the Terraform Registry
Source Code: github.com/marqo-ai/terraform-provider-marqo

Platform Choice

The provider works identically on both platforms, with only minor differences in setup:

Terraform:

Use provider source marqo-ai/marqo
Use terraform commands

OpenTofu:

Use provider source registry.opentofu.org/marqo-ai/marqo
Use tofu commands

Getting Started

Installation

Install either Terraform or OpenTofu
Create a new directory for your Marqo configuration
Create a .tf file with your configuration (see examples below)
Initialize and apply your configuration

Basic Commands

Terraform	OpenTofu	Description
`terraform init`	`tofu init`	Initialize working directory
`terraform refresh`	`tofu refresh`	Update state with remote changes
`terraform plan`	`tofu plan`	Preview changes before applying
`terraform apply`	`tofu apply`	Apply the planned changes
`terraform destroy`	`tofu destroy`	Remove all resources

See the Opentofu documentation for more information on how to use Opentofu or the Terraform documentation for more information on how to use Terraform.

Overview of Features

The marqo opentofu provider supports the following:

A datasource called marqo_read_indices that allows you to read all of your marqo indexes in your account.
A resource called marqo_index that allows you to create and manage a marqo index.

Sample Configuration

For both of the examples below, create a file within each configuration directory named terraform.tfvars containing your api key as follows

marqo_api_key = "<KEY>"

Note that the host must be set to "https://api.marqo.ai/api/v2"

Reading All Indexes in Your Account (datasource)

terraform {
  required_providers {
    marqo = {
      source = "registry.opentofu.org/marqo-ai/marqo"
      version = "1.0.1"
    }
  }
}

provider "marqo" {
  host    = "https://api.marqo.ai/api/v2"
  api_key = var.marqo_api_key
}

data "marqo_read_indices" "example" {
  id = 1
}

output "indices_in_marqo_cloud" {
  value = data.marqo_read_indices.example
}

variable "marqo_api_key" {
  type        = string
  description = "Marqo API key"
}

Creating and Managing a Structured Index (resource)

terraform {
  required_providers {
    marqo = {
      source = "registry.opentofu.org/marqo-ai/marqo"
      version = "1.0.1"
    }
  }
}

provider "marqo" {
  host    = "https://api.marqo.ai/api/v2"
  api_key = var.marqo_api_key
}

resource "marqo_index" "example" {
  index_name = "example_index_dependent_2"
  settings = {
    type                = "structured"
    vector_numeric_type = "float"
    all_fields = [
      { "name" : "text_field", "type" : "text", "features" : ["lexical_search"] },
      { "name" : "image_field", "type" : "image_pointer" },
      {
        "name" : "multimodal_field",
        "type" : "multimodal_combination",
        "dependent_fields" : {
          "imageField" : 0.8,
          "textField" : 0.1
        },
      },
    ],
    number_of_inferences = 1
    storage_class        = "marqo.basic"
    number_of_replicas   = 0
    number_of_shards     = 2
    tensor_fields        = ["multimodal_field"],
    model                = "open_clip/ViT-L-14/laion2b_s32b_b82k"
    normalize_embeddings = true
    inference_type       = "marqo.CPU.small"
    text_preprocessing = {
      split_length  = 2
      split_method  = "sentence"
      split_overlap = 0
    }
    image_preprocessing = {
      patch_method = null
    }
    ann_parameters = {
      space_type = "prenormalized-angular"
      parameters = {
        ef_construction = 512
        m               = 16
      }
    }
  }
}

output "created_index" {
  value = marqo_index.example
}

variable "marqo_api_key" {
  type        = string
  description = "Marqo API key"
}

Here are some additional configuration files that you may find useful:

Configuration for creating a Languagebind index (Video/Audio/Image/text)
Configuration for creating an unstructured index with custom models

Detailed Configuration Options

Required

Option	Type	Description
`api_key`	String	The Marqo API key. Can be set with MARQO_API_KEY environment variable.
`host`	String	The Marqo API host. Can be set with MARQO_HOST environment variable.

marqo_read_indices (Data Source)

Required

Option	Type	Description
`id`	String	The unique identifier for the resource

Read-Only

Option	Description
`items`	List of Indexes in your account
`last_updated`	The last time the resource was updated

Nested Schema for `items`

Field	Description
`all_fields`	The fields to make available in a structured index.
`tensor_fields`	An array of fields that will be vectorised.
`ann_parameters`	The hyperparameters for the ANN method
`created`	The creation date of the index
`docs_count`	The number of documents in the index
`docs_deleted`	The number of documents deleted from the index
`filter_string_max_length`	The filter string max length
`index_name`	The name of the index
`index_status`	The status of the index
`inference_type`	The type of inference used by the index
`marqo_endpoint`	The Marqo endpoint used by the index
`marqo_version`	The version of Marqo used by the index
`model`	The model used by the index
`model_properties`	The properties of the model used by the index
`normalize_embeddings` (Boolean)	Indicates if embeddings should be normalized
`number_of_inferences`	The number of inferences made by the index
`number_of_replicas`	The number of replicas for the index
`number_of_shards`	The number of shards for the index
`search_query_total`	The total number of search queries made on the index
`storage_class`	The storage class of the index
`store_size`	The size of the index storage
`text_preprocessing`	The text preprocessing settings for text
`treat_urls_and_pointers_as_images` (Boolean)	Indicates if URLs and pointers should be treated as images
`treat_urls_and_pointers_as_media` (Boolean)	Indicates if URLs and pointers should be treated as media
`type`	The type of the index
`vector_numeric_type`	The numeric type of the vector

Nested Schema for `items.model_properties`

Field	Description
`type`	The type of the model
`dimensions`	The dimensions of the model
`tokens`	The tokens of the model
`model_location`	The location of the model
`url`	The URL of the model
`trust_remote_code`	Indicates if the remote code should be trusted

Nested Schema for `items.all_fields`

Field	Type
`dependent_fields`	Map of Number
`features`	List of String
`name`	String
`type`	String

Nested Schema for `items.ann_parameters`

Field	Description
`parameters`	Hyperparameters for the ANN method
`space_type`	The space type for ANN parameters

Nested Schema for `items.ann_parameters.parameters`

Field	Description
`ef_construction`	The efConstruction parameter for ANN
`m`	The m parameter for ANN

Nested Schema for `items.text_preprocessing`

Read-Only	Description
`split_length`	The split length for text preprocessing
`split_method`	The split method for text preprocessing
`split_overlap`	The split overlap for text preprocessing

marqo_index (Resource)

For default values in optional fields, please refer to the documentation pages on creating unstructured and structured indexes.

Required

Option	Type	Description
`index_name`	String	The name of the index.
`settings`	Attributes	The settings for the index.

Nested Schema for `settings`

Required

Field	Type
`inference_type`	String
`model`	String
`number_of_inferences`	Number
`number_of_replicas`	Number
`number_of_shards`	Number
`storage_class`	String
`type`	String

Optional

Field	Type
`model_properties`	Attributes
`all_fields`	Attributes List
`ann_parameters`	Attributes
`filter_string_max_length`	Number
`normalize_embeddings`	Boolean
`tensor_fields`	List of String
`text_preprocessing`	Attributes
`image_preprocessing`	Attributes
`video_preprocessing`	Attributes
`audio_preprocessing`	Attributes
`treat_urls_and_pointers_as_images`	Boolean
`treat_urls_and_pointers_as_media`	Boolean
`vector_numeric_type`	String

Nested Schema for `settings.model_properties`

Field	Type
`type`	String
`dimensions`	Number
`tokens`	Number
`model_location`	String
`url`	String
`trust_remote_code`	Boolean

Nested Schema for `settings.all_fields`

Field	Type
`dependent_fields`	Map of Number
`features`	List of String
`name`	String
`type`	String

Nested Schema for `settings.ann_parameters`

Field	Type
`parameters`	Attributes
`space_type`	String

Nested Schema for `settings.ann_parameters.parameters`

Field	Type
`ef_construction`	Number
`m`	Number

Nested Schema for `settings.image_preprocessing`

Field	Type
`patch_method`	String

Nested Schema for `settings.text_preprocessing`

Field	Type
`split_length`	Number
`split_method`	String
`split_overlap`	Number

Nested Schema for `settings.video_preprocessing`

Field	Type
`split_length`	Number
`split_overlap`	Number

Nested Schema for `settings.audio_preprocessing`

Field	Type
`split_length`	Number
`split_overlap`	Number

Using Terraform or OpenTofu to Manage Your Cloud Resources

Provider Registries

Platform Choice

Getting Started

Installation

Basic Commands

Overview of Features

Sample Configuration

Reading All Indexes in Your Account (datasource)

Creating and Managing a Structured Index (resource)

Detailed Configuration Options

Required

marqo_read_indices (Data Source)

Required

Read-Only

Nested Schema for items

Nested Schema for items.model_properties

Nested Schema for items.all_fields

Nested Schema for items.ann_parameters

Nested Schema for items.ann_parameters.parameters

Nested Schema for items.text_preprocessing

marqo_index (Resource)

Required

Nested Schema for settings

Required

Optional

Nested Schema for settings.model_properties

Nested Schema for settings.all_fields

Nested Schema for settings.ann_parameters

Nested Schema for settings.ann_parameters.parameters

Nested Schema for settings.image_preprocessing

Nested Schema for settings.text_preprocessing

Nested Schema for settings.video_preprocessing

Nested Schema for settings.audio_preprocessing

Nested Schema for `items`

Nested Schema for `items.model_properties`

Nested Schema for `items.all_fields`

Nested Schema for `items.ann_parameters`

Nested Schema for `items.ann_parameters.parameters`

Nested Schema for `items.text_preprocessing`

Nested Schema for `settings`

Nested Schema for `settings.model_properties`

Nested Schema for `settings.all_fields`

Nested Schema for `settings.ann_parameters`

Nested Schema for `settings.ann_parameters.parameters`

Nested Schema for `settings.image_preprocessing`

Nested Schema for `settings.text_preprocessing`

Nested Schema for `settings.video_preprocessing`

Nested Schema for `settings.audio_preprocessing`