Unstructured Vs Structured Indexes
Marqo lets you utilise both structured and unstructured indexes. While much of the functionality is shared between these two index types there are some key differences which influence the decision of which to use.
Key Differences
- Searchable Attributes: Structured indexes allow you to specify which attribute to search in at search time. Unstructured indexes cannot do this and search all attributes by default.
- HNSW Behaviour: Structured indexes will place each tensor field in its own HNSW graph whereas unstructured indexes use a single HNSW graph for all tensor fields. This means that searchable attributes has additional benefits to performance as well as the ability to search specific fields.
- Lexical Search: Structured indexes allow you to specify which fields are available for lexical search. Unstructured indexes treat all text fields as lexical search fields.
- Mutability: Structured indexes are a fixed schema and cannot be changed once created. The schema must be a superset of fields in each document, a document doesn't have to have all fields in the schema. Unstructured indexes can have fields added at any time.
- Partial Updates: Partial updates are support for both however they are significantly faster for structured indexes. Partial updates for unstructured indexes are identical to adding the document with
useExistingTensors
set totrue
. - Filtering: Structured indexes allow you to specify which fields are filterable. Unstructured indexes will automatically make fields filterable, if a field contains text then
filterStringMaxLength
will be used to determine if it is filterable using the length of the string. - Performance: Structured indexes are faster in general, the largest performance difference is that structured indexes will consume less memory space. Partial updates to document metadata is also significantly faster for structured indexes.
- Error Handling: Structured indexes will throw an error if you try to add a document with a field that is not in the schema. Unstructured indexes will add the field to the schema and continue. The strictness of structured indexes can help catch errors early.
When to Use Unstructured Indexes
Unstructured indexes are recommended in the following situations:
- Getting Started: If you are new to Marqo and want to get started quickly, unstructured indexes are the best choice due to their ease of use.
- Dynamic Schema: If you have a dynamic schema where fields are added frequently, unstructured indexes are the best choice.
When to Use Structured Indexes
Structured indexes are recommended in the following situations:
- Performance: If you require the best maximum performance, structured indexes are the best choice. Expecially for large indexes with continuous updates to documents.
- Production/Enterprise: If you are using Marqo in a production or enterprise environment, structured indexes are often a better choice due to the strictness of the schema and the ability to catch malformed documents early.
- Advanced Usage: If you require better control over searchable attributes, lexical search, and other features, then structured indexes are the best choice.