Skip to content

API Reference

This reference documents the Python API of Deep Lake.

Core Components

  • Dataset Classes: Documentation for Dataset, ReadOnlyDataset, and DatasetView classes
  • Column Classes: Documentation for Column and ColumnView classes
  • Types: Available data types including basic numeric types and ML-optimized types
  • Schemas: Pre-built schema templates for common data structures
  • Query Language: Complete TQL syntax and operations
  • Metadata: Dataset and column metadata management
  • Version Control: Version control, history, branches, and tags
  • Miscellaneous: Additional auxiliary functionality

Index Management

Deep Lake supports various index types for optimizing search and query performance:

Text Indexes

  • BM25: Full-text search with BM25 similarity scoring
  • Inverted: Keyword-based text search
  • Exact: Exact text matching

Embedding Indexes

  • Clustered: Default clustering-based embedding search
  • ClusteredQuantized: Memory-efficient quantized embedding search

Numeric Indexes

  • Inverted: Numeric value lookup optimization

Indexes can be created and managed through the Column class methods create_index() and drop_index(). See Column Classes for detailed examples.

Getting Started

For implementation guidance and examples, please refer to: