The Universal Data Engine

Beyond
to

What is a Universal Data Engine?

A database focused on universal storage and data management
enabling compute with any API or tool

From a single solution for one type of data and compute…
Monolithic Database
Compute
Data Management
Access Control & Logging
Data Versioning
Storage
Data Format & IO
Snowflake
… to pluggable storage and data management …
Compute Engine
Compute
Pluggable Data Management
Access Control & Logging
Data Versioning
Pluggable Storage
Data Format & IO
… to pluggable compute
at planet-scale
Universal Data Engine
Data Management
Access Control & Logging
Data Versioning
Universal Storage
Data Format & IO
Pluggable Compute

The TileDB Universal Data Engine

Manage any complex data beyond tables, with any API or tool beyond SQL, at planet scale and beyond clusters

PythonRJavaCC++GoDask
SparkPrestoMariaDBPDALGDAL
Pluggable Compute
Efficient APIs and tool integrations
Arrays
TileDB Embedded
Universal storage engine
Cloud-optimized
Data versioning & time traveling
Serverless
TileDB Cloud
Planet-scale sharing
Serverless SQL & UDFs
Hosted Jupyter notebooks
GeospatialGenomicsDataframes
lustre logo

The TileDB Benefits

Arrays
Any Complex Data

Columnar tabular formats are inefficient for storing and handling the immense amount of metadata and multi-dimensional data generated across businesses such as genomics, geospatial, finance, and retail.

Interoperability
Extreme Interoperability

Don’t constrain your computations to domain-specific tools. For modern applications, forcing-fitting data into database tables and accessing that data via SQL and ODBC/JDBC is inefficient.

Collaboration
Planet-scale Sharing

It is an intensive engineering feat to share data and code outside your organization, involving setting up cloud bucket policies, handling key management, and exchanging Jupyter notebooks.

Serverless
No Deployment Hassles

Deploying scalable computations requires spinning up and maintaining clusters. Overprovisioning becomes unaffordable over time, whereas underprovisioning leads to unacceptable performance.

The TileDB Secret Sauce

Multi-dimensional arrays
Dense and sparse multi-dimensional arrays are the most general and efficient way to model any type of data to work with most data science tools; from tables to genomic variants, to imaging, to video, to key-values.
Serverless infrastructure
The ideal approach for scalable compute on globally shared data should be unconstrained by cluster management and deployment. Serverless is the right path towards truly planet-scale sharing and compute.
The TileDB Recipe
Is it for you?
ArraysTileDB Embedded
Store any data and metadata, and access it with various APIs and tools
ServerlessTileDB Cloud
Share data and code with anyone and scale out compute without hassles
Dense / Sparse Arrays & Dataframes
Cloud-Optimized
APIs (Python, R, C, C++, Java, Go)
Integrations (MariaDB, PrestoDB, Spark, Dask, ...)
Data Versioning & Time Travelling
Serverless SQL
Serverless UDFs and Task Graphs
Sharing (within and beyond organizations)
Access Control
Pricing
Open-Source
Pay-as-you-go
ACID
Streaming
The TileDB Recipe
Is it for you?
ArraysTileDB Embedded
Store any data and metadata, and access it with various APIs and tools
ServerlessTileDB Cloud
Share data and code with anyone and scale out compute without hassles

Use Cases

Multi-dimensional arrays make TileDB the perfect fit for any application domain and data type

  • Store tables in an efficient, cloud-optimized format
  • Perform fast multi-column slicing using TileDB’s array indexing
  • Experience data versioning and time-traveling
  • Associate arbitrary metadata with source data
  • Scale out your SQL and UDFs in a serverless manner
  • Get SQL and UDFs with direct slicing via multiple languages APIs
  • Discover and use public datasets
  • Share dataframes globally within and outside your organization
Schedule a demo