News

TileDB x Databricks Partner to Power Multimodal Data for Agentic AI in Healthcare + Life Sciences. Read the news

3 min read

Data Management
Data Science
Single Cell

A conversation with Brandon “Bubba” Brooks: Why governance and auditing are key to agentic AI

Originally published: Feb 24, 2026

Table Of Contents:

How cheaper sequencing leads to the problem of too much data

Why Tile’s governance and auditing features are key

What to expect from agentic AI in 2026

As generating genomic and transcriptomic data becomes easier and less costly, life sciences organizations must find a scalable way to analyze this multimodal data—especially if they’re going to unlock the potential of AI agents. For Tile to provide a scalable solution, we need the insight of versatile data management experts able to work in a variety of modalities and frameworks. This is where Bubba Brooks, our resident Data Wizard, has made a big impact. In this interview, Brooks shares how he got started in bioinformatics, the aspects of Tile’s solutions that he finds most impressive, and where he sees agentic AI going in 2026. 

How cheaper sequencing leads to the problem of too much data

Let’s start with you. Why did you study and make your career in bioinformatics? 

Brooks: During my time as a microbiology Ph.D student at Berkeley, I had to become proficient in languages like R and Ruby and early microbiome data packages to analyze data for my dissertation. That got me hooked on programming. I noticed that practically all the small to medium-sized biotech startups really needed bioinformatics people to scale their questions. They had the wet lab workers to generate the data, but lacked the skilled bioinformaticians to actually process the data at scale. And even if those people knew how to work with Nextflow or how to write orchestrators, getting those tools running on cloud infrastructure was a bottleneck for lots of companies.

That’s where I found my opportunity. My evolution has been from data creation to data analyst to workflow orchestrator and now essentially a “data wizard” of all things data knowledge aggregation. When you run all these orchestrators, all these workflows, you need to organize the data in a way that not only enables your bioinformaticians to find it but also your experimentalists can access it in an approachable way. For me, this often means creating a dashboard that makes analysis more approachable for people who don't have time to do all that data wrangling. 

What led you to join Tile as a “Data Wizard”?

Brooks: As I said, my career evolution is being on the bleeding edge of the next big problem, and for most biotechs that problem is having too much data. With companies like Roche, Ultima, and Element launching sequencers that can sequence 30x genomes for less than $100, generating genomic data has become easier and cheaper than ever. However, this means all these pharma and biotech organizations now have tons of genomic data, but lack the means to actually analyze that data at scale. The result is organizations often put all their data in blob storage or throw it into S3 or other cloud storage, and then try to analyze it with traditional tooling like Nextflow. If you have a lot of money and time, that approach might work, but it was begging for a smarter, more efficient way.

I joined Tile because they’re taking on this challenge with Carrara. It's designed to handle the complex, multi-dimensional "frontier data" (like genomics, imaging, and clinical data) that traditional databases struggle with while unifying it all in one place. I was impressed how Carrara combined the modern tools and UI convenience of a well-designed platform and was built on a powerful multi-dimensional array engine on the back end. It makes me really wish my former bioinformatics teams had this tool years ago. It would have saved a world of headaches.

Why Tile’s governance and auditing features are key

Can you go into more detail about how Tile’s solution addresses the multimodal data struggles you described?

Brooks: The multimodal columnar array format at the core of Tile’s platform helps life sciences organizations selectively query and process the data they want in a way that makes sense for that specific modality. And when you layer onto that the ability to add metadata and auditing, Tile becomes a really nice turnkey solution for a lot of problems biotech companies face. 

Governance is a great example. Other solutions support the small and compact format for single cell data in a columnar format, but lack the governance features that Tile offers. You’re going to need strong auditing and governance when you go to the FDA for approval of any treatments you develop. Too many bioinformatics teams don’t think about that until they’re further along in their pipeline. Add in the performance and compute efficiency that Tile brings and you get much smoother advancement through drug trials or diagnostic applications.

On a simpler level, Tile makes it really easy to find and query the data you need efficiently. In most ecosystems that I've worked in, if you want to work on reference files you usually put them in an EFS (elastic file store) or blob storage. However, you typically have to keep making copies of this data to hand it to different stakeholders for governance and privacy reasons. This leads to a lot of duplicated data and slow processes. 

What to expect from agentic AI in 2026

Agentic AI is probably the most talked about technology of 2026. What challenges do you see in implementing agentic AI that don’t get enough attention?

Brooks: I believe agentic AI is going to have a big year. Look at how AI coding assistants made a huge leap in the last year or so. In 2024 a lot of their output was half-baked and you couldn’t really trust anything, and now they help you get a lot done faster and you’re almost foolish if you’re not using them. I’ve coded some AI agents in 2025, and while I don’t use them for production coding yet, I know people who do. We just need to be clear-eyed about what the capabilities of agentic AI are now, and what their limitations are.

For example, some people want agents to set up infrastructure and run it for them. I think that's a place to be careful and it gets a little risky. Consider how some bioinformatics requests can cost $10,000 in compute, depending on how big the question is. Should a user really trust an AI agent to independently spin up something that costs that amount of money? Or take it to the compliance level. Are you really going to trust an agent to dig into personal identifiable information and not exfiltrate it somewhere else? It’s a huge liability for a life sciences or healthcare organization to leak that kind of protected data. I believe these are problems that will eventually be solved, but today they are big concerns.

As more organizations explore using AI agents, how do you see Tile helping agentic AI work more effectively?

Brooks: One key thing gets back to Tile’s core platform being able to selectively slice and load the right data. So one of the main pain points in AI use cases is the way that AI companies charge by using tokens. You get an API key, you go to Anthropic or OpenAI, and you have input and output costs. If you do a data egress on AWS, you have input and output costs for using these frontier models. And if you don't need 90% of your VCF file because you're only interested in a certain type of mutation, but you don't have a data format to selectively slice out only that 10%, you're going to get hit with 90% of token usage for your API call. And that's huge. If you've ever used these frontier AI models and you're trying to build something complicated, you can hit your token limits pretty fast. This becomes a huge issue for organizations that don’t have some type of lazy loadable core data product like Tile.

Tile’s governance and security features are important too. If we’re going to trust AI agents to work with sensitive data, we need strong auditing and oversight. When Tile provides secure integrations to platforms like Databricks and Snowflake, we’re trying to make sure our governance and compliance layer is bulletproof. That sets up our customers to innovate with AI agents and know their sensitive data will be safe.

To learn more about how Tile helps empower organizations to develop AI agents, contact us.

Meet the authors