import React from 'react';
import CustomersTemplate from '@components/CustomersTemplate/index';
import metaImage from '@page-components/case-studies/assets/Rady-Childrens-case-study-page-thumb.jpg';

const RadyPage = () => {
  const sections = [
    {
      bodySubTitle: 'Clinical context',
      bodyText: (
        <>
          <p>
            For the past 20 years, NBS has used mass spectrometry (MS) for biochemical analysis of dried blood spot samples, which are
            pricked from a baby&apos;s heel within the first few days of life. While NBS-MS is a mature and well-understood process, it
            detects 37 core genetic disorders (plus a secondary list of 26 as of February 2023), according to the&nbsp;
            <a href="https://www.hrsa.gov/advisory-committees/heritable-disorders/rusp" target="blank">
              <u>Recommended Uniform Screening</u>
            </a>
            &nbsp; Panel from the HRSA, a U.S. federal agency.
          </p>
          <br />
          <p>
            The team at RCIGM knew from clinical experience that rWGS is capable of identifying a much wider range of genetic disorders and
            over a much wider population of children. They assembled a group of experts to identify 388 genetic diseases that are medically
            actionable to include in an expanded NBS-rWGS effort. With a wider range of genetic disorders and reasonable clinical testing
            costs on the horizon, the next step was to evaluate the approach using historical data.
          </p>
        </>
      ),
    },
    {
      bodySubTitle: 'Testing TileDB queries',
      bodyText: (
        <>
          <p>
            To gauge the accuracy of the expanded screening panel, the team assembled a test dataset of known genetic diagnoses. RCIGM
            collaborated with a range of clinical and biotechnology industry experts to evaluate the false-positive rate within a cohort of
            "4,376 critically ill children and their parents who received rWGS at RCIGM for diagnosis of suspected genetic disorders”. The
            cohort's Variant Call Format (VCF) samples were ingested into a 3-dimensional TileDB array on Amazon S3 using the&nbsp;
            <a href="https://github.com/TileDB-Inc/TileDB-VCF" target="blank">
              <u>TileDB-VCF</u>
            </a>
            &nbsp;open library, making this data analysis-ready on cloud storage. TileDB queries were refined until they were within an
            acceptable false-positive rate. To ensure statistical significance, they evaluated false-positives against 454,707 whole exome
            sequences from the&nbsp;
            <a href="https://www.ukbiobank.ac.uk/" target="blank">
              <u>UK Biobank</u>
            </a>
            , bringing the rate to 0.27%.
          </p>
        </>
      ),
    },
    {
      bodySubTitle: 'Solving n + 1 with TileDB',
      bodyText: (
        <>
          <p>
            The team knew that as the number of NBS-rWGS disorders grows over time, so too will the computational complexity of population
            genomics itself. This is known as the n + 1 problem, where researchers would prefer to avoid reexamining every Genomic VCF
            (gVCF) when a new variant is introduced into the cohort, in order to determine whether the other subjects are in fact reference
            or simply have no read coverage at that position.
          </p>
          <br />
          <p>
            The typical data engineering solution to this problem would be periodic batch processing jobs; however, since each human genome
            closely overlaps with reference assemblies (99.8% similarity, or ~5 million unique base pairs among ~3 billion genomic
            positions), the sparse nature of this data makes batch processing tedious and expensive. Because TileDB natively represents
            sparse data on cloud object storage — neatly compressed and without bloated filler values — Dr. Kingsmore and his collaborators
            built the data management system for their NBS-rWGS program around sparse TileDB arrays and evaluated the cost reduction of n +
            1 sample ingestion using a c6 g.xlarge Amazon EC2 instance. Here are the highlights:
          </p>
        </>
      ),
      bodyList: [
        'TileDB ingestion from S3 was $0.06 vs. $2.18 using traditional file-based approaches using the same EC2 instance.',
        'Reduced the time it took to add a new sample to an existing dataset and compute common variant statistics across the entire population to ~22 minutes.',
      ],
    },
    {
      bodySubTitle: 'Shortening the diagnostic odyssey',
      bodyText: (
        <>
          <p>
            Reducing the diagnostic burden on clinicians was the ultimate goal. Incorporating a wide range of annotation data — particularly
            through TileDB's support for&nbsp;
            <a href="https://fabricgenomics.com/resource/fabric-gem/" target="blank">
              <u>Fabric GEM™</u>
            </a>
            &nbsp;, an AI-driven genetic diagnosis tool — was a key component. Faster data access and more efficient n + 1 computations
            using TileDB and AWS were significant benefits that contributed to automated and accurate genetic diagnoses, critical to
            clinicians in the NICU who may not have genomics expertise.
          </p>
          <br />
          <p>
            In this dynamic environment, TileDB is positioned to allow clinicians to revisit the pediatric disease landscape as four key
            sources of information grow:
          </p>
        </>
      ),
      bodyList: [
        'Biobank-scale population frequencies and associated phenotypes                                                    ',
        'Patient genomic variant databases                                                                                 ',
        'Curated variant annotation databases                                                                              ',
        'Interventions                                                                                                     ',
      ],
      bodyText2: (
        <>
          <p>
            These improvements are currently supporting another effort led by RCIGM,&nbsp;
            <a href="https://www.nature.com/articles/s41467-022-31446-6" target="blank">
              <u>Genome-to-Treatment</u>
            </a>
            &nbsp;(GTRx), an automated system for genetic diagnosis and acute management guidance. By speeding-up NBS-rWGS data analysis and
            feeding results into GTRx, RCIGM and its collaborators are working to improve clinical outcomes worldwide.
          </p>
        </>
      ),
    },
    {
      bodyTitle: '2023 Update',
      bodySubTitle: 'TileDB for population genomics',
      bodyText: `A year since publication in spring 2022, here's an update on TileDB's role:`,
      bodyList: [
        'Managing group of arrays, totaling ~13 TB.',
        'RCIGM achieved 7-hour clinical turnaround time, of which TileDB loads new samples in minutes and returns queries in seconds.',
        'Optimized allele frequency calculations and other TileDB improvements further drove down query costs to $0.03.',
        'Working to move UKBB WGS storage to TileDB for faster iterative analysis.',
      ],
    },
  ];

  return (
    <CustomersTemplate
      pageName="customers-quest"
      helmet={{
        title: 'Case Study: Rady Children’s | TileDB',
        description: 'Discover how TileDB supports expanded newborn screening and genetic diagnosis program for Rady Children’s Hospital.',
        shareImage: {
          url: metaImage,
          width: 1200,
          height: 627,
        },
      }}
      header="Customer Paper"
      title="TileDB supports expanded newborn screening and genetic diagnosis program for Rady Children’s Hospital"
      description={
        <>
          <a href="https://www.rchsd.org/" target="blank">
            <u>Rady Children’s Hospital</u>
          </a>
          &nbsp;and&nbsp;
          <a href="https://radygenomics.org/" target="blank">
            <u>Rady Children’s Institute for Genomic Medicine</u>
          </a>
          &nbsp; (RCIGM) in San Diego are at the forefront of applying rapid whole-genome sequencing (rWGS) to newborn screening (NBS).
          In&nbsp;
          <a href="https://www.cell.com/ajhg/fulltext/S0002-9297(22)00355-X" target="blank">
            <u>a 2022 paper</u>
          </a>
          &nbsp; published in The American Journal of Human Genetics, lead author Dr. Stephen Kingsmore, President and CEO of RCIGM,
          describes the methods used to design a new, faster and comprehensive form of NBS built on rWGS technology (NBS-rWGS). The
          experiments involved retroactive analysis of large amounts of genomic variant data, which was managed by TileDB for fast and
          cost-efficient access.
        </>
      }
      gradient="blue"
      sections={sections}
      cardText={
        <>
          <a href="https://www.youtube.com/watch?v=YE28OrfcuGs" target="blank">
            <u>Watch the video:</u>
          </a>
          &nbsp; Dr. Stephen Kingsmore builds a compelling case for TileDB as the only solution that can handle PBs of genomic data and
          nightly analysis to impact clinical analysis in the NICU
        </>
      }
      cardAuthor="Dr. Stephen Kingsmore"
      cardAuthorDescription="President/CEO, Rady Children's Institute for Genomic Medicine"
      domain="Genomics"
      datatypes={['Variant data']}
    />
  );
};

export default RadyPage;
