Author: Saibal Banerjee, Ph.D. Co-Founder & CTO tomtA

  • Predicting rare instances for AI/ML in a GDPR world

    Predicting rare instances for AI/ML in a GDPR world

    Success with AI/ML depends first and predominantly on your data. “Garbage in, garbage out” is both a familiar and accurate refrain. The integrity of your data pipeline will be the biggest determinant of whether you are able to produce impactful AI/ML products to solve critical problems—the kind that face us in every industry to address…

  • HIPAA fails both patients and product innovators

    HIPAA fails both patients and product innovators

    The HIPAA privacy rule fails to provide patient privacy or data value for enterprises that want to leverage sensitive data to innovate and improve patient care. A patient’s health information is vulnerable to disclosure, through accident or bad actor, even if an organization adheres to HIPAA. HIPAA should have established a much higher standard of…

  • Synthetic data is a hammer. Not everything is a nail.

    Synthetic data is a hammer. Not everything is a  nail.

    Synthetic data was created to create data, not privacy. Synthetic data is popular in the MLOps space, but it’s often used for the wrong applications. Used for amplifying data where you don’t have enough original data, it’s the wrong method when applied to sensitive data that must be kept private, especially where GDPR requirements are…

  • The Sorry State of Data Anonymization Performance Metrics

    The Sorry State of Data Anonymization Performance Metrics

    The lack of data anonymization standards requires customers to flip a coin in selecting these solutions to protect their sensitive data. Many new data anonymization technologies have been launched to serve data scientists and ML engineers, but without a standard measure of privacy and utility, customers can’t make objective choices on which technology to use.…