Patronus AI launches EnterprisePII, the industry’s first LLM dataset for detecting business-sensitive information

NEW YORK, NY—THURSDAY, OCTOBER 19—Today, Patronus AI officially launched EnterprisePII, the industry’s first large language model (LLM) dataset for detecting business-sensitive information. AI researchers and developers can now freely use EnterprisePII to test whether their LLMs detect confidential information typically found in business documents like meeting notes, commercial contracts, marketing emails, performance reviews, and more.

Detecting and redacting business-sensitive information is a critical problem faced by many real-world enterprises using LLMs. Without this capability, there is a risk that LLMs leak confidential information to the public, third parties or internal users without permission. This is a business-critical risk that is creating uncertainty and holding enterprises back from adopting LLMs.

Typical PII detection models are based on Named Entity Recognition (NER), and only identify Personally Identifiable Information (PII) such as addresses, phone numbers, or information about individuals. These models fail to detect most business-sensitive information, such as revenue figures, customer accounts, salary details, project owners, and notes about strategy and commercial relationships.

The exposure of business-sensitive information through LLM usage has been a well-recognized phenomenon. These privacy concerns also hold back research and innovation, impeding the ability of research teams to test, compare, and track how effectively their LLMs work for real world use cases.

Introducing EnterprisePII

EnterprisePII is the industry’s first LLM dataset for detecting business-sensitive information. The dataset contains 3,000 examples of annotated text excerpts from common enterprise text types such as meeting notes, commercial contracts, marketing emails, performance reviews, and more.

MosaicML, now part of Databricks, has included the EnterprisePII dataset in their open-source training code repository, llm-foundry. A format of the dataset that is compatible with their open-source training library, Composer, is also included in MosaicML’s LLM Eval Gauntlet, a comprehensive measurement technique for evaluating LLMs. Both will be publicly available soon.

Additionally, as a part of this release, Patronus AI has included the EnterprisePII dataset in its platform, and customers can now evaluate their LLM system on enterprise PII leakage using the platform.

While research on differential privacy and data practices for individuals’ PII is well-studied, less attention has been paid to the exposure of sensitive data when using LLMs for enterprise applications. However, this issue has become critical due to the rapid adoption of generative AI models. Companies training language models routinely question whether they are leaking sensitive internal or customer data, while many have warned their employees about the use of public LLMs for work applications.

Organizations typically use categories like those listed below to determine how data should be stored, used, and distributed.

Public data. Information that is freely available and accessible to the general public.
Internal data. Information that is generated, collected, and used within an organization or a specific entity. It is usually proprietary.
Confidential data. Sensitive information that requires protection from unauthorized access, disclosure, or use.
Restricted data. A subset of confidential data that is subject to additional regulatory or legal restrictions. It typically includes highly sensitive information that has specific legal or contractual requirements for protection.

Examples of confidential data include:

References to private company financials
Information about sales or customer accounts
Performance reviews and HR data
Customer-specific information

The following are a few representative examples from the dataset:

Patronus AI has uploaded the full EnterprisePII dataset in its platform, and customers can now evaluate their LLM system on enterprise PII leakage using the platform. They can detect instances at scale where their LLM systems output confidential information. For more information about EnterprisePII, reach out to Patronus AI via contact@patronus.ai.

About Patronus AI

Patronus AI is the first automated AI evaluation and security platform for enterprise. The platform enables enterprise development teams to score LLM performance, generate adversarial test cases, benchmark LLMs, and more. Customers use Patronus AI to detect LLM mistakes at scale and deploy AI products safely and confidently. For more information, visit https://www.patronus.ai/.

‍