Legal Issues for Data Professionals: AI Creates Hidden Data and IP Legal Problems

As data has catapulted to a new and valuable business asset class, and as AI is increasingly used in business operations, the use of AI has created hidden data and IP risks. These risks must be identified and then measures must be taken to protect against both a loss of rights and an infringement of rights.

This column uses a series of hypothetical fact patterns to present illustrative examples of how problems can arise and the potential effects that result if they do. As discussed below, a particular risk is introduced when third-party vendors use their AI on their customer’s databases.

Hypothetical 1

A company has a business unit-level database that contains data from the company’s customers; the database contains other data; the customer data is provided to the company pursuant to a confidential and nondisclosure agreement between the customer and the business unit; and the business unit’s use of the data is subject to rules of confidentiality and nondisclosure. The company wants to use AI to perform analytics on the data in the database, which, as noted, includes confidential customer information. What are the risks?

From a legal perspective, the core risks are a) that analytics run on the database will disclose customer information in violation of the confidentiality agreement and b) that the use of the customer information could be outside the scope of permitted use. It is common for confidentiality and nondisclosure obligations to be integrated into the governing agreement and not in a standalone nondisclosure agreement (NDA). Further, in many industries, the terms of customer confidentiality will be tailored specifically to the transaction and the agreement. As a result, it requires both business and legal analysis to determine the permitted, the prohibited, and the gray areas for scope of use.

It is important to note that the company and the customer entered into an NDA at the beginning of the transaction or before the transaction. The NDA may have different terms than the final agreement, but both the NDA and the agreement, with their different terms, will be in the database.

In addition, a company’s use of confidential information outside its permitted scope of use constitutes a breach of contract and could result in liability and the award of monetary damages against the company. Further, the misuse could trigger an indemnification obligation on behalf of the company, and the amount of damage in such circumstances is likely to be higher than for a regular breach of contract.

As lawyers become more aware of the risks of AI, we can predict that confidentiality and nondisclosure agreements will evolve to cover the use of both generative AI and non-generative AI.

Companies will begin to ask data professionals how to exclude customer confidential data from analytics run on a database that includes that data.

Hypothetical 2

The customer data contains personally identifiable information (PII). What are the risks?

PII is subject to regulation, and regulatory schemes often overlap. Laws are different in different jurisdictions, which is complicated in the U.S., where state laws have different rules.

The legal risk of wrongful disclosure and use of PII is that it subjects the company to regulatory fines, sanctions, and lawsuits, in addition to breach of contract claims if the PII is part of the confidential information category.

Hypothetical 3   

Same facts as Hypothetical 1, except that there are many customers involved. What are the risks?

More customers mean more confidentiality agreements and more confidentiality agreements mean more variations in the scope of the permitted use. The risks are further magnified because the risks are not just failing to comply with the scope of permitted use rules, but failing to comply with multiple scopes of multiple permitted uses.

Hypothetical 4

Same facts as hypothetical 3, except confidential information of business partners is included in the database as well as that of customers. What are the risks?

Well, a critical legal risk here, and it also applies to customer information and involves intellectual property. Confidential information is often protected by trade secret law. Thus, wrongful disclosure and use results in liability for violation of trade secrets. It is also foreseeable that copyright infringement and patent infringement can occur. This is complicated because there are different, but overlapping forms of IP protection, and the infringement analysis becomes complicated.

The nature of the business relationship also affects the risk calculus. If the parties are involved in a joint venture, then they can often have a competitive as well as cooperative relationship. This makes the risks arising from the wrong use and disclosure of trade secret information more of a business risk and more of a legal risk because of the potential avenues of liability.

In a manner similar to that noted above, certain lawyers will begin to address the application of AI to business trade secrets, especially as they and their clients acknowledge that the application of AI technology to databases with trade secrets (including large language models) will occur and the legal task becomes creating policies around use and reconsidering how permitted use should be address when it is virtually inevitable that AI will be used. Therefore, the conventional legal methodologies used with restrictions on the use and disclosure of traditional trade secrets will need to evolve to address the legal and business risks raised by the benefits of using AI. Another way to consider this is that the meaning of “disclosure “and “use” will be different in the AI context. This, in turn, means the definition of confidentiality and nondisclosure in agreements will change, with respect to the application of AI where disclosure is, in fact, different.

Hypothetical 5

The same facts as hypothetical 4, except that instead of using AI, the customer hires a third-party vendor to bring it AI technology as part of its services and then applies its AI to the companies’ data. What are the risks?

Here, the risks may increase because the vendor has not fine-tuned its AI to the company, and this means it has more “black box” operational features. This leads to more unknowns as to what the result of AI will be and less knowledge of the magnitude of the risk. This, in turn, will complicate vendor-customer contract negotiations because, in all likelihood, the vendor will want to reduce its financial risk because it does not know what risks will result or how they can be controlled.


The use of third-party vendor AI is one of the most serious risks to customers. AI is evolving quickly, and the results of analytics and generative AI use cases are difficult to predict. Because of the known unknowns, wrongful disclosure creates business and legal risks of a high magnitude. It may also create a new vector for cybersecurity attacks.

Share this post

William A. Tanenbaum

William A. Tanenbaum

William A. Tanenbaum is a data, technology, privacy, and IP lawyer, and a partner in the 100-year old New York law firm Moses Singer. Who’s Who Legal says Bill is a “go-to expert” on “the management of and protection of data across a variety of sectors.” It named him “one of the leading names” in AI and data, and ranked him as one of the international "Thought Leaders in Data." Chambers, America’s Leading Lawyers for Business, says Bill has “notable expertise in cybersecurity, data law, and IP,” has a “solid national reputation,” and “brings extremely high integrity, a deep intellect, fearlessness, and a practical, real-world mindset to every problem.” Bill is a member of the DAMA Speakers Bureau and the Past President of the International Technology Law Association. He is a graduate of Brown University (Phi Beta Kappa), Cornell Law School, and the Bob Bondurant School of High-Performance Driving. Follow William on LinkedIn.

scroll to top