Stop “Evangelizing” Generative AI without Data Governance

Sam Panini
2 min readAug 21, 2023

--

To paraphrase Lutz Finger, investing in a business — or using a service — which relies on public vs proprietary data should be amongst the first steps of due diligence by a product team.

How real is the moat of training data which has non-trivial costs?

Even Andreessen Horowitz has public articles making it clear that the training costs are high (they will come down).

The business case for the information architecture also has to cover external expenses, such as defending lawsuits.

The CEO of Databricks said:

“The leading companies in the future are going to be data and AI companies — healthcare, retail, you name it….in five or 10 years, to be the CEO in any of these industries, you’ll need to have a data-and-AI background.”

Product thought leaders are writing today about how the function needs to up-skill.

The implication is that data products will be even more entrenched in B2B workflows and B2C offerings.

The phrase “data is the new oil” was coined by a mathematician in 2006.

Note that dinosaur juice is a volatile material that is highly-regulated for a reason: it can literally blow up in your face.

If “software is eating the world” and digital transformation is inevitable, then integrating data governance should Step 0 for leading Product organizations.

Put On Your Thinking Cap

Caveat Emptor:

The future promised by AI is written with stolen words.

A business evaluating the return on investment should consider:

  • where did the words in the language model come from?
  • where will training data come from in the future?
  • do we trust the quality of the sentences?
  • is the extraction, refinement, and use of data sustainable and non-toxic to operators and consumers?

I think it’s a business imperative that Product teams explore answers to these questions.

--

--

No responses yet