Watch the video of pics from Legalweek in NYC 2025 - My 20th Legalweek/Legaltech
Thursday, March 27, 2025
Friday, March 21, 2025
Under whose umbrella? Navigating the Specialized Needs of Information Governance and Legal Operations
Under Whose Umbrella: Navigating the Specialized Needs of
Information Governance and Legal Operations
In the corporate world, two umbrellas Information Governance
(IG) and Legal Operations (Legal Ops) shelter a sprawling array of specialized
needs, each vying for attention in an era of digital transformation. These
domains aren’t just buzzwords; they are frameworks that manage risk, ensure
compliance, and unlock value from data. But what lies beneath each umbrella?
How do their scopes intersect, and who holds the handle, be it the CIO, CISO,
CTO, or even the CEO? Let’s unpack this, spotlighting eDiscovery as a pivotal
element, alongside cybersecurity, computer forensics, IT, Legal IT, data
protection, records management, archival of records, data governance, data
privacy, risk management, and compliance.
Under the Information Governance Umbrella
Information Governance is the backbone of an organization’s
data strategy, a holistic approach to managing information assets across their
lifecycle. It’s about control, accountability, and foresight. Beneath this
umbrella, specialized needs emerge:
- eDiscovery:
The process of identifying, collecting, and producing electronically
stored information (ESI) for legal proceedings. IG ensures eDiscovery is
defensible, think retention policies that prevent spoliation or data
mapping that locates ESI fast. It’s the foundation for litigation
readiness. I have argued for years that eDiscovery is a subset of an
information governance program.
- Cybersecurity:
Protecting data from breaches is non-negotiable. IG defines access
controls and encryption standards, aligning with security protocols to
safeguard sensitive information.
- Computer
Forensics: When incidents occur, IG supports forensic analysis, tracking
data trails to uncover breaches or misuse, often feeding into eDiscovery
efforts.
- IT:
The operational engine, IT executes IG policies, deploying systems for
storage, retrieval, and security. It’s the plumbing beneath the strategy.
- Data
Protection: IG ensures compliance with laws like GDPR or CCPA, setting
rules for data handling and breach response.
- Records
Management: From creation to disposal, IG governs how records are
classified, stored, and purged, balancing utility with regulatory
mandates.
- Archival
of Records: Long-term preservation falls here, ensuring historical
data remains accessible yet secure, often for audits or litigation.
- Data
Governance: A subset of IG, this focuses on data quality, consistency,
and ownership, critical for analytics and compliance.
- Data
Privacy: IG overlaps with privacy, enforcing policies that protect
personal data and manage consent.
- Risk
Management: By identifying data vulnerabilities, IG mitigates
financial and reputational risks.
- Compliance:
The glue that binds it all, IG ensures adherence to industry standards and
regulations.
Who Holds the IG Umbrella? Typically, the Chief
Information Officer (CIO) or Chief Data Officer (CDO), if the role exists, wields
control, given their oversight of IT and data strategy. However, the Chief
Information Security Officer (CISO) often co-owns cybersecurity and data
protection, while the Chief Compliance Officer (CCO) may weigh in on regulatory
alignment. In some firms, the CEO steps in when IG escalates to enterprise-wide
risk, signaling its strategic weight.
Under the Legal Operations Umbrella
Legal Operations, meanwhile, is the business engine of the
legal department, optimizing processes, managing costs, and aligning legal work
with corporate goals. Its umbrella covers needs that often overlap with IG but
serve a distinct purpose:
- eDiscovery:
Here, Legal Ops focuses on execution, managing vendors, streamlining
review workflows, and cutting costs. While IG sets the stage, Legal Ops
runs the play, often leveraging third-party solutions for efficiency.
- Cybersecurity:
Legal Ops collaborates with IG to address breach fallout, think litigation
risk or regulatory fines, rather than owning prevention.
- Computer
Forensics: Legal Ops taps forensics for evidence in disputes or
investigations, relying on IG’s groundwork.
- Legal
IT: A specialized subset of IT, Legal Ops owns tech stacks like
eDiscovery platforms, contract management systems, and case analytics, tools
that boost legal productivity.
- Data
Protection: Legal Ops ensures legal processes (e.g., contracts, NDAs)
comply with protection laws, leaning on IG for policy.
- Records
Management: Legal Ops manages legal-specific records, court filings, legal
hold obligations, agreements, while IG handles broader retention.
- Archival
of Records: Legal Ops archives case files for future reference, often
outsourcing to IG’s systems.
- Data
Governance: Less central here, but Legal Ops uses IG’s data standards
for legal analytics or reporting.
- Data
Privacy: Legal Ops navigates privacy in legal contexts, for example,
client data in discovery, relying on IG’s framework.
- Risk
Management: Legal Ops mitigates legal risks (e.g., litigation
exposure), distinct from IG’s broader data risks.
- Compliance:
Legal Ops ensures legal activities meet regulatory and ethical standards,
overlapping with IG’s compliance arm.
Who Holds the Legal Ops Umbrella? The General Counsel (GC)
or Chief Legal Officer (CLO) typically oversees Legal Ops, with a Legal
Operations Manager handling day-to-day execution. The Chief Technology Officer
(CTO) may influence Legal IT, but control rarely shifts outside legal
leadership unless escalated to the CEO for budget or strategic calls.
The Nexus Debate: Where’s the Line?
The overlap between IG and Legal Ops, especially with
eDiscovery, sparks debate. IG builds the infrastructure (e.g., data retention
for eDiscovery), while Legal Ops drives its application (e.g., review
efficiency). But the nexus blurs with shared needs:
- Cybersecurity
and Data Privacy: IG owns the policies; Legal Ops handles legal
fallout. Who’s accountable when a breach triggers litigation?
- Legal
IT vs. IT: Legal Ops demands tailored tools, but IG’s IT backbone
supports them. Does the CIO or CLO dictate tech priorities?
- Compliance:
Both chase it, but IG’s scope is enterprise-wide, while Legal Ops is
legal-centric. Who resolves conflicts?
This tension often hinges on control. If the CIO or CISO
dominates IG, Legal Ops may feel sidelined, relying on IT without steering it.
If the GC holds sway, IG might bend to legal priorities, neglecting broader
data needs. The CEO becomes the tiebreaker when silos clash, but proactive
firms appoint a Chief Data Officer (CDO) or Chief Privacy Officer (CPO) to
bridge the gap, aligning both umbrellas under a unified vision.
Beyond the List: Additional Factors
- AI
and Analytics: Tools like Needle (from The Project Consultant) sit at
the IG-Legal Ops intersection, analyzing data for legal insights, whose
budget funds them?
- Vendor
Management: Legal Ops often owns eDiscovery vendors, but IG may
oversee data security vendors, another overlap point.
- Cultural
Buy-In: Neither umbrella works without stakeholder alignment, does the
C-suite or department heads drive adoption?
Conclusion: A Shared Canopy
Information Governance and Legal Operations aren’t rivals…they’re
partners under a shared canopy. IG provides the data foundation; Legal Ops
turns it into action. eDiscovery exemplifies this dance, IG ensures readiness,
Legal Ops delivers results. Cybersecurity, IT, and the rest weave through both,
but their ownership depends on who holds the umbrella, and how well they
collaborate. As disputes over nexus persist, the answer isn’t one leader (CIO,
CISO, or GC), but a coalition, often led by the CEO or a hybrid role like the
CDO. Under this umbrella, the future will more likely be protected.
Tuesday, March 18, 2025
How Better Data Drives Superior Generative AI Results
How Better Data Drives Superior Generative AI Results:
In the rush to develop and deploy generative AI solutions, we often overlook a fundamental truth: the quality of AI outputs is directly determined by the quality of its training data. Data that has been accurately classified prior to training creates a foundation for more reliable, useful, and trustworthy AI systems.
Why Classification Matters
When training large language models (LLMs), poorly classified data introduces noise and inconsistencies that the model inevitably learns and reproduces.
Consider these impacts:
1. Contextual Understanding: Precisely classified data helps models understand when and where specific information applies, reducing irrelevant or inappropriate responses.
2. Reduced Hallucinations: Well-classified training data creates clearer boundaries for an AI's knowledge, making it less likely to "hallucinate" or fabricate information when operating outside its knowledge base.
3. Enhanced Specialization: Models trained on accurately classified domain-specific data demonstrate superior performance in specialized fields like legal, medical, or technical domains.
4. Improved Reasoning: Clear classification patterns in training data translate to better logical reasoning capabilities in the resulting AI.
The Business Case:
Organizations investing in data classification before AI training are seeing tangible benefits:
40-60% reduction in model retraining cycles
Significantly higher accuracy in domain-specific applications
Reduced risk of compliance issues and reputational damage
More efficient use of computing resources during training
Looking Forward:
As we move from the "early adoption" phase of generative AI to more mature implementations, the competitive advantage will increasingly belong to those who prioritize data quality over quantity. The most successful AI implementations will be built on foundations of meticulously classified, contextually rich datasets.
#GenerativeAI #DataQuality #MachineLearning #AIStrategy #DataClassification #ediscovery #informationgovernance #dataprotection #dataprivacy #edrm #aceds #arma #iapp #compliance #grc #legalweek2025
Wednesday, March 12, 2025
Bulidng the Roads for Generative AI – Using ETL (Extract, Transform, Load) What is an ETL tool?
Bulidng the Roads for Generative AI – Using ETL (Extract,
Transform, Load)
What is an ETL tool?
When looking at generative artificial intelligence, an ETL
tool is a software solution or process that handles “Extract, Transform, Load”
operations, adapted to the unique needs of AI systems and their supporting
agents. Traditionally used in data warehousing, ETL has evolved in the AI era
to play a critical role in preparing and managing data for generative models
and the agentic frameworks that orchestrate their activities.
In a prior LinkedIn post, I used the metaphor of cars,
roads, and traffic regulations, likening them to development of AI chatbots. In
furtherance of that analogy, let’s take a look at ETL.
ETL Defined: Extract, Transform, Load
- Extract:
Pulling raw data from various sources, databases, APIs, text files, social
media, or even unstructured outputs from generative AI itself (e.g.,
bot-generated text or images). Think of this as gathering the
"fuel" for the AI "car."
- Transform:
Cleaning, structuring, and enriching that data to make it usable for AI
models or agents. This might involve normalizing text, removing biases,
tagging metadata, or converting formats, essentially tuning the
"engine" so the car runs smoothly.
- Load:
Delivering the processed data into a destination, such as a training
dataset for a generative AI model, a knowledge base for an agent, or a
storage system for downstream use. This is like parking the car on the
"road" where it can be accessed or deployed.
In the generative AI and agentic world, ETL isn’t just about
moving data, it’s about enabling bots (the AI "cars") and agents (the
"roads") to function effectively while adhering to AI governance (the
"streetlights and road signs").
ETL in the Generative AI Context
Generative AI models, like GPTs or image generators, rely on
massive, high-quality datasets to produce coherent outputs. ETL tools ensure
that the data feeding these models is fit for purpose. For example:
- Extract:
An ETL tool might scrape web data, pull user prompts from an X feed, or
collect outputs from a bot’s prior runs.
- Transform:
It could filter out noise (e.g., irrelevant or toxic content), standardize
formats (e.g., turning PDFs into plain text), or enrich data with context
(e.g., adding sentiment labels).
- Load:
The processed data is then fed into the AI’s training pipeline or a
real-time inference system, ready for the bot to generate responses.
Without ETL, generative AI would be like a car with no fuel,
or worse, fuel that clogs the engine. The bot might "drive" (generate
outputs), but it’d be erratic, biased, or stuck in a ditch of bad data.
ETL and Agentic Tools: Building the Roads
Agentic tools, autonomous systems that manage workflows,
coordinate multiple AI models, or interact with environments, are the
"roads" in our metaphor. They rely on ETL to keep traffic flowing
smoothly. Here’s how:
- Extract:
Agents need real-time data to act, like a customer service agent pulling
live chat logs or a research agent fetching the latest papers. ETL tools
extract this dynamically.
- Transform:
Agents often work with multiple bots or systems, so ETL harmonizes
disparate data (e.g., converting a generative AI’s text output into a
structured JSON for an agent to parse). It’s like paving a road to connect
different cities.
- Load:
ETL delivers the transformed data to the agent’s decision engine or memory
bank, enabling it to orchestrate tasks, like routing a bot’s output to the
right user or triggering another AI process.
For instance, an agent managing a fleet of generative AI
bots (e.g., one writes copy, another design’s images) uses ETL to ensure all
inputs and outputs align, much like a highway system keeps cars moving in sync.
AI Governance: The Streetlights and Road Signs
ETL tools also intersect with AI governance, ensuring the
"cars" (bots) and "roads" (agents) operate safely and
legally. Governance elements, like data privacy laws, ethical guidelines, or
bias audits, rely on ETL to enforce compliance:
- Extract:
Only pulling data that meets regulatory standards (e.g., GDPR-compliant
sources).
- Transform:
Anonymizing sensitive info, flagging biased content, or adding
traceability tags, akin to installing road signs that say “Speed Limit 55”
or “No U-Turn.”
- Load:
Storing data in secure, auditable systems, ensuring the AI’s
"journey" can be tracked and justified, like streetlights
illuminating the path for accountability.
Without ETL, governance would be blind, unable to monitor or
steer the AI traffic.
Examples of ETL Tools in This Space
- Traditional
ETL Adapted: Tools like Apache NiFi, Talend, or Informatica are being
repurposed to handle AI data pipelines, extracting from cloud sources and
transforming for model training.
- AI-Specific
ETL: Platforms like Hugging Face’s Datasets or Google’s Dataflow cater
to generative AI, offering pre-built transformations for text, images, or
multimodal data.
- Agentic
ETL: Frameworks like Needle or LangChain include ETL-like components to
manage data flows between agents and bots, ensuring seamless
"road" conditions.
The Big Picture: ETL as the Mechanic’s Shop
In our analogy, ETL is the mechanic’s shop, tuning the cars
(bots), paving the roads (agents), and installing the streetlights
(governance). It’s not glamorous, but it’s indispensable. Just as early
automobiles needed mechanics to keep them roadworthy, generative AI and its
agentic ecosystem depend on ETL to turn raw potential into reliable
performance. As we race down this digital highway, ETL tools are the unsung
heroes ensuring we don’t stall, and hopefully not crash. along the way.
Wednesday, January 8, 2025
Legal Operations at a Crossroads: How Corporate Legal Teams Will Continue to Drive Innovation and Implement Digital Transformation in 2025
Corporate legal teams are at a pivotal juncture, focusing on innovation and digital transformation for 2025. A recent survey by iManage, LegalMation, and Neota Logic highlights key trends:
Measuring Success: Legal operations are developing metrics to assess their effectiveness, emphasizing the importance of data-driven decision-making.
Generative AI Integration: While generative AI offers significant potential, teams face challenges in its deployment, including data security concerns and the need for specialized expertise.
Innovation Promotion: There's a concerted effort to foster innovation within legal departments, aiming to enhance efficiency and adapt to evolving business needs.
Automation and Contract Management: Prioritizing automation and improving contract management processes are central to streamlining operations and reducing manual workloads.
The survey also notes that 93% of respondents believe the role of legal operations professionals has expanded, reflecting their growing influence in driving organizational change.
In summary, corporate legal teams are proactively embracing technological advancements and innovative practices to navigate the complexities of the modern legal landscape.
https://www.law.com/legaltechnews/2024/12/17/legal-operations-at-a-crossroads-how-corporate-legal-teams-will-continue-to-drive-innovation-and-implement-digital-transformation-in-2025/?slreturn=20241227184503
Friday, November 15, 2024
The Power of ‘Other People's Time’ (OPT) and ‘Other People's Money’ (OPM) - book authored by Joe Bartolo, J.D.
The Power of ‘Other People's Time’ (OPT) and ‘Other People's Money’ (OPM)
“The Power of ‘Other People's Time’ (OPT) and ‘Other
People's Money’ (OPM)" reveals the secrets of harnessing external
resources to supercharge success, whether for a business or personal goals. The
book dives into practical strategies for leveraging the expertise and effort of
others (OPT) alongside financial resources from external sources (OPM),
emphasizing ethical approaches and responsibility. Through case studies of
industry giants like Amazon and Tesla, it showcases how great leaders balance
time, capital, and trust. It's an essential read for anyone seeking to multiply
their impact by maximizing what they can achieve with the support of others.
Wednesday, September 4, 2024
Treacherous Technology: Generative AI and Quantum Computers - A Discussion of the risks posed by innovative technologies
A book I recently authored, entitled Treacherous Technology, discussing risks posed by innovation such as generative AI, and quantum computers. An examination of threats to data security, data privacy, trust, and to intellectual -property. Includes a look at the history of major hackings, and resulting disruption caused by data breaches. Available for purchase on Amazon at the link provided above.