What Happens When You Bring Real AI Problems to Lean Thinkers?

Eric Olsen
23 hours ago
9 min read

Lean Into AI, Part 3: Observations from a Crowdsourced Exploration

We asked a simple question: Can AI actually help with the work we do every day?

Not in theory. Not in a vendor demo. With our actual problems, from our actual workplaces.

On February 11, 2026, 340 operational excellence practitioners from manufacturing, healthcare, education, finance, and research—including participants from seven countries—gathered for the third installment of the Lean Into AI series. Jim Perry, a Fisher College of Business lecturer with over 20 years of digital transformation experience, did something we hadn’t seen before in this series: he took six automation challenges submitted by participants ahead of time and worked through them live. No script. No cherry-picked examples.

What followed was a 60-minute working session that taught us as much about what AI can’t do well as what it can. Early satisfaction ratings came in at 90%, with participants averaging 53 minutes of engagement—nearly the entire session.

Starting with Real Work

Before the session, participants submitted their actual challenges: an automotive manufacturer trying to automate sign-in workflows, a research institute buried in Excel analysis, a recruiter syncing activities across disconnected systems, organizations wrestling with email overload, a décor manufacturer exploring 3D rendering, and a national bank fighting PowerPoint formatting.

Perry didn’t just discuss these—he had already dug into each one and came ready to demonstrate solutions. Each submission included what technology the organization had access to, what problem they faced, and what outcome they needed. That specificity mattered, because it forced honest answers about what’s actually possible.

A Useful Decision Framework

Perry walked us through a three-question sequence that we’re finding useful:

1. Can AI do this?

2. Can the available tool do this?

3. What is the best way to solve this problem?

That third question is where things got interesting. In several cases, the best solution wasn’t AI at all. Or it was AI, but not the AI tool the organization already had. Or it required combining multiple tools in ways that create real headaches for enterprise IT governance.

Take the automotive manufacturer’s challenge: automate signing in, extracting data, and uploading to SharePoint. Can AI do this? Yes. Can Microsoft Copilot Pro/Business do this? No. What’s the best way? Microsoft Power Automate—because you need authentication workflows, and putting credentials into AI prompts is something you should never do.

For those of us who’ve spent years thinking about value from the customer’s perspective, this felt familiar: understand the actual need, then find the right process to deliver it. Not the other way around.

When Less Instruction Works Better

Steve Pereira, who facilitates the series, offered an analogy that stuck with many of us: Amelia Bedelia, the children’s book character who takes every instruction literally. The comparison points to something broader about how AI interprets what we tell it—it processes language differently than a human colleague would.

Perry showed what this looks like in practice. When he gave an AI agent overly specific spatial instructions to interact with a button on screen, the AI failed repeatedly—it interpreted the directional language literally rather than understanding the intent. The fix was simply to say “Press the button.” The AI could already see the entire page and identify the element; it didn’t need the kind of step-by-step guidance we’d give a person unfamiliar with the interface.

We’re learning that sometimes more detail in a prompt isn’t better—sometimes it gets in the way. This seems worth paying attention to as we develop our collective AI skills, and it’s an opportunity to think beyond just the tool itself: how does the way we communicate with AI need to differ from how we communicate with each other?

Navigating the Tool Landscape

One pattern that emerged across all six demonstrations: solving real problems often means working across multiple tools. Microsoft Power Automate for credential-based workflows. Copilot Agents for project knowledge management. Google Opal for visual content. Gamma.app for presentation design. Claude for advanced reasoning. NotebookLM for research synthesis.

This creates a tension many of us recognize. Enterprises want single-vendor solutions—for good reasons like security, compliance, and support. But in our experience so far, effective AI use often requires picking the right tool for each specific problem, which means working across vendor boundaries. It’s an interesting alignment with the adjacent communities thinking happening elsewhere in the Future of People at Work initiative—different specialized communities that could benefit from collaboration.

We’ve long known in continuous improvement that tools matter less than understanding the process. What seems to be changing is the pace—the AI tool landscape is evolving faster than most organizational governance structures can keep up with. In the Q&A appendix below, Perry offers a practical framework for managing this: a tiered “AI Service Catalog” that moves organizations from saying “No” to asking “How.”

Prompting as a Practitioner Skill

Perry shared a nuanced observation about prompting strategy: for learning and exploration, shorter and more open-ended prompts tend to work better because they give AI room to work; for specific deliverables, detailed and constrained prompts with precise specifications produce better results.

His framing resonated with many participants: “Think of prompts as storytelling—imagine you met Einstein on the corner with all this intellect but who knows nothing about you.” Generally, AI doesn’t retain context between sessions, so each prompt needs to carry the relevant information—though it’s worth noting that some platforms are developing memory features that vary by tool and configuration.

For those of us familiar with the tension between standardized work and experimentation, this feels like a related challenge—knowing when to constrain and when to leave room for discovery.

Trust and Verification

Participants raised important concerns about AI hallucinations and the time required to verify outputs. Perry recommended clear acceptance criteria in prompts, RAG (Retrieval-Augmented Generation) architectures that cross-check against known sources, and asking AI to explain its methodology. His detailed responses to these questions are included in the appendix.

One participant raised what might have been the most important question of the session: When should we layer AI onto existing processes versus redesigning the process entirely? That’s a question any lean thinker would recognize. Perry acknowledged it directly—sometimes the right answer is to fix the underlying process, not to automate around it.

Connections to What We Already Know

What’s emerging from these sessions is a set of parallels between lean principles and thoughtful AI adoption that we’re finding useful:

Understanding the process now includes understanding which AI tools exist and what they’re actually capable of—not what vendor materials claim.
Defining value includes honest assessment of when traditional tools remain the better choice.
Respecting people means not forcing practitioners into approved-but-inferior tools when better options exist.
Going to gemba means actually trying these tools with real work, not just reading case studies or watching demos.

The session also surfaced questions we’re still wrestling with as a community: How do organizations balance security requirements with tool effectiveness? When does process redesign make more sense than AI augmentation? How do we build organizational capability to evaluate dozens of AI tools? These are organizational design questions, change management questions, people and culture questions—the kinds of questions we’ve been working on for decades, applied to a new domain.

What Participants Said

Early survey results reflect how the session landed with practitioners:

“I really like the approach and the topic.”

“Loved the real world examples!”

“I am new to using AI and this was very helpful.”

“Seeing the examples being worked live on the webinar was wonderful.”

“I recently started delving into AI, aside from specific topic chats, so curious about the real-life examples. Found it interesting.”

Continuing the Exploration

The Lean Into AI series continues this spring with additional sessions. The monthly Future of People at Work community gatherings provide ongoing space for working through these challenges together. Get involved at fpwork.org.

What this session modeled—and what seems to be working across the series—is a straightforward approach: bring real problems, try real solutions, share what you find. What works, what doesn’t, and why. Practitioners helping practitioners figure this out together.

If you’re navigating AI adoption in your own improvement work, we’d welcome your questions and experiences. That’s how we all get better at this.

Appendix: Q&A Responses from Jim Perry

The following questions were raised during the session. Jim Perry provided these detailed responses after the webinar.

Q: Which AI tools provide better privacy controls for company requirements?

A: Privacy is a moving target because its definition shifts based on your specific industry and data risk. Most major players like Microsoft, Google, OpenAI, and Anthropic now offer “Enterprise” tiers that legally guarantee your data is not used to train their public models. While Microsoft Copilot and Google Gemini provide the strongest “permission-trimmed” grounding by inheriting your existing security labels in SharePoint or Drive, the true path to total privacy is hosting your own LLM in-house. For most companies, the choice depends on where your data lives, but for highly sensitive IP or strict regulatory compliance (like HIPAA or defense), private/self-hosted models are the best practice. By running models locally or in a private cloud, your data never leaves your environment, which eliminates third-party risk and avoids the “Shadow AI” trap by providing a secure, high-stakes solution that aligns with internal governance.

Q: What are RAG architecture implementation best practices across different use cases?

A: This is a broad question, but it starts with understanding that RAG architecture varies based on volume, security, and accuracy needs. For simple, business-friendly solutions, tools like NotebookLM allow you to ground the AI strictly in up to 300 uploaded files for reference. At a bigger corporate layer, building a RAG process requires the same due diligence as any other application, necessitating early alignment with security, governance, and risk partners. Start small to monitor for hallucinations and ensure the AI is interpreting questions correctly, but remember that “Content is King.” The RAG documents themselves must contain the right information for the answers to be accurate. Proactively monitoring these outputs allows you to refine the architecture and maintain a defensible risk profile as you scale.

Q: How do organizations balance security requirements with tool effectiveness when best-of-breed tools exist outside approved vendor lists?

A: Stop hiding from “Shadow AI.” It is happening because your approved tools aren’t cutting it. Move from a “No” organization to a “How” organization by adopting a Portfolio Strategy. Create an AI Service Catalog instead of a one-size-fits-all mandate—treat AI like Excel or Teams. Proactively vet tools and make them ready for use. Tier 1: Universal Access—Provide a baseline tool like Gemini, ChatGPT, or Claude for general productivity and data security. Tier 2: Role-Specific (The 80/20 Rule)—Identify the 20% of specialized tools that drive 80% of the value for specific teams; find the best solutions for each and run them through proper procurement channels. Tier 3: The Sandbox—Create a clear path for experimental tools, so they go through risk, governance, and security reviews before they hit the wild. The bottom line: vet a best-of-breed stack to reduce your risk profile so people won’t feel the need to bypass IT. You get visibility and they get the effectiveness they need.

Q: What advantages exist when accessing NotebookLM sources through the Gemini app versus working directly within the NotebookLM interface?

A: Two key differences. First, standalone NotebookLM only has one chat window which gets longer until you clear it. Using notebooks inside Gemini allows your conversations to be saved in your Gemini Chat History, making it easier to manage long-term projects and revisit complex analyses. Second, Gemini Gems (think of them like mini-agents) have a limit to the number of documents you can upload into the Gem. You can use a NotebookLM notebook to increase this footprint to 300 documents.

Knowledge Map:

Process Keywords:

AI tool selection framework • Prompt engineering strategy • Value stream analysis before automation • Crowdsourced problem solving • Live demonstration methodology • Cross-sector practitioner collaboration • Tool ecosystem navigation • Security governance balance • RAG architecture implementation • AI Service Catalog design

Context Keywords:

Operational excellence practitioners • Enterprise AI adoption • Microsoft Copilot limitations • Claude Opus 4.5 capabilities • Gamma.app alternatives • Google Opal visualization • Authentication workflow security • Knowledge management systems • Best-of-breed vs single-vendor • Technology governance structures • Shadow AI management

Application Triggers:

When evaluating AI tools for specific organizational challenges • When experiencing AI hallucination concerns requiring verification protocols • When enterprise IT governance conflicts with practical tool effectiveness • When traditional automation methods may outperform AI solutions • When building cross-functional AI implementation teams • When designing prompt engineering training programs • When balancing innovation with security compliance • When developing an organizational AI Service Catalog

Related Continuous Improvement Themes:

Value definition from customer perspective • Respect for people in technology decisions • Go to gemba (actual conditions) with AI tools • Standard work vs experimentation balance • Waste elimination through proper tool selection • Process understanding before automation • Systems thinking in organizational design • Collaborative problem solving across boundaries • Honest assessment of what works • Adjacent communities collaboration

Video: https://youtu.be/QjAtTWt9AFI

———

This blog post was developed through collaboration between webinar presenters and synthesized with Claude.AI assistance. Editorial contributions by Rachel Reuter and Jim Perry. It represents ongoing work by the Future of People at Work initiative, a collaboration of Catalysis, Central Coast Lean, GBMP Consulting Group, Imagining Excellence, Lean Enterprise Institute, Shingo Institute, The Ohio State University Center for Operational Excellence, Toyota Production System Support Center (TSSC), and University of Kentucky Pigman College of Engineering.

For more information about the Lean Into AI series and Future of People at Work community gatherings, visit fpwork.org.