AI Project Showcase - Jay Bennett

AI-Augmented Data Engineering

Real projects where AI delivered measurable business impact

🎯 Featured Project: Tennessee CCWIS Automation

Client: Tennessee Department of Children's Services (via Deloitte)

Duration: 5 months (2024)

Result: 70% reduction in manual data mapping effort

The Challenge

Tennessee was migrating their Child Welfare system (CCWIS) from a legacy platform to Salesforce. The data warehouse team needed to:

Parse hundreds of Salesforce metadata files (XML/JSON)
Interpret 100+ business requirement documents
Create technical mapping specifications for each report
Classify tables as facts vs dimensions
Design Snowflake schema changes

Manual process: Analysts spending 4-6 hours per report reading docs, cross-referencing metadata, writing specifications in spreadsheets.

Volume: 100+ reports across Child Care, Independent Living, and OCS domains.

Timeline pressure: State government deadlines, multi-million dollar project.

The Opportunity

I noticed the team doing repetitive pattern-matching work:

Reading Salesforce field definitions → Matching to report columns
Looking up data types → Classifying as fact or dimension
Cross-referencing business rules → Writing mapping logic

Insight: This is exactly what LLMs are good at - reading structured documents, finding patterns, generating structured output.

The Solution: Python + ChatGPT Automation

Architecture

Built a Python pipeline that:

Ingested Salesforce metadata - Parsed XML/JSON files into structured data
Extracted report requirements - Read business requirement docs, pulled relevant sections
Generated AI prompts - Created context-rich prompts with metadata + requirements
Called ChatGPT API - Sent prompts, received structured mappings as JSON
Validated output - Checked for completeness, flagged ambiguities
Generated Excel specs - Formatted into analyst-friendly spreadsheets

Prompting Strategy

The key was giving ChatGPT the right context:

Prompt Structure (High-Level):

You are a data warehouse architect analyzing Salesforce metadata.

CONTEXT:
- Report Name: [Report Title]
- Business Purpose: [From requirements doc]
- Salesforce Objects: [List of tables involved]

METADATA:
[Paste relevant Salesforce field definitions]

TASK:
Generate technical mapping specification:
1. Identify primary fact table
2. List dimension tables
3. Map each report column to Snowflake table.column
4. Classify field types (measure vs attribute)
5. Note any ambiguities or data quality concerns

OUTPUT FORMAT: JSON

What made it work:

Structured output: Required JSON format for easy parsing
Domain context: Explained DW concepts (facts, dimensions, SCD)
Examples: Showed 3-5 completed mappings as training data
Validation rules: Asked AI to flag uncertainties rather than guess

Implementation Details

Technology Stack

Python 3.x - Core automation logic
OpenAI API - GPT-4 for text analysis and generation
Pandas - Data manipulation and Excel output
XML/JSON parsing - Salesforce metadata extraction
Git - Version control for prompts and code

Workflow Integration

Human-in-the-loop design: AI generated 80% of mapping, analysts reviewed and refined.

Python script generated draft specifications
Analyst reviewed for accuracy (15-30 minutes vs 4-6 hours)
Analyst added business context AI couldn't infer
Final spec approved by lead architect

This preserved data quality while dramatically reducing time investment.

Results & Impact

Quantitative Results

70% time reduction: 4-6 hours → 1-1.5 hours per report
100+ reports processed across 3 domains
300+ manual hours saved over project duration
Consistent quality: Standardized format, fewer errors

Business Impact

Met aggressive state government deadlines
Freed analysts for higher-value work (schema design, optimization)
Created reusable framework for future Salesforce migrations
Improved team morale - eliminated tedious manual work

Technical Learning

AI excels at pattern matching in semi-structured documents
Good prompts = good results (spent 2 days refining prompt templates)
Human review essential - AI catches 80-90%, humans catch edge cases
Metadata quality matters - garbage in, garbage out

Key Takeaways

When to Use AI for Automation

✅ Good candidates:

Repetitive document analysis tasks
Pattern matching across structured/semi-structured data
Tasks requiring context from multiple sources
Work where 80% accuracy + human review = acceptable

❌ Poor candidates:

Tasks requiring 100% accuracy with no review
Work with sensitive data that can't be sent to APIs
Processes that change frequently (prompt maintenance overhead)
Simple tasks already solved with traditional code

My AI Development Approach

Start small: Automate one report as proof-of-concept
Measure results: Compare AI output vs manual work
Iterate prompts: Refine based on what AI gets wrong
Add validation: Build checks for common failure modes
Human-in-loop: Design for review, not full automation
Document everything: Prompts, examples, edge cases

Other AI-Augmented Projects

xBrezzo CaseBuilder (2025)

Built family law case management system using Lovable AI platform:

Converted business requirements to working application code
Iterative prompt refinement for UI/UX improvements
Production system deployed in weeks vs months

Daily Development Work

I use AI tools constantly for routine tasks:

ChatGPT: Boilerplate code, SQL optimization, debugging
Claude: Architecture discussions, requirement analysis
GitHub Copilot: Code completion, test generation

Estimated productivity gain: 10x on routine tasks, 3-5x on complex problems

Want to Learn More?

I'm happy to discuss AI-augmented development approaches, share lessons learned, or collaborate on similar automation opportunities.

Contact me | View full experience