Ghost in the data
  • Home
  • About
  • Posts
  • Topics
  • Resources
  • Posts
  • 2026
    • Talk
    • Brainstorming
    • Guerrilla Interview Guide
    • 2026 Strategy
    • Dimensional Modeling AWS
  • 2025
    • UV Tools
    • Zsh Virtual Environments
    • Piracy Service Problem
    • 2025 Data Trends
    • Data Modeling Approaches
    • MacOS Dev Setup
    • Windows Dev Setup
    • Business Context Guide
    • Data Impact
    • Data Engineering Interviews
    • First 90 Days as Data Engineer
    • Senior to Staff Engineer
    • LLMs for Business Part 1
    • LLMs for Business Part 2
    • Mastering 1:1 Meetings
    • AI Prompting Secret
    • Conceptual Data Modeling
    • WAP Pattern for Data Pipelines
    • AI Simplified
    • dbt Fusion: The Engine Upgrade
    • Continuous Integration for Data Teams
    • Claude Code AI Agents
    • Clear Communication Superpower
    • Compliance vs Commitment
    • D&D Leadership
    • Reflective Best Self
    • Financial Independence
    • Dimensional Modeling Lives
    • Balancing Data Accessibility & Privacy
    • Data Quality Crisis
    • Data Quality Framework
    • AWS Data Pipeline
    • Invisible PR
    • AI's Twin Crises
  • 2024
    • Delta-lake
    • Data Normalisation
    • Data Profiling
    • Defensive Engineering
    • CI/CD
    • Setup Docker and Airflow
    • Find and Attract Data Engineers
    • 17 Years of Insights
    • Relationship Building
    • Individual Contributor
  • 2023
    • GitBash with SSH
    • Journalling
    • Minecraft Server in GCP
    • Onboarding a data team
    • File Format for Big Data
    • Incident Management
    • Data Vault
    • Books that are worth you time?
Hero Image
That Tuesday Morning When I Finally Fixed Our Ten-Minute Queries

The Ten-Minute Query I’m sitting at my laptop on a Tuesday morning, waiting. The progress bar on my screen says ‘Query running… 4 minutes, 37 seconds.’ I lean back in my chair and let out this long sigh that probably says more than I intended. My manager walks past my desk. She glances at my screen, and I can see that look—the one that says she already knows what I’m about to tell her. I didn’t need to explain.

  • AWS Glue
  • Dimensional Modeling
  • Kimball Methodology
  • Data Quality
  • ETL
  • Write-Audit-Publish
  • Apache Iceberg
  • Step Functions
  • SCD Type 2
Friday, January 16, 2026 Read
Hero Image
The 2026 Data Engineering Strategy Nobody's Writing (But Everyone Needs)

What if I told you the biggest threat to your data platform isn’t technology—it’s that we’ve stopped building the next generation of engineers who’ll run it? Not the latest database that promises to solve everything. Not whether you picked the right orchestrator. The real crisis is that we’ve systematically broken our talent pipeline. And in 2026, that decision is going to start costing us in ways that no amount of tooling can fix.

  • Strategy
  • Team Building
  • Cost Optimization
  • DuckDB
  • AI Tools
  • Career Planning
  • 2026 Trends
  • Future of Work
Thursday, January 15, 2026 Read
Hero Image
The Guerrilla Guide to Data Engineering Interviews

The Scenario That Changes Everything Picture this: You’re sitting in an interview room—or more likely these days, staring at a Zoom window with your carefully curated bookshelf background—and the interviewer asks you about data quality. “Tell me about your experience with data quality,” they say. You have two choices. Choice A: “Data quality is really important in data engineering. It involves ensuring data is accurate, complete, consistent, and timely. I believe strongly in implementing data quality checks throughout the pipeline.”

  • Interviews
  • Career Growth
  • Technical Assessment
  • SQL
  • Data Modeling
  • Problem Solving
  • Delta Lake
  • dbt
  • Data Quality
Sunday, January 11, 2026 Read
Hero Image
Why Your Ideas Die in Planning Meetings

The silence that kills good ideas One morning, I sat in yet another meeting where we just spent two weeks backfilling a table then we found it was riddled with issues with the data. Even if we resolve the issue, it would then be another 2 weeks to backfill the data, there has to be a better way. “So, what do we think? Give me your best ideas for tackling this.”

  • Team Culture
  • Collaboration
  • Psychological Safety
  • Innovation
  • Change Management
  • Technical Leadership
  • Data Teams
Wednesday, January 7, 2026 Read
Hero Image
The Science of Conversation for people who hate small talk

One morning, I watched a data engineer struggle with using AI for thirty minutes, trying to debug a DBT job. The problem wasn’t the LLM’s capabilities—it was how the engineer framed the question. No context about what they’d already tried. No explanation of the expected versus actual output. Just “fix this code” followed by a massive code dump. This same engineer had similar struggles with stakeholders. Presentations that assumed too much context. Emails that buried the ask. Meetings where they answered questions nobody asked.

  • Communication
  • Soft Skills
  • Leadership
  • Career Growth
  • Team Building
  • Stakeholder Management
  • Professional Development
Sunday, January 4, 2026 Read
Hero Image
When AI Stops Treating Us Like People: The Twin Crises Nobody Wants to Discuss

The moment it stops being theoretical A friend was telling me about their recent job interviews. Not the questions they struggled with or the technical challenge they bombed—that’s normal interview anxiety. No, they were frustrated because they’d spent forty-five minutes talking to an AI chatbot that analyzed their “enthusiasm for remote work” and tracked their eye movements while asking whether they’d “misrepresent themselves 3.8x more than average candidates.” The speed with which I would exit an AI interview would rival even the most seasoned teenager’s alt-tab skills.

  • AI Bubble
  • Financial Crisis
  • Trust Crisis
  • Data Teams
  • AI Ethics
  • Career Strategy
Friday, December 12, 2025 Read
Hero Image
The Invisible PR You're Building Right Now

The Email That Changed Everything I received a meeting invite labeled “Check In” and it was to discuss news about the restructure. In a few months, my role would be gone. I was still processing it, sitting with that particular brand of numbness that comes when your career gets upended. In the following weeks, my inbox started filling up. LinkedIn messages. Texts. Emails from people I’d worked with years ago.

  • Professional Relationships
  • Trust
  • Reputation
  • Leadership
  • Career Development
  • Team Culture
Monday, December 8, 2025 Read
Hero Image
Building Your First AWS Data Pipeline: A Guide for Data Professionals Who've Never Touched Cloud Infrastructure

The spreadsheet that changed everything Here’s a story that might sound familiar. You’re pulling data from an API—maybe daily sales numbers, maybe customer interactions, maybe something else entirely. Every morning, you open your laptop, run a Python script, save the CSV somewhere, and get on with your actual work. It takes maybe five minutes, but it’s five minutes you can’t forget about. Miss a day and you’ve got a gap in your data. Go on vacation? Better hope someone remembers to run your script.

  • AWS
  • Data Pipelines
  • Lambda
  • S3
  • Athena
  • Cloud Computing
  • Data Ingestion
Wednesday, November 26, 2025 Read
Hero Image
The Four Stages of Data Quality: From Hidden Costs to Measurable Value

This is the fundamental problem with data quality. You know it matters. Everyone knows it matters. But until you can quantify the impact, connect it to business outcomes, and build a credible business case, it remains this abstract thing that’s important but never urgent enough to properly fund. I wrote a practical guide to data quality last week that walks through hands-on implementation—the SQL queries, the profiling techniques, the actual mechanics of finding and fixing data issues. Think of that as the “how to use the tools” guide. This article is different. This is the “why these tools matter and how to convince your organization to actually use them” guide.

  • Data Quality
  • ROI
  • Business Case
  • Data Governance
  • Strategy
  • Frameworks
Monday, November 24, 2025 Read
Hero Image
When Your Data Quality Fails at 9 PM on a Friday

When everything goes wrong at once It’s 9 PM on a Friday. You’re halfway through your second beer, finally relaxing after a brutal week. Your phone buzzes. Then it buzzes again. And again. The support team’s in full panic mode, your manager’s calling, and somewhere in Melbourne, two very angry guests are standing outside the same Airbnb property—both holding confirmation emails that say the place is theirs for the weekend.

  • Data Quality
  • SQL
  • Database Design
  • Data Validation
  • Testing
  • Data Engineering
  • Production Issues
Saturday, November 22, 2025 Read
Hero Image
Balancing Data Accessibility and Privacy in Financial Services

The Data Tightrope: Where Accessibility Meets Privacy Let’s face it—in today’s data landscape, data is simultaneously your most valuable asset and your biggest potential liability. Finding that sweet spot where data remains accessible enough to drive business decisions while being locked down enough to satisfy privacy regulations. It’s not just about ticking compliance boxes—it’s about maintaining customer trust while still extracting every bit of analytical value from your data assets.

  • DataPrivacy
  • Anonymization
  • RetentionPolicies
  • BankingData
  • DataMinimization
  • GDPR
  • DataGovernance
Friday, November 21, 2025 Read
Hero Image
When Pirates Offered Better Service

The Day Music Changed Forever On June 1, 1999, an eighteen-year-old kid in a Northeastern University dorm room launched something that would bring the music industry to its knees. Shawn Fanning called it Napster, and within two years, 80 million people were using it to download 14,000 songs every minute.1 The technology was simple: a central server indexed which songs each user had, then let computers talk directly to each other. No complicated setup. No technical expertise required. Just type in “Metallica” and boom—there it was.

  • DataGovernance
  • UserExperience
  • ShadowIT
  • DataDemocratization
  • Leadership
  • ServiceDesign
Sunday, November 16, 2025 Read
  • ««
  • «
  • 1
  • 2
  • 3
  • 4
  • 5
  • »
  • »»