Ghost in the data
  • Home
  • About
  • Posts
  • Topics
  • Resources
  • Categories
  • AI Development
  • Analytics Engineering
  • Artificial Intelligence
  • AWS
  • Banking
  • Best Practices
  • Big Data
  • Business Technology
  • Career Development
  • Career Growth
  • Cloud Computing
  • Cloud Infrastructure
  • Communication
  • Conflict Resolution
  • Data Architecture
  • Data Culture
  • Data Engineering
  • Data Ethics
  • Data Governance
  • Data Modeling
  • Data Modelling
  • Data Pipelines
  • Data Privacy
  • Data Quality
  • Data Storage
  • Data Warehousing
  • Database Design
  • Dbt
  • Delta-Lake
  • Development
  • Development Tools
  • DevOps
  • Employee Engagement
  • Gaming Servers
  • Google Cloud Platform
  • Hiring
  • Industry Analysis
  • Interviews
  • IT Management
  • Leadership
  • Life Hacks
  • Mindfulness
  • Minecraft
  • Personal Development
  • Personal Finance
  • Pipeline
  • Pipeline Design
  • Productivity
  • Professional Development
  • Professional Growth
  • Promotion
  • Psychology
  • Python
  • Python Tools
  • Setup Guide
  • SQL
  • Stakeholder Management
  • Team Building
  • Team Culture
  • Team Management
  • Technical Architecture
  • Technology Trends
  • Tutorial
  • User Experience
  • Version Control
  • Workplace Dynamics
Hero Image
That Tuesday Morning When I Finally Fixed Our Ten-Minute Queries

The Ten-Minute Query I’m sitting at my laptop on a Tuesday morning, waiting. The progress bar on my screen says ‘Query running… 4 minutes, 37 seconds.’ I lean back in my chair and let out this long sigh that probably says more than I intended. My manager walks past my desk. She glances at my screen, and I can see that look—the one that says she already knows what I’m about to tell her. I didn’t need to explain.

  • AWS Glue
  • Dimensional Modeling
  • Kimball Methodology
  • Data Quality
  • ETL
  • Write-Audit-Publish
  • Apache Iceberg
  • Step Functions
  • SCD Type 2
Friday, January 16, 2026 Read
Hero Image
The 2026 Data Engineering Strategy Nobody's Writing (But Everyone Needs)

What if I told you the biggest threat to your data platform isn’t technology—it’s that we’ve stopped building the next generation of engineers who’ll run it? Not the latest database that promises to solve everything. Not whether you picked the right orchestrator. The real crisis is that we’ve systematically broken our talent pipeline. And in 2026, that decision is going to start costing us in ways that no amount of tooling can fix.

  • Strategy
  • Team Building
  • Cost Optimization
  • DuckDB
  • AI Tools
  • Career Planning
  • 2026 Trends
  • Future of Work
Thursday, January 15, 2026 Read
Hero Image
The Guerrilla Guide to Data Engineering Interviews

The Scenario That Changes Everything Picture this: You’re sitting in an interview room—or more likely these days, staring at a Zoom window with your carefully curated bookshelf background—and the interviewer asks you about data quality. “Tell me about your experience with data quality,” they say. You have two choices. Choice A: “Data quality is really important in data engineering. It involves ensuring data is accurate, complete, consistent, and timely. I believe strongly in implementing data quality checks throughout the pipeline.”

  • Interviews
  • Career Growth
  • Technical Assessment
  • SQL
  • Data Modeling
  • Problem Solving
  • Delta Lake
  • dbt
  • Data Quality
Sunday, January 11, 2026 Read
Hero Image
Why Your Ideas Die in Planning Meetings

The silence that kills good ideas One morning, I sat in yet another meeting where we just spent two weeks backfilling a table then we found it was riddled with issues with the data. Even if we resolve the issue, it would then be another 2 weeks to backfill the data, there has to be a better way. “So, what do we think? Give me your best ideas for tackling this.”

  • Team Culture
  • Collaboration
  • Psychological Safety
  • Innovation
  • Change Management
  • Technical Leadership
  • Data Teams
Wednesday, January 7, 2026 Read
Hero Image
Building Your First AWS Data Pipeline: A Guide for Data Professionals Who've Never Touched Cloud Infrastructure

The spreadsheet that changed everything Here’s a story that might sound familiar. You’re pulling data from an API—maybe daily sales numbers, maybe customer interactions, maybe something else entirely. Every morning, you open your laptop, run a Python script, save the CSV somewhere, and get on with your actual work. It takes maybe five minutes, but it’s five minutes you can’t forget about. Miss a day and you’ve got a gap in your data. Go on vacation? Better hope someone remembers to run your script.

  • AWS
  • Data Pipelines
  • Lambda
  • S3
  • Athena
  • Cloud Computing
  • Data Ingestion
Wednesday, November 26, 2025 Read
Hero Image
When Your Data Quality Fails at 9 PM on a Friday

When everything goes wrong at once It’s 9 PM on a Friday. You’re halfway through your second beer, finally relaxing after a brutal week. Your phone buzzes. Then it buzzes again. And again. The support team’s in full panic mode, your manager’s calling, and somewhere in Melbourne, two very angry guests are standing outside the same Airbnb property—both holding confirmation emails that say the place is theirs for the weekend.

  • Data Quality
  • SQL
  • Database Design
  • Data Validation
  • Testing
  • Data Engineering
  • Production Issues
Saturday, November 22, 2025 Read
Hero Image
Balancing Data Accessibility and Privacy in Financial Services

The Data Tightrope: Where Accessibility Meets Privacy Let’s face it—in today’s data landscape, data is simultaneously your most valuable asset and your biggest potential liability. Finding that sweet spot where data remains accessible enough to drive business decisions while being locked down enough to satisfy privacy regulations. It’s not just about ticking compliance boxes—it’s about maintaining customer trust while still extracting every bit of analytical value from your data assets.

  • DataPrivacy
  • Anonymization
  • RetentionPolicies
  • BankingData
  • DataMinimization
  • GDPR
  • DataGovernance
Friday, November 21, 2025 Read
Hero Image
Why Dimensional Modeling Isn't Dead—It's Just Getting Started

The Great Data Modeling Debate Nobody Asked For Another meeting where someone confidently declared, “We don’t need data modeling anymore—just dump everything in the data lake and let analysts figure it out.” I’ve heard variations of this statement for years now, in meetings or at conferences. The pitch is always the same: traditional data warehousing is dead, dimensional modeling is a relic from the 90s, and modern big data tools have made structured modeling obsolete. Schema-on-read is the future. Agility over architecture.

  • DimensionalModeling
  • DataWarehouse
  • DataModeling
  • DataQuality
  • Analytics
  • Kimball
  • BigData
Friday, November 7, 2025 Read
Hero Image
Financial Independence: Your Shield Against Job Loss Fear

The Fear That Follows You Home One evening, after pushing another commit past midnight, I couldn’t bring myself to sit up. Not because I was tired—though I was. Not because the commit had issues—it went smoothly, and tested all fine. I couldn’t get up because I’d spent the entire day with a knot in my stomach, wondering if our team would survive the next round of “organizational restructuring.” Here’s what made it worse: I had no idea if my fear was rational. Were we really at risk? Or was I just catastrophizing? The uncertainty was eating me alive.

  • financial independence
  • job security
  • emergency fund
  • career development
  • mental health
  • workplace stress
  • budgeting
  • redundancy
Sunday, November 2, 2025 Read
Hero Image
Building AI Agents with Claude Code

Introduction Imagine you’re reviewing a pull request with dozens of SQL files, each containing complex queries for your data pipeline. You spot inconsistent formatting, or syntax which doesn’t work with your infrastructure. Sound familiar? It’s common for data professionals to struggle with maintaining consistent SQL standards across their projects, especially when working with specialized platforms and it can be time consuming to review these elements within a peer review. It would be better use of time to focus on the hard thinking elements, like logic etc. However these small syntax or style issues, can be distracting. Well at least they are for me.

  • claude-code
  • sql-agents
  • starburst
  • delta-lake
  • trino
  • sql-validation
  • dbt
  • data-engineering
  • ai-tools
  • vscode
Saturday, September 13, 2025 Read
Hero Image
Continuous Integration for Data Teams: Beyond the Buzzwords

The Day Everything Broke (And How CI Could Have Saved Us) Picture this: It’s 9 AM on a Monday, and your Slack is exploding. The executive dashboard is showing impossible numbers. Customer support is fielding complaints about incorrect billing amounts. The marketing team is questioning why their conversion metrics suddenly dropped to zero. You trace it back to a seemingly innocent change you merged Friday afternoon—a simple column rename that seemed harmless enough. But that “harmless” change cascaded through your entire data pipeline, breaking downstream models, dashboards, and automated reports.

  • ContinuousIntegration
  • DataQuality
  • dbt
  • DevOps
  • DataEngineering
  • GitHub
  • Datafold
  • DataValidation
Saturday, June 28, 2025 Read
Hero Image
dbt Fusion: The Engine Upgrade That's Got Everyone Talking

When Your Favorite Tool Gets a Makeover You know that feeling when your favorite app suddenly changes its interface? That mix of excitement and anxiety about whether the changes will actually improve your workflow or just mess with muscle memory you’ve spent years building. That’s exactly what happened when dbt Labs dropped dbt Fusion on the analytics engineering community. The reactions were… let’s call them passionate. Some folks were celebrating like they’d just discovered fire, while others were questioning whether this marked the beginning of the end for open-source dbt.

  • dbt
  • DataEngineering
  • AnalyticsEngineering
  • OpenSource
  • DataTools
  • SQL
  • DataModeling
Saturday, June 21, 2025 Read
  • ««
  • «
  • 1
  • 2
  • 3
  • »
  • »»