Ghost in the data
  • Home
  • About
  • Posts
  • Topics
  • Categories
  • Analytics Engineering
  • Artificial Intelligence
  • Best Practices
  • Big Data
  • Business Technology
  • Career Development
  • Cloud Computing
  • Communication
  • Conflict Resolution
  • Data Engineering
  • Data Modeling
  • Data Modelling
  • Data Pipelines
  • Data Quality
  • Data Storage
  • Data Warehousing
  • Database Design
  • Dbt
  • Delta-Lake
  • Development
  • Development Tools
  • DevOps
  • Employee Engagement
  • Gaming Servers
  • Google Cloud Platform
  • Hiring
  • IT Management
  • Leadership
  • Life Hacks
  • Mindfulness
  • Minecraft
  • Personal Development
  • Pipeline
  • Pipeline Design
  • Productivity
  • Professional Development
  • Professional Growth
  • Promotion
  • Psychology
  • Python
  • Python Tools
  • Setup Guide
  • Stakeholder Management
  • Team Building
  • Team Management
  • Technology Trends
  • Tutorial
  • Version Control
  • Workplace Dynamics
Hero Image
Continuous Integration for Data Teams: Beyond the Buzzwords

The Day Everything Broke (And How CI Could Have Saved Us) Picture this: It’s 9 AM on a Monday, and your Slack is exploding. The executive dashboard is showing impossible numbers. Customer support is fielding complaints about incorrect billing amounts. The marketing team is questioning why their conversion metrics suddenly dropped to zero. You trace it back to a seemingly innocent change you merged Friday afternoon—a simple column rename that seemed harmless enough. But that “harmless” change cascaded through your entire data pipeline, breaking downstream models, dashboards, and automated reports.

  • ContinuousIntegration
  • DataQuality
  • dbt
  • DevOps
  • DataEngineering
  • GitHub
  • Datafold
  • DataValidation
Saturday, June 28, 2025 Read
Hero Image
Leveraging LLMs for Business Impact: Part 2 - Building an AI Data Engineer Agent

Introduction In Part 1 of this series, we explored the theoretical foundations of Large Language Models (LLMs), Retrieval Augmented Generation (RAG), and vector databases. Now, it’s time to put theory into practice. This is going to be a long read, so grab some coffee, and one (couple) of your favorite biscuits. One use case for leveraging LLM’s, is creating of a Agent - a Senior Data Engineer AI that automatically reviews Pull Requests in your data engineering projects. This agent will be that nit picky Data Engineer that enforces SQL formatting standards, ensure naming and data type consistency, validate data quality checks, and suggest improvements based on best practices. By integrating this into your GitHub workflow, you can maintain higher code quality, accelerate onboarding for new team members, and reduce the burden of manual code reviews.

  • GitHub Actions
  • CI/CD
  • AI Agents
  • Code Review
  • Data Quality
  • DBT
  • SQL Standards
Saturday, March 8, 2025 Read
Hero Image
Navigating Incident Response Management with DevOps

Introduction Incident response management (IRM) is a critical aspect of any organization’s overall security and risk management strategy. In today’s fast-paced, technology-driven world, IT incidents can occur at any time, and it’s important to have a plan in place to effectively manage these incidents and minimize the impact they have on your organization. The IRM lifecycle is a structured approach to managing incidents, from identification to resolution, and it involves a range of activities, including communication, coordination, and control. In this post, I’ll explore the IRM lifecycle in detail, and discuss the roles and responsibilities of different individuals during each stage. I’ll also compare traditional incident management with devops incident management, and discuss the advantages of adopting a devops approach.

  • Incident Response
  • Risk Management
Sunday, February 19, 2023 Read