Skip links

Continuous Integration and Deployment Are Data Engineering’s Cruise Control

Julie Heckman, Principal Data Engineer

Curious about Continuous Integration and Continuous Deployment in Data Engineering?

As the names imply, CI/CD is a nonstop method for taking new code through development, testing and production environments in order to advance the good and press pause on the bad.

Think of CI/CD as the quickest travel between two points – like breezing through the express lane vs. stopping at every toll booth. Although CI/CD has been prevalent in the development of web apps, the data world has been slower to adopt these practices. Fortunately, developers of data pipelines – and specifically, Informatica Intelligent Cloud Services (IICS) – understand the increasing importance of CI/CD and are providing these within their product suites. By applying them, engineers can expedite development changes to yield analytics and fulfill business intelligence needs.

Traditionally, the road from a tech project’s ideation stage to production has been slow and full of missteps, detours and late deliveries. In a word, it’s painful, requiring data engineers to often work middle-of-the-night hours road-testing upgrades without risking what’s in place. Compounding matters, there’s an ever-present disconnect – typically between the clients who’ve requested the work and the engineers who make it happen – about timelines, priorities and quality. By implementing CI/CD, both sides stand to gain and advance their causes. This is especially true when transactional data must be retrieved to serve a qualitative or strategic purpose.

CI/CD resembles a highway on-ramp when you’re carefully accelerating to merge into traffic. Along the way, you’re testing, testing, testing – increasing your speed, checking your blind spot, moving over carefully and hoping to join without incident.

Similarly, in a development world, integration is plagued by stops, starts, conflicts and bottlenecks. But companies like Informatica are taking notice and creating tools that partner with code repositories such as GitHub to enable code sharing and visualizing – and Jenkins for process automation – so that engineers can create fluid CI/CD pipelines where many of these roadblocks take a backseat to progress.

Along the way:

  • Headway is made when data engineers can make changes within test environments to sample their efficacy and successful changes are perpetually green-lighted into production
  • Production environments are maintained, kept free of debris with changes documented
  • Development time is saved when changes are made once – and automated thereafter – thereby eliminating manual adjustments in multiple environments and reducing errors

By using CI/CD tools, data engineers can activate or replicate code changes without redundancy. Better yet, they can better serve the needs of their internal clients who can themselves iterate in a test environment and quickly decide whether to advance an idea or course-correct instead.

Right Triangle has developed a custom solution for Informatica’s IICS that does exactly this!  Our solution will allow you to step into the world of CI/CD without having to start from scratch and enable you to see results of implementing the approach quickly.