Take a Byte Out of Your ETL Processes

Take a Byte Out of Your ETL Processes

Steven Johnson
Steven Johnson

At Shipyard, we're constantly engaging with our customers, learning about their needs and the tools they use to streamline their data work. Time and again, we've received requests for open source low-code blueprints specifically designed to integrate with these essential tools, making their data operations even more efficient. One of the most highly requested blueprints has been for Airbyte, a popular choice among organizations for their ETL tasks.

We've been enthusiastic about providing our users with an Airbyte blueprint for quite some time, understanding the value it would bring to their day-to-day operations. However, we were waiting for the essential component, an API, to become available for building a seamless integration. Our dedicated research team has been monitoring the situation closely, and they have exciting news to share: the Airbyte API is now live and ready for action! That can only mean one thing....

Introducing our Airbyte Blueprints

As of today, Shipyard is rolling out two fresh low-code blueprints to help you swiftly work with Airbyte. This dynamic duo of Shipyard and Airbyte ensures that your team can keep loading data quickly into your databases using Airbyte, while also linking these syncs effortlessly to the rest of your data operations tasks.

By providing a Airbyte API Key and a specific Connection/Job ID, you can mix and match blueprints to:

Check out a demo of this workflow!

Execute Additional Data Processes Pre and Post Airbyte

In the current Airbyte environment, your data loading processes exist separately from your other data operations, despite typically being the primary data source. You have a general idea of when synchronizations commence and conclude each day, but there is no assurance that subsequent processes will only execute after data has been loaded. Consequently, you find yourself managing business-critical data tasks through unreliable, schedule-based workflows, leading to extended runtimes and potentially inaccurate data.

By leveraging the "Trigger Sync" and "Check Sync Status" blueprints within Shipyard, you can create a streamlined workflow that triggers Airbyte syncs, waits for successful data loading, and then promptly initiates downstream operations such as dbt, Dataform, BI data refreshes, and more. This can be tailored per data source, allowing processes linked to Stripe, for instance, to run immediately after Stripe data is loaded, while comprehensive data processes only execute once all connectors have completed syncing successfully.

Moreover, Shipyard offers the added benefit of automatically preventing downstream jobs from initiating if your Airbyte data syncs encounter failure. This enables you to address the issue directly within Airbyte or implement your own custom alerting and resolution process. Imagine how much more efficient it would be if the appropriate clients or teams could automatically receive updates when their specific data experiences delays.

Integrate All Your Data Tools with Airbyte

Shipyard empowers you to swiftly link the execution of Airbyte connectors with any script you compose in Python, Bash, or Node.js. Additionally, you can easily integrate it with other typical processes that require interaction with external databases (Snowflake, Redshift, BigQuery, etc.) or data storage solutions (AWS S3, Google Cloud Storage, etc.) through our low-code blueprint Library.

Shipyard doesn't just automate and connect Airbyte, but also any other data tool you might be utilizing. With the Shipyard platform at your disposal, your team gains increased flexibility in designing a modular pipeline, where each stage shares data with one another, rather than depending on fragile, schedule-based pipelines among isolated systems.

Here are a few examples demonstrating how you can seamlessly integrate Airbyte with other services, creating a comprehensive solution for your Data Team:

  • Initiate dbt Cloud or dbt Core tasks following a successful Airbyte data import
  • Update Tableau extracts upon successful completion of Airbyte data loading
  • Execute a Python script to conceal PII details right after Airbyte uploads the data
  • Dispatch Slack notifications to relevant teams when Airbyte synchronizations encounter issues
  • Transfer raw data to a stakeholder's SFTP as soon as it becomes available.

Get Started Today

Airbyte blueprints are now accessible to all users and can be used with any account. For additional information, consult our documentation.

Shipyard is streamlining the automation of your complete data stack like never before. We're excited to witness how users will leverage these blueprints to integrate Airbyte with their data stack.

Register for our complimentary Developer Plan and begin automating your Airbyte synchronizations.