Automating Github with Code Sync
Product Updates Integrations

Automating Github with Code Sync

Blake Burch
Blake Burch

A majority of data teams today are using products like Github to manage version control for their code. Github is the perfect option for ensuring that code is up-to-date, free of conflicts, and reviewed by the entire team. However, while the code is managed through Github, it's often hard to ensure that it is deployed correctly in production, especially when the code is related to backend processes like data pipelines. What if you could always automate the code in your Github repos with minimal effort?

Automation Made Easier

Today, Shipyard is announcing Github Code Sync, our native integration with Github. This latest update enables teams to connect their Github repositories directly to their Shipyard organization to quickly automate any scripts.

Connecting your Shipyard Organization to Github

Every time a Vessel or Fleet is triggered, they will immediately clone the latest version of your code from a specific branch or tag that you select.

Selecting a repo to sync your code with

Once your repository is cloned, Shipyard will start running your code like normal, executing the script that you've specified. Whether your Vessel is running on a schedule or running on-demand, you'll always be using up-to-date code. Shipyard also logs the commit hash of the code that was cloned, so it’s exceptionally easy to verify what version of the code was being run at any given time.

Improve your CI/CD

This new integration makes Shipyard an integral part of your team's CI/CD flow. Stop relying on error-prone processes of copying and pasting code into external orchestration systems or scheduled Github syncs. With Github Code Sync, you can ensure that your team is always automating the latest code at runtime. As soon as you commit any code changes to Github, Shipyard will immediately pick up those changes without any extra work on your part. It couldn't be any easier!

Enhance your Workflows

Github Code Sync becomes even more powerful when used in conjunction with Shipyard's workflow orchestration – Fleets. Every Vessel in a Fleet runs in isolation, all while automatically sharing files with other downstream Vessels. In conjunction with this update, you can now easily run multiple versions of your code in parallel without worrying about conflicts. Alternatively, you can clone your repository once and have multiple Vessels access the files without the need to clone your code again and again.

Use Cases

With a feature as powerful as code syncing, we open up many new opportunities to simplify how data teams get their solutions to production. Here are a few examples to get the wheels turning.

Automate SQL Executions

Does your analytics team manage thousands of reporting SQL queries through Github? With Github Code Sync, you can pull down all of the latest queries at once. From there, you can use Shipyard's Blueprints to automate the execution of those SQL queries against Snowflake, Bigquery, Redshift, Postgres, MySQL, SQL Server, and more.

dbt Cloud Executions

Struggling to keep your dbt runs up to date? Keep all of your dbt related code updated in a centralized Github repository and execute dbt in the cloud using Shipyard, pointed at the main branch of your dbt repository.

The setup allows teams to connect dbt jobs to larger workflows that your team may need to run.

Experiment with ML Algorithm Variations

Need to test out how changes to a machine learning algorithm affect the final output? With Github Code Sync, you can easily manage multiple branches with different variations and have all of them execute simultaneously on Shipyard using the same dataset. This setup allows you to make fewer queries against your database while experimenting with changes more rapidly.

Run Test and Production Code in a Single Environment

Are you tired of managing multiple testing environments? With Shipyard, you no longer need to manage different infrastructure and setups for QA, Staging, and Production. Instead, you can duplicate each of your Vessels, set them with the same schedule, and sync them with the appropriate branch or tag.

With this setup, you can execute different versions of your code simultaneously on the same infrastructure. Use Shipyard to know precisely how performance will be affected before merging your code into the production-ready branch.

Get Started Today

Github Code Sync is now available to all subscribers and can be tested with any account. Sign up for our free Developer Plan to get started automating your code. You can also follow our guide for Automating your Code Deployment from Github.

We're looking forward to seeing how users will take advantage of this functionality to quickly develop and deploy data solutions!


About Shipyard:
Shipyard is a modern data orchestration platform for data engineers to easily connect tools, automate workflows, and build a solid data infrastructure from day one.

Shipyard offers low-code templates that are configured using a visual interface, replacing the need to write code to build data workflows while enabling data engineers to get their work into production faster. If a solution can’t be built with existing templates, engineers can always automate scripts in the language of their choice to bring any internal or external process into their workflows.

The Shipyard team has built data products for some of the largest brands in business and deeply understands the problems that come with scale. Observability and alerting are built into the Shipyard platform, ensuring that breakages are identified before being discovered downstream by business teams.

With a high level of concurrency and end-to-end encryption, Shipyard enables data teams to accomplish more without relying on other teams or worrying about infrastructure challenges, while also ensuring that business teams trust the data made available to them.

For more information, visit www.shipyardapp.com or get started for free.