"Shipyard is so easy to use. When we have internal data issues, I’m now able to build solutions that move data between APIs and databases in just a few hours."
Stop working on finding solutions to limitations and restrictions.
Spend more time working on what matters to your business.
"We don’t want to be in the business of maintaining servers and software. We want to be in the business of moving data. Shipyard lets our data team focus more on the things we’re good at and make a positive impact on the organization."
In Airflow, you can create reusable templates through Operations. The only problem? They require someone knowing the Operation exists and being capable of writing a script to use it.
Shipyard focuses on making templates usable without needing to know code. Each template you create maps a script's inputs to a basic form. End users only need to fill out that form and set triggers to start using the template.
There are more than 150 of these low-code Shipyard templates and they’re all open-source. You can see what’s under the hood and make any changes you feel are necessary using your own code.
Plus, when you use a template your team gets the observability into everywhere that template is used. No more guesswork to find every report you send to clients by Slack. You can finally scale the efforts of your data team and feel confident in tracking every new solution.
Why code when it’s not necessary?
"Shipyard has a lot of great features sets for the semi technical as well as very technical users who have a need to automate data transformation or various business processes. With seamless connections into many major platforms as well as capability to build your own python scripts, it has unlocked more time and improved many of our day to day tasks for reporting and more."
Data pros are already tasked with considerable responsibilities, and once they've crafted a script to perform a specific task, the last thing they need is the added burden of modifying it just to ensure it executes properly.
In the context of Airflow, the direct transfer of a functional script from a notebook or code editor into your DAG is impractical. Additional steps are required: you must insert an Airflow-specific decorator into the code for it to operate within your DAG, complicating the development process and creating unnecessary hurdles. This complexity can act as a barrier, excluding non-technical team members from participating in the data process and presenting yet another learning curve for new hires.
Shipyard, in contrast, streamlines the transition from development to production. It guarantees that if your script works in a notebook or code editor, it will function identically in Shipyard without the need for any additional, platform-specific decorators. This simplicity accelerates the integration of your code into your pipelines, optimizing the workflow for efficiency and ease.
Why settle for unnecessary complexity?
"Shipyard is way easier to use than the other data pipeline automation tools I investigated. We are able to quickly automate a wide range of our data processing tasks leveraging the great Python support Shipyard provides, including extracting data from and sending data to APIs, data transformation within our data warehouse, generating and emailing reports, and sending system error push alerts to our phones."
Airflow recommends that every script you write be atomic - where data doesn't need to be shared between the tasks. This structure results in monolithic scripts that run through a series of cascading steps. When errors inevitably occur, it's difficult to know what exactly went wrong. If you need to re-run your task, you have to start over from the beginning. If part of your solution needs to be used elsewhere, you'll end up with code snippet nightmares.
If you choose to not make your scripts atomic, you'll be stuck adding additional code to store or download your files and variables externally, adding even more points of failure to the process.
With Shipyard, every task can be broken down into its core functionality, with data seamlessly transferred from one step to the next. Write three separate components to download data, transform the data with business logic, and upload it to an external service. Reuse these components as many times as you like across the platform.
Plus, your data team can turn these components into custom templates that are available to the rest of the business. Teams can monitor template usage and make changes that bulk update every task built using it.
Why waste time trying to figure out what went wrong and then re-running tasks from the beginning?
"You can do so much with Shipyard that the possibilities are literally endless. Even when there isn't a Blueprint for your use case its very simple to create your own."
Need to run two scripts at the same time using different versions of Pandas or Python? In Airflow, this type of structure is nearly impossible. The typical recommendation is to use the same package versions and to keep these in sync across your multiple Airflow servers. The more complex recommendation is to always use the KubernetesPodOperator, no matter what you need to run.
With Shipyard, every task and workflow is automatically run in a container, using the package dependencies you specify, without the need to write a Docker file. You can orchestrate tasks that use different packages and different languages without worrying about one affecting the other. Stop worrying about managing virtual environments and rest easy that your code will run.
Wouldn't it be nice to never worry about scripts interfering with one another?
Airflow, as an open-source platform, affords you the autonomy to host their tool on your own infrastructure, which can be an advantage, especially when considering security concerns. This benefit comes with its own set of challenges: as the number of pipelines increases, so does the need for expanded infrastructure to support them. You can run out of storage, use too much memory, or even max out compute, bringing your workflows to a halt.
Although larger organizations may have dedicated infrastructure teams capable of managing this growth, their resources might be more effectively allocated to other organizational needs.
Shipyard provides a solution that offloads the burden of infrastructure management from your team. Whether you’re running ten workflows or thousands at the same time, the Shipyard team delivers and maintains the necessary infrastructure, efficiently scaling it to align with your fluctuating demands. This service enables your organization to focus its energies on addressing data challenges, rather than diverting valuable time and resources to the maintenance of infrastructure.
Wouldn't it be nice to never worry about scale?
"For a team like ours, Shipyard's low-maintenance, standalone platform made it easy to build our data integration pipelines. It saved us from a lot of infrastructure work."
As an open-source platform, Airflow allows flexible setup aligned with unique business requirements.
However, customizing Airflow’s security extends onboarding by weeks or even months and burdens teams and diverts resources. This is especially true for those without security teams where implementing robust security measures in Airflow falls on the data team.
The bottom line is that working with Airflow means that you're on your own when it comes to ensuring your data is secure.
In contrast, Shipyard offers a streamlined solution that handles security for you. Our platform is equipped with single sign-on (SSO) and multi-factor authentication (MFA) as standard features, and it supports integration with various identity providers like Okta and Active Directory. Additionally, administrators can precisely manage user access throughout the application, enhancing control and security.
Shipyard's commitment to security extends to operational usage as well. When orchestrating workflows that require credentials for external vendor connections, Shipyard conceals these credentials in the user interface once entered, making them inaccessible to users thereafter.
Also, after the completion of a workflow run, Shipyard automatically deletes all generated data, ensuring your information remains secure and confidential. This comprehensive security approach not only safeguards your data but also frees your team to focus on their main goals.
Why take on security when it’s not your strength?
"Shipyard solved a huge problem for me and my team. It gives us a platform where we can take our existing data management scripts and run them in a secure environment without the overhead of managing our own compute resources."
Airflow forces you to manage your workflows through constant uploads to an external server. If you want to keep your code in sync with GitHub, you'll have to manage that process through cron jobs.
Shipyard connects directly to your GitHub repos, allowing you to sync your code to a specific branch or tag at runtime. Easily test out changes on development branches and rest easy knowing that you're always running the most up-to-date code.
Even if you’re not using GitHub, every change you make to your workflows, through the UI or API, gets tracked and versioned. You can easily explore what historical workflow configurations caused a run’s results and revert changes as needed.
Why use a tool that isn't tracking changes for you?
With Airflow, logs are generated which include information from the web server, scheduler, and workers that execute tasks. These logs are important facets of making sure your pipelines are running smoothly. If something goes wrong, you’ll need to find the error in your logs and start working on a solution.
But Airflow’s logs are one giant log file that encompasses the entire workflow. You’ll have to sift through the entire log file to try and find the issue. This takes unnecessary time from your data team that should be spent on patching the pipeline.
With Shipyard, you can see your logs in the user interface of the application. Shipyard provides logs at the workflow and task level, so you can stay on top of your workflows and make sure everything is running smoothly.
On top of logging in the application, Shipyard will also send you an email the moment a task fails during a run. This allows for continued observability into your processes without having to log in to the application every day.
If you want to keep an eye on how long processes are running in your pipelines, you can pop into the application and take a look through the logs, but we also provide you with the opportunity to export your logs into a dashboarding tool. This allows you to build the level of observability that you want for your organization.
Why are you willing to sacrifice observability?
"Shipyard helps us integrate our systems and monitor kicking off data workflows across a variety of tools and platforms. We can now develop custom data workflows easily for any new platform."
As an open source tool, Airflow is “free” to use. However, you quickly learn that’s not true. Airflow can be run on your local machine, but if your computer goes down the pipeline goes down, too. You must have infrastructure that can host your Airflow instance.
Infrastructure costs to host Airflow are just the beginning. You must have a team with infrastructure knowledge to build out and continue maintaining that infrastructure as you scale up your pipelines.
Pipeline building in Airflow is also very technical. All pipelines have to be created using code. Time spent coding costs money. It’s also a sunk cost. The time to begin building in Airflow is measured in weeks and months due to the technical nature of the tool.
Shipyard is a fully hosted tool. Your organization won’t have to worry about paying for infrastructure to handle your pipelines. When you start to scale up your workflows, you won’t need an infrastructure team. We take that work off your plate.
The option to code is also available with Shipyard. However, we also provide you with our low-code templates to build pipelines from scratch in days versus weeks. This saves you significant time and money without sacrificing anything.
Complexity also significantly narrows the number of people in your organization who can build with Airflow. Thanks to Shipyard’s low-code templates, less technical team members can be involved in the pipeline building process. This is especially important for teams who are being forced to do more with less.
Why settle for expensive complex tooling when an equally powerful and less expensive option is available?
"Shipyard's greatest strengths are its ease of use, ability to save time, and scalability. It's extremely affordable compared to other competitors in the space."
While Airflow can look enticing due to its “free” open-source price, costs start adding up immediately. Besides the need to write a lot of code which takes up costly data engineering time, if your scripts won't run you'll have to dig through thousands of articles in hopes of finding a solution.
If you find a bug with the platform, you'll need to wait months for the community to fix it (or spend money dedicating engineering resources yourself). If all of your workflows stop working for any reason, you'll have a mad dash to troubleshoot your infrastructure. Why put that level of risk and frustration on your team?
Shipyard prides itself on its support, with a team of dedicated experts to help you through any issues you might run across. Can't get your script to run? We can help. Need a specific feature? We can build it. With the click of a button, our team is available to get you on the right track as soon as possible. Say goodbye to frustration and dead ends and hello to a team of experts eager to help you.
Why wait days, weeks, or months for support when it’s available immediately?
"Shipyard is a great tool run by great people who go above and beyond to help you get what you need, whether that's helping you fix an issue or creating a solution you are struggling to build yourself."
Shipyard is a cloud-native workflow orchestration platform designed for the modern data team. With its intuitive UI and powerful infrastructure, Shipyard can easily automate the toughest tasks in a matter of minutes. Solutions can be created modularly and shared across the organization to help spread the influence and reusability of a data team's work.
Apache Airflow is a well-known open-source project for workflow orchestration, released in 2014 by the Data Team at Airbnb. While flexible in its potential use cases, its rigidity and adherence to python-specific code prevent make it more difficult to work with. Infrastructure knowledge is required to ensure a solid foundation for your workflows.