How Computational Workflows for Genomic Analysis can be Simplified
Computational workflows are often not just computational workflows. Most interface with Library Information Management Systems (LIMS), Next-generation Sequencing (NGS) instruments, perform complex input validations, and coordinate processing between on-prem and cloud services. Many developers find that writing the code and software required to perform these integrations to be a daunting task and the resultant complex, integrative, workflows can be unwieldy and difficult to use for end users. There is now an answer to simplifying these complex analyses. The Seven Bridges Automation Tools and Services allow a developer to quickly write automation scripts that end users can run at the push of a button directly on the Seven Bridges Platform.
The Seven Bridges Automation Development Kit (ADK) allows developers to write simple Python scripts that combine arbitrary operations in Python with calls to external asynchronous services, including task executions on the Seven Bridges Platform. The modules in the ADK take care of the low-level plumbing that is required for system integration, workflow orchestration and any custom steps required during task preparation.
Consider a use case where a workflow can take inputs referenced to different versions of the human genome. However, depending on what version of the human genome is used, a workflow needs completely different parameter sets and needs to access different external databases. The code to perform the individual operations is in place, but it will be a lot of work to combine it all together and keep the complexity hidden from the end user who really wants to simply attach the input data, click “Run” and be served with the analysis results.
With the ADK a solution can quickly be scripted that allows input data to be checked, correct parameter sets selected, programs called that talk with external databases, inputs staged and workflows launched on the Seven Bridges Platform. The ADK script will wait until all the computations are completed, retrieve the results and then invoke the analysis scripts to generate the final result. All the end user needs to do is feed the script with the correct input data. It’s really a win-win for everyone.
The ADK script can be hosted on an organization’s internal infrastructure or it can be hosted on the Seven Bridges Platform. The script itself has very low computational requirements but does require to run continuously until finished and requires access to the internet, so there are benefits to hosting it on the Seven Bridges Platform. Additional benefits of hosting and running automation scripts on the platform include centrally accessible execution logs, collaborative debugging, versioning, quick re-runs, and total reproducibility.
For technical aspects related to the Seven Bridges Automation product suite, please visit our documentation.