The AQIS Engine - Dask Part
The AQIS includes submodules supporting efficient data access and processing (including inference), an AQIS Engine to automate Extreme Data transfer and processing workflows, and a catalogue of workflow examples.
The user can choose whether to automate workflows with an elaborate workflow engine (Apache Airflow / AQIS Engine - Airflow) or with a more lightweight approach (AQIS Engine - Dask, based on Python and Dask - which includes workflow/data-orchestration capabilities). The approaches can also be combined
AQIS Engine - Dask: Capabilities
The AQIS Engine - Dask submodule allows for Extreme Data workflows using local and remote Dask clusters. Data is accessed via Data System Adaptors.
Thus, it facilitates flexible management of distributed computing tasks. The Engine leverages full Dask capabilities for parallel computing and enhanced performance in data processing or machine learning. Functions can either be executed locally or remotely via a Dask cluster, giving you the flexibility to choose where your computations run.
In addition, the submodule provides a boilerplate REST API, making it easy to expose custom and predefined functions in an actual deployment. The interface has been designed to be extensible with custom operations, enabling you to add new functionalities tailored to specific use cases or requirements.
Example Extreme Data Processing Workflows
We showcase the usage of the AQIS Engine - Dask with the other AQIS submodules via examples provided in the Workflow Catalogue.
Documentation in Detail
The AQIS Engine - Dask submodule is documented via
- a Component Overview,
- a guide for Getting Started,
- and an API reference aiming at concise documentation.
More important information is also in the LICENSE and README.md files of all repositories.