Overview

Conversion pipeline, also referred to as the Transfer pipeline, provides the main mechanism for getting data into the platform, out of the platform, and moving data within the platform. It is designed primarily for handling of large files.

A transfer is often a multi-step process behind the scenes and sometimes the client must complete several actions in order to initiate a transfer. A good example is the file import scenario. First, client must GET a url where it can upload the file. This URL is Azure Blob Storage URL with valid SAS token and the client must upload the data there, ideally using some kind of blob storage client. This ensures the fastest and most robust way of getting the data close to the platform. When this step is complete, client can initiate a new Transfer. A Transfer entity which tracks the process from start to the end.

The Transfer might be represented by a dedicated process, which takes the input data, reads it with the selected reader, transforms the data according to the supplied parameters, and writes it into the selected destination storage/format. However, for the simplest case, when the input file is just a plain file, and the result is also just a plain file, then the Transfer is handled directly by the web application (by simply moving the file from the uploaded url to its final destination - for this to work of course the file has to be uploaded to the location provided by the platform, see the GET url notice in the previous section).

Further details are described in the rest of this section.

Limitations¶

Tasks like import, export, or conversion running for over 1 day will be terminated. This limit can be adjusted in the metadata service settings TransferTaskOptions.MaxWallClockTime.