How much bandwidth should I have for transferring data from the PromethION to my local infrastructure?

The storage capacity of the P24 or P48 Data Acquisition Units is approximately 24 hours of 24 or 48 flow cells, respectively, with .pod5 and FASTQ output.

A standard sequencing run lasts 64 to 72 hours; therefore, it is essential that this data is streamed from the device in real-time to prevent runs from terminating due to lack of storage space. (Currently, we recommend moving the data from the tower using Rsync run hourly through crontab. For further details, please contact the support team via Live Chat on our website.

For this, a customer site must ensure connectivity to the local infrastructure is of sufficient bandwidth to prevent data backing up. The PromethION offers two 10-Gbit-per-second ports for this purpose, with the customer able to choose between Ethernet or Fibre solutions. (USB ports do not provide sufficient bandwidth for real-time data transfer so should not be used.) Below is a worked example showing idealised data transfer speeds; real transfer speeds could be slower depending on network configuration:

| | 1 Gbit/s | 10 Gbit/s |

| ---------- | ---------- | ---------- |

| 1 x 200-Gbase flow cell | ~7 hrs | ~1 hr |

| 48 x 200-Gbase flow cell | ~320 hrs | ~32 hrs |

The PromethION runs Ubuntu and can mount multiple file system types. We recommend storage presented as NFS or CIFS. This storage streamed in real-time should be SSD for its high write speed compared to HDD. After initial writing to networked SSD drives, data can be moved to storage with a slower write speed for long-term storage.

The form and volume of data to be stored will depend on customer requirements:

- Storing .pod5 files with raw read data is optional and will permit re-basecalling of data when new algorithms are released by Oxford Nanopore. In such cases, new releases of basecallers have enabled significant improvements in basecalling accuracy of existing datasets through re-basecalling. Further, selected Oxford Nanopore and third-party tools use the raw signal information contained within the .pod5 to extract additional information from the raw signal, e.g. modified bases calling, reference-guided SNP calling, or polishing of data.

- Retaining just FASTQ and/or BAM files will allow the use of standard downstream analysis tools using the DNA/RNA sequence, but no further sequence data can be generated when improvements in basecalling become available.

Oxford Nanopore is unable to provide exact recommendations for storage, as these will be site-specific. The above guidelines and requirements should be taken into consideration.