Big Data Processing for I-LOFAR: REALTA

I-LOFAR is a true ‘big data’ project, and as part of the International LOFAR Telescope, is a pathfinder telescope to the future Square Kilometre Array telescope, which will create perhaps the largest dataset in the world (an exabyte per day!).

To date, the data processing capabilities of  I-LOFAR have not  been sufficient to allow the telescope study astronomy objects that vary rapidly. However,  members of the I-LOFAR consortium at UCC and NUIG have recently obtained SFI funding to create a data processing system that can intensively analyse  and rapidly store the data stream as it emerges from the telescope, giving I-LOFAR a new and powerful capability to study the transient radio sky.  The system, called the REALtime Transient Acquisition cluster (REALTA), builds on the heritage of the ARTEMIS system designed by the Oxford e-Research Centre.

In July 2018, a team from UCC, NUIG, TCD and Oxford/Berkeley team set up 5 Dell-EMC PowerEdge GPU/CPU blades and up to 300 terabytes of storage. Over the coming months, this will enable us to study a wide range of fascinating astronomy targets such as aurora in planetary atmospheres, pulsars, solar radio bursts, nova explosions, and fast radio bursts.

The I-LOFAR processing cluster was enhanced in March 2020 to assist in the Search for Extraterrestrial Intelligence (SETI), in collaboration with the Breakthrough Listen foundation and the Berkeley SETI Institute, by the addition of another compute node to the cluster. This machine is planned to be integrated to the REALTA system by hooking into the live station data stream and performing transient signal analysis in addition to any work being performed by observers on the existing REALTA compute nodes.

Technical specifications for REALTA can be found in the table below and more technical details on REALTA together with first results can be found in Murphy et al., Astronomy & Astrophysics, 2021.


* This 110TB is distributed across REALTA’s 4 compute nodes
Compute Nodes (x4) Storage Node BL Headnode BL Compute Node
Machine Name Dell Poweredge R740XD Dell Poweredge R730XD SuperMicro 1029U-TRTP2 SuperMicro 6049P-E1CR24H
CPU Model Intel Xeon Gold 6130 (2x) Intel Xeon E5-2640 v4 (2x) Intel Xeon Silver 4110 (2x) Intel Xeon Silver 4110 (2x)
CPU Clock Speed 2.10GHz 2.40GHz 2.10GHz 2.10GHz
No. CPU Cores (Threads) 32 (64) 20 (40) 16 (32) 16 (32)
RAM 256GB 256GB 93GB 96GB
Storage 210TB* 128TB N/A 144TB

The figure below shows the individual hardware components of REALTA on the right and how they are connected to the data stream from I-LOFAR on the left.

Block diagram for REALTA and I-LOFAR.

Block diagram for REALTA and I-LOFAR. Data recorded at the Remote Station Processing (RSP) boards are sent to the S1 fibre switch in the I-LOFAR container. Here the data are split into four ‘lanes’ where each lane contains the data from a maximum of one quarter of the beamlets from the observation. The four lanes of data are then sent over a fibre connection to the I-LOFAR control room where it is recorded by REALTA. Orange arrows indicate the data path along fibre connections. Blue arrows are 1 Gbps Ethernet links and red arrows show infiniband connectivity.The dotted orange line is a fibre link to the BL compute node currently under development.

I-LOFAR and REALTA are supported by an SFI Research Infrastructure grants. REALTA is owned by UCC and NUIG and maintained by TCD and DIAS.