Big Data Processing for I-LOFAR: REALTA

I-LOFAR is a true ‘big data’ project, and as part of the International LOFAR Telescope, is a pathfinder telescope to the future Square Kilometre Array telescope, which will create perhaps the largest dataset in the world (an exabyte per day!).

To date, the data processing capabilities of  I-LOFAR have not  been sufficient to allow the telescope study astronomy objects that vary rapidly. However,  members of the I-LOFAR consortium at UCC and NUIG have recently obtained SFI funding to create a data processing system that can intensively analyse  and rapidly store the data stream as it emerges from the telescope, giving I-LOFAR a new and powerful capability to study the transient radio sky.  The system, called the “REALtime Transient Acquisition Cluster (REALTA)”, builds on the heritage of the ARTEMIS system designed by the Oxford e-Research Centre.

In July 2018, a team from UCC, NUIG, TCD and Oxford/Berkeley team set up 5 Dell-EMC PowerEdge GPU/CPU blades and up to 300 terabytes of storage. Over the coming months, this will enable us to study a wide range of fascinating astronomy targets such as aurora in planetary atmospheres, pulsars, solar radio bursts, nova explosions, and fast radio bursts.

The I-LOFAR processing cluster was enhanced in March 2020 to assist in the Search for Extraterrestrial Intelligence (SETI), in collaboration with the Breakthrough Listen foundation, by the addition of another compute node to the cluster. This machine is planned to be integrated to the REALTA system by hooking into the live station data stream and performing transient signal analysis in addition to any work being performed by observers on the existing REALTA compute nodes.

I-LOFAR and REALTA are supported by research infrastructure grants from Science Foundation Ireland. REALTA is owned by UCC and NUIG and hosted at TCD’s Rosse Observatory.

Technical specifications for REALTA can be found in the table below.


* This 110TB is distributed across REALTA’s 4 compute nodes
Compute Nodes (x4) Storage Node BL Headnode BL Compute Node
Machine Name Dell Poweredge R740XD Dell Poweredge R730XD SuperMicro 1029U-TRTP2 SuperMicro 6049P-E1CR24H
CPU Model Intel Xeon Gold 6130 (2x) Intel Xeon E5-2640 v4 (2x) Intel Xeon Silver 4110 (2x) Intel Xeon Silver 4110 (2x)
CPU Clock Speed 2.10GHz 2.40GHz 2.10GHz 2.10GHz
No. CPU Cores (Threads) 32 (64) 20 (40) 16 (32) 16 (32)
RAM 256GB 256GB 93GB 96GB
Storage 110TB* 128TB N/A 144TB

The figure below shows the individual hardware components of REALTA on the right and how they are connected to the data stream from I-LOFAR on the left.

Block diagram of REALTA hardware and network connection to I-LOFAR.

Observing the Sun at the nanosecond scale with the I-LOFAR Transient Buffer Boards

Transient Buffer Boards (TBBs) are a part of I-LOFAR’s data computation hardware that allow signals from the sun, stars and other astrophysical objects to be recorded at one of the fastest time resolutions possible. They can show us how the sun and stars change at the nanosecond scale.

To do this, TBB data must be recorded to cluster of computers so that the large amount of data can be stored. The total amount of data all 12 of I-LOFAR’s TBBs can hold is 384 GB, which is only 5 seconds worth of observations if all 96 antennas are used. Not only does the cluster need to store large volumes of data, it has to write it fast enough so that it doesn’t get lost before more data comes in. The TBB data cluster in our control room in Birr writes data to a fast, 46 TB storage node.

The picture above shows the TBB cluster in the control room. While it may not look pretty, it offers researchers in Trinity College Dublin the chance to observe the fine structure of radio bursts from the sun at the highest time resolution that has ever been done before. This will allow them to solve unanswered questions about energy release and particle acceleration on the sun. The fat silver unit at the top (WN104) is the fast storage node where all the TBB data is stored while the smaller units beneath it will be used in the future to process this data to look for new phenomena in the sun’s atmosphere.