Artist’s impression of Gaia. Credit: ESA/ATG medialab; background image: ESO/S. Brunier
Gaia is a European Space Agency Mission to map the Milky Way Galaxy in 3 dimensions.
Gaia was launched in December 2013 and is currently orbiting the second Lagrange point (L2) at a distance of 1.5 million kilometers from Earth in the anti-sun direction, co-rotating with Earth in its one-year orbit around the sun. Gaia observes with two mirrors continuously and its primary objective is to measure the precise positions, motions and luminosities of one billion stars and to discover thousands of planets around other stars and supernovae.
The main goals of this €700 million mission are to measure the precise positions and luminosity of ~1 billion stars and to discover thousands of planets around other stars and supernovae.
Since July 2014 Gaia has made nearly 100 billion measurements with its 1-billion pixel digital camera. Gaia’s database will eventually grow to 1 Petabyte in size which is equivalent to about 200,000 DVDs worth of data.
Analysis of this data will result in the creation of a three-dimensional map of the Milky Way galaxy.
Data volumes in astronomy have now surpassed what is possible to visually inspect by even large teams of astronomers and volunteer citizen scientists. This necessitates an increasingly central role of software and hardware frameworks in the process of scientific discovery from astronomical archives, supplanting the traditional roles of humans in the loop.
To meet the extraordinary data analysis challenge imposed by the key science goals of the mission, more than 80 institutes, involving over 500 researchers around the world, formed the Gaia ‘Data Processing and Analysis Consortium’, which has worked for more than a decade to solve these problems.
Considering the much richer data environment of the early 2020s, when the final Gaia archive will be available over the internet, astronomers will want to connect with ground-based (e.g. the Square Kilometre Array) and other space-generated archives. As complex applications are developed, there will be a requirement to run a distributed application accessing one or more complete archives without pulling entire datasets over the internet.