Combining machine learning with astrophysics

Collaborations across university schools and academic fields are on the increase. An example of a deeply intertwined project is the one being conducted between Birmingham astrophysicists and artificial intelligence (AI) experts into the detection of galaxy clusters billions of light years away.

Five academics – two from the School of Computer Science and three from the School of Physics and Astronomy – have joined forces to discover more about the evolution of the Universe by pinpointing faraway groups of galaxies using a mixture of machine learning and astrophysics.

Their paper, 'Automated detection of galaxy groups through Probabilistic Hough Transform’, was presented late last year to an international conference and its extended version will be submitted to two journals (one in the astrophysics domain, the other in the machine learning community) later this year. It was winner of the College’s ‘Paper of the Month’ in November.

It is the first time the Probabilistic Hough Transform (PHT) – a feature extraction method motivated by techniques in image analysis and successfully used before in bioinformatics – has been adapted here to search for the proverbial ‘needle in a haystack’, and thus to help unlock the mysteries of space.

The PHT is a probabilistic variant of the Hough transform algorithm commonly used to detect curves such as straight lines, circles and ellipses. In this case, it is being used to detect weak signatures of galaxy groups.

Professor Peter Tino from the School of Computer Science, one of the paper’s authors, says: ‘Our research centres on groups – or clusters – of galaxies. Our galaxy, the Milky Way, is only part of a huge conglomeration of stars; it turns out that galaxies are often grouped into clusters or groups.

‘Studying these groups tells us a lot about the evolution of the Universe, so it’s very important for astronomers to detect them. However, they are very far away. If you look through a powerful telescope, you see planets and stars, but galaxies are only points. Even if you see lots of points, how do you know which belong to which group? This is where AI comes into play.’

Automated discovery of galaxy groups is possible because of the amounts of detailed realistic simulations of galaxy and galaxy group formation, as well as available survey data collected by the likes of the Hubble space telescope.

Peter and his colleagues have introduced a novel methodology, based on PHT, for finding galaxy groups embedded in a rich background. The model takes advantage of a typical signature pattern of galaxy groups known as ‘fingers-of-God’: This is where the galaxy distribution is elongated in redshift-space (the spatial distribution of galaxies appears squashed and distorted when their positions are plotted in redshift-space rather than real-space), with an axis of elongation pointed toward the observer.

‘Until now, astronomers have been using classical approaches, based on statistics, to detect these galaxy clusters,’ explains Slovakian-born Peter, who has been in the UK for 16 years and at Birmingham for 13. ‘What our approach does is to allow us to include prior astrophysical knowledge as an inherent part of the method.’

The proposed method is first tested in large-scale controlled experiments with 2D patterns and then verified on 3D realistic mock data (comparing with the well-known ‘friends-of-friends’ method used in astrophysics).

‘We know that in the Universe there is dark matter that we don’t see, but that we suspect is there: we can only “see” things that produce waves of some detectable type,’ explains Peter. ‘The Universe looks like filaments, and between them there’s empty space. These galaxy clusters are actually centred in these filaments. The classical techniques trace filaments, but don’t tell us about the physics. Our approach says “we know something about those clusters and we can use that knowledge to help us detect them”.

‘What we’ve found is that this approach can detect groups at least as reliably as the traditional techniques can – as well as giving more information about astrophysics; so it’s not just about statistics. What is more, we think it can get even beyond the capabilities of traditional techniques.’

If you look at a galaxy cluster that’s two billion light years away, you’re looking at an object that existed two billion years ago.

‘So the further you are, the younger you are. We know that if you are younger, then probably the galaxy cluster is smaller, so all these things are in our model, which are not in the traditional approach: it is AI techniques heavily influenced by astrophysics, and we are the first to use Hough transforms in this way.’

Although a rapidly developing field, collaborations between astrophysics and AI researchers are still relatively uncommon, which is why this project is so significant.

‘It is still quite unusual for people from diverse fields to collaborate in such a deep way, but such collaborations are going to be more and more important,’ says Peter. ‘The big thing in science today is interdisciplinary work, and this is a very good example of collaboration that genuinely wouldn’t be able to happen without all the parties involved.’