A machine-learning algorithm has sniffed out 50 highly likely exoplanets previously hidden in data collected by NASA’s now-defunct Kepler space telescope.
The system uses a gaussian process classifier that crunches through a list of possible planet candidates, and assigns a percentage describing how likely each object is an alien world.
The software analysed readings of confirmed planets and a larger data set of false positives to train itself on what to spot and what to ignore – and then was shown fresh observations and assigned them a percentage chance of the data indicating the presence of a fully fledged exoplanet. Those with a tiny false-positive chance were thus selected as likely alien planets.
“The algorithm we have developed lets us take fifty candidates across the threshold for planet validation, upgrading them to real planets,” said David Armstrong, a research fellow at the UK’s University of Warwick who led the study. “Rather than saying which candidates are more likely to be planets, we can now say what the precise statistical likelihood is. Where there is less than a one per cent chance of a candidate being a false positive, it is considered a validated planet.”
Armstrong told The Register that the algorithm was “extremely accurate,” and only mislabeled three objects out of nearly 8,000 candidates. “We found a couple of apparent errors in the results, but these actually turned out to be errors in the previous labels,” he said.
The 50 exoplanets identified by the software have a variety of properties. Some are as big as Neptune, others are smaller than Earth. Some take 200 days to complete a full orbit around their host stars while others zip round in a single day.
The now-defunct Kepler space telescope detected exoplanets by watching the brightness levels of nearby stars and spotting periodic decreases of starlight that could be caused by a planet passing in front of them.
Kepler telescope is dead but the data lives on: Earth-sized habitable zone planet found after boffins check for errors
But such dips are not always caused by an exoplanet as you can get similar images from binary star systems or even from flaws in the telescope’s equipment. To make the algorithm more accurate, secondary features such as the object’s size and shape are taken into account.
It’s difficult to find needles in the massive haystacks of data amassed by spacecraft like Kepler. There are too many samples for astronomers to eyeball; many exoplanets have been discovered with automated methods, such as the VESPA algorithm.
“Almost 30 per cent of the known planets to date have been validated using just one method, and that’s not ideal. Developing new methods for validation is desirable for that reason alone. But machine learning also lets us do it very quickly and prioritize candidates much faster,” Armstrong said.
The team is now working on open sourcing their algorithm so that other scientists can use it to analyse data from other astronomical surveys.
“We are aiming to use the algorithm to discover exoplanets in data from the NASA TESS mission. With a rigorous automated procedure like this we can look at the populations of planets TESS finds in a statistical way and try to uncover clues about where planets are common and how they formed,” he concluded. ®