‘Deep learning’ provides insights into cosmological structure formation
How can machine learning methods help us understand our tangled cosmic web? A new study presents a ‘deep learning’ framework to shed light onto the physics of the formation of dark matter halos. The results show that spherical averages over the initial conditions of the Universe carry the most relevant information about the final mass of halos.
All cosmic structures in the Universe were seeded by tiny fluctuations in the density of matter in the early Universe. Due to gravitation, these small perturbations grow over cosmic time into extended halos of dark matter, connected by walls and filaments and surrounded by empty voids. Normal matter follows this dark matter distribution, so that large-scale observations of our Universe show that galaxies and galaxy clusters form a “cosmic web”. While the non-linear evolution of matter can be computed using cosmological simulations, a theoretical understanding of this complex process remains elusive.
In our study, we use a deep learning framework to learn more about the non-linear relationship between the initial conditions and the final dark matter halos in cosmological simulations (Fig. 1). With this framework, we want to improve our physical understanding of how non-linear, late-time cosmic structures emerge from the linear initial conditions. As it turns out, the major barrier to realising this goal is understanding and explaining how and why complex deep learning algorithms reach particular decisions — in most applications, they have effectively acted as “black boxes”. In our case, we wish to understand which features of the initial conditions are extracted by the algorithm to make its final predictions.
Our three-dimensional convolutional neural network (CNN) is trained on the non-linear relationship between the initial density field and the final mass of dark matter halos in cosmological simulations. The CNN consists of six layers, where features are extracted across the layers in a hierarchical fashion (from low-level, local features in the initial layers to high-level, more abstract ones in subsequent layers). Two fully-connected layers then combine the features to return the final prediction. By training the network across many examples of mapping initial conditions to halo masses, the model learns to identify those aspects of the initial density field, which impact the final mass of the resulting halos.
The crucial step now is to generate a physical interpretation of the mapping learnt by the machine learning tool: we remove part of the input information, re-train the model and measure the resulting change in the model's performance. This simple and effective technique reveals, which parts of the input influence the model's output.
We remove anisotropic information about the initial density field and re-train the CNN (Fig. 2). The two models, one trained on the raw density inputs and the other on the averaged-density inputs, return consistent predictions; the performance of the CNN does not degrade if we remove anisotropic information about the density. Therefore, the features learnt by the CNN are equivalent to spherical averages over the initial density field. This means that anisotropic properties of the initial density field carry no relevant information for establishing the final mass of dark matter halos.
This fact leads to a re-evaluation of our existing interpretations of gravitational collapse, based on analytic approximations of structure formation. For decades, the interpretation from analytic models has been that accounting for anisotropic properties of the early-Universe, such as external tidal shear effects, yields an improved halo collapse model compared to one based on isotropic properties alone. Instead, we show the contrary: anisotropic properties of the initial density field do not play a relevant role in establishing final halo masses. A crucial test of robustness of our framework was to demonstrate that the deep learning model can effectively extract spatially-local features on all scales and yield robust halo mass predictions that match expectations for a simpler test-case scenario.
Our work shows that interpretable deep learning frameworks can provide a powerful tool for extracting insights into cosmological structure formation. Developing toolkits for deep learning interpretability is of great interest also to the broader science community, as only by understanding how machine learning models reach their predictions can scientists trust AI tools in scientific applications.