Title: Galaxy Spectra I Auto Encoding: Architecture
Authors: P. Melchior, Y. Liang, C. Hahn, et al.
Institution of the first author: Department of Astrophysical Sciences, Princeton University, NJ, USA and Center for Statistics & Machine Learning, Princeton University, NJ, USA
Status: Submitted to AJ
Imagine for a moment that you had never seen a car before and really wanted to build a toy car as a gift for one of your car-loving friends. Naturally, faced with this conundrum, you might feel that the outlook is bleak. However, now suppose another friend arrives and has the brilliant idea of sending you to the nearest busy street corner, explaining that anything passing by can be considered a car. They then bid you farewell and leave. Armed with this new information, you are invigorated. By watching vehicles pass for a few minutes, you quickly get an idea of what a car could be like. After about an hour, you head inside, ready for your challenge: you have to build your own (toy) car based on what you’ve seen.
This example is obviously contrived but provides a useful touchpoint from which we can heuristically understand how a automatic encoder, or, more generally, any unsupervised machine learning algorithm, could work (see these astrobites for examples of machine learning used in astronomy). If you think about how you would approach the above challenge, the rule of thumb might be clear: with just observations of passing cars, you would latch onto the patterns you noticed and use them to “reconstruct” the definition of a car in your mind. Perhaps you would see that most of them had four wheels, many had headlights, and they were all generally the same shape. From there, you might be able to build a decent little car and your friend will be proud of it. This is the basic idea behind an unsupervised learning task: the algorithm is presented with data and it tries to identify relevant features of that data to help achieve a goal presented to it. In the specific case of an autoencoder, that goal is to learn how to reconstruct the original data from the compressed dataset, like you were trying to do when building a toy car from memory . In particular, you (or the computer) summarize the cars’ observations (the data) into shared characteristics (the so-called “latent characteristics”) and reconstruct the car from these characteristics (reconstruct the data). This process is shown schematically in Figure 1.
Application to astronomy
To understand how this machine learning method is applied in today’s article, let’s take the example of rebuilding the car a little further. Rather than being able to observe a random sample of passing cars, let’s say this time your friend was less resourceful and just grabbed pictures of a few different cars on their phone. Based on that, you might be able to do a decent job of producing something that looks like a car, but your car probably wouldn’t run very well (for example, maybe the wheels would be attached to the frame and your car wouldn’t move). Alternatively, maybe your friend wanted to challenge you more and just described how a car works. In this case, you might be able to build a functional little car, but it probably wouldn’t look too precise. These scenarios are loosely analogous to some of the challenges present in current approaches to modeling galaxy spectra, the subject of today’s article.
Approaches to modeling galaxy spectra can be divided between empirical data-driven models and theoretical models. The first of these are equivalent to the pictures of cars your friend showed you – astronomers use “model” spectra and observations of local galaxies to construct model spectra that can be fitted to observations of systems at offsets to the red higher. While useful, these are usually based on observations of local galaxies and therefore can be restricted to a limited wavelength range once cosmological redshift correction is incorporated. The theoretical models, on the other hand, reflect your friend’s last suggestion; that is, they produce model spectra based on a physical understanding of emission and absorption in the interstellar medium, in stars, and in nebulae. These are interpretable and physically motivated, so they can be applied to higher redshifts, for example, but usually rely on some approximation and are therefore unable to accurately capture the complexity of true spectra.
Despite these challenges, today’s authors note that the historical utility of applying model spectra to describe new observations of other spectra implies that such data may not be as inherently complicated as it may be. Seems – perhaps the variations between spectra can be narrowed down to a few relevant parameters. . This recalls the discussion around auto-encoders and inspires the approach of today’s article – perhaps one can find a low-dimensional integration (read: simpler representation) of the spectra that makes the reconstruction a easy task.
How to Build a Galaxy Spectrum
Most traditional galaxy spectrum analysis pipelines work by transforming the observed (red-shifted) spectrum into the emitted spectrum in the galaxy’s quiescent frame, and then fitting the observation to a model. This means that spectra are generally limited in the range of usable wavelengths to a range that is shared among all of the different spectra in a survey sample. In the authors’ architecture today, they choose to keep the spectra as they are observed, which means they don’t do any type of redshift processing before analyzing them, allowing them to present the algorithm with the full wavelength range of the observation, thus preserving more of the data. Today’s article presents this algorithm, called SPENDER, which is schematically represented in Figure 2.
The algorithm takes an input spectrum and first passes it through three convolution layers to reduce dimensionality. Then, because the spectra were not red-shifted, the processed data passed through an attention layer. It’s very similar to what you would do watching cars go by on the street – although there are lots of cars passing by and they’re all in different places and moving at different speeds, you focus your Warning about particular cars and particular features of those cars to train your neural network (read: brain) to learn what a car used to be. This layer does the same thing by identifying which parts of the spectrum it should focus on; that is, where the relevant emission and absorption lines may be found. Then, to conclude the coding data, the data is passed through a Multi-Layer Perceptron (MLP) which redshifts the data to the galaxy’s rest frame and compresses the data to s-dimensions (the desired dimensionality of latent space).
The model should now decode the embedded data and attempt to reproduce the original spectrum. It does this by passing the data through three “activation layers” which process the data through a predefined function. These layers transform the simple, low-dimensional (latent) representation of the data into a spectrum within the galaxy’s resting frame. Finally, this representation is redshifted to the observation and the reconstruction process is complete.
In practice, the contributions of different parts of the data to the final result depend on initially unknown weights. To learn these weights, the model is qualified — the reconstructed and original data are compared and the weights are adjusted (roughly by trial and error) until the optimal set of weights is reached.
So how does it work ?
The results of running the SPENDER model on an example spectrum of a galaxy in the Sloan Digital Sky Survey are given in Figure 3.
Visually, it looks like the model does quite well in reproducing the given spectrum. Figure 3 also shows one of the advantages of such a model. Not only is the model able to reproduce the different complexities of the spectrum, but by varying the resolution of the reconstructed spectrum, the model is able to distinguish features that overlap (or blend) in the input data (see both nearby OII lines in Figure 3, for example). Ultimately, the nature of the SPENDER build means that data can be passed to the model as it is received from the instrument – because the model is trained with no redshift or cleanup, the model learns to incorporate this processing into its analysis. Such an architecture can also be used to generate fictitious spectra and provides a new approach to detailed modeling of galaxy spectra that alleviates some of the problems that exist in current empirical galaxy modeling approaches.
Astrobite edited by Katy Proctor
Featured image credit: adapted from paper and bizior (via FreeImages)
About Sahil Hegde
I’m a first-year astrophysics PhD student at UCLA. I currently use semi-analytical models to study the formation of the first stars and galaxies in the universe. I completed my undergraduate studies at Columbia University and am originally from the San Francisco Bay Area. Outside of astronomy, you’ll find me playing tennis, surfing (read: annihilating), and playing board games/TTRPGs!
#Crack #code #Galaxy #Spectra