Publications – DeepDIP

Photometric redshifts from SDSS images using a convolutional neural network

Pasquet, Johanna; Bertin, E.; Treyer, M.; Arnouts, S.; Fouchez, D.

Abstract

We developed a deep convolutional neural network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z < 0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64 × 64 pixel ugriz images centered on the spectroscopic targets, plus the galactic reddening value on the line-of-sight. For training sets of 100k objects or more (≥20% of the database), we reach a dispersion σ_MAD < 0.01, significantly lower than the current best one obtained from another machine learning technique on the same sample. The bias is lower than 10^-4, independent of photometric redshift. The PDFs are shown to have very good predictive power. We also find that the CNN redshifts are unbiased with respect to galaxy inclination, and that σ_MAD decreases with the signal-to-noise ratio (S/N), achieving values below 0.007 for S/N > 100, as in the deep stacked region of Stripe 82. We argue that for most galaxies the precision is limited by the S/N of SDSS images rather than by the method. The success of this experiment at low redshift opens promising perspectives for upcoming surveys.

Learn more



Publication: Astronomy & Astrophysics, Volume 621, id.A26, 15 pp.
Pub Date: January 2019
DOI: 10.1051/0004-6361/201833617
arXiv: arXiv:1806.06607
Bibcode: 2019A&A…621A..26P

Keywords: galaxies: distances and redshifts; surveys; methods: data analysis; techniques: image processing; Astrophysics – Instrumentation and Methods for Astrophysics

E-Print Comments: Submitted to A&A, Comments welcome, Model and Example available at “https://github.com/jpasquet/Photoz; A&A 621, A26 (2019); doi:10.1051/0004-6361/201833617

A deep learning approach to observational cosmology with Supernovae

Pasquet, Johanna

Abstract

Future large surveys like the Large Synoptic Survey Telescope (LSST) aim to increase the precision and accuracy of observational cosmology. In particular, LSST will observe a large quantity of well-sampled type Ia supernovae (SNIa) that will be one of the major probes of dark energy. However the spectroscopic follow-up for the identification of SN and the redshift estimation of their host galaxy will be limited. Therefore new automatic classification and regression methods, that exploit the photometric information only, become indispensable. We have developed two separate deep convolutional architectures to classify SN light curves and estimate photometric redshifts. PELICAN (deeP architecturE for the LIght Curve ANalysis) is designed to characterize and classify light curves from multi-band light curves only. Despite using a small and non-representative spectroscopic training dataset (2,000 LSST simulated light curves) PELICAN is able to detect 85% of SNIa with a precision higher than 98%. The second Convolutional Neural Network (CNN) was developed to estimate galaxy photometric redshifts and associated probability distribution functions. We tested it on the Main Galaxy Sample of the Sloan Digital Sky Survey (DR12). The input consisted of 64×64 ugriz images and the CNN was trained with 80% of the statistics. We obtained a standard deviation σ(Delta z) of 0.0091 (Delta z=(zspec-zphot)/(1+zspec)) with an outlier fraction of 0.3%. This is a significant improvement over the current state-of-the-art value (σ ~ 0.0120, Beck et al. 2016). Using SNIa candidates that were well-classified by PELICAN and whose host galaxy photometric redshifts were estimated by the CNN, we are able to construct a Hubble Diagram from photometric information only. The bias introduced by the methods compared to a spectroscopic analysis will be presented.

Learn more

Publication: American Astronomical Society, AAS Meeting #233, id.225.07
Pub Date: January 2019
Bibcode: 2019AAS…23322507P

PhotoWeb redshift: boosting photometric redshift accuracy with large spectroscopic surveys

Shuntov, M.; Pasquet, J.; Arnouts, S.; Ilbert, O.; Treyer, M.; Bertin, E.; de la Torre, S.; Dubois, Y.; Fouchez, D.; Kraljic, K.; Laigle, C.; Pichon, C.; Vibert, D.

Abstract

Improving distance measurements in large imaging surveys is a major challenge to better reveal the distribution of galaxies on a large scale and to link galaxy properties with their environments. As recently shown, photometric redshifts can be efficiently combined with the cosmic web extracted from overlapping spectroscopic surveys to improve their accuracy. In this paper we apply a similar method using a new generation of photometric redshifts based on a convolution neural network (CNN). The CNN is trained on the SDSS images with the main galaxy sample (SDSS-MGS, r ≤ 17.8) and the GAMA spectroscopic redshifts up to r ∼ 19.8. The mapping of the cosmic web is obtained with 680 000 spectroscopic redshifts from the MGS and BOSS surveys. The redshift probability distribution functions (PDF), which are well calibrated (unbiased and narrow, ≤120 Mpc), intercept a few cosmic web structures along the line of sight. Combining these PDFs with the density field distribution provides new photometric redshifts, z_web, whose accuracy is improved by a factor of two (i.e., σ ∼ 0.004(1 + z)) for galaxies with r ≤ 17.8. For half of them, the distance accuracy is better than 10 cMpc. The narrower the original PDF, the larger the boost in accuracy. No gain is observed for original PDFs wider than 0.03. The final z_web PDFs also appear well calibrated. The method performs slightly better for passive galaxies than star-forming ones, and for galaxies in massive groups since these populations better trace the underlying large-scale structure. Reducing the spectroscopic sampling by a factor of 8 still improves the photometric redshift accuracy by 25%. Finally, extending the method to galaxies fainter than the MGS limit still improves the redshift estimates for 70% of the galaxies, with a gain in accuracy of 20% at low z where the resolution of the cosmic web is the highest. As two competing factors contribute to the performance of the method, the photometric redshift accuracy and the resolution of the cosmic web, the benefit of combining cosmological imaging surveys with spectroscopic surveys at higher redshift remains to be evaluated.

Learn more

Publication: Astronomy & Astrophysics, Volume 636, id.A90, 13 pp.
Pub Date: April 2020
DOI: 10.1051/0004-6361/201937382
arXiv: arXiv:2003.10766
Bibcode: 2020A&A…636A..90S



Keywords: galaxies: distances and redshifts; Astrophysics – Astrophysics of Galaxies



E-Print Comments: A&A 636, A90 (2020); doi:10.1051/0004-6361/201937382

PELICAN: deeP architecturE for the LIght Curve ANalysis

Pasquet, Johanna; Pasquet, Jérôme; Chaumont, Marc; Fouchez, Dominique

Abstract

We developed a deeP architecturE for the LIght Curve ANalysis (PELICAN) for the characterization and the classification of supernovae light curves. It takes light curves as input, without any additional features. PELICAN can deal with the sparsity and the irregular sampling of light curves. It is designed to remove the problem of non-representativeness between the training and test databases coming from the limitations of the spectroscopic follow-up. We applied our methodology on different supernovae light curve databases. First, we tested PELICAN on the Supernova Photometric Classification Challenge for which we obtained the best performance ever achieved with a non-representative training database, by reaching an accuracy of 0.811. Then we tested PELICAN on simulated light curves of the LSST Deep Fields for which PELICAN is able to detect 87.4% of supernovae Ia with a precision higher than 98%, by considering a non-representative training database of 2k light curves. PELICAN can be trained on light curves of LSST Deep Fields to classify light curves of the LSST main survey, which have a lower sampling rate and are more noisy. In this scenario, it reaches an accuracy of 96.5% with a training database of 2k light curves of the Deep Fields. This constitutes a pivotal result as type Ia supernovae candidates from the main survey might then be used to increase the statistics without additional spectroscopic follow-up. Finally we tested PELICAN on real data from the Sloan Digital Sky Survey. PELICAN reaches an accuracy of 86.8% with a training database composed of simulated data and a fraction of 10% of real data. The ability of PELICAN to deal with the different causes of non-representativeness between the training and test databases, and its robustness against survey properties and observational conditions, put it in the forefront of light curve classification tools for the LSST era.

Learn more

Publication: Astronomy & Astrophysics, Volume 627, id.A21, 15 pp.
Pub Date: July 2019
DOI: 10.1051/0004-6361/201834473
arXiv: arXiv:1901.01298
Bibcode: 2019A&A…627A..21P



Keywords: methods: data analysis; techniques: photometric; supernovae: general;
Astrophysics – Instrumentation and Methods for Astrophysics


E-Print Comments: doi:10.1051/0004-6361/201834473