Future data products

Science-ready data products will include calibrated images and catalogs of detected sources, made available on Prompt (daily) and Data Release (annual) timescales.

The descriptions on this page refer to the future data products that will be produced after the limited data releases of Rubin's Early Science Program.

Learn more about the recent data releases.

The LSST data products, abridged (August 2022)

A brief, informal summary of the planned Rubin Observatory data products and analysis tools.

This information has been distilled from the Data Products Definition Document (DPDD; ls.st/dpdd) and the publication “LSST: From Science Drivers to Reference Design and Anticipated Data Products” (Ivezić et al. 2019), which remain the ultimate reference for detailed descriptions of the planned LSST data products and pipelines. As of October 2024 those two documents did not yet reflect the 80-hour timescale for Prompt-processed images.

The DPDD Abridged is available in several formats:

this webpage (scroll down)
the recorded video (below)
a DOI-citable Zenodo slide deck

Preparing a research grant funding proposal? Use the citable DOI as the reference for what data products, tools, and services Rubin will (and won't) provide, and the timescales for data releases.

‌

The LSST Data Products, Abridged (August 2022)

Introduction

The Rubin Observatory’s LSST Science Pipelines will create the general-use data products and analysis tools which will enable scientists to produce the expected science deliverables in the four science pillars: probing dark energy and dark matter, taking an inventory of the solar system, exploring the transient optical sky, and mapping the Milky Way. The science deliverables are described in the Rubin Science Requirements Document (SRD; ls.st/srd). The general-use data products and analysis tools developed by Rubin staff incorporate algorithms and software that have been designed, built, and validated by the global astronomical community, and represent a cumulation of shared knowledge and expertise.

Producing the expected LSST science deliverables - and pushing into new scientific frontiers - will also require the development of specialized algorithms, data products, analysis tools, and cyberinfrastructure that go beyond what will be provided by Rubin Observatory. This considerable amount of work is best left to the specific expertise of the science community, and the independent LSST Science Collaborations are driving this development.

Transients, variables, and moving objects

Processing Pipelines: The data products for transients, variables, and moving objects are primarily produced by the Prompt Processing pipelines, which perform reduction, calibration, difference image analysis (DIA), source detection and measurement, and alert distribution within 60 seconds of image readout. Solar System Processing for moving objects takes place during the day. Images and catalogs that result from Prompt Processing are available after 80 hours, and are fully described in Section 3 of the DPDD. All DIA data products are re-generated during the annual Data Release Processing. Source detection and measurement on direct images (i.e., non-difference images) is only be done during the annual Data Release Processing.

Alert Packets are ascii files containing data for a single detected source in a difference image (DIASource); include catalog data and small cutouts of the difference and template images.

Images

processed visit images (PVIs or “direct images”; 80 hours and annually)
difference images (template-subtracted; 80 hours and annually)
template images (transient-free annual stacks; annually)

Catalogs

sources detected with SNR>5 via difference image analysis (DIA), and associated forced photometry: the DIASource, DIAObject, and DIAForcedSource tables (24 hours and annually)
DIASources linked as moving-objects in the solar system (SS) and their orbital parameters: the SSSource, SSObject, and MPCORB tables (24 hours and annually)
sources detected with SNR>5 in PVIs, and associated forced photometry: the Source, Object, and ForcedSource tables (annually)

Catalog contents include:

unique identifiers (IDs)
measurements (e.g., coords, flux, mag, date/time, shape, size, PSF fit, proper motion, parallax)
IDs of nearby LSST static-sky catalog objects (i.e., host association)
orbital parameters derived by the Minor Planet Center (MPC)
time variability parameters (limited; to be determined with community input; ls.st/dmtn-118)
pre-discovery ("precovery") PSF photometry in difference images

Examples of additional specialized algorithms, data products, and analysis tools that will be left to the expertise of the science community include, but are not limited to:

photometric and spectroscopic follow-up observations
object classifications (e.g., light-curve types, astronomical categorization)
cyberinfrastructure for the large-scale acquisition, processing, and analysis of follow-up
cross-matching to non-LSST catalogs
host-galaxy confirmation (e.g., distinguishing faint or blended hosts)
orbital and/or time-variability parameters beyond what is in the LSST tables
light-curve parameters (e.g., rise/fall times, peak brightness, asteroid rotation rates)
shifted-and-stacked images (e.g., to detect faint moving objects)
multi-night stacks or difference images (e.g., to detect fainter objects)
physical parameters (e.g., redshift, distance, host extinction, composition, intrinsic magnitude)
event occurrence rates (e.g., volumetric rates)

Static-sky objects (stars and galaxies)

Processing Pipelines: The data products for static-sky objects (stars and galaxies) are primarily produced by the Data Release Processing pipelines, which reduce, calibrate, and combine (i.e., stack, coadd) all LSST images, and detect, measure, and characterize sources in both direct and “deep coadded” images. Images and catalogs that result from Data Release Processing will be available annually, and are fully described in Section 4 of the DPDD.

Images

processed visit images (PVIs or “direct images”; 80 hours and annually)
deep CoAdds (stack of all LSST images; one per filter; annually)

Catalogs

sources detected with SNR>5 in PVIs, and associated forced photometry: the Source, Object, and ForcedSource tables (annually)
forced photometry in PVIs at the location of all Objects: the ForcedSource tables (annually)

Catalog contents include:

unique identifiers (IDs)
measurements (e.g., flux, mag, color, date/time, shape, size, PSF fit, proper motion, parallax)
centroids and adaptive moments
Petrosian and Kron fluxes
deblending parameters (e.g., parent/child associations; priors for crowded fields)
model fits (e.g., point-source, bulge-disk)
aperture surface brightness measurements
photometric redshift (PZ) estimates (community-vetted algorithm TBD; see ls.st/dmtn-049)
local shear estimation measures

Examples of additional specialized algorithms, data products, and analysis tools that will be left to the expertise of the science community include, but are not limited to:

alternative types of deeply stacked coadded images (e.g., intermediate timescales, multi-band, best-seeing)
specialized deblending algorithms (e.g., for crowded fields)
probabilistic photometry catalogs (e.g., for crowded fields)
stellar types or physical parameters (e.g., metallicity)
Milky Way component associations (e.g., disk/bulge/halo stars)
specialized low-surface brightness measurements
galaxy PZ or physical parameters (e.g., star formation rates) beyond those from the adopted PZ algorithm
galaxy shear estimates beyond those provided by the adopted shear algorithm
other galaxy characterization (e.g., AGN, interacting galaxies, group or cluster membership, morphological classifications)
cyberinfrastructure to support large-scale compute-intensive processing (e.g., wide-area joint pixel analyses with non-LSST data sets or image reprocessing, cosmological simulations)

Computational resources and user-generated data products

In order for scientists to access and analyze the LSST data, Rubin Observatory provides the Rubin Science Platform (RSP). The RSP is a set of integrated web-based applications and services running at the Rubin Observatory Data Access Centers (DACs), which includes tools to query, visualize, subset, and analyze the full LSST data archives in a stable software environment located “next-to-the-data,” with storage space and compute resources for user-generated data products.

User-generated data products refers collectively to the specialized data products that will be generated by the science community. These will be created and stored using suitable Application Programming Interfaces (APIs) that are provided as part of the RSP. Users and groups are able to maintain access control over the data products they create, enabling them to have limited distribution or to be shared with the entire Rubin Observatory community.

As defined in the Science Requirements Document (SRD; ls.st/srd), the Rubin Data Management System provides at least 10% of its total capacity for user processing and storage. Scientists are able to pool their compute resource quotas in order to undertake larger processing jobs. If the compute resources provided by Rubin Observatory are oversubscribed, a “Resource Allocation Committee” will be established.

Due to the unprecedentedly large nature of the LSST data set, it is anticipated that some of the additional specialized algorithms, data products, and analysis tools that will be left to the expertise of the science community will require significant external cyberinfrastructure support in addition to the RSP. A few examples include: processing and analyzing follow-up observations for LSST time-domain events; running wide-area joint pixel analyses with non-LSST data sets; building and using frameworks for probabilistic catalogs; iterative development and training for machine learning algorithms; and many, many other applications in the big-data era of the LSST.

Questions?

Everyone is welcome to ask about data products in the "Support - Data Products" category of the Rubin Community Forum.

Rubin Community Forum

Ask questions, get help, report bugs or errors, and join in discussions about Rubin Observatory and its data products, pipelines, and services.

Go to the Rubin Community Forum

Let's Connect