DR7: Data Products

DASCH DR7 offers the following data products:

Lightcurves

The primary DASCH DR7 data product is a catalog of astrophysical lightcurves consisting of ~25 billion visible-band magnitudes of ~250 million sources, to depths of 14th-16th magnitude. Typical lightcurve RMSes are about 0.15 mag. Extensive quality metadata are included.

DASCH DR7 actually provides two such lightcurve catalogs. The preferred catalog is referenced to the AAVSO Photometric All-Sky Survey (APASS), data release 8. This catalog is photometrically calibrated to the Johnson-Morgan B-band magnitude system, whose definition traces back to the very plates scanned by DASCH. This results in the most reliable data for long-term analysis. APASS should be preferred by nearly all DR7 users.

However, DR7 also provides a catalog referenced to the ATLAS All-Sky Stellar Reference Catalog (ATLAS-refcat2). This database is photometrically calibrated to the SDSS g-band system, which can result in discontinuities and false long-term trends in the lightcurves. The benefit of ATLAS is that it has better astrometry than APASS (based on Gaia DR2), which may be valuable in some use cases.

The recommended way to obtain DR7 lightcurves is via the daschlab Python package (Session.lightcurve() method). A cloud-powered quicklook notebook allows you to examine lightcurve data without needing to install any software locally. (Be warned: this is hosted using a free service, and so its speed and reliability can be spotty.) Lightcurves may also be retrieved using the underlying web API endpoint (POST /dasch/dr7/lightcurve). As of December 2025, the classic “Cannon” data access portal is still available, but its use is strongly discouraged.

See Lightcurve Data Table Columns for information about the detailed contents of DR7 lightcurve data tables.

The entire DR7 APASS lightcurve database is about 7.2 TiB of data, while the ATLAS database is about 8.6 TiB. The underlying files, which use a custom binary format, are indexed in the digital inventory (see releases/dr7/photfiles/). For bulk access, inquire on the DASCH Astrophysics email list. If you want to analyze these files at scale, you will almost surely also want to obtain the Source Catalog Databases as well.

Deep in the DASCH archive (pipeline/photfiles/) there are also historical photometric databases based on the following reference catalogs: gsc2.3.2, kepler, and gaia.

Cutouts / Postage Stamps

Plate cutout images around any sky location are available. These can be obtained using the daschlab analysis toolkit (Session.cutout() method) or the underlying web API (POST /dasch/dr7/cutout). Once again, a cloud-powered quicklook notebook allows you to obtain cutouts without needing to install any software locally.

Note that cutouts must technically be referenced to exposures rather than plates. Some plates were exposed multiple times at different sky positions. Therefore the same pixel of a plate image may map to multiple different RA/Dec coordinates. By the same token, if a plate has multiple overlapping exposures, one set of coordinates might validly map to multiple different pixel locations on the same image!

Plate-Level Photometry

You can also obtain all of the photometric measurements from a specific exposure of a specific plate around a sky location of interest. Data can be retrieved using daschlab (Session.extract() method) or the underlying web API (POST /dasch/dr7/platephot). At the low level, these interfaces merely provide another way to access the photometric databases underlying the lightcurve catalogs.

Calibrated Plate Mosaics

Full FITS images of entire plates, with WCS metadata including plate distortions, can be obtained using daschlab (Session.mosaic() method). These files are called “mosaics”. (They do not combine multiple images of the sky, but they do combine multiple “tiles” produced by the DASCH scanner camera.) Typical full-size mosaics are 750 MiB each; the largest are around 2.2 GiB. Downsampled mosaics, binned in 16×16 pixel blocks, are also available.

The daschlab software creates “value-added” mosaics that incorporate a variety of metadata available from other DR7 data sources. See Value-Added Mosaic FITS File Contents for reference information about the structure and contents of these files. The underlying web APIs return “base” mosaic FITS files that are not as standardized and have fewer metadata.

DR7 contains mosaics for 429,274 plates. The complete corpus of base mosaics held in the DASCH archive (full-resolution and 16×16 binned) is indexed in the digital inventory (see plates/) and adds up to about 200 TiB of data. For bulk access, inquire on the DASCH Astrophysics email list.

Photometric Calibration Metadata

For each mosaic that was successfully photometrically calibrated against one or both photometric reference catalogs, the calibration metadata are archived in an ASDF-format file.

These files can be retrieved through daschlab (ExposureRow.photcal_asdf_url() method) or the underlying web API (GET /dasch/dr7/asset/photcal_asdf/{hexid}). See Photometric Calibration ASDF File Contents for documentation of their structure and contents.

The complete corpus of DR7 photometric calibration files (against both the ATLAS and APASS reference catalogs) is indexed in the digital inventory (see pipeline/photcal/) and totals about 10.8 TiB of data. For bulk access, inquire on the DASCH Astrophysics email list.

Astrometric Calibration Metadata

For each mosaic that had a successful astrometric calibration, the metadata are also archived. DR7 includes 414,755 astrometric calibration solutions.

To obtain the astrometric solution for an invididual plate, it may be simplest to obtain its value-added mosaic as described above. To obtain the entire collection of solutions, obtain the plate database, which includes this information.

The DASCH archive includes small amounts of additional metadata regarding the astrometric calibration process (see databases/astrometry_results in the digital inventory). For access, inquire on the DASCH Astrophysics email list.

Plate Database

Information about the plates in the HCO Plate Stacks collection, both scanned and unscanned, is gathered into a searchable database. This database includes information about exposures as well as plates, including the best-available astrometric calibrations.

For scientific users, the primary means of accessing this information is through daschlab using a positional exposure search (Session.exposures() method). The underlying web API is POST /dasch/dr7/queryexps. Information about plates can also be searched using the Starglass website or less-technical Starglass web APIs.

The DR7 plate database is archived as a snapshot export of the AWS DynamoDB that services the DR7 web APIs, using the DynamoDB JSON format. This export is about 1 GiB in size, and can either be restored into your own DynamoDB instance or analyzed offline. For access, inquire on the DASCH Astrophysics email list. The detailed structure of the database contents is not yet documented.

Metadata for un-scanned plates may be inaccurate due to errors in the historical logbooks.

Source Catalog Databases

The two “reference catalogs” underlying DASCH DR7, APASS and ATLAS, are also available as searchable databases. These catalogs can be queried through daschlab (Session.refcat() method) or the underlying web API (POST /dasch/dr7/querycat).

These databases are also archived as snapshot exports of two AWS DynamoDB tables using the DynamoDB JSON format. The APASS export is about 10.5 GiB in size, while the ATLAS export is about 31.5 GiB. These can be restored into your own DynamoDB instance(s) or analyzed offline. For access, inquire on the DASCH Astrophysics email list.

Both databases largely reproduce existing public data. However, they do include fields counting the number of detections of each source in DR7, as well as metadata supporting lookup of each source’s lightcurve in the DR7 photometric databases.

These databases are also archived in a binary format used by the DASCH pipeline (see pipeline/library/other_catalogs in the digital inventory). For some users, this format may be easier or faster to process than the DynamoDB JSON exports mentioned above, although the data volume is somewhat larger (17.5 TiB and 54.3 TiB). These forms of the catalogs do not include the summary information about DR7 detections.

Plate and Jacket Photos

Most plates were photographed prior to scanning. Photos covered both the paper jackets holding the plates, which sometimes have annotations of interest, as well as the plates themselves, which also sometimes have annotations. Prior to 2023, these annotations were erased before scanning, meaning that these photos are the only record of what was written on the plates. Individual photo files are typically 10 to 40 MiB in size.

Plate and jacket photos can be obtained using the Starglass website or the underlying web APIs (GET /plates/p/{plate_id}). Value-added mosaics include metadata about which photos are available for a given plate.

DASCH DR7 includes about 800,000 plate and jacket photos. These are indexed in the digital inventory (see plates/) and the total data volume is about 7.9 TiB. For bulk access, inquire on the DASCH Astrophysics email list.

Logbook photos

Along with the glass plates, the Harvard Plate Stacks collection includes written logbooks summarizing the corresponding observations. These logbooks have been photographed in their entirety. Citizen-scientist volunteers transcribed limited forms of metadata from these photos, which were then incorporated into the plate database.

As of the DR7 release, the logbook photos can be navigated manually at http://dasch.rc.fas.harvard.edu/scans/.

DASCH DR7 includes about 166,000 photographs of logbooks as well as selected historical astronomer notebooks. These are indexed in the digital inventory (see logbooks/) and the total data volume is about 1.7 TiB. For access, inquire on the DASCH Astrophysics email list.

Raw Scan Data

Raw data and metadata associated with the plate scanning process are included in the DASCH archive. In principle, there should be no need to access these files as long as the corresponding calibrated full-plate FITS mosaics is available.

The raw data are indexed in the digital inventory (see scanning/ and tiles/) and the total data volume is about 417 TiB. For access, inquire on the DASCH Astrophysics email list.

Note that the raw tile data are stored using AWS S3’s Glacier Deep Archive system, which requires a 24-hour lead time for any data retrieval. The scanning calibration files, logs, and telemetry do not have this restriction.

Digital Inventory

The DASCH archive consists of more than 650 TiB of data underlying all of the products documented above, as well as a variety of other resources supporting and documenting all aspects of the DASCH project over multiple decades of history. This archive is indexed by the DASCH Digital Inventory, an exhaustive tabulation of the ~34 million files stored in the archive. If you are interested in duplicating some or all of the DASCH archive, the Inventory provides information about everything that is available.

While the actual DASCH archive itself is too large to duplicate on generalist services like Zenodo or Dataverse (for now), the inventory has been deposited to Zenodo and assigned the DOI 10.5281/zenodo.14563521. To retrieve the inventory, follow the DOI link. The inventory is about 1 GiB compressed and 10 GiB uncompressed.

Other Archived Data

The DASCH archive includes other data as well:

  • Derived data products needed to operate the DASCH data access services
  • Snapshots of all of the source code behind the DASCH software systems, from scanning to pipeline processing to data access services to end-user analysis
  • Logs relating to all modern DASCH pipeline processing, data management, and other operations tasks
  • All available internal project documentation
  • All other data files supporting DASCH operations

Consult the digital inventory for more information. For access, inquire on the DASCH Astrophysics email list.