WFCAM Data Handling

WFCAM Data Handling

Raw Data Flow

The raw data generated by each of the 4 SDSU controllers is transmitted over a dedicated fiber-optic link to a PCI card in a data acquisition computer. This data acquisition computer runs a UKIRT Data Handling System (DHS). It records the data and various header data collected from various sources in a standard UKIRT NDF HDS container. These files are written to a file system located on a RAID5 volume local to the acquisition computer. Thus, there are 4 separate raw data disks for WFCAM, each located on a separate acquisition computer.

While observing is taking place, 4 other computers dedicated to data reduction read the raw data from the acquisition machines via NFS and reduce it using ORACDR running the so called “summit pipeline” to produce various data quality parameters and reduced images and catalogues.

In the morning following a night of WFCAM observing, the raw data are written to LTO-1 tapes on drives located in the acquisition computers. Thus, 4 separate data tapes, one for each camera, are written in parallel.

The 4 Data Reduction computers also write copies of the raw data (now transformed into compressed FITS files) to LTO tapes. These tapes are mailed to CASU in the UK, where the data are processed and ingested into the WFCAM science archive.

Raw data flow schematic

Figure 1.1 – Raw data flow schematic

Note: The LTO tape systems in use operate most efficiently with a 64k block size. In Linux, this is achieved by running ‘mt setblk 0’ to configure the drive for variable block size, then using ‘tar -b128 …’ to specify to tar that it should generate 64k (128 times 512-bytes) blocks. It is necessary to use the same parameters when reading the tapes.

Pipeline Processing

The WFCAM raw data is pipeline processed at UKIRT during observing. This gives the observing team near-real-time feedback on the quality of the data they are acquiring and allows observers to assess whether the current weather and conditions are appropriate for their observations or observations they are planning to execute. To do this, the summit pipeline carries out a reasonably complete data reduction sequence, including flat fielding, microstep processing, and mosaicing, followed by object detection and catalog creation, followed by astrometric and photometric calibration.

An offline pipeline processes the raw WFCAM data shipped to CASU in the UK to generate reduced data for ingest into the WFCAM science archive. The offline pipeline has the advantage that it does not have to deal with data in the order it was observed; for example, calibration frames such as darks or flats can be built up from data spanning the entire night or even several nights.

It is expected that data will be dispatched to CASU within approximately 7 days of being taken. Allowing a further 7 days for transit and a further 7 days for processing / ingest, data should be at the science archive within 3 weeks of being observed.

Access to WFCAM Data

Note, this section deals with the practicalities of access to WFCAM data, not the politics. You must first have the right to access the data under the relevant policies before any of this applies. The UKIRT web pages have a UKIRT Infrared Deep Sky Survey Operations page containing a UKIDSS survey data policies page and a Non-UKIDSS WFCAM Operations page, which contains some notes regarding data access rights policies.

In general, WFCAM users (including UKIDSS, PATT, UH, Japanese, etc.) will obtain their WFCAM data from the UK’s WFCAM science archive. This will be reduced data processed by the offline (UK) pipeline. To gain access to the archive, WFCAM PIs must register with the WSA.

It is not expected that the majority of users will require access to raw data. Those that do may request it from CASU.

Reduced data products from the summit pipeline will be available to observers at UKIRT. This includes calibrated catalogs and mosaiced images. In general, it will be possible for observers to take away small amounts of this data (e.g. catalogs and a few mosaics) if they wish. Likewise, small amounts can be supplied to remote PIs. Both of these are on a best efforts basis, and will probably be limited to a few gigabytes in practice.

Some projects may have a valid case for expedited access to bulk data, where the ~6-week delay between observations being taken and data becoming available in the science archive presents a serious compromise to the project’s scientific potential. PIs who believe that this applies to their project should raise the issue with their support scientist well in advance.

Japanese and University of Hawaii WFCAM Runs

In addition, it is realized that retrieving bulk raw data from the UK may be difficult and slow for people located a long way from the UK (e.g. Hawaii). Strictly on a best efforts basis, we may be able to write data to either LTO-1 or LTO-2 tape or USB-2 interfaced hard disks at JAC. These tapes or disks must be supplied by the observer. If you wish to take advantage of this, you must contact your support scientist well in advance of your data being taken, typically 2-3 weeks before the run. Last-minute requests will likely be unsuccessful. In general, there will be a delay of 1-2 weeks after your data are taken before the data are available on your tapes or disks – you will not be able to take the data away at the end of your run.

Access to Raw and/or Reduced Data at the Summit

Raw and reduced data can be accessed directly at the summit. The easiest way is to bring your laptop and transfer the data using the internal network. Once your laptop is connected (using a cable, not wireless), you will be able to use ssh and sftp to the computer called lemi.ukirt.hawaii.edu.
Another possibility is to bring an external disk.  There is only one computer where you can plug an external disk: it’s the weather monitor computer, currently located under the desk in the control room. Please ask your support astronomer or the TSS for instructions. We do NOT allow external disks to be plugged in anywhere else.