PDS_VERSION_ID = PDS3 LABEL_REVISION_NOTE = " 2012-02-27 Cornell:BTCarcich Updated to be consistent with lien-resolved data products. " RECORD_TYPE = STREAM OBJECT = TEXT INTERCHANGE_FORMAT = ASCII PUBLICATION_DATE = 2012-02-27 NOTE = "Contents of /DATA/ subdirectories in Stardust-NExT CIDA Data Archive." END_OBJECT = TEXT END Stardust-NExT CIDA data ======================= Contents ======== Overview General EDF structure (layout of data within the files) Spectrum EDF structure Simple approach to reading a Spectrum EDF General approach to reading a Spectrum EDF HK EDF structure EDF directory layout and filename conventions /DATA/ directory and subdirectory layout Overview ======== As described in /CATALOG/DATASET.CAT and elsewhere, there are two types of Stardust-NExT Engineering Data Files (EDFs): Spectrum EDFs; Housekeeping (HK or ancillary) EDFs. The Spectrum EDFs contain data from CIDA events; some are time-of-flight mass spectra from particles that hit the CIDA target. The HK EDFs record state of the CIDA instrument, but are not required to interpret the spectra. There are four sub-types of HK EDF. The data in each EDF comprise a PDS label followed by one or more PDS tables. The table rows comprise comma-separated-values (CSVs) in ASCII format; the width of each field is constant for all rows in a table. The PDS label describes each table in the file; the PDS label also refers to ancillary files that further describe the data in each row of a table. Each Spectrum EDF represents a single event in time, and contains a one-row CIDA header table followed by a spectrum table for that event. Each HK EDF covers some span of time: the entire mission; the cruise phase; the encounter phase. Each row of an HK EDF table contains values for a single point in time. Each row comprises the data of the CIDA synoptic header followed by the data apropos the HK EDF sub-type. The /DATA/ subdirectory structure separates the EDFs by target and type. The EDFs are distinguished by filename, which contains an analog for the EDF time(s). The HK EDFs are all in one subdirectory, and the HK EDF sub-types are also distinguished in the filenames. General EDF structure (layout of data within the files) ======================================================= The PDS label at the start of each EDF describes in detail the position, layout and content of the table(s) in that EDF. A brief description will be given here that pertains only to CIDA EDFs and TABLEs. Refer to /DOCUMENT/ONLABELS.TXT for more details, and to the PDS Standards Reference (available from http://pds.jpl.nasa.gov) for more general details about PDS labels and PDS TABLEs. Spectrum EDF structure ====================== There is one type of Spectrum EDF, so the layout of the sections of all Spectrum EDF data is identical: a PDS label; a single-row CIDA header table; a single-row geometry table; an 8192-row event (spectrum) table. Every table row represents a TOF spectral (mass) bin. Every table row is a sequence of printable non-control ASCII characters terminated by a carriage return (, ASCII 13 decimal) and line feed (, ASCII 10 decimal) character pair. The columns in each table are described in the *.FMT files pointed to by the PDS label. The spectrum (CIDA_EVENT.FMT) table has thirteen columns: five columns of raw (EDR) spectral data (in counts) from the four channels of the instrument (see Note 1); five columns of calibrated (RDR) spectral data (in picoCoulombs) from the same four channels (see Note 1); one column of the 'best' high-sensitivity columns; one column with at saturation flag; one column of interpreted, estimated mass for the spectral line. For example, the first ten columns look like this: 38, 16, -1, -1, -1, 6.947, 0.000,-9999.999,-9999.999,-9999.999[...] -1, 16, -1, 15, -1,-9999.999, 0.000,-9999.999, 0.000,-9999.999[...] ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | | | | | | | | | | | | | +--Low | | | | | | | | | Delayed, | | | | | | | | | pC | | | | | | | | | (Note 1) | | | | | | | | | | | | | | | | | +--Low Straight, | | | | | | | | pC (Note 1) | | | | | | | | | | | | | | | +--High Delayed, pC | | | | | | | | | | | | | +--High Straight, pC | | | | | | | | | | | +--Target, pC (Note 1) | | | | | | | | | +--Low Delayed, counts (Note 1) | | | | | | | +--Low Straight, counts (Note 1) | | | | | +--High Delayed, counts | | | +--High Straight, counts | +--Target, from Low delayed or straight (Note 1) The final three columns look like this: , 0.000,0, 0.023 , 0.000,0, 0.025 ^ ^ ^ | | | | | +--Estimated, interpreted TOF mass for the row, Da | | | +--Saturation flag for previous column (Note 2) | +--High Straight or High Delayed column, pC (Note 2) Note 1: The CIDA instrument multiplexes the Target signal onto one Low Sensitivity Ion Detector channel. For the first half of the multiplexed spectrum (4096 samples), the measured signal is actually from the CIDA target i.e. where the particle impact occurs. For the last half of the spectrum the measured signal of that channel is from the ion detector. All other channels measure signal from the ion detector only for all 8192 samples. In the EDF table, the multiplexed channel is separated into two columns. The first half (4096 rows) of the Low Sensitivity column, which is multiplexed, is set to MISSING_CONSTANT for the column (negative 1 (-1) for the EDR data and -9999.999 for the RDR data); the second half of that column contains the actual signal. In the same table, the first half of the Target columns have the actual signal extracted from the multiplexed data; the second half of the column is set to MISSING_CONSTANT. Note 2: The eleventh column contains calibrated data from one of the high-sensitivity COLUMNs. It is a duplicate of either the High Straight (pC) COLUMN if it is present (if raw COLUMN 2 is not -1), or the High Delayed (pC) column. The twelfth column is a flag indicating whether the data in the previous column are saturated. It will contain either zeroes or ones to indicate unsaturated or saturated, respectively. Because of the compression and data logging algorithm in the CIDA firmware, not all columns have valid data in them; sometimes only the Straight or only the Delayed data are stored and downlinked, and the COLUMN for the other channel contains only MISSING_CONSTANT values. This COLUMN picks the 'best' of the two columns and will never have the MISSING constant in it. Simple approach to reading a Spectrum EDF ========================================= Because of the pair at the end of each row, one way to get the rows for each table is to take the last 8194 lines from the file: the last 8192 lines are the spectrum mass lines; the 8193rd line from the end of the EDF is the geometry table; the 8194th line from the end of the EDF is the CIDA header table. The comma-separated-values (CSVs) in each table may be read into a spreadsheet or database application. It is also easy to read a spectrum programmatically. E.g. the following Python commands read the spectrum data (first four statements), then use the Matplotlib module to plot the RDR (calibrated) Low Delayed and RDR High Delayed signals: chdata=[] for i in range(10): chdata+=[[]] for row in open('T20110215043832C.TAB',"r").readlines()[-8192:]: for i,v in enumerate(eval('['+row[:-2]+']')): chdata[i]+=[v] import matplotlib.pyplot as plt plt.plot(range(8192),chdata[4],'g') ### Plot Mass vs EDR Target plt.show() General approach to reading a Spectrum EDF ========================================== The standard way to read PDS data is to use the information in the PDS label as follows: 1) Determine the pointer to the start of each table --------------------------------------------------- Search the PDS label at the start of the file for the following keywords: ^HEADER_TABLE = 2985 ^GEOMETRY_TABLE = 3237 ^EVENT_TABLE = 4016 The number after the equals sign (=) is the one-based ordinal byte position of, a.k.a. pointer to, the first byte of each table. These ^HEADER_TABLE, ^GEOMETRY_TABLE and ^EVENT_TABLE pointers refer to the CIDA header, geometry, and spectrum event tables, respectively. 2) Determine the extent (row width and number of rows) of each table -------------------------------------------------------------------- Search the PDS label for the following keywords in each table: OBJECT = EVENT_TABLE ^STRUCTURE = "CIDA_EVENT.FMT" ROW_BYTES = 62 ROWS = 8192 COLUMNS = 9 END_OBJECT = EVENT_TABLE Those keywords are the TABLE OBJECT (N.B. not all OBJECT keywords are shown in this example). The keywords have the following meanings: OBJECT Start of the OBJECT ROW_BYTES Count of bytes in each row, including terminator ROWS Count of rows in the table COLUMNS Count of data column (fields) in each row ^STRUCTURE File pointer to a FMT file describing each data field END_OBJECT End of the OBJECT The FMT file referred to by the ^STRUCTURE pointer is in the same directory as the EDF. 3) Determine the location of each data field in a row ----------------------------------------------------- Search the FMT file from the ^STRUCTURE keyword of the TABLE OBJECT for COLUMN OBJECTS; the FMT file will look similar to this: OBJECT = COLUMN COLUMN_NUMBER = 1 NAME = TIME DATA_TYPE = CHARACTER START_BYTE = 1 BYTES = 20 DESCRIPTION = "Event time" FORMAT = "A20" END_OBJECT = COLUMN OBJECT = COLUMN COLUMN_NUMBER = 2 NAME = VERSION DATA_TYPE = ASCII_INTEGER START_BYTE = 22 BYTES = 3 DESCRIPTION = "Software version" END_OBJECT = COLUMN N.B. First two COLUMN OBJECTs of the CIDA_HEADER.FMT file are shown as an example; select keywords were removed for brevity Each of these COLUMN OBJECTs describe the layout of one or more data fields in a table row. The keywords have the following meanings: OBJECT Start of OBJECT COLUMN_NUMBER One-based ordinal position of this field in the row NAME Name of the field DATA_TYPE Data type of the field (ASCII_REAL => floating-point) START_BYTE Ordinal byte position of start of the field BYTES Length of the field, bytes DESCRIPTION Description of the field FORMAT Recommended display format; FORTRAN FORMAT dialect END_OBJECT End of the OBJECT The first 25 bytes, and the first two COLUMNs of the corresponding data from the table data (starting at ^HEADER_TABLE pointer in the file) look like this: 2011-02-15T04:39:09 , 33, ^ ^^^ ^^ | ||| || | ||| |+-byte position 25, comma separating COLUMNs 2 & 3 | ||| | | ||| +--byte position 24, last byte of COLUMN 2, VERSION | ||| | ||+----byte position 22, first byte of COLUMN 2, VERSION | || | |+-----byte position 21, comma separating COLUMNs 1 & 2 | | | +------byte position 20, last byte of COLUMN 1, TIME | +-------------------------byte position 1, first byte of COLUMN 1, TIME HK EDF structure ================ Each HK EDF comprises two sections: a PDS label; a multi-row PDS TABLE. Each HK EDF covers some span of time; each row in its PDS TABLE contains data for a single time in that span. The TABLE in an HK EDF is similar to the Spectrum EDF table above, but each row is the concatenation of two CONTAINERs. A CONTAINER is an OBJECT that groups repeating sets of data fields (COLUMNs) together. In the realm of TABLE layout, CONTAINERs act like COLUMNs in that they have row space allocated to them per their START_BYTE and BYTES values, and they act like TABLEs in that they contain columns. The CONTAINERs in the HK EDF TABLEs appear similar to the following: OBJECT = TABLE ROW_BYTES = 1852 ROWS = 26 __ OBJECT = CONTAINER \ NAME = CIDA_HEADER_DATA \ START_BYTE = 1 \ BYTES = 250 \__First CONTAINER REPETITIONS = 1 / DESCRIPTION = "CIDA header" / ^STRUCTURE = "CIDA_HEADER.FMT" / END_OBJECT = CONTAINER __/ __ OBJECT = CONTAINER \ NAME = CIDA_CALIB_DATA \ START_BYTE = 251 \ BYTES = 1600 \__Second CONTAINER REPETITIONS = 1 / DESCRIPTION = "CIDA FDAQ [...]" / ^STRUCTURE = "CIDA_CALIB.FMT" / END_OBJECT = CONTAINER __/ END_OBJECT = TABLE The byte arithmetic is the same for the containers as it was for the COLUMNS in the Spectrum EDF example above. In the example given here, each row is 1862 bytes. The first 250 bytes contain COLUMNs as defined in CIDA_HEADER.FMT. The next 1600 bytes, starting at ordinal byte 251 of each row, contain COLUMNs as defined in CIDA_CALIB.FMT. N.B. For CIDA data, the REPETITIONS keyword in all CONTAINER OBJECTs is always one; the repetition is there for the CIDA_HEADER.FMT which is used for both the Spectrum and HK EDFs. The PDS-intended use for CONTAINER OBJECTs is when a sequence of columns repeats in a single row. N.B. As before, some keywords have been left out, and comments have been added to the right of the keywords. The first CONTAINER defines the first group of data fields in each row, and for all HK EDFs the first container uses the same CIDA_HEADER.FMT file as was used for the Spectrum EDF HEADER TABLE described above. The second CONTAINER determines the type of the HK EDF, the FMT file of its ^STRUCTURE keyword defines the meaning of the COLUMNs within it. There are four types of container, and four corresponding types of HK EDF: HK EDF type Filename prefix CONTAINER NAME ----------- --------------- -------------- Calibration parameters CALIB_ CIDA_CALIB Global variables GLOBALS_ CIDA_GLOBALS Housekeeping parameters HK_ CIDA_HOUSEKEEPING Interrupt/config. variable values KEEP_ CIDA_KEEPALIVE N.B. Although the terminology overlaps, there is a distinction between HK (HouseKeeping) EDFs (non-event data files containing one of three types of HK TABLE), and a Housekeeping parameters table (one of the HK TABLE types in the HK EDFs). EDF directory layout and filename conventions ============================================= EDF directory layout -------------------- Under the /DATA/ directory of this data set, the EDFs are located in subdirectories by type and target as noted in this table: Dir. & Subdir. Target EDF Type -------------- ------ -------- /DATA/NONSCI/ NON SCIENCE Spectrum /DATA/TEMPEL1/ 9P/TEMPEL 1 (1867 G1) Spectrum /DATA/ISP/ INTERSTELLAR PARTICLES Spectrum /DATA/ANCILLARY/ N/A HK See also the /DATA/ directory and subdirectory layout graphic below. Spectrum EDFs for which data were not consistent with a time-of-flight mass spectrum were assigned the target NON SCIENCE. All other Spectrum EDFs were assigned comet or particle targets based on the mission phase (Cruise or Encounter) extant at the time of the spectrum. N.B. This assignment was subjective, and it is possible that some Spectrum EDFs were mis-assigned. The HK EDFs have no target per se and are all under one directory. Filename conventions -------------------- All EDF names end in .TAB. Before the .TAB: 1) Spectrum EDF names start with a T, followed by the time as yyyymoddhhmmssx, where yyyy, mo, dd, hh, mm, ss are the year, month of year, day of month, hour of day, minute of hour, and second of minute, respectively. The final x is a letter (A-Z) to allow for multiple events within the same time to one second. 2) HK EDF names start with the type of HK table (CALIB, GLOBALS, HK, KEEP for calibration, globals, housekeeping, and keepalive EDFs, respectively), then start and end dates as Tyyyymodd, all connected by underscores. /DATA/ directory and subdirectory layout ======================================== /DATA/ | +--TEMPEL1/ Spectra EDF directory | +--Tyyyymoddhhmmssx.TAB - Spectrum EDFs, target=9P/TEMPEL 1 (1867 G1) | +--ISP/ Spectra EDF directory | +--Tyyyymoddhhmmssx.TAB - Spectrum EDFs, target=INTERSTELLAR PARTICLES | +--NONSCI/ Spectra EDF directory | +--Tyyyymoddhhmmssx.TAB - Spectrum EDFs, target=NON SCIENCE | +--ANCILLARY/ Housekeeping EDF directory | +--CALIB_T20050611_T20110214.TAB - Calibration parameters | +--GLOBALS_T20050611_T20110214.TAB - Global variables | +--HK_T20040213_T20110215.TAB - Housekeeping params, Cruise | +--HK_T20110215_T20110215.TAB - Housekeeping params, Encounter | +--KEEP_T20050611_T20110214.TAB - Interrupt/config variable values | +-- DATAINFO.TXT This file