***** File VOLINFO.TXT IHW Comet Halley Archive: Volume Description for Volumes 19-23, and a Brief History of the Generation of IHW CD-ROMs assembled by E. Grayzeck, Jr. and M. B. Niedner, Jr. [NOTE: To a certain extent, this file represents a combining of text files found elsewhere in the DOCUMENT directory of this disc.] Contents 0. Acknowledgements 1. Introduction 2. The Comet Giacobini-Zinner Test Disc 3. Production of Large-Scale Phenomena (L-SP) Compressed Image CD-ROMs (Volumes 1-18) 4. Production of the Mixed-Data Discs (Volumes 19-23) A. Depositing the Data on the NASA/GSFC Mass Storage System B. Quality Assurance Programme and Test Mixed Disc C. Filenaming Conventions D. Directory Structure and Size E. Time Ranges of Discs, Datafile and Directory Counts F. Contents of Supplemental (non-data) Directories 5. Data Descriptions A. PDS Data Objects used in the IHW Archive 6. Some Techniques in the use of Volumes 19-23 7. Suggested References SUPPLEMENTAL INFORMATION I. Ephemeris II. Subsampled Browse Images for the Large-Scale Phenomena Discipline III. Calibration Data for three IHW Disciplines A. Infrared Studies Discipline B. Large-Scale Phenomena Discipline C. Spectroscopy and Spectrophotometry Discipline APPENDIX: Data Formats A. FITS Format Information B. PDS Labels 1. Keyword Definitions 0. ACKNOWLEDGEMENTS (A Brief History of the Generation of IHW CD-ROMs) By now the story of the International Halley Watch (IHW) is well enough known that I need not attempt any recounting of its entire history as far as acknowledgements are concerned. Besides, that is not my place: Ray L. Newburn and Juergen Rahe, the IHW Co-Leaders, have already described in print (the so-called "IHW Summary Volume") much of what has transpired since the late-1970s and early-1980s in the world of IHW. These Acknowledgements are concerned with the final steps of CD-ROM preparation and production, steps which were largely taken by a handful of individuals at the NASA/Goddard Space Flight Center, working in collaboration with the IHW Lead Center (LC) at the Jet Propulsion Laboratory (JPL). Let those who are considering the assembly of a CD-ROM archive of this size (20+ volumes of data) be aware of this truth, which we have learned empirically: depending on the nature of the data and on the diversity of data types, the "job"--defined here as all efforts leading to the shipment of pre-mastered tapes to a CD-ROM mastering vendor--may not be close to completion once all the data have been received from the outside world. That was certainly the case in our situation. The simple truth is that if the goal is to create a useful archive, one that is replete with searchable indices, tables of interest, software, and lucid documentation, and, moreover, one which possesses a useful and efficient directory layout (or "CD tree"), then a very large amount of effort is required. It probably comes as no surprise to the reader that many revisions of plan are encountered along the way, as a scheme which once seemed promising now looks like the course NOT to follow. A point which cannot be emphasized enough is that for CD-ROMs, like IHW's, which contain a very large number of files and whose directories contain many types of data originally resident on so many different magnetic tapes, the data need to reside on a "mass storage system" immediately prior to ingestion into the "pre-mastering workstation" (a device which converts data and files to a format which a CD-ROM mastering vendor can use). In other words: transfer the original tapes into mass storage and organize the data there, either writing output tapes or streaming the data directly to the workstation by electronic means. One advantage of this approach is that it is readily adaptable to new technologies, such as 8mm exabyte tape and FTP file transfer. The other approach of creating multiply-interleaved magnetic tapes directly from many input tapes (i.e., without intermediate storage) is not only excessively time-consuming, but it is more error prone and less adaptable to repeat attempts if something goes wrong the first time. ------------------- That NASA/GSFC became so involved in these last steps of IHW archive production came about as a direct result of the points raised in the last two paragraphs. The brief history is this. During 1986-89, the IHW Large-Scale Phenomena (L-SP) Discipline, the digital data portion of which resided at NASA/GSFC, was engaged in sending standardized, FITS-formatted data to the JPL LC (as were all the IHW Discipline Teams). However, because of the enormous disparity between the average file size for L-SP data (approx. 15 Mb) and those of the other IHW Disciplines, it was decided in late-1987 that L-SP's contribution to the IHW CD-ROM archive would reside on dedicated discs, and, further, that in order to reduce the number of discs required the L-SP data would be compressed by a factor of not less than two-to-one. As a result of follow-on studies conducted by Archibald ("Archie") Warnock III and Barbara B. Pfarr, both of STX Corp. and serving, respectively, as Senior Software Specialist and Archive Manager for the L-SP Discipline Specialist Team, it was decided that "previous pixel compression" was not only conceptually simple to end users but would yield 2:1 compression. It became the technique of choice. A development parallel to these decisions about L-SP data was one concerning the manner in which the IHW data were to be pre-mastered for CD-ROM. Specifically, an agreement was reached between the IHW and NASA/GSFC's National Space Science Data Center (NSSDC) which allowed IHW's use of the NSSDC's pre-mastering workstation for the entire set of CD-ROMs. There were several factors at work here, among them the obvious desirability from a management/cost viewpoint of having a government facility (NASA/GSFC-NSSDC) directly involved in the pre-mastering. Not the least of the factors, however, was the desire to continue Dr. Edwin J. Grayzeck's (then of Interferometrics, Inc., and under contract to the NSSDC) connection with the IHW CD-ROMs. Ed had, for several years, been on my L-SP Discipline Specialist Team, and with time he had "branched out" into the larger arena of IHW CD-ROM production. The IHW, to which Ed had served as a consultant for CD-ROM work, knew of his worth to project. Indeed, many of us in the Discipline Specialist community received our "CD-ROM education" from Ed as a result of talks he gave at IHW meetings (Archie Warnock also possessed and communicated valuable CD-ROM expertise to the IHW). Returning to the subject of the L-SP data, it was felt that, due to the very unique nature of those data, the compression should take place at NASA/GSFC following the completion of the microdensitometry effort. In other words, this very discipline-specific task should be done at the discipline level. We felt that this was "one more thing" that should NOT be added to Mikael Aronsson (JPL/LC) and the LC's burden. Besides, our manner of data shipment to Mikael was one (uncompressed) image per magnetic tape, which had resulted in over 1,500 tapes shipped between 1986 and 1989. To ask Mikael, who did not have access to a mass storage platform, to run our compression code on files contained on 1,500 separate tapes, seemed "cruel and unusual." We offered to do the job at GSFC and to do whatever was necessary to get the files to Ed Grayzeck at NSSDC's pre-mastering workstation. It was at this point--the end of 1988 and the first half of 1989--that NASA/GSFC/L-SP's Dr. Daniel A. Klinglesmith III, working closely with John M. Bogert III (also of NASA/GSFC), made unique contributions to the L-SP effort which were to have great value later on with the entire IHW dataset. Dan and John transferred the entire set of uncompressed L-SP imagery to NASA/GSFC's IBM/3081 mass storage system (over 20 gigabytes of data), compressed the data there, then wrote the compressed datafiles to magnetic output tapes in chronological order (of observation date/time) and shipped them across GSFC to Ed. In the process of setting up this "system," John and Dan also created software which generated a set of on-line catalogs listing: every datafile, a subset of the more important FITS keywords associated with each, and the location of each file within the IBM "disk farm." At this point in the second half of 1989 we were, theoretically, ready to pre-master all 18 volumes of L-SP compressed images, but it was important to create a "test disc" to ascertain, not only if the data preparation, disc layout, and pre-mastering had been done correctly and intelligently, but also what type of CD-ROM "performance" could be expected of a high-quality mastering vendor. Toward this end, we (Ed, Dan, John, Archie, and I) created a "Halley Armada Test Disc" containing 80 compressed L-SP images spanning 1986 March 6-14 (Armada Week). The mastering vendor for this "one shot" venture was known to be at the top of the CD-ROM profession, and extensive testing of the resulting disc by us and an outside testing company confirmed the disc's high quality (low block linear error rates, etc.). As important, we liked the layout of the disc and decided to go forward with most of its features for the full set of 18 L-SP discs. ["Armada" was actually the second IHW test disc: the first one had been a disc containing IHW data on comet P/Giacobini-Zinner. The G-Z test disc--its history and purpose--is discussed more fully in Section 2 below]. [Something of an aside, perhaps, but I should nonetheless state that the drawing-up of technical specifications, the writing of a "Request for Proposal" (RFP), the actual selection of a CD-ROM mastering vendor, and the writing of the Contract, were all aspects of the IHW CD-ROM work which occurred at NASA/GSFC. By agreement between Ray Newburn and me, I was in charge of performing these tasks, including the judging of proposals and the awarding of the Contract (out of funds shipped from JPL to NASA/GSFC). My primary interaction in all of this was with the NASA/GSFC Procurement Office, and it is a pleasure to thank Ms. Cindy Tart; she was very patient with me (explaining the vagaries of government procurement) and was as interested as the IHW in securing the services of an excellent CD-ROM vendor. We were strongly guided by the high performance characteristics of the Armada Test Disc.] ------------------- Production of the 18 L-SP compressed image discs followed in fairly routine order, Ed Grayzeck and an assistant doing the actual pre-mastering from tapes created by Dan Klinglesmith and John Bogert. In the meantime, Mikael Aronsson at the JPL LC was working on the myriad of tasks required for preparing the datafiles of the other IHW Disciplines, datafiles which would reside on a shorter series of 5 "mixed discs." The idea was that Ed Grayzeck would receive from Mikael chronologically-sorted magnetic tapes on which six of the IHW Disciplines' data would reside interleaved; this would include uncompressed, subsampled "browse versions" of the L-SP images which we had shipped Mikael. Three of the IHW Disciplines were to have their data deposited on CD-ROM in different directory levels, and they could be separately treated. The tricky question was: how does one interleave over 16,000 datafiles from 6 sets of input tapes (one set per Discipline) without some form of mass storage? The answer, of course, is that if enough tape drives are available and if enough human intervention time is committed (for tape mounts, monitoring/correction of media errors, tape drive breakdowns, etc.), it can be done. At NASA/GSFC, we were concerned about the huge number of tasks which confronted the JPL LC (especially Mikael). The largest of these, undoubtedly, was the creation of interleaved datatapes in a many-tapes-to-tape operation involving about 100 input tapes. As a result of our L-SP work, which included all the tasks from initial archiving to actual disc production, we knew that the mass storage techniques developed by Dan and John were very powerful when applied to datasets like IHW's. I made an appeal to Ray Newburn, which was accepted, to have Mikael ship us the ENTIRE set of IHW data for ingestion into the NASA/GSFC IBM/3081 "mass store." In other words, the final steps of data preparation would take place at NASA/GSFC. It is important to state, however, that this transfer allowed Mikael to concentrate on many other tasks such as index construction, standardizing and re-formatting of Discipline Appendix files, etc. ------------------- Once the entire IHW dataset had been transferred and was on-line at NASA/GSFC, a tripartite decision was made in late-1990--by NASA/GSFC, JPL, and the Small Bodies Node (SBN) of the Planetary Data System (PDS)--to create a third IHW test disc, this one containing data from the entire IHW in much the same structure as envisioned for the so-called "mixed discs" [Michael F. A'Hearn is Node Manager of the SBN/PDS, and was a Discipline Specialist for IHW]. The emphasis here was not at all on testing mastering quality (the Contract having already been awarded), but on scrutinizing the characteristics of disc design and layout, these being, in contrast to the L-SP discs, extremely complicated discs. Further, there was the hope that any systematic problems with subsets of data might surface in disc review and be correctable before the final discs were made. In addition (and finally), this disc would test our ability to transfer files electronically to the pre-mastering workstation via FTP (the Armada Disc was assembled from output tapes written off the IBM mass storage device). The plan was not just to examine the disc ourselves, but to distribute copies to a handful (5-10) of outside reviewers. Also sent out for review was the earlier L-SP "Armada Week" test disc. Due to the exigencies of time, it was not possible to fabricate the "IHW Test Disc" exactly according to the mixed disc design. For example, PDS labels were not included in this second test disc, and the documentation and index tables were far from complete. Although our reviewers did point out these deficiencies to us, and had, in some cases, complaints about our decision to split off FITS headers from the data, they generally were quite favorable in their remarks about the test discs. It is a pleasure now to thank the following individuals, our "outside peer review panel": Drs. Anita Cochran (Univ. of Texas-Austin), Mike DiSanti and Susan Hoban (NASA/GSFC), Michel Festou (Observatoire de Besancon), Barry Lutz and David Schleicher (Lowell Observatory), Karen Meech (Univ. of Hawaii), and Al Schultz and Wayne Kinzel (Space Telescope Inst.). Disc reviews within the IHW community were performed by M. A'Hearn, M. Aronsson, E. Grayzeck, D. Klinglesmith, R. Newburn, M. Niedner, and A. Warnock. ------------------- This brings the story nearly up-to-date (i.e., October 1991). In the last 12 months a great deal of work has been expended at NASA/GSFC (in collaboration with the JPL LC) in: o managing an IBM on-line archive consisting of (approx. 3x) 37,700 datafiles (the FITS headers and PDS labels are distinct files separate from the data; "approx. 3x" because some files are "dataless", consisting of only headers and labels); o reviewing and revising the layout, or "CD tree," of the mixed discs; o writing software to analyze the temporal distribution of files across IHW disciplines, and creating CD-ROM data subdirectories of time widths which satisfy our chosen maximum number of files per directory, 256; o creating an intermediate "staging area" out of disk space on the Laboratory for Astronomy and Solar Physics' (LASP) VAX cluster, in order to build the contents of individual CD-ROMs (in other words, the electronic data flow was: IBM--VAX--workstation); o responding to calls by the Discipline Specialists for error correction of headers and data (hundreds of files across several of the disciplines), made possible by the headers/data being "on-line"; o creating searchable, delimited tables and indices from on-line headers and data; o generating PDS labels for all datafiles; o writing/editing of documentation to allow the archive user to understand the disc contents and layout; and o frequent checking of procedures and products. The above should be considered a partial list of the activities which occurred even after the IHW data were deposited on the IBM mass store in the late-summer of 1990. If it is appropriate to single out a particular individual within the last 8-12 months, then that person surely is Dan Klinglesmith, who has been extremely active in all phases of the work. This is not to diminish anyone else, however: we've all been very busy and are eager to move on to other things! I have truly lost track of the number of IHW "planning sessions" attended by Dan, Ed, Archie, and me, and I'm equally hazy about the number of e-mail messages swapped back and forth (it's LARGE, and includes those sent by Mikael Aronsson, Ray Newburn, and Mike A'Hearn). On it goes.... We are nearing the end now, however. Pre-mastering of the mixed discs will start in earnest in a matter of weeks at most, and should be completed in several months. Data preparation for the third series of IHW CD-ROMs, that of the "Space Data", is getting underway at the SBN/PDS, University of Maryland, under the direction of Ed Grayzeck and Mike A'Hearn. Malcolm B. Niedner, Jr. IHW Discipline Specialist for Large-Scale Phenomena Laboratory for Astronomy and Solar Physics NASA/Goddard Space Flight Center Greenbelt, MD 20771 USA October 2, 1991 1. INTRODUCTION The International Halley Watch (IHW) Archive of comet P/Halley contains tens of thousands of observations obtained by many international astronomers and scientists during the years 1981-89. The Archive's several components reflect the diversity of ways in which data and information may be disseminated to the scientific community, but of these the largest and most important is the set of approximately 25 compact discs (CD-ROMs) containing digital Halley data obtained from the ground, earth orbit, and in situ. The body of work required to produce these discs has very much been the core effort of IHW personnel, from the Discipline Specialists and their teams to the Lead Centers (LC) at JPL and, more recently, NASA/Goddard Space Flight Center. The IHW compact discs come in four subsets: o Compressed images from the Large-Scale Phenomena Discipline (Vols. 1-18) o Data from all "ground-based" IHW Disciplines (Vols. 19-23) o IHW "Trial Runs" on comets P/Crommelin & P/Giacobini-Zinner (Vol. 24) o In situ Halley data (Vols. 25-26) While the major purpose of this text file is to describe the contents of Volumes 19-23, which have sometimes been called the "mixed discs" due to their multi-Disciplinary nature, an additional purpose is to present a short history of the effort devoted to producing the entire set of IHW CD-ROMs. It is worth naming here the nine IHW Disciplines: Astrometry, Infrared Studies, Large-Scale Phenomena, Meteor Studies, Near Nucleus Studies, Photometry and Polarimetry, Radio Studies, Spectrometry and Spectrophotometry, and Amateur Observations. ----------------- As the International Halley Watch (IHW) became a reality during 1980-81, it became obvious that distribution of images in any digital form would be a problem because of the enormous amount of data involved. Since the IHW was producing an archive, there was no need to use a medium that the could be overwritten. What was needed were longevity, accuracy, speedy access, and a standardized format for which inexpensive playback equipment was readily available. Cost and ease of production were clearly factors. Commercial laser discs were tested by the Planetary Data System (PDS) for storage, as was the compact disc being promoted for the audio market. In the production of audio CDs, Philips and SONY reached an agreement on the physical structure of discs. The so-called Red Book described the size of the disc, placement of center hole, useable area, and encoding of the data. SONY and Philips also realized the potential for this medium to store other digital data for distribution if the error correction could be improved. Using a layered EDC/ECC scheme to improve upon the standard error correction code called CIRC (Cross Interleaved Reed-Solomon Correction) by 10000 times meant that character, tabular, and image data could be archived on CD-ROM. Eventually, a Yellow Book was generated which described the physical encoding of these data in the same structure as audio CDs, i.e., 2048 byte blocks with 304 bytes for housekeeping. Typical error rates indicate only one lost bit per 2000 discs. The use of the constant linear velocity (CLV) recording format provides maximum data packing but has the disadvantage of slow access times when compared to other media using the constant angular velocity (CAV) approach. Access time usually includes the changing speed for the disc, the radial movement of the laser diode which requires a settling time, and the location procedure that often demands a full rotation of the disc. Current players have reduced the access times to under 400 msec, or a factor of 4 slower than typical magnetic hard disks. Coupled with the low transfer rates set by the audio requirements (150 KB/s of useful data), this means that the placement of data on the CD-ROM requires a strategy for efficient use. However, these disadvantages are outweighed by the low cost of this medium and its longevity as an archiving tool. When the CD-ROM technique became accepted as a digital storage medium, a number of vendors attempted to write application software, primarily for PCs. This resulted in proprietary formats which quickly became non-standard. At about this time, Microsoft organized an informal working group that developed a logical structure then called the High Sierra proposal. Eventually, this resolution was modified and has been documented as the International Standards Organization 9660 format. At this writing, even those vendors with proprietary formats such as UNIFILE (DEC) and HFS (APPLE) have announced their support of that standard. In the PC market, Microsoft has supported an extension to MS-DOS which is supplied in its 4.0 operating system. The main advantage of this logical structure is that there are well defined rules for volume descriptors, placement of files, and record structures. Descriptors in the data area identify the volume, establish a character set, locate the path table, and indicate the presence of boot records. Data are located by logical sectors (2048 byte blocks) or a finer division into logical blocks (minimum 512 bytes). The path table provides a quick means to point at data since the structure is hierarchical as in MS-DOS. Finally, Extended Attribute Records (XARs) can be used to carry associated information about the record structure, key dates, global permission, and hidden files. The key to this standard is its three levels of interchange which span various machines and operating systems. In the lowest level 1, a file is a continuous byte stream spanning only one sector. Directory and file names are restricted to 8 characters with a 3 character file extension allowed. This level is designed for PC style machines but must be acceptable to drivers for higher levels. The advent of these standards has proved to be a major advantage to archivists. The low cost of the media and CD players, and the existence of widespread applications software insures that the data can be widely distributed. The longevity for optical media is considerably greater than more volatile magnetic storage and could rival such media as photographic plates. But the main disadvantage to this approach is that the CD-ROM is really a "publishing" medium. In the data preparation phase, an archivist has complete control over integrity and structure. In order to produce the CD-ROM, the data need to be shipped to a commercial vendor for actual mastering and replication. To insure that the organization of the data follows scientific standards, the "pre-mastering" phase is done by the archivist. In this way, the directories, path table, and layout of the disc, as well as customized application programs, can be tested on the complete data set. Once the integrity of the data is secure then final tapes in the ISO format are sent to a mastering facility. There the actual EDC/ECC is supplied, along with synch information to complete the pre-mastering phase. Creation of the IHW Archive has required several advances in data formatting and handling. Astronomical data transfer began to be standardized with acceptance by the International Astronomical Union of a system called FITS (Flexible Image Transport System). The IHW adopted this format, including an extension to FITS tailored for tabular material. The IHW had proposed and is using a further extension for compressed data; a standard similar to the IHW approach is under review by the FITS Working Group for Astronomical Software (WGAS). Meanwhile, the PDS has developed an independent system of formatting data which has some advantages over FITS. The IHW has included in the Archive detached PDS labels in order that the data can be accessed via readers for either format. The techniques for indexing CD-ROMs were developed by the National Space Science Data Center (NSSDC) and IHW for the IHW Archive, which includes data not only of comet P/Halley but also of comets P/Crommelin and P/Giacobini-Zinner. The software required to read CD-ROM data has been continuously developed by the PDS and has been made available to the IHW and the NSSDC. 2. THE GIACOBINI-ZINNER TEST DISC The IHW instituted a "trial run" on comet P/Giacobini-Zinner (G-Z), centered around the time of the International Cometary Explorer (ICE) encounter with the comet on 1985 September 11. The exercise's dual purposes were to support the ICE mission and to test the data flow paths and internal organization of the IHW. With the goal of making a test CD-ROM, in 1989 the data from the IHW G-Z Archive were brought to NSSDC and ported to the CD Publisher (at NSSDC) via 9-track magnetic tape. The directory structure originally envisioned for the Halley "mixed discs" was modified to include ICE data and Large-Scale Phenomena compressed and (subsampled) browse images. The actual pre-mastering process, i.e., the building of a CD-ROM "image" on the CD Publisher (also known as a pre-mastering workstation), was carried out in the batch mode in order that the disc could be iteratively tailored to a convenient layout. The premastered tapes were sent to Disctronics, a vendor chosen by IHW-LC. The official release of the discs and accompanying software by the IHW-LC took place in May, 1989. In addition to the CD-ROM, a floppy disk with modified IMDISP code and a user SHELL were included in the Beta release. A guide, which could be printed as ASCII text, was added to the floppy disk. The user SHELL was designed to be flexible, i.e., make use of existing astronomy software packages. The evaluation of the G-Z Test Disc (GZ_0001) was conducted by questionnaire, at meetings, and at a CD-ROM Workshop. Initially, a report form was designed and included with the CD-ROM distribution. As a follow-up, a poster presentation at the American Astronomical Society meeting in June 1989 was used to demonstrate the search capability of Database Management System (DBMS) indices. As part of this process, a CD-ROM Workshop was held at NSSDC later that same month. The CD-ROM Workshop focused on the premastering workstation at NSSDC. Through a series of talks, the entire process was outlined and procedures for its use were proposed. In addition, the general topic of ISO format guidelines and even label art was discussed. The summary document from this first CD-ROM Workshop in the NASA environment could be used as a set of guidelines for other technical and government agencies involved in CD-ROM production. Finally, the participants were able to share experiences from many different projects involving data types and formats that would be used in the future. It became clear that the contents of the G-Z Test Disc must be modified for the Halley Archive to include NSSDC guidelines. A subsequent meeting attended only by IHW participants took place immediately after the CD-ROM Workshop. The Minutes from that session have formed the guidelines for the current design of the full Comet Halley Archive, which included some additional background steps leading up to the eventual premastering of the compressed image discs. An important change involved the choice of new filenames to reflect the Discipline and sub-Discipline, and to keep a chronological running count of the number of files throughout the Archive. The G-Z data have been included as part of the Comet Halley Archive on a separate disc (HAL_0024) using the new filenames and directory design. In other words, Volume 24 is a "remake" of the initial G-Z Test Disc, done according to the new design guidelines established by IHW. Also included on Volume 24 are IHW observations of comet P/Crommelin. 3. PRODUCTION OF LARGE-SCALE PHENOMENA (L-SP) COMPRESSED IMAGE CD-ROMS (VOLUMES 1-18) The next phase in IHW CD-ROM production addressed the voluminous set of Large-Scale Phenomena digital imagery which, even in compressed form, was projected to occupy 18 discs. Because the decision had been made to deposit the L-SP images on dedicated discs (separate from the other Disciplines' data), and also because their homogeneity permitted a relatively simple directory structure, it was felt that building the L-SP compressed image discs would be considerably more straightforward than the same exercise applied to the so-called "mixed discs." Moreover, the L-SP Discipline Specialist Center was at NASA-GSFC, and proximity to the NSSDC (also at GSFC) was a distinct advantage from the standpoint of data transport. Initial work included definition of the method for producing the IHW CD-ROM set for a wide array of platforms. Expertise was developed using the SUN, MicroVAX, MAC, and PC to access CD-ROM data. A large number of players (SCSI,Q-bus,PC-bus) were swapped among machines to develop a working knowledge of ISO constraints and a testbed of systems for evaluating the first CD-ROM. An immediate concern was the current implementation of Extended Attribute Records (XARs) to describe variable length files in the DEC environment, a problem that still exists. It was concluded that for the IHW CD-ROMs, no XARs would be included, but that the text and data would be presented in fixed length format with instructions on the conversion procedure for a VMS system. In designing the L-SP compressed image CD-ROMs, much effort went into visualizing the characteristics and disc structure for the entire set of IHW CD-ROMs, and making the L-SP discs consistent with that total set in terms of filenaming conventions, directory layout, documentation and software provided, etc. For example, with the exception of calibration data, the numerical portion of filenames was set to be rigorously chronological (by time of observation), beginning at 0001; calibration data are also (internally) chronological, but their numerical filenames start at 4001. Although, in terms of imagery, the L-SP dedicated discs were to contain only digital datafiles (and associated headers and PDS labels), it was decided that the filenaming would also reflect those images which the L-SP Team had received but not digitized. So, for example, the "first" digital L-SP image is not LSPN0001, but LSPN0059. The initial real test of these guidelines and techniques took place when a second test disc was fabricated which contained a set of selected L-SP images from the time period 1986 March 6-14, when comet Halley was encountered by an "armada" of spacecraft. Unlike the first test disc which showcased the "design", this project was meant to streamline the process of both reformatting the data (on the pre-mastering workstation) and describing them by adequate metadata. A number of critical discoveries were made as part of the review process. First, the building of a CD-ROM "image" on the workstation's MS-DOS partition was not needed if the input tapes were thoroughly verified. Second, it was realized that if care was not exercised in choosing the disc-to-disc "splits" for the full series of L-SP CDs, it would be quite possible to split (unintentionally) the final image on a disc, if that image had been digitized in two or more segments. Third, it was found that the pre-mastering environment is not the place to be modifying metadata, and that such changes on the workstation should be kept to an absolute minimum. This second test disc, known informally as the "Armada Disc", did have some errors in labels and text files; such problems were deemed both acceptable for a test disc and instructive for the future. With an eye toward optimizing the pre-mastering process for the full run of 18 L-SP discs, there was from this point on a conscious move to organize the incoming data on magnetic tapes on a per-disc basis, and to minimize the number of documentation changes which needed to be made from one disc to the next. Specifically, the text files that did not change were held on a master floppy disk; only three were updated (AAREADME, CDTREE, and VOLUME) from a master table composed in the IHW log. Similarly, the PDS labels were made as a set, corrected, and held on floppy disks for each CD-ROM volume. Only those index files specific to each volume changed (CDSTRUCT, EPHEM, NETLARGE, PATHTABL); care was taken to correctly reflect these changes in the FITS headers accompanying those files. A production schedule was instituted that called for composing about two discs per week of scheduled pre-mastering workstation time. This was begun immediately after the test run of selected images for the Armada Week were reformatted. Each disc was checked in three steps: the display of all BROWSE images, and a random check of compressed comet as well as calibration images. A file count for each disc was kept and verified before the output tapes were written. A set of structure files, called CDSTRUCT, were composed to provide a listing of physical locations of each data file. These files were constrcuted at the very end and inserted into the CD-ROM "image." The output was essential to providing the path and filenames for the SFDU inventory file termed VOLDESC.SFD, which is described later. The set of 18 compressed L-SP data discs was pre-mastered at NSSDC by July, 1990, and one disc was mastered that October. A new technique which emerged during this production was the splitting of workstation disk space into separate ISO partitions to increase speed. One of the more important new ideas was to utilize as much of the extra storage space on the last volume (Vol. 18) as possible, by depositing the full run of 1,612 subsampled browse images into 18 separate "volume subdirectories" (path=\SUMMARY\BROWSE\ HAL_00nn, where HAL_00nn is the volume number). In addition, a full PATHTABL index was generated which included each datafile in the entire set of 18 discs, and an ERRATA directory was set up in which errors grouped by volume (HAL_00nn) were enumerated (and replacement header or label files presented). It was felt that this type of "summary use" of any extra disc space on the last volume of a subset of discs might also be possible with Volume 23 (last of the "mixed discs"). As the project progressed, software was developed to construct various levels of metadata, beginning with PDS labels and including SFDU pointers to specific reference documentation. Following guidelines provided by PDS and CCSDS, a VOLDESC.SFD inventory file was created for each disc including the final summary volume. Working with NSDSSO, a procedure had been developed to design reference files to self-document the disc and then provide an inventory of pointer files for the data. The original code was developed under C on a mainframe, and ported to the pre-mastering workstation. After a number of iterations and modifications, a series of steps to provide this inventory was streamlined into a software package now available at NSDSSO. The entire set of L-SP discs (HAL_0001-HAL_0018) was premastered according to standards proposed by NSSDC for disc structure (including subdirectories), conformance to the ISO 9660 standard (Volume Descriptor table), and disc art, e.g., full VOLUME identification. Many of these guidelines were introduced at the CD-ROM Workshop in June of 1989, and follow-up discussions took place which addressed unresolved questions. At the subsequent meetings, CD-ROM manufacturing processes were scrutinized as was the longevity of this archive medium. During the CD-ROM testing phases of the IHW project, we defined and implemented procedures to evaluate the quality of test discs. This included not only full-disc "read checks" on the assortment of CD-players resident at NASA/GSFC and the NSSDC, but also a sophisticated set of electronic tests conducted under contract at an off-site facility. 4. PRODUCTION OF THE MIXED-DATA DISCS (VOLUMES 19-23) A. Depositing the Data on the NASA/GSFC Mass Storage System As described in the Acknowledgements (Section 0. & ACKNWLDG.TXT), production of the L-SP test disc, as well as Volumes 1-18, had shown the value of mass storage techniques for the creation even of relatively straightforward discs. In the so-called "mixed discs" (Volumes 19-23), the IHW was facing a rather different matter. Whereas the L-SP disc series contained 1,612 independent datafiles on 18 discs, the mixed discs were projected to hold > 37,000 datafiles on only 5 discs. Given the IHW decision to create a "main data directory level" which would contain interleaved observations from six of the professional Disciplines (and all their subdisciplines), the task of organizing split data, header, and PDS label files into the proper directories was at the very least a daunting one. The total set of IHW data, which had originally been shipped by the Discipline Specialists to the JPL Lead Center, and which had been so carefully checked there by Mikael Aronsson, was therefore sent to NASA/Goddard Space Flight Center to reside on code 930's IBM3081 Mass Storage System, where they would be organized, verified, and ultimately transported to the NSSDC's pre-mastering workstation (also at NASA/GSFC). At a glance, the Disciplines and subdisciplines whose data came together on the "main level" on these mixed-data CD-ROMs are as follows: Infrared Studies Subdisciplines: 2.1. Infrared Photometry. 2.2. Infrared Polarimetry. 2.3. Infrared Spectroscopy. 2.4. Infrared Imaging. 3.0 Large-Scale Phenomena Discipline (browse images & dataless headers) 4.0 Near Nucleus Studies Discipline Photometry and Polarimetry Subdisciplines: 5.1. Broadband Photometry. 5.2. Narrowband Photometry. 5.3. Polarimetry. 5.4 Stokes Parameters. Radio Studies Subdisciplines: 6.1. Hydroxyl Feature at 18 cm. 6.2. Spectral Line. 6.3. Continuum. 6.4. Occultation. 6.5 Radar. 7.0 Spectroscopy and Spectrophotometry Discipline The data from the Astrometry Discipline and the Amateur Observations Discipline (4 subdisciplines) were deposited one level lower (and off the main level), and data from the Meteor Studies Discipline was placed in a dedicated directory on Volume 23. More details on the overall organization of data on these mixed discs are to be found in Section 3. of the file HALGUIDE.TXT. Transfer of all IHW datafiles to the IBM3081 involved the creation of, among other things, a series of on-line catalogs which contained some of the more important FITS keywords associated with each file (such as date, time, filenumber, system code, and filename). The catalogs also listed the IBM "partitioned data set" for each file, which gave the location of the file on the IBM "disk farm." These preliminary steps were crucial to the organization of the data: the catalogs would be the basis for creating a time-ordered stream of (multi-Discipline) datafiles to the pre-mastering workstation. B. Quality Assurance Programme and Test Mixed Disc In the process of depositing the data in the Mass Storage System, an initial check was performed on the size of each datafile. It was found that in 101 cases, the file size did not agree with that expected on the basis of the FITS keywords NAXISn and BITPIX. While this was not considered a high failure rate, it did stimulate a broader effort at Quality Assurance. The steps taken, described briefly, were these: Across all Disciplines: - datafile size agrees with header axis information (naxisn x bitpix/8)? - duplicate keyword values for FILE-NUM? - check for completeness of FITS header, look for Keyword = END For the individual Disciplines: - consistency of independent variable (naxis1) - consistency of additional variable (e.g., naxis2, naxis3, naxis6) - consistency of dependent variable (BUNIT) description - consistency of DAT-TYPE with INSTRUME - consistency of SYSTEM with OBSVTORY - consistency of TIME-OBS with other time parameters, e.g., EXPOSURE A relatively small number of errors were found as a result of these checks, and the procedure was to inform the appropriate Discipline Specialist of the problem (or of our questions) and then take corrective action. In some cases, once the IHW Discipline Specialist community became fully aware of the file-editing ability afforded by the data being on-line at NASA/GSFC, changes were made in data and/or header files at the Discipline Specialist's instigation. The IHW Team at NASA/GSFC felt that, while its own efforts at Quality Assurance and Disc Concept Review were productive, it was important to have a a sample of potential end users of the IHW Archive examine a fraction of the data, deposited on a third test disc in more or less the structure and layout envisioned for the full set of "mixed discs." This final test disc exercise resulted in a disc which spanned the period 1986 February 9--April 15. The test disc was "partial" in the sense that the documentation and indices were quite fragmentary, and PDS labels were not included. Several reviewers found errors in datasets he or she had submitted to the IHW years before, and corrections were made to those. It is to be noted that this third and last IHW test disc was the first one we generated which utilized FTP electronic file transfer from the Mass Storage System to the pre-mastering workstation. Further comments on the results of this test disc are to be found in the Acknowledgements section. C. Filenaming Conventions Filenaming has been described elsewhere in the DOCUMENT directory (cf. Section 5 of HALGUIDE.TXT and FITSFORM.TXT). Briefly here, however, it should be said that filenames have three parts: a 3-5 character string identifying the Discipline/Subdiscipline, a running number which is chronologically ordered within Discipline/Subdiscipline, and a file extension. The number portion of the filename begins at 0001 for Halley observations and at 4001 for calibration object datafiles. A table linking the character codes with the actual subdiscipline names is given in Section 5 of HALGUIDE.TXT. D. Directory Structure and Size We have restricted directories to a reasonable number of files while allowing enough information for useful browsing; 256 was adopted as the desired maximum number, which includes datafiles, headers, and PDS labels. Given the large variation of the temporal density of IHW observations throughout the apparition, the "reasonable N" < 256 criterion resulted in directories widely divergent in duration, as is discussed below. For the "main data levels" described earlier (containing data from 6 of the professional Disciplines: Infrared, Large-Scale Phenomena, Near Nucleus, Photometry, Radio Science, and Spectroscopy), the naming scheme for the lowest level directories is as follows: Y19xx\Myy\Dzz\Haa\NETNfile.ext , where xx ranges between 81 and 89 yy ranges between 01 and 12 zz ranges between 01 and 31 aa ranges between 00 and 21, in increment of 03H For times during the apparition when the density of observations was relatively low, data are placed in directories whose names do not contain the full assortment of time parameters. For example, all observations for 1983 were deposited in one directory (name: Y1983), whereas for 1986 April there were many days which required directories only 3 hours wide (sample directory name: Y1986\M04\D10\H18). The smallest hourly subdivision is, in fact, 3 hours (03,06,09,....hours UT). No subdirectory was created for days on which data were not submitted. Across the entire set of ground-based data discs (Volumes 19-23), the typical file count in a directory is 50, and the average byte count is 1.0 Mbyte. There are four additional sets of Halley data, which are located elsewhere. Astrometry observations from the 1985-86 apparition, and Amateur Observations, are placed one level below the "main data level." Generic directories are as follows (all Amateur subdisciplines shown): Y19xx\Myy\Dzz\Haa\ASTROM\ASTRfile.ext Y19xx\Myy\Dzz\Haa\AMDRAW\AMDRfile.ext Y19xx\Myy\Dzz\Haa\AMPHOTO\AMPGfile.ext Y19xx\Myy\Dzz\Haa\AMSPECTR\AMSPfile.ext Y19xx\Myy\Dzz\Haa\AMVIS\AMVfiles.ext The other two sets of Halley data are for Meteor Studies, whose data are located in a dedicated directory on Volume 23; and Astrometry, "historical data" from 1835 and 1910 being placed in the AST_HIST directory on all 5 mixed discs. Finally, it should be noted that some Disciplines submitted supplemental (mostly calibration) data which include filter tables, non-comet images, flat fields, and laboratory spectra. These are in the CALIB or IR_FILTR subdirectories of Volume 23. As mentioned earlier, the numeric portion of all calibration filenames begin at a higher number (4001) than those of the Halley datafiles (0001). A listing of the entire directory structure of this disc is given in the document CDTREE.TXT although, for brevity, the data directories have been highly abridged and are only meant to be representative. If the Archive user wishes to see the entire data directory structure, he or she should examine the text file DATATREE.TXT. E. Time Ranges of Discs, Datafile and Directory Counts The table below shows, for Volumes 19-23, the important entities named in the title of this section. Perhaps the table is as good an indication as any, both of the enormous size of the IHW Archive (by file count as well as Megabytes) and of the high data density afforded by compact discs (> 10,000 datafiles on one disc alone). Basic Properties of Volumes 19-23 | Number of | |___________________________| Volume Start | | Stop | MB files dirs | |___________________________| 19 \Y1981\ | | \Y1985\M12\D08 | 440 10,902 679 | | | 20 \Y1985\M12\D09\ | | \Y1986\M02\D09\ | 410 8,885 452 | | | 21 \Y1986\M02\D10\ | | \Y1986\M04\D13\ | 540 8,820 470 | | | 22 \Y1986\M04\D14\ | | \Y1987\M04\D03\ | 560 8,416 624 | | | 23 \Y1987\M04\D04\ | | \Y1989\M04\ | 440 692 87 | |___________________________| The actual number of files will be about 3 times larger because the table does not include the header and pds label files when digital data are present. There are 5,468 dataless headers in the Archive, so a true count of the total number of split data, header, and label files for P/Halley is given by (2*5,468 + 3*32,247) = 107,677. The directory files themselves are not included in the above totals, nor are 59 Meteor Studies datafiles on Volume 23, which are not placed with the other types of IHW data but rather in their own dedicated directory. By Discipline, the number of datafiles breaks down as follows: NUMBER OF FILES FOR EACH IHW DISCIPLINE --------------------------------------- Discipline Number Astrometry 6477 Infrared Studies 498 Large Scale 3383 Meteor 59 Near Nucleus 3523 Photometry 3436 Radio Studies 1950 Amateur 15150 Spectroscopy 3368 ----- 37844 The above file counts include 68 calibration files as well as the 1835 and 1910 Astrometry tables (one file each); this explains why 37,844 is not equal to the sum of the individual disc file counts (in the first table) + 59. The reader is referred to the file VOLSET.TXT, which gives a breakdown of the number of files contributed by each IHW Discipline on each of the mixed discs. F. Contents of Supplemental (non-data) Directories There are four directories (DOCUMENT, EPHEM, INDEX and SOFTWARE) on this disc that contain supplementary files. The DOCUMENT directory contains text files that give the background to this CD-ROM project, present a general guide to its use, and detail experience with previous CD-ROM products, including a test disc of comet Giacobini-Zinner data (also archived by the IHW) and two test discs of Halley data. A discussion of the FITS and PDS formats and the metadata used specifically for the Halley data is located in the files FITS_IHW.TXT and PDS_IHW.TXT. HALGUIDE.TXT (and IMAGUIDE.TXT on Volumes 1-18) is meant to serve as general overviews of the discs and their contents. Documents in the APPENDIX subdirectory, written by the IHW Discipline Specialists, contain information on data collection, subsequent processing steps, and archiving techniques, at the Discipline level. Although we have tried both to keep the documents to a reasonable number and to minimize duplication of information, we are aware that the number of text files is large and there is some overlap between files. Our attitude has been that the Archive user should not have to hunt endlessly to find information, and that it might therefore be advantageous to have some key pieces of information repeated in several places. In the INDEX directory, tables of useful information have been indexed in various forms in order to allow automated searching of the data. The QUIK.IDX index contains a selected set of mandatory FITS keywords from all Disciplines. On each of Volumes 19-23, QUIK.IDX includes only the observations on that disc. Volume 23 has an additional, "summary quick index", QUIK_SUM.IDX, which includes all observations contained in Volumes 19-23; the last field in QUIK_SUM.IDX includes the Volume number. A set of tables in the subdirectory NETABLES contains the metadata/data from the proposed printed archive, organized by network and subnetwork and chronologically ordered in each index. In this subdirectory, also, are more complete indices of FITS keywords for five of the IHW Disciplines. The filenames (Disciplines) are: NETAMATV.IDX (Amateur Observations), NETLARGE.IDX (Large-Scale Phenomena), NETMETR.IDX and NETMETV.IDX (Meteor Studies), NETRADIO.IDX (Radio Science), and NETSPECT.IDX (Spectroscopy and Spectrophotometry). We constructed a separate index called PATHTABL.IDX to specify the full path to each datafile; these are organized by disc, and a summary version is contained on Volume 23. We attempted to make all index tables transportable to relational DBMS by delimiting the tables and providing structure (.STR) and dBASE-compatible (.DBF) files. Further information about IHW indices is contained in the file INDXINFO.TXT. The SOFTWARE directory contains source code and executables for display of imaging and spectral data, interpolation of ephemeris tables, reading of FITS tables, and manipulation of metadata. To be specific, IMDISP.EXE contains various utilities for manipulating visual data on image display devices; IMDISP was originally developed by the Planetary Data System (PDS) at the Jet Propulsion Laboratory (JPL), and has been augmented and improved by them and by outside users. The interpolation software is meant to be used on the EPHEM.TAB file in the EPHEM directory; the algorithm uses values of ephemeris data for 7 consecutive integral days to perform the interpolation. The Fortran source code is called OBSNTERP.FOR, which we have compiled and linked on VAX and PC computers; the resulting executables for VAX/VMS and MS-DOS operating systems are VAXNTERP.EXE and PCNTERP.EXE, respectively. Also provided on these discs is a "FITS Table Browser" called FTB.EXE, which was developed by the Astronomical Data Center (ADC) of the National Space Science Data Center (NSSDC). Several other support programs for manipulating the metadata--FITSUTIL, FITSXTND, FITS2TXT, and TXT2FITS--are also provided. The archive user should take note of the fact that on the L-SP compressed image discs (Volumes 1-18), additional source code and executables exist for compression and decompression of the large image files contained on those discs. 5. DATA DESCRIPTIONS The International Halley Watch agreed early in the project that all data would be submitted from the individual Disciplines to the Lead Center using the FITS format (Wells et al., 1981). When the decision was made to distribute this information on CD-ROM, it was determined that the data had to have even broader accessibility. For this reason the original FITS files, with contiguous headers and data, were split into separate files distinguishable by their filename extensions (.HDR for headers). The file sizes were preserved as multiples of 2880 bytes, allowing the original FITS byte stream to be recovered by concatenating the appropriate header and datafile. PDS labels were constructed to allow definition of the datafiles for the Planetary Data System. For each datafile there must always be an associated FITS header. In cases where no digital data had been supplied the .HDR file carries information about upper limits, values reported by observers, references gleaned from the literature, or the characteristics of data in analog form. The table below identifies these "dataless" files and provides the correspondence between file extension and types of data so that a concatenated file (.FIT) can be reconstructed. The convention for naming files on the IHW CD-ROMs was proposed by the Lead Center and NASA/Goddard Space Flight Center (GSFC) personnel to include a unique data qualifier for the data. Specifically, a set of letter codes was established to enable identification of the IHW Discipline/subdiscipline from the filename itself. A CD-ROM running number and file extension complete the filename (example: LSPN0059.IBG). A short list of this convention by Discipline and subnet (or experiment) is given below: PDS Object FITS Discipline Subnet File Extensions (description) NAXIS = Code ____________________________________________________________________________ text 1 Astrometry ASTR .dat .hdr .lbl fits_label (no data) 0 IR Studies IRSP .hdr .lbl table (filter) 0,2 " IRFT .tab .hdr .lbl table (photometry) 0,2 " IRPH .tab .hdr .lbl table (polarimetry) 0,2 " IRPOL .tab .hdr .lbl spectrum (filter) 2 " IRFC .dat .hdr .lbl spectrum 2 " IRSP .dat .hdr .lbl image 2 " IRIM .img .hdr .lbl fits_label (no data) 0 Large Scale Phen LSPN .hdr .lbl image(browse) 2 " LSPN .ibg .hdr .lbl image 2 Near Nucleus NNSN .img .hdr .lbl table (narrow band) 0,2 Photometry Polar PFLX .tab .hdr .lbl table (broad band) 0,2 " PMAG .tab .hdr .lbl table (polarization) 0,2 " PPOL .tab .hdr .lbl table (Stokes parameters) 0,2 " PSTOKE .tab .hdr .lbl fits_label (no data) 0 Radio Studies RSCN .hdr .lbl fits_label (no data) 0 " RSSL .hdr .lbl spectrum 1 " RSSL .dat .hdr .lbl spectrum (multiple) 1 or 2 " RSOH .dat .hdr .lbl spectrum (multiple) 2 " RSRDR .dat .hdr .lbl image 2 " RSCN .img .hdr .lbl image(multiple) 3 " RSOC .img .hdr .lbl spectrum (visibility) 6 " RSCN .dat .hdr .lbl spectrum (visibility) 6 " RSOH .dat .hdr .lbl spectrum (visibility) 6 " RSSL .dat .hdr .lbl spectrum 1 Spectroscopy SPEC .dat .hdr .lbl spectral image qube 2 " SPEC .dat .hdr .lbl image (spectrum) 2 " SPEC .img .hdr .lbl fits_label (no data) 0 Amateur Studies AMDR .hdr .lbl fits_label (no data) 0 " AMPG .hdr .lbl fits_label (no data) 0 " AMSP .hdr .lbl table (magnitude) 0,2 " AMV .tab .hdr .lbl table (radar) 0,2 Meteor Studies MSNRDR .tab .hdr .lbl table (visual) 0,2 " MSNVIS .tab .hdr .lbl ____________________________________________________________________________ A table linking the letter codes above and the subdiscipline names is given in Section 5 ('Filenaming Conventions') of the file HALGUIDE.TXT. Concerning the numeric portion of filenames, calibration files for IRIM, IRSP, LSPN, and SPEC begin at 4001, whereas the Halley data themselves for all disciplines and subdisciplines start at 0001. The above listing corresponds to unique datasets (called subnets by the IHW) except in two cases: visibility data in the Radio Science Network, and both Meteor Studies Subnetworks. In the former instance, interferometric data was submitted that covers three Radio subnets (OH, continuum, spectral line) but actually corresponds to one type of data called "UV Visibility." In the PDS formulation, these are grouped as "UV" observations. In the second case, the Meteor Studies Subnetworks (radar, visual) actually record the data on an event (meteor shower) related to Comet Halley but do not directly observe the comet. Each meteor stream is identified in the PDS formulation. The file extensions follow suggestions by the Planetary Data System (SPIDS v1.1; Martin et al., 1988) for tabular and image data. In addition, for IHW FITS, the original headers and data were split into separate files, with filename extensions as listed below. .DAT - other non-image and non-tabular data .FIT - original FITS file .HDR - FITS header records .IBG - data records for subsampled browse image .IMG - image data records .LBL - detached PDS stream format .TAB - table data records as ASCII There are five PDS objects in this archive: FITS_LABEL (header), IMAGE, TABLE, TEXT, and SPECTRUM; a LABEL occurs for each datafile. Files that remain in the original FITS form (extension=.FIT) do not have a PDS label. On Volumes 19-23, the only .FIT files are in the \DOCUMENT\APPENDIX\SOL_ATLS directory (3 files). These PDS labels are metadata (as headers describing data submitted to the archive). There has been no effort to duplicate the documentation contained in the full FITS headers because the PDS and FITS headers for a given datafile differ only in the filename extension. Instead we have attempted to use the power of the PDS label syntax to fully describe the data structures and thus gain access to software by that group. "Standards for the Preparation and Interchange of Data Sets", document version 1.1 (by Martin, T. Z., et al, Document D-4683, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA), was the primary reference to the Object Description Language (ODL) necessary to create the PDS labels. (We acknowledge R. Borgen and M. Martin, PDS-CN, JPL, for assisting the IHW through version 2.0 of the ODL implementation for SPECTRUM.) The basic PDS descriptors such as SFDU_LABEL, RECORD_TYPE, RECORD_BYTES, and FILE_RECORDS are explained in the SPIDS document. The RECORD_TYPE for all data files is FIXED_LENGTH. The PDS labels have been formed as fixed length (78 byte) plus an embedded CR and LF. A. PDS Data Objects used in the IHW Archive FITS_LABEL ---------- We have conformed with the PDS definition of a specific keyword to indicate the presence of a FITS header (the keyword TYPE = FITS) when the "data" object is a foreign label (FITS_LABEL). In FITS, if NAXIS=0, then no data records need follow, as in the case of an upper limit. The "dataless" headers can be recognized by the NAXIS value or the IHW keyword DAT-FORM = NODATA in the FITS header. The PDS label (with same filename but differing extension) points at the "header" file as its data object. As shown in the above table, the "dataless" header can occur for different types of data: images (LSPNNNN or IRIMNNNN), and spectra which can be ordered groups (IRSPNNNN) or standard (RSSLNNNN), and existing but not present data as for AMDRNNNN, AMPGNNNN, AMSPNNNN. IMAGE ----- In the case of images, we have included a new keyword describing the byte ordering of the data (MSB_INTEGER) required by FITS. In PDS, images (.IMG, .IMQ, .IBG) are in terms of LINES (FITS keyword NAXIS2) and SAMPLES (FITS keyword NAXIS1), given knowledge of the SAMPLE_BYTES (FITS keyword BITPIX), and are easy for the split files. The final form of the label for compressed images under v2.0 is under discussion. Unlike previous PDS efforts with compressed images, we chose not to compress the header (or label) and thus have included a keyword to describe the type of compression (ENCODING_TYPE = "PREVIOUS_PIXEL") used. The label for compressed images also contains information to permit software to skip over the data if the decoding algorithm is unknown (ITEMS, ITEM_TYPE, and ITEM_BITS). We use ODL to indicate various subclass structures for the data objects. An example of this is the DIFFERENCE modifier applied to IMAGE, yielding the keyword DIFFERENCE_IMAGE, which indicates that a processing step was applied to the original image. TABLE ----- In creating the TABLE descriptions we have found a good correspondence between the FITS and PDS syntax. For tables, the value of NAXIS2= ROWS, TFIELDS=COLUMNS, and NAXIS1=ROW_BYTES; in both cases, the default FORMAT is ASCII. We have attempted to describe the values in each column as a direct translation of the FITS header file; the data itself follows the FITS record format, i.e., ASCII characters with no delimiters and padded to multiples of 2880 bytes. The FITS data structures are currently supported by public domain software that will be distributed with the Archive. TEXT ---- The TEXT object (which is used for the Astrometry Discipline's data with extension .DAT) is an 80-byte fixed length record that contains only ASCII values. In the FITS formulation, the 80-byte records are strung together, typically as 4 or 5 "card" images with no delimiters and padded to fill the 2880 byte record structure. It can be recognized in the FITS formulation by the NAXIS=1 statement, which indicates that a byte stream follows usually carrying "text" description. SPECTRUM -------- The SPECTRUM class description was refined in v2.0 by working closely with the PDS group to ensure definition of data groups that included both uniformly spaced data (as a single array) as well as ordered groups of observations. From guidelines for dealing with the SPECTRUM data structure, we consider the spectra as tabular data (COLUMN, NAME, DATA_TYPE, START_BYTE, BYTES) which are binary. The independent variable (e.g., WAVELENGTH) is described by the keywords SAMPLING_PARAMETER_NAME, MINIMUM_SAMPLING_PARAMETER, SAMPLING_PARAMETER _INTERVAL, and SAMPLING_PARAMETER_UNIT. (There are special cases for Radio or IR data using Doppler VELOCITY, FREQUENCY, or FREQUENCY_OFFSET.) Another case is a table from the Infrared Studies Network of ordered sets of data, in which we interpreted the column of signal/noise or ratios as an associated ERROR. A NOTE about this nonstandard use is included in the labels for the appropriate datasets. We have also attempted to use the NOTE keyword to identify the contributing IHW discipline, subnet, and generic comments about the data. As in the situation for multiple images, we have subclasses for the spectra indicated by a modifier, e.g., LHC_POLARIZATION_SPECTRUM. A special effort was made to describe 2-dimensional spectra by working with the PDS to establish a SPECTRAL_IMAGE_QUBE object. The data are reduced measurements that have the slit oriented either along the tail or perpendicular to the tail of the comet. To capture the positional information, a vectorial notation was adopted for the SPECTRAL_IMAGE_QUBE that could allow for such observational selection. In cases where the derived units were non-standard, a text DESCRIPTION is embedded in the label. There was one case of a binary table that was used to describe the UVFITS data. A hybrid description (VISIBILITY_SPECTRUM), incorporating both the ordered sets and uniformly spaced data, describes this intermediate processing step in data reduction; the integer values are Complex numbers, not currently supported under PDS. 6. SOME TECHNIQUES IN THE USE OF VOLUMES 19-23 There are multiple access routes to the data on these discs, but perhaps the best approach in answering scientific questions is to: o search the various indices, some of which contain a large fraction of the total set of FITS keywords for a given IHW Discipline, o generate a list of filenames satisfying the search criteria, o browse the data using software such as IMDISP (which might eliminate some files from further interest), and o perform subsequent data analysis as needed using sophisticated astronomical data packages such as IRAF, AIPS, MIDAS, etc. Although it was not in the IHW "charter" to provide a complete set of software for manipulating the data on these compact discs, some software has nonetheless been provided which will be of use to the Archive user. On the Large-Scale Phenomena (L-SP) compressed image discs (Volumes 1-18), for example, a code called PCDECLSP (written in C) has been provided which will decompress the images on MS-DOS machines (PCs), writing an abridged FITS header and the decompressed image to a full FITS file. An important general point is that the software provided on these discs was written for a DOS environment since that is the operating system of the pre-mastering workstation. On Volumes 19-23, which includes this disc, the IMDISP package (v. 7.7, executable and documentation) will allow users working with PCs to display and manipulate some types of IHW data. It is envisioned that use of IMDISP will perform the "browse" function in the bulleted activities above. Many IHW datafiles are in the form of table (.TAB) files, which have been split from their headers (.HDR). Concatenation of headers and tables into full FITS files using the provided utility FITSUTIL will allow the use of a FITS table software package called FTB (for "FITS Table Browser"), which has also been included on these discs. These discs are replete with a wide assortment of index (.IDX) files, all of which are cast in the form of delimited tables. The reader is referred to the file INDXINFO.TXT in the INDEX directory, and to NETINFO.TXT in the INDEX\NETABLES directory, for further details on the types of indices provided. Each index is provided in a delimited form for import to DataBase Management Systems (DBMS) such as dBase; both dBIII+ and dBIV were, in fact, used to check and verify the various index files. We attempted to maximize the use of the index tables by including an associated FITS header (.HDR) for each which, in effect, records where the delimiters are in the .IDX byte stream. This feature of the headers allows the use of the FTB package on the concatenated .HDR + (delimited) .IDX file; note, however, that the .IDX file must first be padded before concatenation is performed. Experience with dBase packages indicates that the following steps would be generally useful in an index search: 1. First, set up for the index you wish to use. At this point, view the structure so that you can see the column headings. 2. Get the provided .IDX file and place it into the provided "dummy" dbase file of the same name (.DBF). 3. Use of the List (or equivalent) command gives columns of interest for conditions that depend on the range and type of search. The typical Sort function will permit choosing ranges by Boolean conditions. There are similar commands for other databse packages once the data have been imported using the structure file (or FITS header or PDS label) to correctly recognize the fixed width format of the fields. Once the search has been performed and a "report" is generated containing a list of filenames (with extensions such as .DAT, .TAB, .IMG, .IBG), it is time to choose the manipulation program. The simplest case is for 8- and 16-bit 2-d data (.IMG, .IBG), for which IMDISP is fully implemented both in the FITS and PDS options (see IMDISP.DOC file). You should use the Browse set of commands if the data are appropriate, i.e., can be quickly displayed. For the L-SP Discipline, a set of subsampled images has been constructed with this purpose in mind. This command can be used on 2-d data, but files larger than the L-SP browse images (maximum 256x256, 1 byte) will cost in display time (typically 20 s for a 512x512, 1 byte image). There are also Radio Science data that contain multiple images which can simply be displayed by pointing at the PDS label. In all cases, the display time is dramatically improved if the PDS label is used rather than the FITS option, which requires the original (embedded) data structure. If the datafile has the extension .DAT, then the choice within IMDISP can be to display the data using "plot", which is described in the on-line help. The detached PDS objects may read spectral_image_qube (a 2-d spectrum), spectrum (ordered values or multiple column matrix), and spectrum with qualifier (LHP_POLARIZATION_SPECTRUM). The FITS option requires that the data be in the original form, i.e., embedded headers with data. Consequently, FITSUTIL was developed to concatenate files in an MS-DOS environment (as well as split data for assembling the compact disc). Therefore at this point, branch to FITSUTIL and work on those files you chose to save on hard disk. The program (FITSUTIL) does not overwrite data, so you can check for consistency; however, this means that storage space will quickly get used as file copies appear. First use a full scale plot. Since the normal data range can be very large, a zoom query is allowed to pick a restricted range; use the normal cursor commands in IMDISP. Multiple spectra can also be plotted. All descriptive "textual" information can be displayed on the screen using various commands within IMDISP like LABEL. For example, use of the LABEL command on an ASCII file such as a header (.HDR) would simply type the text to the screen. Although the Astrometry data are identified by the PDS object TEXT, it is a continuous byte stream (as demanded by FITS) without the normal end of line characters (carriage retrun, line feed, or both). These data could be broken out as 80-byte chunks (card images) for further use, if desired. The FITS data loaded in the extended "tables" format can be accessed in a manner similar to the Astrometry data. FTB (for "FITS Table Browser"), one of the software modules supplied on this disc and discussed earlier, was written by the Astronomical Data Center at NASA/GSFC. FTB parses the table file (.TAB) "byte stream" of ASCII characters into a "tabular display." To run this program, however, the file must be concatenated to the original form, i.e., .HDR and .TAB file must be combined into a .FIT file while preserving the 2880-byte record format. A basic flow pattern for searching for interesting data, then browsing them, is given in the following diagram: Choose data for Display | | Search Index by DBMS: <---------------| | | - Create Structure | | Inspection - Load *.IDX via operating system | | - Isolate fields | | | - Save results in "report" | | | ------------------------------>| | | | check system setup | choose display function | ------------------------------------------------------------------------ | | | FITSUTIL(.IMG,.IBG,.DAT,.HDR,.TAB) | | concatenate,split | | | | | | | | | | | | | | | | | | IMDISP(.FIT,.HDR,.IMG,.IBG,.DAT,.LBL) | | Display, Browse, Plot | | | | | | | |-------------------.HDR+.TAB=.FIT---------->| | | FTB(.FIT) List,Edit 7. SUGGESTED REFERENCES King, J.H. and Grayzeck, E.J. "Minutes of the CD-ROM Workshop", June 19-20, 1989, NSSDC 89-11. Martin, T., Martin, M., Braun, M., Johnson, T., Davis, R., and Mehlman, R., SPIDS v1.1: Standards for the Preparation and Interchange of Data Sets, JPL D-4683: October 3, 1988. E. Grayzeck, Jr. D. Klinglesmith III Small Bodies Node of IHW Large-Scale Phenomena Planetary Data System Discipline Astronomy Program Laboratory for Astronomy Dept of Physics and Astronomy and Solar Physics University of Maryland NASA/GSFC, Code 684 College Park, MD 20742 Greenbelt, MD 20771 M. B. Niedner, Jr. IHW Discipline Specialist for Large-Scale Phenomena Laboratory for Astronomy and Solar Physics NASA/GSFC, Code 684 Greenbelt, MD 20771 SUPPLEMENTAL INFORMATION I. EPHEMERIS The geocentric ephemeris for 0h UT each day has been calculated by the Astrometry Network from the following set of osculating orbital elements (Astrometry Network orbit no. 61). The orbital solution was fit to 7469 astrometric observations over the interval from 1835 August 21 to 1989 January 9 with a weighted rms residual = 1.2 arcsec. Full planetary and nongravitational perturbations have been taken into account at each time step in the ephemeris computations. The angular elements are referred to the ecliptic plane and the equinox of 1950. Epoch of Osculation 1986 Feb. 19.0 TDT (ET) Time of Perihelion Passage 1986 Feb. 9.45895 TDT (ET) Perihelion Distance 0.5871036 AU Eccentricity 0.9672769 Argument of Perihelion 111.84656 deg. Longitude of Ascending Node 58.14339 deg. Inclination 162.23925 deg. Nongravitational Parameters and center-of-light/center-of-mass offset: Radial component, A1 +3.883 E-10 AU/(day)**2 Transverse component, A2 +1.554 E-10 AU/(day)**2 So (see explanation below) 851 km The nongravitational acceleration model (Style II) is described in the following reference: Marsden, B.G., Sekanina, Z., and Yeomans, D.K. Comets and nongravitational forces. V. In Astronomical journal, v. 78, 1973, p. 211 - 225. Because of rather systematic trends in comet Halley's orbit residuals during March - April 1986, it was necessary to model an observation bias to obtain solutions that fit the observations to the level of the data noise itself. However, it is not entirely clear whether the effect is instrumental or an actual displacement of the comet's photometric center from its center of mass. The comet's center of mass was assumed to be offset a distance (S) radially toward the Sun from the observed center of light. This measurement bias, S, varies as the inverse square of the heliocentric distance (r) and the expression was normalized to a heliocentric distance of one AU (i.e. at r= 1 AU, S = So). S = So/r2 This measurement bias was assumed operative during all three apparitions included in the orbit solution. The value of the parameter So resulting from solution No. 61 is 851 km. The following osculating orbital elements are consistent with orbit No. 61 for comet Halley. Using these orbital elements and the export version of the Astrometry Network's Two-Body Ephemeris Generation program, users can generate their own ephemeris information. If care is taken to use the set of orbital elements with the epoch of osculation closest to the desired ephemeris dates, the Two-Body program can generate ephemeris information that is equivalent to corresponding information in the perturbed ephemeris (to approximately the one arc second level of accuracy). Each set of orbital elements is in the same order as the elements listed above - the only differences being that the epochs of osculation and dates of perihelion passage time are given as Julian dates rather than calendar dates. The second line of each element set contains the calendar date corresponding to the epoch directly above it on the first line. *** P/HALLEY TWO-BODY ELEMENTS *** 2445200.5 2446470.32863 0.5852278 0.9675859 111.82385 58.10886 162.25637 1982 AUG 19.0 2445310.5 2446470.45296 0.5858829 0.9675453 111.80417 58.10083 162.25872 1982 DEC 7.0 2445430.5 2446470.57072 0.5864306 0.9675064 111.79191 58.09832 162.25950 1983 APR 6.0 2445540.5 2446470.69050 0.5869451 0.9674637 111.78220 58.09763 162.25970 1983 JUL 25.0 2445680.5 2446470.79138 0.5872224 0.9674243 111.78673 58.10574 162.25698 1983 DEC 12.0 2445840.5 2446470.88815 0.5874794 0.9673746 111.79348 58.11507 162.25353 1984 MAY 20.0 2445990.5 2446470.94022 0.5874862 0.9673322 111.80837 58.12618 162.24895 1984 OCT 17.0 2446070.5 2446470.95080 0.5873858 0.9673142 111.82000 58.13272 162.24593 1985 JAN 5.0 2446190.5 2446470.96022 0.5872995 0.9672880 111.83062 58.13796 162.24307 1985 MAY 5.0 2446185.5 2446470.95983 0.5873038 0.9672895 111.83011 58.13774 162.24321 1985 APR 30.0 2446275.5 2446470.96216 0.5871911 0.9672652 111.84044 58.14134 162.24063 1985 JUL 29.0 2446330.5 2446470.96064 0.5871307 0.9672605 111.84482 58.14232 162.23958 1985 SEP 22.0 2446375.5 2446470.95982 0.5871094 0.9672624 111.84616 58.14247 162.23925 1985 NOV 6.0 2446420.5 2446470.95925 0.5871015 0.9672710 111.84644 58.14247 162.23920 1985 DEC 21.0 2446515.5 2446470.95901 0.5871055 0.9672780 111.84688 58.14343 162.23928 1986 MAR 26.0 2446625.5 2446470.95965 0.5871410 0.9672928 111.85290 58.14647 162.24019 1986 JUL 14.0 2446730.5 2446470.96823 0.5871630 0.9673312 111.86639 58.15668 162.24171 1986 OCT 27.0 2446820.5 2446470.98321 0.5870762 0.9673555 111.87354 58.16592 162.24268 1987 JAN 25.0 2446935.5 2446471.01007 0.5869611 0.9673842 111.88703 58.18176 162.24389 1987 MAY 20.0 2447040.5 2446471.05245 0.5866979 0.9674136 111.89612 58.19769 162.24481 1987 SEP 2.0 2447145.5 2446471.10491 0.5863577 0.9674421 111.90371 58.21398 162.24552 1987 DEC 16.0 2447220.5 2446471.15008 0.5859881 0.9674632 111.90348 58.22335 162.24584 1988 FEB 29.0 2447325.5 2446471.19641 0.5855362 0.9674834 111.89906 58.23058 162.24603 1988 JUN 13.0 2447435.5 2446471.27010 0.5849150 0.9675144 111.89678 58.24318 162.24627 1988 OCT 1.0 2447525.5 2446471.30900 0.5843262 0.9675342 111.88097 58.24162 162.24626 1988 DEC 30.0 2447640.5 2446471.34163 0.5836317 0.9675556 111.86132 58.23841 162.24624 1989 APR 24.0 2447765.5 2446471.33194 0.5829334 0.9675705 111.83024 58.22380 162.24622 1989 AUG 27.0 D.K. Yeomans Discipline Specialist for Astrometry Jet Propulsion Laboratory 4800 Oak Grove Dr. Pasadena, CA 91109 II. SUBSAMPLED BROWSE IMAGES FOR THE LARGE-SCALE PHENOMENA DISCIPLINE Effective use of the 1,612 images contained in the IHW/Large-Scale Phenomena (L-SP) compressed image CD-ROMs requires that the user of the discs be able to "browse through the data" quickly to find those images and intervals which are of high scientific interest. Because of the long decompression and transfer times of the full-resolution images with current image display hardware, the goal of efficient browsing of the data can be met only if the images are placed on the discs at least a second time, in either subsampled or filtered form, and uncompressed. The browse images are actually stored in three places within the total set of IHW CD-ROMs. In addition to the subset of images stored in the BROWSE directory of each IHW/L-SP compressed image CD-ROM (HAL_0001 - HAL_0018), the entire set of 1,612 digital images exists on the last of the IHW/L-SP dedicated discs (HAL_0018) in the "volume subdirectories" of the BROWSE subdirectory of SUMMARY (sample path is SUMMARY\BROWSE\HAL_0006). The browse images are also interleaved with data from the other IHW disciplines in the daily data subdirectories on these "mixed data" CD-ROMs (HAL_0019 - HAL_0023). A "browsed image" is one that has been generated from the original uncompressed image. It has been subsampled and is no larger than 256 pixels in either dimension. In addition, the digital data have been scaled into a numerical range of 0 to 255 (one byte per pixel; the precision for most of the original images is 10 bits, requiring two bytes per pixel). The BROWSE directory of each L-SP compressed image CD-ROM contain datafiles, FITS headers, and PDS labels for the compressed images on that CD-ROM; this includes both images of Comet Halley and of calibration objects. Note that the 1,612 total L-SP digital images (1,439 of the comet, 173 of calibration objects) are deposited on multiple CD-ROMs "dedicated" to the L-SP imagery. The browse data were obtained by taking the "n"th row and column for the original image starting at row "n/2" and column "n/2". The value for "n" was determined from the larger of the two axes such that the quantity (original length / n) was less than or equal to 256. For the images which were digitized at GSFC, the original densitometer values ranged between 0 and 1023. The density values in the images were divided by 4 in order to convert the density to a single byte. For those images digitized elsewhere, the density scaling factor was chosen so that the density in the browse image was less than or equal to 255. The FITS header records for the browse images have had their astrometric information adjusted to reflect the change both in pixel spacing and image origin. Thus, should the user wish, (crude) astrometry can be performed with the browse images. In addition, HISTORY keywords have been inserted to document the linear scale and density scale changes. The creation of the browse images was accomplished using the program MIDGET, which can be found as MIDGET.FOR in the SOFTWARE directory of this CD-ROM. The filename extension for the files of the browse data (.IBG = image, .HDR = header, .LBL = PDS label) follow the IHW filename conventions. To reconstruct the original FITS byte stream, the .HDR and .IBG files for the appropriate observation should be concatenated. D. Klinglesmith III M. B. Niedner, Jr. IHW Large-Scale Phenonema Network (LSPN) IHW Discipline Specialist for Laboratory for Astronomy Large-Scale Phenomena and Solar Physics Laboratory for Astronomy NASA/GSFC Code 684 and Solar Physics Greenbelt, MD 20771 NASA/GSFC, Code 684 Greenbelt, MD 20771 III. CALIBRATION DATA FOR THREE IHW DISCIPLINES Supplemental (calibration) data from three IHW Disciplines were submitted to the IHW Archive as described in the subsections below. The calibration datafiles have been placed twice on these "mixed discs" (Volumes 19-23): once in the daily data subdirectories of the 5 discs, and once in a dedicated CALIB directory on Volume 23. Also in the CALIB directory are flat ASCII tables listing various parameters for the calibration datafiles, such as date, time, system code, and the Halley datafile(s) with which the specific calibration file is associated. These tables are also to be found on Volume 23 in delimited index form, in the INDEX directory. The calibration datafiles themselves are easily distinguished from P/Halley data by the numerical portion of the filename: it is > 4000. Finally, the L-SP Discipline calibration files resident on these mixed discs are, like the P/Halley images, subsampled browse images. The full calibration datafiles for L-SP are contained on the compressed images discs (Volumes 1-18) in compressed form, in the CALIB directories of those discs. For the most part, the write-up on the L-SP calibration data below (section B.) describes the situation on the L-SP compressed image discs. -------------- A. INFRARED STUDIES DISCIPLINE The columns of the calibration table and associated index are as follows: (1) Calibration filename, in ascending order, first for Infrared images (filenames of calibration data = IRIM4*.*), followed by Infrared spectra (filenames of calibration data = IRSP4*.*). (2) System Code of observatory/instrument/location combination; refer to OBSCODES.TXT in the DOCUMENT directory for a listing of Observatories, etc., or to the IRS_OBSR.TXT file in the DOCUMENT/OBSERVER directory. (3) Calibration Object. Names of sky objects are self-explanatory. (4) Date is in UT for the date of the calibration file. (5) Time is in UT day fraction of the middle of observation of the calibration file. (6) NAXIS1 specifies the number of values (e.g. pixels or columns) along the most rapidly varying axis. (7) NAXIS2 specifies the number of values (e.g. pixels or rows) along the second-most rapidly varying axis. (8) Associated Halley Filename is the filename of the Halley file for which the calibration was made. There is not a one-to-one correspondence as some calibration files applies to several Halley files and vice versa. B. LARGE-SCALE PHENOMENA DISCIPLINE Most of the plates and films of P/Halley submitted by LSPN observers to the Discipline Specialist were uncalibrated. However, those which were calibrated were done so in a variety of ways, and this reflects our different treatment of those data. Some observers provided calibration as sensitometer spots or step wedges on photographic plates distinct from the Halley plates they calibrate. As a rule we did not digitize these due to extreme demands on microdensitometer time, concerns about differing background density levels between the calibration and associated Halley plates, and microdensitometer zero-level drift between scans. However, in all such cases the existence of calibration has been noted in the FITS header of the Halley image in the keyword CALAVL (=T), as well as in the index table NETLARGE.IDX. Other calibrated plates had sensitometer spots and strips on the same plate as the comet, and these calibration data were regularly digitized by the L-SP Team. Most of the time, due to the (large) plate size or the location of calibration data on the plate, the calibration area was scanned separately from the comet; every effort was made to minimize microdensitometer drift and elapsed time between scans of the calibration and the comet. On the compact discs, such calibration scans have been placed in the CALIB directory, in separate files named LSPN4*.*, with the FITS keyword OBJECT='CALIBRAT' in the header (.HDR) file. The calibration tables and indices serve to link the calibration datafiles with the associated P/Halley image files. In some cases the calibration area on the plate was physically close enough to the comet to include it in the Halley scan. In these situations there is only one datafile and it follows the naming convention for Halley data (filename=LSPNnnnn.* with nnnn<4000). There are 46 occurrences of this type and they are listed in the section following Table 3 of the file CALIB.TXT on the compressed image discs (or CALIB_LS.TXT in the CALIB/LSPN directory on Volume 23). Finally, a small number of observers provided comet and calibration data to the L-SP Team already in digital form, and in separate files. In these situations the calibrating object is either spots and wedges on plates (OBJECT='PHOTOMET') or a sky object (e.g., OBJECT='NGC1817'). We have deposited these calibration data on compact disc as separate files (filename=LSPN4*.*), and hence the L-SP calibration tables and indices should be used to link them with Halley data. For all calibration data in the L-SP portion of the archive, the known information about intensity ratios for step wedges, exposure times, etc., is contained in the FITS headers associated with the calibration datafiles. The columns of the L-SP calibration indices are described below: (1) Calibration filename, in ascending order; compressed calibration data reside in the CALIB subdirectory. Uncompressed, sometimes subsampled calibration data reside in the BROWSE subdirectory along with subsampled Halley imagery (filenames of calibration data = LSPN4*.*). (2) System Code of observatory/instrument/location combination; refer to LSPNOBS.TXT in the DOCUMENT subdirectory for a listing of Observatories, Instruments, and System Codes, or to the delimited index LSPNOBS.IDX in the INDEX subdirectory. (3) Calibration Object. "CALIBRAT" and "PHOTOMET" refer to sensitometer spots or wedges on the Halley plates. Names of sky objects are self- explanatory. (4) Date is in UT for the date of the calibration plate. (5) Time is in UT day fraction of mid-exposure of the calibration plate. (6) NAXIS1 is the number of samples per line for the uncompressed calibration image. (7) NAXIS2 is the number of lines in the uncompressed calibration image. (8) Associated Halley Filename is the filename of the Halley image for which the calibration was made. There is not a one-to-one correspondence as some large Halley plates with calibration were scanned (by the L-SP Team) in two segments, resulting in two files. Moreover, calibrating sky objects were frequently photographed once for several Halley images. 'UNKNOWN' is used for a few situations in which both calibration and Halley data were submitted to L-SP in digital form, and the calibration/ Halley associations were not specifically stated by the observers. Nonetheless, we chose to include these calibration data in the archive. The Halley digital data reside in the BROWSE AND Cyymmmdd subdirectories. Malcolm B. Niedner, Jr. IHW Discipline Specialist for Large-Scale Phenomena NASA/Goddard Space Flight Center Greenbelt, MD 20771 C. SPECTROSCOPY AND SPECTROPHOTOMETRY DISCIPLINE Since spectroscopic observations require a variety of different types of calibrations, such as flux standards, flat fields, arcs, etc., most of which are specific to individual observations, the calibration files for the Spectroscopy and Spectrophotometry Discipline have been intermingled with the data. To find which observations correspond to which data, the user must search for the proper calibration files. Generally this requires searching for a type of calibration file, by the same observers on the same instrument at a time as close to the observations as is possible. This search may be accomplished by either searching the (FITS) header files directly, or by using the meta-databases provided. Note that most users submitted fully reduced digital spectra, as was preferred. These files would have no calibration files. In some instances, the observers did not submit all the calibration files which a user of the archive might wish. This might be due to an oversight on the part of the submitter, or, more likely, they didn't feel that that calibration was necessary. For instance, with extremely high resolution spectra, flux calibration is often impossible, and not particularly necessary. The data in this Archive are deposited into directories whose temporal widths vary widely. Calibrations, in general, will have been taken on the same date as the P/Halley observations. However, at times when the IHW data density is very high, the data directories are divided at intervals of 3-hour multiples; the result is that a calibration file may be in a directory adjacent to that containing the P/Halley observation. Calibration for spectroscopy have filenames of the type SPECT4xxx (where xxx is a number between 000 and 999). The calibration files and the observation files should have matching DIS-CODE keyword values (except for the last digit, which is a measure of quality). This may be used to search for the correct calibration files. E. Grayzeck, Jr. Small Bodies Node of the Planetary Data System Dept of Astronomy University of Maryland College Park, MD 20742 ----------------------------- APPENDIX: DATA FORMATS A. FITS Format Information All data were submitted to the International Halley Watch Lead Center on magnetic tape, written in standard FITS format. There are three primary references to basic FITS (Wells et al., 1981) and its extensions (Greisen and Harten, 1981, Harten et al., 1988). Although commonly viewed as a magnetic tape format, the actual FITS specifications can be interpreted to describe a general byte stream. As such, FITS files may be written on any storage medium, including CD-ROM. Note that there is no inherent record structure called for in the FITS agreements, only a blocking structure for block oriented media such as magnetic tape. The basic FITS agreements call for only a few required keywords (SIMPLE, BITPIX, NAXIS, and END must be present; EXTEND may appear; NAXIS1, ..., NAXISn appear as defined by the value of NAXIS). We have also followed recommended conventions for the representation of values of keywords (dates in the format 'dd/mm/yy', SI units used where possible, etc.). The IHW has defined an additional set of mandatory keywords for all submissions to the Lead Center. These are presented in the list below: OBJECT - Name of the object in the datafile, a text string. FILE-NUM - Unique 6-digit number of the file submitted to the Lead Center. The first digit identifies the network, the other digits are assigned by the individual Disciplines, but must uniquely identify the file. DATE-OBS - UT Date of mid-observation, in the format 'dd/mm/yy'. TIME-OBS - UT Time of mid-observation, expressed as fractional day. DATE-REL - IHW internal data release date, a date string. DISCIPLN - Name of the network submitting the file, a text string. LONG-OBS - Longitude of the submitting observatory, in the format 'ddd/mm/ss', in degrees from 0 to 360, increasing in the eastward sense. LAT--OBS - Latitude of the submitting observatory, in the format 'sdd/mm/ss'. SYSTEM - An 8-digit coded character string identifying the Discipline, observatory and instrument which supplied the data. The first character identifies the network (1 = Astrometry, 2 = IR Studies, 3 = Large-Scale Phenomena, 4 = Near Nucleus Studies, 5 = Photometry & Polarimetry, 6 = Radio Studies, 7 = Spectroscopy & Spectrophotometry, and 8 = Amateur Observation), the next three identify the observatory (by IAU code number, when one is assigned, 500 otherwise). The next four digits either identify the telescope/in- strument combination (if there is an IAU number for the observatory) or the country and observatory (if no IAU number). See the file OBSCODES.TXT for a listing of the system codes used for P/Halley. OBSERVER - Name of the observer(s) who took the data, a text string. The notation "ET AL." indicates that there were more than two observers, and the names of the additional observers are given in a COMMENT later in the header, with the subkeyword "ADD. OBS." SUBMITTR - Name of the person submitting the data to the Lead Center, a text string. SPEC-EVT - A logical value indicating that the observation is a special event. Either T or F. DAT-FORM - A character string defining the form of the data, e.g., 'ASCII', 'NODATA'. The individual Disciplines have written appendices which describe the keywords used in addition to the mandatory ones. Refer to the text files in the subdirectory APPENDIX below this one for more details on those keywords. A more complete listing of keywords used, including their definitions, can also be found in the file FITSHDRS on the CD-ROM discs. REFERENCES Greisen, E. W. and Harten, R. H.: 1981, Astron. Astrophys. Suppl. Ser., 44, 371. Harten, R. H., Grosbol, P., Greisen, E. W. and Wells, D. C.: 1988, Astron. Astrophys. Suppl. Ser. 73, 365. Wells, D. C., Greisen, E. W. and Harten, R. H.: 1981, Astron. Astrophys. Suppl. Ser. 44, 363. M. Aronsson International Halley Watch Lead Center Jet Propulsion Laboratory Mail Stop 169-237 4800 Oak Grove Dr Pasadena, CA 91109 B. PDS LABELS The International Halley Watch agreed early in the project that all data would be submitted from the individual disciplines to the Lead Center using the FITS format. When the decision was made to distribute this information on CD-ROM, it was determined that the data had to have even broader accessibility. For this reason, the original FITS files, with contiguous headers and data, were split into separate files. The original FITS byte stream could then be recovered by concatenating the appropriate header and data files. In addition, detached PDS labels were constructed to allow parallel definition of the datafiles for the Planetary Data System. The SPIDS (Standards for the Preparation and Interchange of Data Sets, Martin, T. Z., et al, Document D-4683, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA) document version 1.1 was the primary reference to the Object Description Language (ODL) necessary to create the PDS labels. (We acknowledge R. Borgen and M. Martin, PDS- JPL, for assisting the IHW through version 2.0 of the ODL that allows for description of FITS_LABEL, TEXT, and SPECTRUM.) There are five fundamental data objects in this archive: IMAGE, TABLE, TEXT, FITS_LABEL, and SPECTRUM. Our aim was to construct a basic PDS label for each datafile on the CD-ROM. These PDS labels contain pointers to the actual datafiles (or to headers describing data submitted to the archive). There has been no effort to duplicate the documentation contained in the full FITS headers because the PDS and FITS headers for a given datafile differ only in the filename extension. Instead we have attempted to use the power of the PDS label syntax to fully describe the data structures and thus gain access to the powerful software already supported by that group. A more full explanation of each object is carried in text (.TXT) files HDRFORM, IMAGFORM, SPECFORM, TABLFORM, and TEXTFORM. In addition, a further description of the PDS detached label is listed in file LBLFORM.TXT. Each of these files was used as a reference for the SFDU pointers in the VOLDESC.SFD. Most keywords were already in the Planetary Science Data Dictionary but a few dealing with the spectral_image_qube were introduced to specifically describe the IHW data. A listing of these keywrods with definitions follow. 1. Keyword Definitions The definitions of the keywords used in the detached PDS labels on the International Halley Watch Archive CD-ROMs are given below. Where applicable the definitions are taken from the draft PSDD (Planetary Science Data Dictionary 1990) or the PDS Standards for the Preparation and Interchange of Data Sets (SPIDS) document (Martin et al., 1988). There are some differences from the PDS data dictionary in the definitions used on this CD-ROM, since the definitions of some keywords in the current PDS data dictionary do not describe comet data. In addition, some keywords for processed (UVFITS) data do not exist in the current PDS data dictionary. AXES The number of independent variables in a data array. BAND_STORAGE_TYPE The arrangement of data in a qube ordered by spectral bands. BYTES The number of bytes contained in a data item. COLUMNS The number of items of information in each row of a data table. CORE_DESCRIPTION The dependent variable expressed in a spectral_image_qube. CORE_ITEMS The number of elements along an independent axis. CORE_NAME Identifying name for a variable along an independent axis. DATA_SET_ID A unique alphanumeric identifier for a dataset. It is used as a primary key in the PDS catalog. DATA_SET_PARAMETER_NAME The name of the physical parameter represented in an image. Note this definition differs from the PDS data dictionary definition. DATA_TYPE The data type of a data item. Valid values are INTEGER, FLOAT, BINARY, and CHARACTER. DERIVED_MAXIMUM The maximum value held in a data record. DERIVED_MINIMUM The minimum value held in a data record. DESCRIPTION Text describing an object. Sometimes this is expressed as a pointer to another file containing the descriptive text; e.g., FITS header. ENCODING_TYPE Previous pixel compression of 16-bit data; also called first difference. END_OBJECT This keyword is used by ODL to indicate the end of a data object definition. FILE_RECORDS The number of physical records in a data file. FORMAT The Fortran 77 representation of the format statement needed to read a data item. IMAGE The data in an image file, expressed as a pointer to the record where the data begins. For example, ^IMAGE = ("filename",3) indicates that image data begins in record 3 of file "filename". INTERCHANGE_FORMAT The type of data stored in a data table, such as ASCII or BINARY. ITEMS Elements held in any arbitrary variable. ITEM_BYTES Number of bytes per item. ITEM_TYPE The data type of an item. LINES The number of lines in an image. LINE_SAMPLES The number of samples contained in each image line. MINIMUM_SAMPLING_PARAMETER For the spectrum object, the first value along the fastest varying axis. NAME The name of a column in a table. NOTE Descriptive text about a data file, referring to IHW Disciplines. OBJECT This keyword specifies the name of a data object. It is used by ODL to indicate the start of a data object definition. OBSERVATION_TIME A time associated with the midpoint of the International Halley Watch set of observations. OBSERVATION_ID A unique number held to identify each archived measurement gathered by the International Halley Watch. OFFSET A shift in zero point required to properly calculate the reduced value represented in a FITS data record. PRODUCER_FULL_NAME The full name of those mainly responsible for production of a data set. RECORDS The number of records in the object being described; for example, the number of records in a header object. RECORD_BYTES The number of bytes in each record of a data file. RECORD_TYPE The record structure type of a data file. Valid values are FIXED_LENGTH, VARIABLE_LENGTH, and STREAM. Images and data tables usually have fixed-length records, whereas text files have stream format records. ROWS The number of logical records in a data table. ROW_BYTES The number of bytes in each row (i.e., logical record) of a data table. SAMPLE_BYTES The number of bytes of data comprising one sample or pixel in an image or element in other objects. SAMPLE_TYPE The data type of an image sample or pixel. The table below lists the values used on this CD-ROM: UNSIGNED_INTEGER An unsigned integer value. Samples with a length of 16 bits are in most-significant-byte first order. MSB_INTEGER A signed integer with most-significant-byte leading as required by FITS format. COMPLEX_INTEGER The value represented in a process step (UVFITS) for certain types of radio data. SAMPLING_PARAMETER_DESCRIPTION Text explanation of independent variable. SAMPLING_PARAMETER_INTERVAL Smallest uniform change of independent variable. SAMPLING_PARAMETER_ITEMS Number of elements along independent axis. SAMPLING_PARAMETER_NAME Name associated with independent variable. SAMPLING_PARAMETER_UNIT Unit associated with independent variable. SCALING_FACTOR The factor that must be applied to the FITS data record to scale the values as described by UNIT. START_BYTE The byte position of the beginning of a data item within a row of data. START_TIME The date and time of the beginning of an event, such as data collection, in PDS standard (UTC) format. TARGET_NAME The name of a planetary body, such as a planet or satellite. TYPE The type of header in a data file, such as a VICAR2 label embedded in the image file. UNIT The units of measure of a data item. REFERENCE Martin, T.Z., Martin, M.D., Braun, M., Johnson, T, Davis, R., and Mehlman, R. (1988), "SPIDS v1.1: Standards for the Preparation and Interchange of Data Sets", JPL D-4683, Pasadena, CA. Planetary Data System Planetary Scinece Data Dictionary (draft Nov, 1990), Cribbs, M. Pasadena, CA. E. Grayzeck, Jr. Small Bodies Node of the Planetary Data System Dept of Astronomy University of Maryland College Park, MD 20742