This paper describes a grouping convention for FITS that facilitates the construction of hierarchical associations of Header Data Units (HDUs). The grouping convention uses FITS table structures (ASCII or binary) to encapsulate pertinent information about the HDUs belonging to a group. Group members may reside in a single FITS file or be distributed in many FITS files; the FITS files themselves may reside on different computer systems.
The rules for generalized extensions in FITS (Grosbøl et al., 1988) provide for FITS formatted files containing more than one header data unit. By using combinations of ASCII tables (Harten et al., 1988), binary tables (Cotton et al., 1994) and image extensions (Ponz et al., 1994) related data sets requiring different data structures may be stored in the same FITS file, each within its own HDU. Unfortunately, once the related data sets are segregated into separate HDUs the relationship between them is often lost.
The FITS standard currently allows for simple hierarchical associations of HDUs within a single FITS file through use of the EXTLEVEL keyword. However, this mechanism has several major limitations. First, its use is not well defined. Different organizations may use EXTLEVEL for widely varying purposes and still not violate the FITS standard. Secondly, it does not specify a mechanism for defining distinct multiple groups of HDUs within a FITS file. Lastly, it cannot be used to associate HDUs residing in different FITS files. Except for very simple cases, FITS contains no mechanism for creating or preserving associations between HDUs or groups of HDUs.
As the volume and complexity of FITS formatted data grows, the need for a recognized and versatile HDU grouping mechanism increases. Individuals can be overwhelmed trying to manage and analyze large data sets unless those sets are logically organized. Software tools also require data organization in order to access all necessary components of an observation, simulation or experimental data set.
As an example of where grouping capabilities within FITS would be useful, consider the following. It is desirable to combine a set of observations from a given time period into a single FITS file for transport and archival purposes. For each observation there is an observation log, an event list, a derived image and a set of instrument calibration data; furthermore, several observations share a common set of calibration data. By using a grouping mechanism each [log, event list, image, calibration] set could be logically tagged as an associated observation group and the calibration data could be made a part of many different observation groups, thus eliminating the need to store it more than once. Software could retrieve all the information about a given observation simply by extracting those HDUs defined in the table that identifies members of the group. Also, observations of the same object from different observational periods could be combined into a group and accessed as a unit, even though the HDU sets comprising the different observations reside in separate FITS files.
The following sections describe a scheme for implementing a hierarchical grouping of header data units within single and multiple FITS files. Section 2 discusses the content of table extensions used to define HDU groupings. Section 3 lists those keywords recommended for headers of group member extensions. Finally, Section 4 provides sample headers from FITS table extensions containing grouping structures.
A group table, as defined in this convention, is a FITS table
extension that contains a list of all the associated member HDUs in the
group. Group tables may be represented by either FITS ASCII tables
(XTENSION= 'TABLE '
) or binary tables (XTENSION= 'BINTABLE'
),
and are uniquely distinguished from other types of FITS tables by having the
EXTNAME = 'GROUPING'
keyword and value in the header. The other
required or recommended keywords and columns in a group table are described
in the following sections.
There may be zero, one, or more group tables within a given FITS file. Each group table may reference any number of HDUs. The entire set of HDUs referenced in a group table, along with the group table itself, form a group . Individual HDUs referenced in a group table are said to be members of the group or group members.
Groups can contain any type and mix of HDU. This includes all of the IAU-endorsed extensions as well as other extensions that conform to the requirements for generalized FITS extensions. Note that a group may also contain other groups as members, since a group table is itself a FITS extension. This feature allows for the construction of hierarchical structures of HDUs within a single FITS file or across many FITS files.
Group tables specify the names and locations of FITS files containing member HDUs as well as identifying members within their FITS files. The name and location of each FITS file is specified by using the World-Wide Web (Berners-Lee, 1994) Uniform Resource Identifiers, or URIs. All current and future forms of URIs, such as Uniform Resource Locators (URL) and the proposed Uniform Resource Names (URN), shall constitute valid names, although the group table must specify the type of URI being used. If the group member resides in a different FITS file but on the same computer system then partial URIs (specifically partial URLs) may be used instead of absolute URIs to specify the member's file location. If the group member resides in the same FITS file as the group table itself, then the URI field may be left blank.
The location of member HDUs within FITS files may be specified in two different ways, either by reference or by absolute position. The reference identification method uses the values of the XTENSION, EXTNAME and EXTVER keywords to uniquely identify the member HDU within the FITS file. The position method uses the HDU order number to identify members, with the primary array having order value 0, the first extension order value 1, and so on. Users may choose either or both identification methods when constructing a group table.
While the reference method is not invalidated by a reordering of HDU positions within FITS files, it does require that each member HDU have a unique set of (non-FITS-required) keyword values, Thus, this method may present problems for FITS files whose headers cannot be easily modified, such as FITS files on read-only media. The position identification method provides for quick ``random'' access to the member HDUs, since software does not have to sort though each extension looking for the correct set of keyword values, but will be affected if the order of member HDUs within their FITS files is changed (please note: there is nothing within the current FITS standard governing how or when HDUs may be reordered within their files).
In addition to the standard required FITS table extension keywords, the following keywords are required in the header of a group table:
EXTNAME = 'GROUPING'
must have a unique integer EXTVER value.
This group number may also be used in the header of each group member to
identify the group(s) to which the member belongs (see section 3,
GRPIDn keyword).
The following keyword is strongly recommended for inclusion in the header of each group table:
The number of columns required in a group table depends on which method is used to identify the members (and recall that both methods may be used within the same group). If the members are identified by reference then the following columns are required:
TTYPEn = 'MEMBER_XTENSION'
- character field:
Contains the value of the XTENSION keyword from the group member's header. In
the case of primary HDUs where there is no required XTENSION keyword,
the value of `PRIMARY' will be used instead. Therefore, the current
valid entries for this column are 'PRIMARY '
,
'TABLE '
, 'BINTABLE'
, 'IMAGE '
or any other IAU
FITS Working Group registered XTENSION value. Note that the single quotation
marks are used only to designate the string boundaries and are NOT to be
included with the XTENSION values in the column entries; the trailing blanks
shown in each string are optional. This field may contain the FITS null value
appropriate for this column type if the value is unknown (e.g., if the
position identification method described below is used to identify the member
location).
TTYPEn = 'MEMBER_NAME'
- character field:
Contains the value of the EXTNAME keyword from the group member's header.
In the case of primary HDUs where the EXTNAME keyword is not defined or when
the member extension has no EXTNAME keyword present, this field may contain
the FITS null value appropriate for the column type.
TTYPEn = 'MEMBER_VERSION'
- integer field:
Contains the value of the EXTVER keyword from the group member's header. In
the case of primary HDUs, or if the EXTVER keyword is not present in
the member header then a value of 1 should be assumed.
If members are identified by file position then the following column is required:
TTYPEn = 'MEMBER_POSITION'
- integer field:
Contains a group member's position within its FITS file. The file's primary
header is given a position value of 0, the first extension is given a
position value of 1, and so on. If for some reason a group member's
`MEMBER_POSITION' value becomes invalid or undefined, then this column
field should be filled with the FITS null value appropriate for the column
format.
If some or all of the group members reside in FITS files separate from the group table itself then the following two columns are also required:
TTYPEn = 'MEMBER_LOCATION'
- character field:
Contains the location of the group member's FITS file using Uniform
Resource Identifiers. If the FITS file resides on the same computer
system as the group table, then partial URIs may be used instead of
absolute URIs. If the group member resides in the same FITS file as
the group table, or the MEMBER_LOCATION value becomes invalid then this
field may be filled with the FITS null value appropriate for the column type.
TTYPEn = 'MEMBER_URI_TYPE'
- character field:
Contains the mne-monic for the Uniform Resource Identifier type used in the
corresponding MEMBER_LOCATION field. Recommended values for this column field
are `URL' for the Uniform Resource Locator and `URN' for the Uniform Resource
Name. As other URI types are defined their mnemonics will also become
acceptable values for this field. In cases where the MEMBER_URI_TYPE is
undefined (such as a null or blank MEMBER_LOCATION field value) this field
may contain the FITS null value appropriate for the column type.
Besides the table columns defined above, a group table may contain any number of user defined columns. Group table columns may appear in any order within the table and their TTYPEn values are not to be considered case-sensitive.
No additional keywords are required for HDUs that are members of a group. This rule is to ensure that all currently existing FITS files and their constituent HDUs may all be part of this convention. There are, however, several grouping related keywords whose presence is strongly recommended in newly created headers. The description of these keywords follow.
XTENSION= 'TABLE '
) and binary tables (XTENSION= 'BINTABLE'
).
The following are examples of valid group table headers that use different combinations of identification methods.
Example 1: A group containing five members all of which reside in the same file as the group table. This group is itself a member of two other groups and both of those groups' tables reside in the same file as this extension. The member position identification method is used to locate member HDUs.
XTENSION= 'BINTABLE' / This is a binary table BITPIX = 8 / Table contains 8-bit bytes NAXIS = 2 / Number of axis NAXIS1 = 4 / Width of table in bytes NAXIS2 = 5 / Number of member entries GCOUNT = 1 / Mandatory FITS keyword PCOUNT = 0 / Number of bytes in HEAP area TFIELDS = 1 / Number of columns in table EXTNAME = 'GROUPING' / This BINTABLE contains a group EXTVER = 3 / The ID number of this group GRPID1 = 1 / Part of group 1 GRPID2 = 2 / Part of group 2 TTYPE1 = 'MEMBER_POSITION' / Position of member within file TFORM1 = '1J' / Datatype descriptor END
Example 2: A group containing 150 members, some of which reside in FITS files different from that of the group table. This group is not a member of any other group, although it is the seventh group table defined in the FITS file. All member identification methods are used.
XTENSION= 'BINTABLE' / This is a binary table BITPIX = 8 / Table contains 8-bit bytes NAXIS = 2 / Number of axis NAXIS1 = 79 / Width of table in bytes NAXIS2 = 150 / Number of member entries GCOUNT = 1 / Mandatory FITS keyword PCOUNT = 0 / Number of bytes in HEAP area TFIELDS = 6 / Number of columns in table EXTNAME = 'GROUPING' / This BINTABLE contains a group EXTVER = 7 / The ID number of this group TTYPE1 = 'MEMBER_LOCATION' / URI of file containing member HDU TFORM1 = '30A ' / Datatype descriptor TTYPE2 = 'MEMBER_URI_TYPE' / URI type of MEMBER_LOCATION field TFORM2 = '3A ' / Datatype descriptor TTYPE3 = 'MEMBER_POSITION' / Position of member within file TFORM3 = '1J ' / Datatype descriptor TTYPE4 = 'MEMBER_XTENSION' / XTENSION keyword value of member TFORM4 = '8A ' / Datatype descriptor TTYPE5 = 'MEMBER_NAME' / EXTNAME keyword value of member TFORM5 = '30A ' / Datatype descriptor TTYPE6 = 'MEMBER_VERSION' / EXTVER keyword value of member TFORM6 = '1J ' / Datatype descriptor ENDExample 3: A group containing 17 members, some of which reside in FITS files different from that of the group table. This group is a member of six other groups, two of which are defined in FITS files on other computer systems and one that is defined in a FITS file on the same computer system. The member reference identification and member file location methods are used. Two user defined columns are also present.
XTENSION= 'BINTABLE' / This is a binary table BITPIX = 8 / Table contains 8-bit bytes NAXIS = 2 / Number of axis NAXIS1 = 180 / Width of table in bytes NAXIS2 = 17 / Number of member entries GCOUNT = 1 / Mandatory FITS keyword PCOUNT = 0 / Number of bytes in HEAP area TFIELDS = 7 / Number of columns in table EXTNAME = 'GROUPING' / This BINTABLE contains a group EXTVER = 7 / The ID number of this group GRPID1 = 3 / Member of group 3 GRPID2 = 6 / Member of group 6 GRPID3 = 18 / Member of group 18 GRPID4 = -1 / Member of external group 1 GRPLC4 = 'http://fits.gsfc.nasa.gov/FITS/file1.fits' / location of COMMENT FITS file containing group GRPID5 = -5 / Member of external group 5 GRPLC5 = '/FITS/file5.fits' / Location of file containing group GRPID6 = -2 / Member of external group 2 GRPLC6 = 'http://www.noao.edu/irafdir/file2.fits' / location of COMMENT FITS file containing group TTYPE1 = 'USER_INFO_1' / A user supplied column TFORM1 = '25J ' / Datatype descriptor TTYPE2 = 'MEMBER_LOCATION' / URI of file containing member HDU TFORM2 = '30A ' / Datatype descriptor TTYPE3 = 'MEMBER_XTENSION' / XTENSION keyword value of member TFORM3 = '8A ' / Datatype descriptor TTYPE4 = 'MEMBER_NAME' / EXTNAME keyword value of member TFORM4 = '30A ' / Datatype descriptor TTYPE5 = 'USER_INFO_2' / A user supplied column TFORM5 = '5A ' / Datatype descriptor TTYPE6 = 'MEMBER_VERSION' / EXTVER keyword value of member TFORM6 = '1J ' / Datatype descriptor TTYPE7 = 'MEMBER_URI_TYPE' / URI type of MEMBER_LOCATION field TFORM7 = '3A ' / Datatype descriptor END
Example 4: A group containing 82 members, some of which reside in FITS files different from that of the group table. This group is a member of three other groups, and makes use of the member position and member file location methods. One user defined column is present. Note that in this example an ASCII table (as opposed to a binary table) is used to define the group.
XTENSION= 'TABLE ' / This is an ASCII table BITPIX = 8 / Table contains 8-bit ASCII characters NAXIS = 2 / Number of axis NAXIS1 = 46 / Width of table in bytes NAXIS2 = 82 / Number of member entries GCOUNT = 1 / Mandatory FITS keyword PCOUNT = 0 / Mandatory FITS keyword TFIELDS = 4 / Number of columns in table EXTNAME = 'GROUPING' / This TABLE contains a group EXTVER = 31 / The ID number of this group GRPID1 = 3 / Member of group 3 GRPID2 = 9 / Member of group 9 GRPID3 = 27 / Member of group 27 TTYPE1 = 'USER_INFO_1' / A user supplied column TFORM1 = 'E10.3 ' / Datatype descriptor TBCOL1 = 1 / Starting table column for field TTYPE2 = 'MEMBER_LOCATION' / URI of file containing member HDU TFORM2 = 'A30 ' / Datatype descriptor TBCOL2 = 11 / Starting table column for field TTYPE3 = 'MEMBER_URI_TYPE' / URI type of MEMBER_LOCATION field TFORM3 = 'A3 ' / Datatype descriptor TBCOL3 = 41 / Starting table column for field TTYPE4 = 'MEMBER_POSITION' / XTENSION keyword value of member TFORM4 = 'I3 ' / Datatype descriptor TBCOL4 = 44 / Starting table column for field END
We gratefully acknowledge the support of the NASA Applied Information Systems Research Program, underwhich this effort is partially funded.
In certain circumstances, it may be convenient to point, or refer, to a HDU from another HDU. Such references neither imply or require the hierarchical association information as allowed by grouping table structures, but still serve a similar function by pointing to another data structure residing in a separate HDU.
If referring to a single HDU is preferable to forming a hierarchical association and including the given HDU as a member, then keyword and table column values may employ the same syntax as used for the identification of group members. For notational convenience, thus allowing all the information to be included in a single keyword value or table column entry, the reference should be expressed as a single character string of either type 1 format, cm
or of type 2 format,
where each quantity enclosed in single quotation marks is replaced by its corresponding value as defined in section 2.3. The colons (':', ASCII 58) appearing in the expressions are significant and must be used to separate the fields of the string. Such expressions are known as reference strings.
Default values in the HDU reference strings are permitted but must obey the following rules. Note that by implication a reference string may begin with a colon field separator (':', ASCII 58) but may not terminate with a colon field separator.
Below are examples of valid reference strings. In each case the following values are assumed:
Please note that reference strings are meant only to supplement and enhance the hierarchical grouping convention as described above. In particular, reference strings should be used sparingly and with care; they do not provide the same level of data format structure and long-term archival stability as the grouping tables themselves.
Berners-Lee, Tim, 1994. ``World Wide Web Initiative'', CERN - European Particle Physics Lab. http://info.cern.ch/hypertext/WWW /TheProject.html .
Cotton, W. D., Tody, D. and Pence W., 1994. ``Binary Table Extension to FITS: A Proposal'', version dated June 13, 1994, final form. NRAO FITS Archives. http://fits.cv.nrao.edu/fits/documents/standards/bintable3.ps .
Grosbøl, P., Harten, R. H., Greisen, E. W., and Wells, D. C., 1988. ``Generalized extensions and blocking factors for FITS,'' Astronomy and Astrophysics Suppl., 73, 359-364.
Harten, R. H., Grosbøl. P., Greisen, E. W., and Wells, D. C., 1988. ``The FITS tables extension'', Astronomy and Astrophysics Suppl., 73, 365-372.
Ponz, J. D., Thompson, R. W., and Munoz, J. R., 1994. ``FITS Image Extension'' , Astronomy and Astrophysics Suppl., vol 105, pp 53-55.