From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:24:22 1991 X-VM-Message-Order: (14 18 21 16 20 28 29 25 19 23 27 24 17 22 30 31 26 37 32 33 35 36 40 42 41 43 44 1 8 10 11 15 34 38 39 3 4 2 5 7 6 9 12 13) X-VM-Summary-Format: "%n %*%a %-17.17F %-3.3m %2d %4l/%-5c %I\"%s\"\n" X-VM-Labels: nil Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1057" "" "24" "September" "91" "20:26:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<2876446 at toto.iv>" "21" "Re: compression of fits" "^From:" nil nil "9" "1991092420:26:00" "compression of fits" (number " " mark " Archie Warnock Sep 24 21/1057 " thread-indent "\"Re: compression of fits\"\n") nil] nil) X-VM-VHeader: ("Resent-" "From:" "Sender:" "To:" "Apparently-To:" "Cc:" "Subject:" "Date:") nil X-VM-Bookmark: 44 Newsgroups: alt.sci.astro.fits Organization: ST Systems Corp. - NASA/NCDS News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: 24 Sep 91 20:26:00 GMT I am willing to post the LaTeX source of the FITS Compression proposal here, although it's fairly long. OTOH, I could E-mail it, if I don't get too many requests. I'm delighted to see such an interest. First of all, compression and FITS are separate issues. The proposal is for a FITS syntax to _indicate_ a compressed data stream. It does allow for keywords to indicate the compression method used on the data, but does not (yet) mandate any particular algorithms - though we do give an example. In general, compression is tricky - binary files just don't compress very well. Tables, on the other hand, compress wonderfully - helped no small amount by the padding FITS requires to "round out" the header and data lengths. Further discussion _very_ welcome! ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From gsh7w at fermi.clas.Virginia.EDU Wed Oct 9 09:24:53 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2848" "Tue" "24" "September" "91" "22:36:35" "GMT" "Greg Scott Hennessy" "gsh7w at fermi.clas.Virginia.EDU " "<3273931 at toto.iv>" "54" "Re: compression of fits" "^From:" nil nil "9" "1991092422:36:35" "compression of fits" (number " " mark " Greg Scott Hennes Sep 24 54/2848 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: University of Virginia From: gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) Subject: Re: compression of fits Date: Tue, 24 Sep 91 22:36:35 GMT Archie Warnock writes: #I am willing to post the LaTeX source of the FITS Compression proposal #here, although it's fairly long. Please do post such. #In general, compression is tricky - binary files just don't compress #very well. As with many things, the answer is "that depends". I have large amounts of 2Kx2K images from the Astro-1 mission on my disk, and to save disk space I use the unix "compress" facility. The images are all 16 bit signed integers, which are certainly "binary" images, but these compress very well, not only because the large field of view (40 arcmin) means that there is lots of sky, but because in general images tend to be rather smooth, and the compress utility works rather well on these, compressing them to about 30 percent of the origional size. As an experiment, I converted these images from 16 bit integer to 32 bit floating point, IEEE format, and I was suprized to find that the compression ratio remained much the same. I would have ventured the opinion that floating point numbers would not compress well, unless fancy things were done line splitting the bytes to compress the exponents and some of the most significant bits, however, this trial worked fine. However, trying to compress a UV data set may not achieve the same results. For a single data point though, I just ran the compress utility on a UV data set that I have. The data set was 18 hours of VLA time, continuum mode, 2 IFs, full polarization, and 20 second integration time. The total dataset was 41.6 MBy, and the compressed data set was 23.1 MBy, so the compressed data was 55% of the origional, a nice healthy savings. This data is IEEE format, and probably about one of the hardest data sets to work with. However, how will the FITS reader and writing programs deal with compressed data? Currently the header (which I would argue should NEVER be compressed) allows an exact determination of how much data is in the file. If a new keyword is introduced (COMPRESSED = T?) is introduced, then the header could be identical to current practice, and just the bytes of the compressed data written, with the padding to the 2880 octet buffer size (and whatever blocking factor) could be written, but it is unknown how the uncompress algorithm would deal with the padding values. It is unclear if compress should be used at all. My understanding is that compress is what POSIX 1003.2 has chosen, but with Unisys and IBM both claiming patents on compress, the legality of using compress is not clear. My understanding is that compress will never give out a larger output than the input, but I am not certain of this. Sugguestions are most welcome. -- -Greg Hennessy, University of Virginia USPS Mail: Astronomy Department, Charlottesville, VA 22903-2475 USA Internet: gsh7w at virginia.edu UUCP: ...!uunet!virginia!gsh7w From markus at mso.anu.edu Wed Oct 9 09:25:06 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["4658" "" "25" "September" "91" "04:42:34" "GMT" "Markus Buchhorn" "markus at mso.anu.edu " "<3529180 at toto.iv>" "94" "Re: compression of fits" "^From:" nil nil "9" "1991092504:42:34" "compression of fits" (number " " mark " Markus Buchhorn Sep 25 94/4658 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: Mt. Stromlo Observatory From: markus at mso.anu.edu (Markus Buchhorn) Subject: Re: compression of fits Date: 25 Sep 91 04:42:34 GMT I apologise in advance for the length of this posting. I'm interested in this topic, and want to stir up some conversation :-) gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes: >Archie Warnock writes: >#I am willing to post the LaTeX source of the FITS Compression proposal >#here, although it's fairly long. >Please do post such. Agreed - or put it up for ftp on fits.cx.nrao.edu in the appropriate directory. >#In general, compression is tricky - binary files just don't compress >#very well. >As with many things, the answer is "that depends". [...] >compression ratio remained much the same. I would have ventured the >opinion that floating point numbers would not compress well, unless >fancy things were done line splitting the bytes to compress the >exponents and some of the most significant bits, however, this trial >worked fine. [...] >However, how will the FITS reader and writing programs deal with >compressed data? Currently the header (which I would argue should >NEVER be compressed) allows an exact determination of how much data is >in the file. If a new keyword is introduced (COMPRESSED = T?) is >introduced, then the header could be identical to current practice, >and just the bytes of the compressed data written, with the padding to >the 2880 octet buffer size (and whatever blocking factor) could be >written, but it is unknown how the uncompress algorithm would deal >with the padding values. >Sugguestions are most welcome. >-Greg Hennessy, University of Virginia > Internet: gsh7w at virginia.edu > UUCP: ...!uunet!virginia!gsh7w Sorry about including all of that - but I think it's relevant. The choice of compression algorithm and the storage of the compressed data are the two main problems we would face here. I don't know what Mr Warnock's work has done, so I can't comment on that. But a couple of ideas spring to mind: First off, for 2D image type of data (CCD images say), even somewhat noisy ones, one way of helping the compression algorithm might be to store the data as differences from one pixel to the next. If the Universe was well behaved and skies were truely smooth, the gain would be astounding, as d(pixel)/dx = 0 or const for many pixels in a row. Any run-length encoding technique would be quite happy with this. As the noise increases, the efficiency would decrease. Another advantage (possibly the main one) is that all the dp/dx values would be closer in number-space, which means that a trick like compressing high and low bytes separately would work much better. But what about a lossy compression technique ? Before anyone jumps on that, look at the way HST will store their digitised sky survey. They bin the data into 2x2 blocks, generate 4 coeffs, then treat the coeffs as pixel values for another image (binned down by a factor of 2 effectively). Loop until you get down to a single pixel / group of. By choosing which coeffs to keep and which to turf out, they have achieved quite respectable compression (4:1), and, *they claim* that photometry is not compromised by this. (Dunno about crowded fields though ???) The other thing is how do we store this compressed stream. I agree with Greg that the header should not be compressed. If you want to use 'compress' to increase disk-space - fine. But you uncompress them yourself. What we want is a technique that does this transparently to The User. Perhaps what one would need to do is read in either the entire image into RAM, compress all of it, then write out the compressed stream in 2880 chunks, or else read in some fraction of that (N lines), compress just that, and write out 2880 chunks. Padding is not a good idea as (i) wastes space and (ii) decompression could have a problem with this. Also, the nice thing with raw FITS files is that one can access them randomly/sequentially (see the FITSIO package). By chunk-ifying the image, we can still achieve some of the advantages, as reading any individual record won't need the previous one to yield raw data. (Ok, possibly one would need to read a few adjacent records, but not the whole thing..... ) Anyway - I can see problems(*) with the above already. I'm a busy student so don't have much time to play with this sort of thing. But I am very interested in the problem, and look forward to seeing Mr Warnock's proposal. Cheers, Markus Markus Buchhorn markus at mso.anu.edu.au (*) One problem with any(?) run-length compression scheme is that a row or chunk of data may depend on the first pixel value being known - what happens if there is a transmission error/tape-error/ generic-Murphy-Law error in that pixel? Then we need CRC/checksums ???? From gsh7w at fermi.clas.Virginia.EDU Wed Oct 9 09:25:09 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1760" "Wed" "25" "September" "91" "15:12:41" "GMT" "Greg Scott Hennessy" "gsh7w at fermi.clas.Virginia.EDU " "<2422251 at toto.iv>" "35" "Re: compression of fits" "^From:" nil nil "9" "1991092515:12:41" "compression of fits" (number " " mark " Greg Scott Hennes Sep 25 35/1760 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: University of Virginia From: gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) Subject: Re: compression of fits Date: Wed, 25 Sep 91 15:12:41 GMT Markus Buchhorn writes: #Perhaps what one would need to do is read in either the entire #image into RAM, compress all of it, then write out the compressed stream #in 2880 chunks, or else read in some fraction of that (N lines), compress #just that, and write out 2880 chunks. Padding is not a good idea as #(i) wastes space and (ii) decompression could have a problem with #this. It is very common already to have FITS data sets that will not fit into RAM, and I think that it would be a mistake to rely on a compression scheme that required reading the whole data set into ram. Another aspect is what will the existing FITS readers do with a compressed image. The original FITS papers considered this point, and conceded that not all reader programs can read every FITS file, but the programs should at least exit gracefully. I have encounted data sets that were called FITS files, that did not have the last record padded out to 2880 octects, and some FITS readers die horribly at this. Blocking out the last record to 2880 octects is not very wastefull of space, at least for modern data sets (trying to transport icons for your workstation with FITS isn't very efficient spacewise to begin with). The more important issue is how do the decryption program deal with this padding. It is also very import to remember during all these discussions on what the non-unix users will do. While POSIX 1003.2 may specifiy compress, there are going to be interesting computers that will not have this algorithm, and it seems a disservice to leave these systems high and dry. -- -Greg Hennessy, University of Virginia USPS Mail: Astronomy Department, Charlottesville, VA 22903-2475 USA Internet: gsh7w at virginia.edu UUCP: ...!uunet!virginia!gsh7w From tody at noao.edu Wed Oct 9 09:25:12 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["3767" "" "26" "September" "91" "06:29:17" "GMT" "Doug Tody NOAO/IRAF CCS" "tody at noao.edu " "<3526925 at toto.iv>" "61" "Re: compression of fits" "^From:" nil nil "9" "1991092606:29:17" "compression of fits" (number " " mark " Doug Tody NOAO/IR Sep 26 61/3767 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Optical Astronomy Observatories, Tucson, AZ, USA From: tody at noao.edu (Doug Tody NOAO/IRAF CCS) Subject: Re: compression of fits Date: 26 Sep 91 06:29:17 GMT >From article <1991Sep25.151241.1531 at murdoch.acc.Virginia.EDU>, by gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy): > Markus Buchhorn writes: > #Perhaps what one would need to do is read in either the entire > #image into RAM, compress all of it, then write out the compressed stream > #in 2880 chunks, or else read in some fraction of that (N lines), compress > #just that, and write out 2880 chunks. Padding is not a good idea as > #(i) wastes space and (ii) decompression could have a problem with > #this. > > It is very common already to have FITS data sets that will not fit > into RAM, and I think that it would be a mistake to rely on a > compression scheme that required reading the whole data set into ram. Unless there are good reasons to do otherwise it would be wise to use a compression techinique which is local in nature (assuming the pixel data is to be compressed instead of the entire FITS file). This is particularly important if the data is to be randomly accessed at runtime, for example, when reading a FITS image from a CD-ROM, or from disk. It would also aid in recovery from data losses. A simple technique would be to compress each image line independently, using an index to record the offset of each compressed line. An existing example of this is the pixel mask image format in IRAF, which preserves all the semantics of the random access image while storing the image in a compressed form. A line-oriented, random access approach would be one way of justifying a FITS specific compression technique, rather than simply compressing the entire FITS file with a standard adaptive compression program like COMPRESS. I don't think it is worthwhile to try to compress headers; if you want to compress the entire file, just use a non-FITS file oriented compression program. This would not necessarily have to be COMPRESS; we could write our own adaptive compression program if we wished, more oriented towards noisy image data, without having to tie it to FITS or any other specific image format. > It is also very import to remember during all these discussions on what the > non-unix users will do. While POSIX 1003.2 may specifiy compress, there are > going to be interesting computers that will not have this algorithm, and it > seems a disservice to leave these systems high and dry. Source for the compress/uncompress programs is available on the net. A while back we hacked a version of this here for VMS. Source for both the UNIX and VMS versions, plus VMS binaries, are in the util subdirectory on iraf.noao.edu (I believe gatekeeper.dec.com has an older copy as well). The VMS version isn't as easy to use as the UNIX one due to the record structures of VMS files, but it works. As many will be aware there are efforts to patent the LZ algorithm used in compress, but the authors of the original algorithm oppose this, and since the source has been publically available for years it may be difficult for the lawyers to win this one. I think more research, and trial implementations, are needed before we want to settle on a FITS standard for image compression. Getting experimental support for compression into our online image formats, which can be more easily changed than the FITS standard, would be a good start. Some basic algorithms research is badly needed as well. A general purpose program like COMPRESS, with an adaptive algorithm which could do well on noisy pixel data as well as text, could prove immediately useful and would provide an opportunity for algorithms development. -- Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217 UUCP: {arizona,decvax,ncar}!noao!tody or uunet!noao.edu!tody Internet: tody at noao.edu SPAN/HEPNET: NOAO::TODY (NOAO=5355) From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:25:22 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2514" "Thu" "26" "September" "1991" "14:29:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<6995829 at toto.iv>" "44" "Re: compression of fits" "^From:" nil nil "9" "1991092614:29:00" "compression of fits" (number " " mark " Archie Warnock Sep 26 44/2514 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov Organization: ST Systems Corp. - NASA/NCDS From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: Thu, 26 Sep 1991 14:29:00 GMT In article <1991Sep26.062917.1517 at noao.edu>, tody at noao.edu (Doug Tody NOAO/IRAF CCS) writes... >recovery from data losses. A simple technique would be to compress each >image line independently, using an index to record the offset of each >compressed line. An existing example of this is the pixel mask image format This would, indeed, be very handy. One of the drawbacks of the bit-oriented compression schemes is that random access into the image is tricky, at best. On the other hand, I also want to address the _interchange_ of data, in which case the maximum compression ratio is desirable. I still think of FITS as an interchange format (which it is good at), rather than a working format (which it isn't so good at). >COMPRESS. I don't think it is worthwhile to try to compress headers; if you >want to compress the entire file, just use a non-FITS file oriented >compression program. This would not necessarily have to be COMPRESS; we Except that this decidedly non-FITS like. It relies on external means to document the algorithm, rather than packaging the data in self-documenting form, and doesn't necessarily select the compression algorithm from what I hope will be a widely available set. The point of the proposal we made is to implement the strategy _within the FITS framework_. >VMS version isn't as easy to use as the UNIX one due to the record >structures of VMS files, but it works. As many will be aware there are >efforts to patent the LZ algorithm used in compress, but the authors of the >original algorithm oppose this, and since the source has been publically >available for years it may be difficult for the lawyers to win this one. Two strong arguments for implementing compression as part of the FITS syntax, rather than relying exclusively on "external" techniques. >I think more research, and trial implementations, are needed before we want >to settle on a FITS standard for image compression. Getting experimental >support for compression into our online image formats, which can be more >easily changed than the FITS standard, would be a good start. Some basic >algorithms research is badly needed as well. A general purpose program like I can't agree more strongly. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:25:25 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2199" "" "25" "September" "91" "18:34:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<8085885 at toto.iv>" "42" "Re: compression of fits" "^From:" nil nil "9" "1991092518:34:00" "compression of fits" (number " " mark " Archie Warnock Sep 25 42/2199 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: ST Systems Corp. - NASA/NCDS News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: 25 Sep 91 18:34:00 GMT In article <1991Sep25.044234.18639 at newshost.anu.edu.au>, markus at mso.anu.edu (Markus Buchhorn) writes... >even somewhat noisy ones, one way of helping the compression >algorithm might be to store the data as differences from one pixel to >the next. If the Universe was well behaved and skies were truely smooth, Ladies and gentlemen, we have a winner in the "IHW Compression Contest". Seriously, this is virtually the algorithm IHW used for compressing the digitized large-scale images. We observed that, for most of our 1600 images, although the data itself required 2 bytes for storage, the pixel-to-pixel differences fit in 1 byte. Presto - instant 50% compression (almost instant - it requires only a single addition per pixel to decompress the image). Best bang for the buck, in our estimation. Not the best possible, but the best for the lowest (computational) cost. >But what about a lossy compression technique ? Before anyone jumps on >that, look at the way HST will store their digitised sky survey. Excellent idea - especially if we can arrange the compression so what's lost is the noise, not the signal. Still, astronomers are conservative folks and want to preserve every last bit of noise. >(*) One problem with any(?) run-length compression scheme is that >a row or chunk of data may depend on the first pixel value >being known - what happens if there is a transmission error/tape-error/ >generic-Murphy-Law error in that pixel? Then we need CRC/checksums ???? A checksum keyword would be an excellent idea. Currently, none of the FITS data structures inherently incorporate such a thing - it's not unique to compressed data. The successive-differences scheme used by IHW has to "re-calibrate" the 8-bit differences whenever the pixel-to-pixel difference is too big. This means that such transmission errors wouldn't propagate too far down the image. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:25:29 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["3196" "" "25" "September" "91" "18:26:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<6275646 at toto.iv>" "61" "Re: compression of fits" "^From:" nil nil "9" "1991092518:26:00" "compression of fits" (number " " mark " Archie Warnock Sep 25 61/3196 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: ST Systems Corp. - NASA/NCDS News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: 25 Sep 91 18:26:00 GMT In article <1991Sep24.223635.14295 at murdoch.acc.Virginia.EDU>, gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes... >As with many things, the answer is "that depends". I have large >amounts of 2Kx2K images from the Astro-1 mission on my disk, and to >save disk space I use the unix "compress" facility. The images are all >16 bit signed integers, which are certainly "binary" images, but these >compress very well, not only because the large field of view (40 >arcmin) means that there is lots of sky, but because in general images >tend to be rather smooth, and the compress utility works rather well If you got those from where I think you did (digitized by my collegues at Goddard), I'm intimitely familiar with the microdensitometer which generated those digital images (we digitized 1600 plates of Halley's Comet with the same PDS). It uses a 10-bit A/D converter, so the data is only 10 bits deep (although it still requires 16 bits to store it). And you're right - at 10 bits, the data is quite smooth and easily compressed (see examples in the Compression proposal from the International Halley Watch). In any event, far be it from me to deny that certain compression schemes work well on certain data. Clearly, that's so. There are two things necessary for interchanging compressed data under the FITS syntax. The first is that the astronomical community has to agree on what that syntax shall be - that's the primary subject of the proposal. The second thing is to have software which is "smart" enough to select an _appropriate_ compression scheme for the data in question. Just applying Unix compress won't cut it - there are too many counter examples. My claim is that there's no single best scheme for all astronomical data. New algorithms need to be developed, existing algorithms need to be modified and ultimately (I predict) we'll find that the approach of the PC-based compressors will be best - to compress the data in question with a variety of techniques and keep the one which works best (ala PKZIP). The "trick" of the compression proposal is that it allows the header to specify the type of compression used, so we only need to agree on names for the particular scheme to use it within the FITS framework. >However, how will the FITS reader and writing programs deal with >compressed data? Currently the header (which I would argue should As a FITS extension: XTENSION= 'COMPRESS' [...] COMPRES1= 'HUFFMAN' COMPRES2= 'PREVPIXEL' [etc] Additional informational keywords can be added to the extension header, if desired. The second "trick" is that, when you compress the data, you compress everything, including a FITS header. The reason for this is that it allows you to compress _any_ FITS data stream, and the result of the decompression will always be a fully-qualified FITS data stream, which you can feed directly back into your FITS reader. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From gsh7w at fermi.clas.Virginia.EDU Wed Oct 9 09:25:33 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2205" "Wed" "25" "September" "91" "19:16:36" "GMT" "Greg Scott Hennessy" "gsh7w at fermi.clas.Virginia.EDU " "<3464892 at toto.iv>" "41" "Re: compression of fits" "^From:" nil nil "9" "1991092519:16:36" "compression of fits" (number " " mark " Greg Scott Hennes Sep 25 41/2205 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: University of Virginia From: gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) Subject: Re: compression of fits Date: Wed, 25 Sep 91 19:16:36 GMT Archie Warnock writes: #In any event, far be it from me to deny that certain compression schemes #work well on certain data. . . . The #second thing is to have software which is "smart" enough to select an #_appropriate_ compression scheme for the data in question. Just #applying Unix compress won't cut it - there are too many counter #examples. My claim is that there's no single best scheme for all astronomical #data. New algorithms need to be developed, existing algorithms need to #be modified and ultimately (I predict) we'll find that the approach of #the PC-based compressors will be best - to compress the data in question #with a variety of techniques and keep the one which works best (ala #PKZIP). While no single compression scheme is best for the many types of data that are available, I would argue that the number of different schemes should be kept small. It is a large burden to expect that every observatory and every department support a large number of different algorithms. Even saying "Oh, as long as [IRAF,AIPS,MIDAS,STSDAS,whatever] supports the decompression I am ok" is simply dumping the problems onto Doug Tody, Bill Cotton, Preben Grosbol, Bob Hanisch, or whomever supports the software. I.E. people already under heavy burdens. Also, I doubt the utility of running (as in the Warnock et. al. draft) five different compression schemes to obtain a further 5 percent reduction in space. Table 1 in the Warnock et. al. draft clearly shows what we all know now. Some data sets compress well (using a generic compress, not referring to any particular algorithm) and some data sets do not. My conclusion from Table 1 is that the space savings from different algorithms is a small fraction of the range of savings possible, hence picking a good but admittedly not best for all cases algorithm to standarize on is a win in keeping down software complexity over supporting multiple compression algorithms to gain 5-10 percent savings. Does anyone make a different conclusion? Cheers, -- -Greg Hennessy, University of Virginia USPS Mail: Astronomy Department, Charlottesville, VA 22903-2475 USA Internet: gsh7w at virginia.edu UUCP: ...!uunet!virginia!gsh7w From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:25:36 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2195" "" "25" "September" "91" "21:42:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<4557884 at toto.iv>" "37" "Re: compression of fits" "^From:" nil nil "9" "1991092521:42:00" "compression of fits" (number " " mark " Archie Warnock Sep 25 37/2195 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: ST Systems Corp. - NASA/NCDS News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: 25 Sep 91 21:42:00 GMT In article <1991Sep25.191636.4068 at murdoch.acc.Virginia.EDU>, gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes... >While no single compression scheme is best for the many types of data >that are available, I would argue that the number of different schemes >should be kept small. It is a large burden to expect that every No argument there. It's time-consuming to repeatedly compress data to see what works best. On the other hand (as an earlier posting showed - Nelson?) there are pathological cases which can get bigger, not smaller. Either we require the user to somehow characterize their data, or educate the user as to which schemes compress which data the best, or we let the software do that - or we use some generic scheme which doesn't to too bad on anything, but doesn't do too well, either. In any event, the proposal allows for all those eventualities. I don't believe _anyone_ has done the type of comprehensive testing necessary to draw conclusions about _which_ compression algorithms ought to be adopted. Whatever the result of such studies in the future, the main point of our proposal is to make the data compressed by _any_ of those schemes transportable. The framework is what's proposed - how to fill in the details remains to be resolved/investigated. >particular algorithm) and some data sets do not. My conclusion from >Table 1 is that the space savings from different algorithms is a small >fraction of the range of savings possible, hence picking a good but >admittedly not best for all cases algorithm to standarize on is a win >in keeping down software complexity over supporting multiple >compression algorithms to gain 5-10 percent savings. Bang for the buck - a good concept. Computing time goes up substantially with increasing compression ratio. I hope no one claims to be able to decide for everyone else what investment level is sufficient. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From dwells at fits.cx.nrao.edu Wed Oct 9 09:25:40 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["5440" "Wed" "25" "September" "1991" "20:44:29" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " "<6824208 at toto.iv>" "104" "Re: compression of fits" "^From:" nil nil "9" "1991092520:44:29" "compression of fits" (number " " mark " Don Wells Sep 25 104/5440 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits In-Reply-To: gsh7w at fermi.clas.Virginia.EDU's message of Tue, 24 Sep 91 22: 36:35 GMT Organization: National Radio Astronomy Observatory, Charlottesville, VA From: dwells at fits.cx.nrao.edu (Don Wells) Subject: Re: compression of fits Date: Wed, 25 Sep 1991 20:44:29 GMT In article <1991Sep24.223635.14295 at murdoch.acc.Virginia.EDU> gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes: > Archie Warnock writes: > #In general, compression is tricky - binary files just don't compress > #very well. > As with many things, the answer is "that depends". I agree with both statements. The real issue is what fraction of the bits of the file are "random", or alternatively what types of patterns can the compressor detect in binary data. > ... 2Kx2K images from... Astro-1 ... I use... "compress" ... 16 bit > signed integers ... compress utility ... to about 30 percent of... > origional ... experiment... converted... to 32 bit... IEEE... > suprized... that... compression ratio remained much the same. Most of the image probably has sky background with DN values of order 25-35, i.e., 5 bits per pixel typically, or else it is smooth with 5 bits of noise; this enables about 10 bits to be eliminated in every pixel. When such data are converted to IEEE they still only have a modest number of significant bits. In particular, the lower 16 bits of the mantissa are probably zero; this is worth 50%. In addition, there will be only a few possible exponents in the IEEE values, and so most of the exponent information will compress. Compression of the most significant byte is similar to the original problem. If the four bytes were separately compressed you might get down to about 15% of original, i.e., about same number of bits as for the 16bit version, but with the bytes interleaved compress will not be able to take full advantage of the fact that the four bytes have very different statistical distributions. > ... ran... compress on a UV data set ... 18 hours of VLA time, > continuum mode, 2 IFs, full polarization... 41.6 MBy... > compressed... was 23.1 MBy... 55% of... origional... I am surpised that the gain was this much. My guess is that much of the gain came from the random parameters, with things like baseline number and maybe weights having few values. I suspect that if it had not been full polarization the relative gain would have been greater (visibility data is probably fairly random noise to compress). =-=-=-= The general strategy which I speculate will generally make the maximum gain is to split the datastream into streams which have homogeneous statistics, and compress each stream separately. For a table, compress each column as a stream. For random groups, compress each random parameter separately. For binary data I advocate using first differences (integer subtraction) in the streams to eliminate even more of the variation (this should be tested). It may be advantageous to compress even and odd bytes separately for 16-bit data, or use four streams for 32-bit. =-=-=-= I have done only one small experiment. Last November two astronomers in Germany asked me to help them get copies of several HST images which were in anonymousFTP on fits.cx.nrao.edu. They were having trouble with slow, congested, unreliable circuits in Germany at that time, and could not reliably FTP a 5 MB file. I wrote a script and trivial c routine to split the file into an even-byte file and an odd-byte file and compress each separately. The even-byte (high order byte) file compressed by 80%, but the noisy odd-byte file would not compress, so the net reduction was by 40%. Fortunately, it turned out that it was possible to FTP a 3 MB file. Sometimes compression can make the difference between success and failure! If this interests you, do anonFTP to fits.cx.nrao.edu [192.33.115.8] and fetch the smallest of the following three files (the tar file) in directory FITS/HST: -rw-r--r-- 1 dwells vlb 5158080 Oct 19 1990 w0bs0102t_cvt.c0h -rw-r--r-- 1 dwells vlb 4380234 Nov 15 1990 w0bs0102t_cvt.c0h.Z -rw-r--r-- 1 dwells vlb 3088384 Nov 16 1990 w0bs0102t_cvt.c0h.tar The tar file will unpack to two files and a C program and a Makefile, and it will compile the program and execute it to uncompress and zipper the two files back together to recover the 5.1_MB original. As you can see, compress on the un-split file only got 20%, as compared to 40% on the split file. =-=-=-= I recommend that anyone who is serious about compression should consider applying a low-pass filter to the sampled data (e.g., convolve with Gaussian with, say, 2 pixel halfwidth). The purpose of the filtering is to increase the correlation of adjacent pixels (i.e., reduce high frequency noise) so that the subtraction will eliminate more bits. Even a modest reduction in resolution can make a gain in smoothness, with a resulting saving in bits. It will also be *very* important to consider how many of the least significant bits you really need after the filtering. Every least significant bit that you mask is a bit that won't go into your compressed dataset! Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:30:38 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1357" "Thu" "26" "September" "1991" "14:19:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<5947676 at toto.iv>" "24" "Re: compression of fits" "^From:" nil nil "9" "1991092614:19:00" "compression of fits" (number " " mark " Archie Warnock Sep 26 24/1357 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov Organization: ST Systems Corp. - NASA/NCDS From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: Thu, 26 Sep 1991 14:19:00 GMT In article , dwells at fits.cx.nrao.edu (Don Wells) writes... >The general strategy which I speculate will generally make the maximum >gain is to split the datastream into streams which have homogeneous >statistics, and compress each stream separately. For a table, compress Another reason this might help matters, at least for compression schemes like that used in Unix compress, is that the symbol table for such schemes usually assumes 8-bit data (Huffman, LZ, etc. are usually designed for text, not numeric, data). I'm not aware of any investigation into 16-bit tables. >I recommend that anyone who is serious about compression should >consider applying a low-pass filter to the sampled data (e.g., >convolve with Gaussian with, say, 2 pixel halfwidth). The purpose of Yes, yes, yes - the more often we say it, the better our chances of actually getting the astronomical community to believe it - you don't necessarily need (or even _want_) all those bits, particularly if you're limited in storage space and/or bandwidth. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From pwb at newt.phys.unsw.oz.au Wed Oct 9 09:32:06 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1819" "" "25" "September" "91" "02:48:05" "GMT" "Paul W. Brooks" "pwb at newt.phys.unsw.oz.au " "<2611496 at toto.iv>" "34" "Re: compression of fits" "^From:" nil nil "9" "1991092502:48:05" "compression of fits" (number " " mark " Paul W. Brooks Sep 25 34/1819 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Summary: MSDOS compression 20% - 50% From: pwb at newt.phys.unsw.oz.au (Paul W. Brooks) Subject: Re: compression of fits Date: 25 Sep 91 02:48:05 GMT I have experimated extensively with compressing FITS images on our MS-DOS system (to try to squeeze more images on each tape). I tried all sorts of archiver algorithms, and found that uniform flat frames (such as bias or dark frames) could be compressed to about 45% of the original length, but any structure (such as large galaxies) reduced the effectiveness, until our flat fields (which are quite non-flat :-( ) could only be compressed by 20% - i.e. to 80% of the original length. This is because the compression algorithms all go looking for repeated sequences of bytes - the larger the dynamic range covered by an image, the less compression is possible, as there are fewer repeated sequences. In practice, at the byte level images are effectively random (assuming you are sampling the sky noise or the CCD readout noise) sequences, which don't vary much. What is required is a compression algorithm that: a) works on a pixels (1,2,4,8 bytes depending on BITPIX) rather than single bytes b) encodes the difference between a pixel and its neighbour (which is small, and might normally be stored in a single byte) Probably some other smarts are required as well, but compression algorithms designed for text probably will never work effectively on binary data. At least with the above system, the larger your pixel size, the better compression you get! (BITPIX=16 images could only achieve under 50% compression, BITPIX=32 might approach 75% compression) P.S. my images tested above were 600x400x16bits - close enough to 0.5 MB in size. Paul Brooks |Internet: pwb at newt.phys.unsw.edu.au Uni. of N.S.W. |If you have trouble sleeping, try lying on the end of Kensington NSW 2033| your bed. With a little luck you'll drop off. AUSTRALIA | - Mark Twain. From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:32:10 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["659" "" "25" "September" "91" "18:36:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<4462057 at toto.iv>" "11" "Re: compression of fits" "^From:" nil nil "9" "1991092518:36:00" "compression of fits" (number " " mark " Archie Warnock Sep 25 11/659 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Summary: MSDOS compression 20% - 50% Organization: ST Systems Corp. - NASA/NCDS News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: 25 Sep 91 18:36:00 GMT In article <2589 at usage.csd.unsw.oz.au>, pwb at newt.phys.unsw.oz.au (Paul W. Brooks) writes... >which don't vary much. What is required is a compression algorithm that: I claim that what is required is a set of algorithms to choose from. The variety of data representations is too varied to find a single algorithm which works in all cases. Think bigger ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From dwells at fits.cx.nrao.edu Wed Oct 9 09:33:12 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1330" "Wed" "25" "September" "1991" "19:03:57" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " "<4007442 at toto.iv>" "26" "Warnock etal Compression Paper Available" "^From:" nil nil "9" "1991092519:03:57" "Warnock etal Compression Paper Available" (number " " mark " Don Wells Sep 25 26/1330 " thread-indent "\"Warnock etal Compression Paper Available\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Radio Astronomy Observatory, Charlottesville, VA Distribution: alt From: dwells at fits.cx.nrao.edu (Don Wells) Subject: Warnock etal Compression Paper Available Date: Wed, 25 Sep 1991 19:03:57 GMT "An Extension of FITS for Data Compression", A.Warnock, R.Hill, B.Pfarr and D.Wells, 11 pages, is now available for anonymous-FTP access on fits.cx.nrao.edu [192.33.115.8] in directory FITS/doc: -rw-r--r-- 1 dwells vlb 171951 Sep 25 13:47 fitscomp.ps -rw-r--r-- 1 dwells vlb 27267 Sep 25 13:46 fitscomp.tex LaTeX typeset the date of the document as today 25Sept91. This is misleading; the document was originally composed circa 1988, and was presented in a WGAS paper session about that date. I (DCW) am a co-author on this paper, but this does not mean that I think that the approach presented in the paper is the only possible approach to compression of FITS files, or even the best approach. I reserve judgement on the latter question. My attitude is that the FITS community needs to study and debate a series of papers like this one, and construct some prototypes (hopefully interoperable prototypes!), before we settle on one or more canonical standard schemes. This paper is an essential step in the process. -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:33:35 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1286" "Thu" "26" "September" "1991" "14:14:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<7855565 at toto.iv>" "21" "Re: Warnock etal Compression Paper Available" "^From:" nil nil "9" "1991092614:14:00" "Warnock etal Compression Paper Available" (number " " mark " Archie Warnock Sep 26 21/1286 " thread-indent "\"Re: Warnock etal Compression Paper Available\"\n") nil] nil) Newsgroups: alt.sci.astro.fits News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov Organization: ST Systems Corp. - NASA/NCDS Distribution: alt From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: Warnock etal Compression Paper Available Date: Thu, 26 Sep 1991 14:14:00 GMT In article , dwells at fits.cx.nrao.edu (Don Wells) writes... >I (DCW) am a co-author on this paper, but this does not mean that I >think that the approach presented in the paper is the only possible >approach to compression of FITS files, or even the best approach. I >reserve judgement on the latter question. My attitude is that the >FITS community needs to study and debate a series of papers like this >one, and construct some prototypes (hopefully interoperable >prototypes!), before we settle on one or more canonical standard >schemes. This paper is an essential step in the process. I agree strongly with this statement. We made Don a co-author in recognition of some major suggestions he made which (in our opinion) made the proposal a viable one, not just to share in the blame . It is a _proposal_, though. It's a starting point for discussion, and I hope it will generate plenty. I want problems to be uncovered _before_ it's in use too widely. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From warnock at nssdca.gsfc.nasa.gov Wed Oct 9 09:33:46 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1910" "" "26" "September" "91" "14:09:00" "GMT" "Archie Warnock" "warnock at nssdca.gsfc.nasa.gov " "<6362044 at toto.iv>" "38" "Re: compression of fits" "^From:" nil nil "9" "1991092614:09:00" "compression of fits" (number " " mark " Archie Warnock Sep 26 38/1910 " thread-indent "\"Re: compression of fits\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: ST Systems Corp. - NASA/NCDS News-Software: VAX/VMS VNEWS 1.41 Nntp-Posting-Host: nssdca.gsfc.nasa.gov From: warnock at nssdca.gsfc.nasa.gov (Archie Warnock) Subject: Re: compression of fits Date: 26 Sep 91 14:09:00 GMT In article <1991Sep25.234751.26451 at newshost.anu.edu.au>, markus at mso.anu.edu (Markus Buchhorn) writes... >Uhmmm - 'scuse me - if you compress the header using >then how can the decompression program know which algorithm you chose ? >(You've compressed the keyword which actually tells you....) Not quite. Take your original FITS file, header and all, and compress it however you like. Take the resulting byte stream and glue it onto a header. The latter header is the one which gives the compression algorithm, along with the other information necessary to read the file. >Some sort of magic-cookie at the head of the file ? or am I misinterpreting Yep - another FITS header... (sorry - I just had to...) >I think that the FITSIO library written by .... [dang....sorry, forgotten] Bill Pence. >is going to become quite widespread. News I heard was that the FITSIO >stuff was *very* popular. Thus the need for support is dropped onto >only one person - and if people are prepared to help him, there shouldn't Well, it seems that's the way lots of stuff is done these days - by contributed labor. My hope/intention is that the "accepted" compression algorithms will be published so that anyone can code them up. That is the purpose of the appendix in the compression proposal itself - to illustrate the process by publishing the first algorithm (PREVPIXEL). >[BTW - have you put your comp. proposal docs up for ftp somewhere ? If you >did (and announced it) then we're not getting all the messages down (up :-) ) >here in Australia :-( ] Yep - Don Wells has posted it on fits.cx.nrao.edu. ---------------------------------------------------------------------------- -- Archie Warnock Internet: warnock at nssdc.gsfc.nasa.gov -- ST Systems Corp. SPAN: NSSDC::WARNOCK -- NASA/GSFC "Unix - JCL for the 90s" From dwells at fits.cx.nrao.edu Wed Oct 9 09:34:32 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1265" "Thu" "26" "September" "1991" "16:24:20" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " "<3432267 at toto.iv>" "36" "Sample single-dish FITS table available" "^From:" nil nil "9" "1991092616:24:20" "Sample single-dish FITS table available" (number " " mark " Don Wells Sep 26 36/1265 " thread-indent "\"Sample single-dish FITS table available\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Radio Astronomy Observatory, Charlottesville, VA Distribution: alt From: dwells at fits.cx.nrao.edu (Don Wells) Subject: Sample single-dish FITS table available Date: Thu, 26 Sep 1991 16:24:20 GMT Frank Ghigo posted the following message to the old "dishfits" exploder about an hour ago: Message-Id: <9109261407.AA13070 at lodestar.gb.nrao.edu> From: fghigo at lodestar.GB.NRAO.EDU (FRANK GHIGO) Sender: dishfits-request at fits.CX.NRAO.EDU To: dishfits at lodestar.gb.nrao.edu Date: Thu, 26 Sep 91 10:07:17 EDT Sept. 26, 1991 A sample single-dish FITS binary table is now available for examination or testing by anyone. The file is available by anonymous ftp from the machine "fits.cx.nrao.edu" in Charlottesville. To retrieve the demo file, do the following: ftp to "fits.cx.nrao.edu" login as "anonymous" cd FITS/SingleDish get gbfits.demo This file contains a few sample spectral line scans from the Green Bank 140-ft and from the Kitt Peak 12-meter. It conforms to the single-dish FITS agreement of Nov. 1989. Any comments or criticisms are welcome. -- frank ghigo (fghigo at nrao.edu) -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From sla at helios.ucsc.edu Wed Oct 9 09:35:04 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2737" "" "26" "September" "91" "15:12:56" "GMT" "Steve Allen" "sla at helios.ucsc.edu " "<7390194 at toto.iv>" "47" "those trailing zeros in the last 2880 byte block" "^From:" nil nil "9" "1991092615:12:56" "those trailing zeros in the last 2880 byte block" (number " " mark " Steve Allen Sep 26 47/2737 " thread-indent "\"those trailing zeros in the last 2880 byte block\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Summary: just how important are they on a disk? Keywords: zeros, padding, logical records, random access media Organization: UCO/Lick Observatory From: sla at helios.ucsc.edu (Steve Allen) Subject: those trailing zeros in the last 2880 byte block Date: 26 Sep 91 15:12:56 GMT In article <1991Sep25.151241.1531 at murdoch.acc.Virginia.EDU> gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes: > The original FITS papers considered this point, and >conceded that not all reader programs can read every FITS file, but >the programs should at least exit gracefully. I have encounted data >sets that were called FITS files, that did not have the last record >padded out to 2880 octects, and some FITS readers die horribly at >this. Blocking out the last record to 2880 octects is not very >wastefull of space, at least for modern data sets >-Greg Hennessy, University of Virginia First, to make something perfectly clear: When you observe at Lick Observatory and bring home a magnetic tape (9-track or exabyte) written by the data acquisition system, that tape always has physical records which are multiples of 2880 bytes long. The last record after the image data is padded with zeros. So Lick generates conforming FITS tapes. However, most images spend at least some time on disk during the observing run, and during this time they are usually manipulated by an image processing package (namely Vista*). The images on the disk are not written with the zero padding to fill out the records to 2880 bytes. They simply terminate immediately after the last byte of the image. (This is a Berkeley Unix system). Vista is not bothered by this. It has been becoming more and more common for observers to transfer their images via methods other than magnetic tape (namely ftp--Lick does not officially endorse this yet.) When such an image is then read by IRAF's rfits, IRAF chokes and fails. I do not consider this to be exiting "gracefully." I have suggested to these observers that they ask NOAO to fix IRAF to be "graceful" in this case. Lick is not opposed to changing the data acquisition system such that it writes out the extra zeros. This will cost some disk space on a disk which is already too small. Furthermore, when dealing with the images >from some of the new (and small) IR array detectors, the proportional increase in size is not insignificant. We would welcome opinions on the importance of the zero-padding to 2880, along with reasons that explain why not having those zeros is bad for FITS readers other than IRAF. (*) P.S. Vista is available for most modern computers via anonymous ftp. Contact Jon Holtzman (holtz at lowell.edu) for more details. _______________________________________________________________________________ Steve Allen | | sla at helios.ucsc.edu UCO/Lick Observatory | This space for rent. | If the UC were opining, Santa Cruz, CA 95064 | | it wouldn't tell me. From bschlesinger at nssdcb.gsfc.nasa.gov Wed Oct 9 09:35:14 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["3107" "Fri" "27" "September" "1991" "20:16:00" "GMT" "Barry Schlesinger" "bschlesinger at nssdcb.gsfc.nasa.gov " "<5339465 at toto.iv>" "57" "Re: those trailing zeros in the last 2880 byte block" "^From:" nil nil "9" "1991092720:16:00" "those trailing zeros in the last 2880 byte block" (number " " mark " Barry Schlesinger Sep 27 57/3107 " thread-indent "\"Re: those trailing zeros in the last 2880 byte block\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Summary: just how important are they on a disk? News-Software: VAX/VMS VNEWS 1.41 Keywords: zeros, padding, logical records, random access media Nntp-Posting-Host: nssdcb.gsfc.nasa.gov Organization: NASA - Goddard Space Flight Center From: bschlesinger at nssdcb.gsfc.nasa.gov (Barry Schlesinger) Subject: Re: those trailing zeros in the last 2880 byte block Date: Fri, 27 Sep 1991 20:16:00 GMT In article <21295 at darkstar.ucsc.edu>, sla at helios.ucsc.edu (Steve Allen) writes... >In article <1991Sep25.151241.1531 at murdoch.acc.Virginia.EDU> gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes: >> The original FITS papers considered this point, and >>conceded that not all reader programs can read every FITS file, but >>the programs should at least exit gracefully. I have encounted data >>sets that were called FITS files, that did not have the last record >>padded out to 2880 octects, and some FITS readers die horribly at >>this. Blocking out the last record to 2880 octects is not very >>wastefull of space, at least for modern data sets >>-Greg Hennessy, University of Virginia > > ... > >It has been becoming more and more common for observers to transfer their >images via methods other than magnetic tape (namely ftp--Lick does not >officially endorse this yet.) When such an image is then read by IRAF's >rfits, IRAF chokes and fails. I do not consider this to be exiting >"gracefully." I have suggested to these observers that they ask NOAO >to fix IRAF to be "graceful" in this case. > > ... > >We would welcome opinions on the importance of the zero-padding to 2880, >along with reasons that explain why not having those zeros is bad for >FITS readers other than IRAF. >_______________________________________________________________________________ >Steve Allen | | sla at helios.ucsc.edu >UCO/Lick Observatory | This space for rent. | If the UC were opining, >Santa Cruz, CA 95064 | | it wouldn't tell me. The mandated logical record (hereafter record) size for FITS is 2880 bytes. FITS readers take advantage of this requirement by picking up the data sets 2880 bytes at a time, particularly for media other than tape and electronic transmissions where there is no indicator of an end of record. So, to begin, the software has to be able to handle a record of a different length and distinguish the end record from an error. That's one complication. Now suppose someone gets this data set and adds extensions after it. For standard FITS, the software need only check the beginning of each 2880-byte gulp for the XTENSION keyword and type name. If the data are not padded, then the reader has to be prepared to look for the XTENSION keyword elsewhere. That is a more complicated process than simply picking up 2880 bytes at a time. A minimalist FITS reader designed to take advantage of the 2880-byte unit specified in the FITS papers will would not expect to have to make such a check. Standards are devised to provide a common ground for different users and systems. As soon as a group starts letting a requirement slide because it is inconvenient, this common ground is lost. As a lack of fill is a violation of the FITS standard, as expressed in the FITS papers, any FITS-checking software should at the very least flag it. Ours will. Barry Schlesinger NASA/OSSA Office of Standards and Technology (NOST) FITS Support Office From thompson at stars.gsfc.nasa.gov Wed Oct 9 09:35:53 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["4688" "" "26" "September" "91" "16:24:09" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<6966542 at toto.iv>" "91" "Any thoughts on nD arrays in binary tables?" "^From:" nil nil "9" "1991092616:24:09" "Any thoughts on nD arrays in binary tables?" (number " " mark " Bill Thompson Sep 26 91/4688 " thread-indent "\"Any thoughts on nD arrays in binary tables?\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Reply-To: thompson at stars.gsfc.nasa.gov Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA News-Software: VAX/VMS VNEWS 1.3-4 Nntp-Posting-Host: stars.gsfc.nasa.gov From: thompson at stars.gsfc.nasa.gov (Bill Thompson) Subject: Any thoughts on nD arrays in binary tables? Date: 26 Sep 91 16:24:09 GMT I am connected with the Coronal Diagnostic Spectrometer for the Solar Heliospheric Observatory (SOHO). We plan to use FITS as the standard for data storage. My problem is that our data does not seem to exactly fit into the FITS standard. The detector we are building uses a CCD chip for readout. We will be able to operate this chip in "windowed-mode" in which we will be able to select out up to 40 rectangular areas on the CCD chip that will actually be read out, with the rest of the chip being ignored for speed. One way to represent this in a data file would be to reconstruct the entire chip, with areas outside the windows being set to zero or NaN. However, this would be extremely wasteful of space, so I would prefer to store each window as a separate array within the same file. The problem is that each window could have a different size from all the others. I just downloaded the document "Binary Table Extension to FITS: A Proposal" by W. D. Cotton and D. Tody. The draft date is given as 20 September 1991. I found it very interesting. In particular I was interested in Appendix A, the "Multidimensional Array" Convention (I haven't digested Appendix B yet). It seems to address this vary issue. Basically, they suggest using, as part of the binary tables extension, the additional keywords TDIM1, TDIM2, etc., where the number refers to the column associated with it. I give an example below of my interpretation of how this would be coded in the header. This example shows only a single exposure, but I can imagine a series of identically formatted exposures with each row in the binary table representing a specific exposure. This would be quite consistent with the way we intend to operate our instrument. I guess I'm still a little nervous about the fact that the TDIMnnn keyword may not be generally supported by FITS readers. I'm not worried that we couldn't read our data, but that colleagues we send our data to could not use it. According to the Cotton and Tody document, this keyword (Appendix A) is not part of their proposed binary table definition. I've also seen mention of something called the "Green Bank Convention" in an earlier news article by Don Wells. If I understand this correctly, the format and dimensions of the array (the equivalents of NAXIS, NAXIS1, etc.) would each appear as separate columns in the binary table, and each row would represent a different window from the same exposure. The data itself could either be null filled to the maximum array size, or stored as variable length arrays using the facility described in Appendix B of the Cotton and Tody document. Either way would work at some level. Personally, I prefer the "TDIMn" approach--it seems less complicated to me. I figure, though, that other people must be struggling with similar problems, and I was wondering what the community saw themselves as standardizing on. Thank you, Bill Thompson Example (using TDIMn) >From my understanding, the following should be a valid binary tables extension header. It describes an observation with three windows on the CCD. Window Size Position 1 10x100 10,10 2 20x100 200,10 3 50x50 500,100 The keywords TROWnnn and TCOLnnn were introduced by myself. This can easily be generalized to a series of observations made with the same window configuration simply by modifying NAXIS2. XTENSION= 'BINTABLE' / Binary table extension BITPIX = 8 / Binary data NAXIS = 2 / Table is a matrix NAXIS1 = 11000 / Width of table row in bytes NAXIS2 = 1 / Number of rows in table PCOUNT = 0 / Random parameter count GCOUNT = 1 / Group count TFIELDS = 3 / Number of windows TFORM1 = '1000I' / 16-bit integer TTYPE1 = 'Window #1' / Label for window 1 TDIM1 = '(10,100)' / Dimensions of window 1 TROW1 = 10 / Starting row of window 1 TCOL1 = 10 / Starting column of window 1 TFORM2 = '2000I' / 16-bit integer TTYPE2 = 'Window #2' / Label for window 2 TDIM2 = '(20,100)' / Dimensions of window 2 TROW2 = 200 / Starting row of window 2 TCOL2 = 10 / Starting column of window 2 TFORM3 = '2500I' / 16-bit integer TTYPE3 = 'Window #3' / Label for window 3 TDIM3 = '(50,50)' / Dimensions of window 3 TROW3 = 500 / Starting row of window 3 TCOL3 = 100 / Starting column of window 3 END From tody at noao.edu Wed Oct 9 09:35:58 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["4458" "" "27" "September" "91" "21:47:21" "GMT" "Doug Tody" "tody at noao.edu " "<5338129 at toto.iv>" "77" "Re: Any thoughts on nD arrays in binary tables?" "^From:" nil nil "9" "1991092721:47:21" "Any thoughts on nD arrays in binary tables?" (number " " mark " Doug Tody Sep 27 77/4458 " thread-indent "\"Re: Any thoughts on nD arrays in binary tables?\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Optical Astronomy Observatories, Tucson, AZ, USA From: tody at noao.edu (Doug Tody) Subject: Re: Any thoughts on nD arrays in binary tables? Date: 27 Sep 91 21:47:21 GMT > From article <1991Sep26.152517.28870 at nsisrv.gsfc.nasa.gov>, by > thompson at stars.gsfc.nasa.gov (Bill Thompson): > > I am connected with the Coronal Diagnostic Spectrometer for the Solar > Heliospheric Observatory (SOHO). We plan to use FITS as the standard for > data storage. My problem is that our data does not seem to exactly fit into > the FITS standard. > > The detector we are building uses a CCD chip for readout. We will be able > to operate this chip in "windowed-mode" in which we will be able to select > out up to 40 rectangular areas on the CCD chip that will actually be read > out, with the rest of the chip being ignored for speed. One way to > represent this in a data file would be to reconstruct the entire chip, with > areas outside the windows being set to zero or NaN. However, this would be > extremely wasteful of space, so I would prefer to store each window as a > separate array within the same file. The problem is that each window could > have a different size from all the others. A third possibility which you didn't mention would be to write each subregion out as a separate FITS image. If ease of reading the data at another site is important this might be the best approach. Provided the subregions are large (several hundred pixels square or larger) the overhead would not be so bad. I agree that if you have to record many small subregions, the complexity of a table might well be justified. If you decide to use a table there are two possible approaches, as outlined in your article: one table row per subregion, or all the subregions for an exposure on one row. Either approach will work. If you want another opinion, I think I would probably go with a simple record structure and output one subregion per table row (as in the Green Bank convention), using two columns to specify the subregion size in X and Y, and one array type column for the pixels. A fixed size array should be used if the regions do not vary greatly in size; only if this becomes seriously inefficient should you consider using a variable length array. > I just downloaded the document "Binary Table Extension to FITS: A Proposal" > by W. D. Cotton and D. Tody. The draft date is given as 20 September 1991. > I found it very interesting. In particular I was interested in Appendix A, > the "Multidimensional Array" Convention (I haven't digested Appendix B > yet). It seems to address this vary issue. If you want to use binary tables I should remind you that this is not yet a FITS standard. That doesn't mean that you shouldn't use binary tables, just that there is no guarantee that things will not change. Also there is not a whole lot of software out there yet that can do anything very useful with such data. > Basically, they suggest using, as part of the binary tables extension, the > additional keywords TDIM1, TDIM2, etc., where the number refers to the > column associated with it. I give an example below of my interpretation of > how this would be coded in the header. This example shows only a single > exposure, but I can imagine a series of identically formatted exposures with > each row in the binary table representing a specific exposure. This would > be quite consistent with the way we intend to operate our instrument. > > I guess I'm still a little nervous about the fact that the TDIMnnn keyword > may not be generally supported by FITS readers. I'm not worried that we > couldn't read our data, but that colleagues we send our data to could not > use it. According to the Cotton and Tody document, this keyword (Appendix > A) is not part of their proposed binary table definition. The binary tables proposal tries to avoid the issue of how to represent multidimensional arrays because it is a harder problem than it might at first appear. This very discussion points out the kind of problem that you can run into - TDIM doesn't work if the array size changes in each record. We need more practical experience with embedding multidimensional arrays in tables before we can extend FITS in this area. > Either way would work at some level. Right. It is a database design problem. Tables just provides the method, it doesn't help you decide how to represent the data. -- Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217 UUCP: {arizona,decvax,ncar}!noao!tody or uunet!noao.edu!tody Internet: tody at noao.edu SPAN/HEPNET: NOAO::TODY (NOAO=5355) From thompson at stars.gsfc.nasa.gov Wed Oct 9 09:36:06 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["4190" "" "27" "September" "91" "23:06:13" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<3884511 at toto.iv>" "83" "Multidimensional arrays in binary tables" "^From:" nil nil "9" "1991092723:06:13" "Multidimensional arrays in binary tables" (number " " mark " Bill Thompson Sep 27 83/4190 " thread-indent "\"Multidimensional arrays in binary tables\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Reply-To: thompson at stars.gsfc.nasa.gov Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA News-Software: VAX/VMS VNEWS 1.3-4 Nntp-Posting-Host: stars.gsfc.nasa.gov From: thompson at stars.gsfc.nasa.gov (Bill Thompson) Subject: Multidimensional arrays in binary tables Date: 27 Sep 91 23:06:13 GMT Earlier, I sent an example of what I thought a header would look like with Cotton and Tody's suggested TDIMnnn convention. I was in favor of that convention, but wondered whether other people were planning to use and/or support this convention. This time I'm sending the same example case, but using something Don Wells calls the "Green Bank Convention". My understanding of the G.B. convention is as follows. A column is assigned for each keyword of the basic FITS header, with the column label designating which keyword, except that NAXISn becomes MAXISn. Keywords which would be the same for each row of the table can be entered as a keyword in the binary table header rather than as a column (it is described as a virtual column). The multi-dimensional data array is stored as another column with a specific column label which I've assumed to be ARRAY. I chose to put the keywords MAXIS1 and MAXIS2 in columns 1 and 2. Since MAXIS would always be 2, I put it in the header. I didn't think that the equivalent of BITPIX (MBITPIX?) would be necessary, as that same information is given by the keyword TFORM5. While using the variable length facility is not required by this convention, I used it because it represents the most efficient way of storing the data. Otherwise, padding would be required. A significant difference between this convention and the TDIMnnn convention, is that this allows only one multidimensional array per row in the binary table. It is therefore not obvious to me how one would most adequately describe a situation in which there would be a series of identical exposures, with the same window configuration in each exposure. With the TDIMnnn convention, the most straightforward way to do this was to make each row in the binary table a different exposure. Another difference is that TDIMnnn signaled its use from the very presence of the TDIMnnn keyword. It seems to me that it would be harder for a generalized FITS reader to recognize when the "Green Bank Convention" was being used. Example (using "Green Bank Convention") >From my understanding, the following should be a valid binary tables extension header. This is as close an approximation as I can get to the "Green Bank Convention" based on the information I currently have. It describes an observation with three windows on the CCD. Window Size Position 1 10x100 10,10 2 20x100 200,10 3 50x50 500,100 In this example, the array sizes and positions are stored inside the binary table. XTENSION= 'BINTABLE' / Binary table extension BITPIX = 8 / Binary data NAXIS = 2 / Table is a matrix NAXIS1 = 12 / Width of table row in bytes NAXIS2 = 3 / Number of rows in table PCOUNT = 11000 / Variable length parameter count GCOUNT = 1 / Group count TFIELDS = 5 / Number of columns THEAP = 36 / Start byte of variable length parameters TFORM1 = '1I ' / 16-bit integer TTYPE1 = 'MAXIS1 ' / Column 1 contains the 1st dimension size TFORM2 = '1I ' / 16-bit integer TTYPE2 = 'MAXIS2 ' / Column 2 contains the 2nd dimension size TFORM3 = '1I ' / 16-bit integer TTYPE3 = 'COLUMN0 ' / Column 3 contains window starting column TFORM4 = '1I ' / 16-bit integer TTYPE4 = 'ROW0 ' / Column 4 contains window starting row TFORM5 = '1PI(2500)' / Variable length 16-bit integer array TTYPE5 = 'ARRAY ' / Column 5 contains the window array MAXIS = 2 / Each window has 2 dimensions END The data portion of this FITS extension would be organized as is shown symbolically below: 10 ; 100 ; 10 ; 10 ; ; 20 ; 100 ; 200 ; 10 ; ; 50 ; 50 ; 500 ; 100 ; ; Variable array 1 (2000 bytes) ; Variable array 2 (4000 bytes) ; Variable array 3 (5000 bytes) There are no gaps anywhere in the datastream. The variable length arrays start immediately after the last byte in the binary table. From tody at noao.edu Wed Oct 9 09:36:10 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1310" "" "28" "September" "91" "03:02:26" "GMT" "Doug Tody" "tody at noao.edu " "<5209261 at toto.iv>" "22" "Re: Multidimensional arrays in binary tables" "^From:" nil nil "9" "1991092803:02:26" "Multidimensional arrays in binary tables" (number " " mark " Doug Tody Sep 28 22/1310 " thread-indent "\"Re: Multidimensional arrays in binary tables\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Optical Astronomy Observatories, Tucson, AZ, USA From: tody at noao.edu (Doug Tody) Subject: Re: Multidimensional arrays in binary tables Date: 28 Sep 91 03:02:26 GMT >From article <1991Sep27.220632.11388 at nsisrv.gsfc.nasa.gov>, by thompson at stars.gsfc.nasa.gov (Bill Thompson): > A significant difference between this convention and the TDIMnnn convention, > is that this allows only one multidimensional array per row in the binary > table. It is therefore not obvious to me how one would most adequately > describe a situation in which there would be a series of identical > exposures, with the same window configuration in each exposure. With the > TDIMnnn convention, the most straightforward way to do this was to make each > row in the binary table a different exposure. Bill, that's easy - you need to add another column giving the exposure number to the table. Select all rows having the same exposure number, and you have all the regions output for that exposure. The advantage of this over the all-on-a-row approach is that the record structure remains the same regardless of the number of subregions you read out. This, along with the modest number of columns, could significantly simplify the software required to produce and read the table. -- Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217 UUCP: {arizona,decvax,ncar}!noao!tody or uunet!noao.edu!tody Internet: tody at noao.edu SPAN/HEPNET: NOAO::TODY (NOAO=5355) From thompson at stars.gsfc.nasa.gov Wed Oct 9 09:36:14 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1646" "" "30" "September" "91" "16:26:48" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<1603633 at toto.iv>" "29" "Re: Multidimensional arrays in binary tables" "^From:" nil nil "9" "1991093016:26:48" "Multidimensional arrays in binary tables" (number " " mark " Bill Thompson Sep 30 29/1646 " thread-indent "\"Re: Multidimensional arrays in binary tables\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Reply-To: thompson at stars.gsfc.nasa.gov Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA News-Software: VAX/VMS VNEWS 1.3-4 Nntp-Posting-Host: stars.gsfc.nasa.gov From: thompson at stars.gsfc.nasa.gov (Bill Thompson) Subject: Re: Multidimensional arrays in binary tables Date: 30 Sep 91 16:26:48 GMT In article <1991Sep28.030226.5900 at noao.edu>, tody at noao.edu (Doug Tody) writes... >> A significant difference between this convention and the TDIMnnn convention, >> is that this allows only one multidimensional array per row in the binary >> table. It is therefore not obvious to me how one would most adequately >> describe a situation in which there would be a series of identical >> exposures, with the same window configuration in each exposure. With the >> TDIMnnn convention, the most straightforward way to do this was to make each >> row in the binary table a different exposure. > >Bill, that's easy - you need to add another column giving the exposure >number to the table. Select all rows having the same exposure number, >and you have all the regions output for that exposure. Yes, I also thought of that. The difficulty with that approach is that it doesn't really express the underlying structure of my anticipated database. I think that should also be the goal of designing a FITS file format. However, I could also think of cases where that approach would make more sense than putting everything in a separate column. I am in the process of preparing a proposal that would allow me to use the "Green Bank" convention, but still delineate the difference between windows in the same exposure, and different exposures. I will send out this proposal in a later message. Another one of my goals is to avoid padding out oversized arrays. The TDIMnnn approach allows me to do that in one way. The "Green Bank" convention also allows me to do that, but only if I use the more complicated variable array size facility. From thompson at stars.gsfc.nasa.gov Wed Oct 9 09:36:26 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2620" "" "27" "September" "91" "23:07:39" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<7785368 at toto.iv>" "48" "Proposed BINTABLE binary table extension to FITS" "^From:" nil nil "9" "1991092723:07:39" "Proposed BINTABLE binary table extension to FITS" (number " " mark " Bill Thompson Sep 27 48/2620 " thread-indent "\"Proposed BINTABLE binary table extension to FITS\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Reply-To: thompson at stars.gsfc.nasa.gov Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA News-Software: VAX/VMS VNEWS 1.3-4 Nntp-Posting-Host: stars.gsfc.nasa.gov From: thompson at stars.gsfc.nasa.gov (Bill Thompson) Subject: Proposed BINTABLE binary table extension to FITS Date: 27 Sep 91 23:07:39 GMT I have been looking at the document "Binary Table Extension to FITS: A Proposal" (DRAFT 20 September 1991), by W. D. Cotton and D. Tody. Principally, I have been looking into the question of entering multidimensional arrays into the binary tables. So far I have come across two different possible conventions on how to do this. The proposal discusses one such convention, the TDIMnnn convention, in Appendix A, and discuss the possibilities of other conventions in their section 4. Don Wells advocates something called the "Green Bank Convention". Cotton and Tody also discuss the possibility of more complicated structural information being entered into elements of binary tables. Since everyone I've talked to seems to think that binary tables are the wave of the future, then it may be advantageous to think about how the binary tables concept will be extended in the future. The TDIMnnn convention requires that the keywords TDIM1, TDIM2, etc. be reserved for the use of this convention. Certainly other conventions will also impose similar requirements. What I propose is that another keyword be reserved as part of the basic binary tables (BINTABLE) extension. This keyword (let's call it CONVENTN for now) would signal the presence of additional constraints on the creation of user defined keywords. For instance, if the value of CONVENTION were 'GBC', then this would mean that the additional keywords MAXIS, MAXIS1, MAXIS2, etc. were reserved. The idea is that rather than try to anticipate all the possible reserved words now, we could reserve one additional keyword which allows us to anticipate future improvements. Otherwise we would have to register each improvement as a different version of XTENSION, which I feel is more awkward. The following rules would apply to the use of the CONVENTN keyword. 1. The CONVENTN keyword would not be a required keyword, but would be optional. If the CONVENTN keyword is absent, then only the standard restrictions apply on making up names for keywords. 2. A file using the CONVENTN keyword would be written in such a way that a FITS reader that did not support the use of the CONVENTN keyword, or that did not recognize its value, would still be able to read the file according to the standard of the Binary Tables Extension. 3. Only registered values would be allowed for the CONVENTN keyword. These values, and the associated description of the convention, would become part of the definition of the Binary Table Extension through some controlled process. I throw this out for general discussion. Bill Thompson From pence at heawk1.gsfc.nasa.gov Wed Oct 9 09:36:35 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["656" "Mon" "30" "September" "1991" "16:12:37" "GMT" " William Pence" "pence at heawk1.gsfc.nasa.gov " "<1101365 at toto.iv>" "18" "Re: FITS V3.0 available from NRL" "^From:" nil nil "9" "1991093016:12:37" "FITS V3.0 available from NRL" (number " " mark " William Pence Sep 30 18/656 " thread-indent "\"Re: FITS V3.0 available from NRL\"\n") nil] nil) Newsgroups: alt.sci.astro.fits Nntp-Posting-Host: heawk1.gsfc.nasa.gov Organization: Goddard Space Flight Center From: pence at heawk1.gsfc.nasa.gov ( William Pence) Subject: Re: FITS V3.0 available from NRL Date: Mon, 30 Sep 1991 16:12:37 GMT In general I would discourage users from obtaining 2nd hand copies of the FITSIO software because one may not get the latest version. As a case in point, the new version 3.01 of FITSIO, which fixes a relatively minor bug, was released just last Friday. Currently there a 2 ways to receive the most recent version of FITSIO: 1) by anonymous ftp from tetra.gsfc.nasa.gov in subdirectory pub/fitsio3 2) over the SPAN network (e.g., with DECNET copy) from NDADSA::HEASARC:[EXOSAT.XANADU.FITSIO.VERSION3] If any users have difficulty accessing the files by either of these methods, then they should send me a message. -Bill Pence HEASARC/GSFC From dwells at fits.cx.nrao.edu Wed Sep 18 18:15:41 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2034" "Wed" "18" "September" "1991" "22:55:27" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "44" "Re: wrong name for this group" "^From:" nil nil "9" nil nil nil nil] nil) X-VM-Summary-Format: "%n %*%a %-17.17F %-3.3m %2d %4l/%-5c %I\"%s\"\n" X-VM-Labels: nil X-VM-VHeader: ("Resent-" "From:" "Sender:" "To:" "Apparently-To:" "Cc:" "Subject:" "Date:") nil X-VM-Bookmark: 1 Newsgroups: alt.sci.astro.fits In-Reply-To: sla at fast.ucsc.edu's message of 18 Sep 91 17: 39:54 GMT Organization: National Radio Astronomy Observatory, Charlottesville, VA From: dwells at fits.cx.nrao.edu (Don Wells) Subject: Re: wrong name for this group Date: Wed, 18 Sep 1991 22:55:27 GMT In article <21022 at darkstar.ucsc.edu> sla at fast.ucsc.edu (Steve Allen) writes: In article <1991Sep18.114743.9862 at uwovax.uwo.ca> 17001_1511 at uwovax.uwo.ca writes: >Again I say, should this not have been sci.astro.fits ???? There is now general agreement in the FITS community that our long-term goal should be to create sci.astro.fits. SA> only 46 messages... not exactly high traffic... Several key groups in the FITS community have not yet demonstrated that they are able to participate in this newsgroup. Our goal must be to create a true worldwide discussion forum for FITS-related matters. SA> ... Nor do I believe that we actually have 100 subscribers... FITS is a matter of such importance in astronomy, throughout the world and across the entire range of observing frequencies and for both space-based and ground-based astronomy, that I will be surprised if we have trouble meeting the 100-excess-yes-votes test when the moment comes. Preben Grosbol is currently, this week, polling the European FITS Committee. In a message to me only 12 hours ago he said: "On the USEnet News, there are still a number of people who do not get the News or only with significant difficulty. I will make the final statistics end next week. I can already see that we cannot rely on News only. Some link/exchange to normal e-mail is needed." His conclusion regarding the need for an exploder is consistent with my own knowledge of the community. I expect that we will soon be operating an exploder with a bi-directional gateway to this newsgroup. SA> ... does anybody want to act as the central administration site SA> [for a rollcall and/or a USEnet vote]? You can expect that I will volunteer for that duty. I created this newsgroup. -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From dwells at fits.cx.nrao.edu Thu Sep 19 21:34:47 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1688" "Fri" "20" "September" "1991" "00:52:35" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "33" "Re: is there an ftp archive site for std | c++ sw | test data" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits In-Reply-To: alans at juliet.ll.mit.edu's message of 19 Sep 91 12: 20:40 Organization: National Radio Astronomy Observatory, Charlottesville, VA From: dwells at fits.cx.nrao.edu (Don Wells) Subject: Re: is there an ftp archive site for std | c++ sw | test data Date: Fri, 20 Sep 1991 00:52:35 GMT In article alans at juliet.ll.mit.edu ( Alan Stein) writes: AS> ... I believe... that my copy of the "FITS standard" AS> specification is outdated, and would like to obtain a current AS> version... Do anonymous-FTP to fits.cx.nrao.edu [192.33.115.8] and look in directory FITS/doc. You will find several interesting documents: -rw-r--r-- 1 dwells vlb 169172 Jul 25 22:45 bintable.ps -rw-r--r-- 1 dwells vlb 29273 May 27 14:36 bintable.tex -rw-r--r-- 1 dwells vlb 243 Jun 16 00:43 fits-docs.bib -rw-r--r-- 1 dwells vlb 363903 Jul 21 17:56 fits_standard.ps -rw-r--r-- 1 dwells vlb 7201 Jul 21 17:58 fits_standard.readme -rw-r--r-- 1 dwells vlb 99523 Jun 27 1990 fitsdbmsapp.ps -rw-r--r-- 1 dwells vlb 42193 Jun 27 1990 fitsdbmsapp.ps.Z -rw-r--r-- 1 dwells vlb 20603 Jun 27 1990 fitsdbmsapp.tex -rw-r--r-- 1 dwells vlb 7775 Feb 28 1989 fitsfp89.txt -r--r--r-- 1 dwells vlb 364130 Jun 16 00:16 vf.ps -rw-r--r-- 1 dwells vlb 71244 Jun 16 00:45 vf.tex The "fits_standard" is the most recent draft (version 0.2) of the NASA version of the FITS standard. The "bintable" is the May draft of the BINTABLE extension draft standard, and it should be regarded as superceding the binary table appendix in the NASA standard document. -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From dwells at fits.cx.nrao.edu Fri Sep 20 15:59:09 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["395" "Fri" "20" "September" "1991" "19:28:37" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "7" "Gateway test#1" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Path: cv3.cv.nrao.edu!mail-to-news-gateway Message-ID: <9109201928.AA10853 at fits.cx.nrao.edu> Organization: National Radio Astronomy Observatory Lines: 7 From: dwells at fits.cx.nrao.edu (Don Wells) Sender: news at nrao.edu Subject: Gateway test#1 Date: Fri, 20 Sep 1991 19:28:37 GMT Please ignore this test of a bi-directional Email exploder gateway to be associated with alt.sci.astro.fits. Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From dwells at fits.cx.nrao.edu Sat Sep 21 23:13:12 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["5520" "Sun" "22" "September" "1991" "04:05:41" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "102" "Re: a FITS puzzle, suggestions welcome" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Path: cv3.cv.nrao.edu!cv3.cv.nrao.edu!dwells In-Reply-To: sla at helios.ucsc.edu's message of 19 Sep 91 04: 26:32 GMT Message-ID: Organization: National Radio Astronomy Observatory, Charlottesville, VA References: <21064 at darkstar.ucsc.edu> Lines: 102 From: dwells at fits.cx.nrao.edu (Don Wells) Sender: news at nrao.edu Subject: Re: a FITS puzzle, suggestions welcome Date: Sun, 22 Sep 1991 04:05:41 GMT In article <21064 at darkstar.ucsc.edu> sla at helios.ucsc.edu (Steve Allen) writes: SA> ... at Lick ... construction of ... 2x2 mosaic of ... 2048**2 SA> chips... adjacent edges ... butted together ... When ... all 4 SA> chips [are] readout, we expect that the best method of storing SA> these images is as a 3-d FITS array... akin to .. HST's WF/PC SA> images ... The question ... is...: What happens when subregions SA> of the CCDs are readout as part of a single exposure? More SA> precisely, if the subregions of the 4 CCDs do not have the same SA> dimensions, how can these images be stored as a FITS image? This question does not currently have a widely-accepted answer. However, the general version of your question, and related questions, have been debated during the past two years. I can think of five possible answers, and there may be several more that I have forgotten at the moment. (1) Pad to full size SA> ... pad the edges of the images with Null values .. up to SA> 4096x4096 ... take up A LOT of space ... This is not only inefficient, it is definitely not elegant. (2) Use separate Basic FITS files SA> ... store... images as 4 separate FITS files ... unappealing .. SA> would like to keep data from the same exposure in one file. I also prefer to keep a set of related data in one file, but this multi-file solution is a practical workaround until a better solution becomes generally accepted. (3) Argue to standardize XTENSION='MATRIX' (or 'IMAGE') The Generalized Extensions Agreement of 1984 specifies an obvious syntax which could concatenate multiple "Basic FITS" images of different sizes in one FITS file. As a matter of policy, we decided to deprecate such usage in order to assure that Basic FITS continues to be the preferred way to encode matricies, but in principle the FITS community could agree to relax the prohibition for cases like yours. (4) Use XTENSION='BINTABLE' with the "Green Bank Convention" and padding The BINTABLE design encodes tables (arrays of C struct) in which fields may be vectors. One can store a matrix in a vector, and encode the dimensionality of the matrix in other fields by convention (the BINTABLE draft also specifies an optional matrix dimensionality keyword). Imagine a set of Basic FITS files, as in case (2) above. We could have a vector field big enough to hold the largest matrix of the set. We could define a column of the table for each keyword that occurs in the set, including such keywords as NAXIS, NAXIS1 and NAXIS2, and label the columns with the keyword names. Now let each Basic FITS file become one row in our table, and pad the matricies smaller than the maximum with blanks. It is typical that many of the keywords of the set will be the same for all the headers, so that the corresponding columns of the table will be constant. We can agree, by convention, that we will put the keyword=value into the table header and eliminate the column (this is the virtual column convention). The resulting encoding of a set of Basic FITS files is efficient for a wide variety of cases. In particular, at the "Single-Dish FITS" workshop at Green Bank in November 1989 this whole concept was proposed as the mechanism for encoding sets of spectra in FITS. If a set of fixed-length 1-D spectra has more than about 20 members the overhead of the self-describing header structure for the table becomes essentially negligible, thus removing the major classical objection to the use of FITS for spectroscopy. (Obviously the efficiency is lower for variable-dimension arrays.) This concept also has the advantage of keeping a *set* of observations together in one file. Note that generally the convention is invertible: the Basic FITS files can be recreated from the table. Two radio single-dish prototype implementations of the Green Bank Convention are currently being tested for interoperability. (5) Use XTENSION='BINTABLE' with the GBC and variable vectors The May-91 version of Cotton and Tody's BINTABLE draft standard specifies a "pointer" data type. A field, rather than containing a vector (matrix), can instead contain a byte-offset pointing to the vector, plus a byte-length for the vector. This mechanism supports variable-length vectors of various datatypes in the table fields. The vectors of various types would be concatenated into a vector of bytes which would be appended to the body of the table of fixed-length rows, and the length of the concatenated vector would be encoded in the PCOUNT keyword of the generalized extension header. Therefore, we could use the pointer type for our matrix field and the Green Bank conventions for keywords of our set of variable-matrix observations. As far as I know there are still *no* prototype implementations of this *WONDERFUL* variable-structure idea, but in my opinion it is likely to become the most popular FITS data structure by the end of the 90's, because of its versatility, efficiency, and elegance. It would be risky to use pointers for any production application at this time, before the necessary R&D and testing have been done, and before any of the major datasystems (MIDAS, IRAF, AIPS,...) have released code supporting the pointer type in binary tables. -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From tody at noao.edu Thu Sep 26 09:55:07 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["3767" "" "26" "September" "91" "06:29:17" "GMT" "Doug Tody NOAO/IRAF CCS" "tody at noao.edu " nil "61" "Re: compression of fits" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Optical Astronomy Observatories, Tucson, AZ, USA From: tody at noao.edu (Doug Tody NOAO/IRAF CCS) Subject: Re: compression of fits Date: 26 Sep 91 06:29:17 GMT >From article <1991Sep25.151241.1531 at murdoch.acc.Virginia.EDU>, by gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy): > Markus Buchhorn writes: > #Perhaps what one would need to do is read in either the entire > #image into RAM, compress all of it, then write out the compressed stream > #in 2880 chunks, or else read in some fraction of that (N lines), compress > #just that, and write out 2880 chunks. Padding is not a good idea as > #(i) wastes space and (ii) decompression could have a problem with > #this. > > It is very common already to have FITS data sets that will not fit > into RAM, and I think that it would be a mistake to rely on a > compression scheme that required reading the whole data set into ram. Unless there are good reasons to do otherwise it would be wise to use a compression techinique which is local in nature (assuming the pixel data is to be compressed instead of the entire FITS file). This is particularly important if the data is to be randomly accessed at runtime, for example, when reading a FITS image from a CD-ROM, or from disk. It would also aid in recovery from data losses. A simple technique would be to compress each image line independently, using an index to record the offset of each compressed line. An existing example of this is the pixel mask image format in IRAF, which preserves all the semantics of the random access image while storing the image in a compressed form. A line-oriented, random access approach would be one way of justifying a FITS specific compression technique, rather than simply compressing the entire FITS file with a standard adaptive compression program like COMPRESS. I don't think it is worthwhile to try to compress headers; if you want to compress the entire file, just use a non-FITS file oriented compression program. This would not necessarily have to be COMPRESS; we could write our own adaptive compression program if we wished, more oriented towards noisy image data, without having to tie it to FITS or any other specific image format. > It is also very import to remember during all these discussions on what the > non-unix users will do. While POSIX 1003.2 may specifiy compress, there are > going to be interesting computers that will not have this algorithm, and it > seems a disservice to leave these systems high and dry. Source for the compress/uncompress programs is available on the net. A while back we hacked a version of this here for VMS. Source for both the UNIX and VMS versions, plus VMS binaries, are in the util subdirectory on iraf.noao.edu (I believe gatekeeper.dec.com has an older copy as well). The VMS version isn't as easy to use as the UNIX one due to the record structures of VMS files, but it works. As many will be aware there are efforts to patent the LZ algorithm used in compress, but the authors of the original algorithm oppose this, and since the source has been publically available for years it may be difficult for the lawyers to win this one. I think more research, and trial implementations, are needed before we want to settle on a FITS standard for image compression. Getting experimental support for compression into our online image formats, which can be more easily changed than the FITS standard, would be a good start. Some basic algorithms research is badly needed as well. A general purpose program like COMPRESS, with an adaptive algorithm which could do well on noisy pixel data as well as text, could prove immediately useful and would provide an opportunity for algorithms development. -- Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217 UUCP: {arizona,decvax,ncar}!noao!tody or uunet!noao.edu!tody Internet: tody at noao.edu SPAN/HEPNET: NOAO::TODY (NOAO=5355) From sla at helios.ucsc.edu Fri Sep 27 12:17:37 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2737" "" "26" "September" "91" "15:12:56" "GMT" "Steve Allen" "sla at helios.ucsc.edu " nil "47" "those trailing zeros in the last 2880 byte block" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Summary: just how important are they on a disk? Keywords: zeros, padding, logical records, random access media Organization: UCO/Lick Observatory From: sla at helios.ucsc.edu (Steve Allen) Subject: those trailing zeros in the last 2880 byte block Date: 26 Sep 91 15:12:56 GMT In article <1991Sep25.151241.1531 at murdoch.acc.Virginia.EDU> gsh7w at fermi.clas.Virginia.EDU (Greg Scott Hennessy) writes: > The original FITS papers considered this point, and >conceded that not all reader programs can read every FITS file, but >the programs should at least exit gracefully. I have encounted data >sets that were called FITS files, that did not have the last record >padded out to 2880 octects, and some FITS readers die horribly at >this. Blocking out the last record to 2880 octects is not very >wastefull of space, at least for modern data sets >-Greg Hennessy, University of Virginia First, to make something perfectly clear: When you observe at Lick Observatory and bring home a magnetic tape (9-track or exabyte) written by the data acquisition system, that tape always has physical records which are multiples of 2880 bytes long. The last record after the image data is padded with zeros. So Lick generates conforming FITS tapes. However, most images spend at least some time on disk during the observing run, and during this time they are usually manipulated by an image processing package (namely Vista*). The images on the disk are not written with the zero padding to fill out the records to 2880 bytes. They simply terminate immediately after the last byte of the image. (This is a Berkeley Unix system). Vista is not bothered by this. It has been becoming more and more common for observers to transfer their images via methods other than magnetic tape (namely ftp--Lick does not officially endorse this yet.) When such an image is then read by IRAF's rfits, IRAF chokes and fails. I do not consider this to be exiting "gracefully." I have suggested to these observers that they ask NOAO to fix IRAF to be "graceful" in this case. Lick is not opposed to changing the data acquisition system such that it writes out the extra zeros. This will cost some disk space on a disk which is already too small. Furthermore, when dealing with the images >from some of the new (and small) IR array detectors, the proportional increase in size is not insignificant. We would welcome opinions on the importance of the zero-padding to 2880, along with reasons that explain why not having those zeros is bad for FITS readers other than IRAF. (*) P.S. Vista is available for most modern computers via anonymous ftp. Contact Jon Holtzman (holtz at lowell.edu) for more details. _______________________________________________________________________________ Steve Allen | | sla at helios.ucsc.edu UCO/Lick Observatory | This space for rent. | If the UC were opining, Santa Cruz, CA 95064 | | it wouldn't tell me. From koffley at nrlvx1.nrl.navy.mil Fri Sep 27 17:07:02 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1695" "" "27" "September" "91" "14:44:51" "GMT" "koffley at nrlvx1.nrl.navy.mil" "koffley at nrlvx1.nrl.navy.mil" nil "33" "FITS V3.0 available from NRL" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Organization: NRL SPACE SYSTEMS DIVISION From: koffley at nrlvx1.nrl.navy.mil Subject: FITS V3.0 available from NRL Date: 27 Sep 91 14:44:51 GMT NRL has a mail server which will send the FITS version 3 package to you if you have no FTP connectivity to GSFC. To use it, send a mail message containing only the following lines in the body : HELP LIST FITSIO3 SEND FITSIO3 The last line will send the package as a multi part mail message. To recreate, (assumes you are a VMS site) simply concatenate all the messages comprising the software package into a single file and execute it as a DCL command of the form: at filename You will see informational messages telling you that it is unpacking the source code for you. The address you want to send to is : NRL_ARCHIVE at nrlvax.nrl.navy.mil If you are incapable of dealing with the above methods (because you are a UNIX site perhaps), send mail to me or to nrl_archive-mgr at nrlvax.nrl.navy.mil and I'll try to accomodate you. I have dabbled with FITS i.e. I used another package to convert a GIF image to a FITS file and then used the FITS package to read the data. It does work ! I am not an astronomer/physicist, I am the site system manager, network manager and security manager so don't ask me for too many details on FITS per se. There are other people who read this group who are much more conversant than I on FITS. I only verified that it worked in a limited test case or two. -- \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ < Joe Koffley KOFFLEY at NRLVAX.NRL.NAVY.MIL > < Naval Research Laboratory KOFFLEY at CCF.NRL.NAVY.MIL > < Space Systems Division AT&T : 202-767-0894 > \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ From pence at heawk1.gsfc.nasa.gov Mon Sep 30 13:17:48 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["656" "Mon" "30" "September" "1991" "16:12:37" "GMT" " William Pence" "pence at heawk1.gsfc.nasa.gov " nil "18" "Re: FITS V3.0 available from NRL" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Nntp-Posting-Host: heawk1.gsfc.nasa.gov Organization: Goddard Space Flight Center From: pence at heawk1.gsfc.nasa.gov ( William Pence) Subject: Re: FITS V3.0 available from NRL Date: Mon, 30 Sep 1991 16:12:37 GMT In general I would discourage users from obtaining 2nd hand copies of the FITSIO software because one may not get the latest version. As a case in point, the new version 3.01 of FITSIO, which fixes a relatively minor bug, was released just last Friday. Currently there a 2 ways to receive the most recent version of FITSIO: 1) by anonymous ftp from tetra.gsfc.nasa.gov in subdirectory pub/fitsio3 2) over the SPAN network (e.g., with DECNET copy) from NDADSA::HEASARC:[EXOSAT.XANADU.FITSIO.VERSION3] If any users have difficulty accessing the files by either of these methods, then they should send me a message. -Bill Pence HEASARC/GSFC From landsman at stars.gsfc.nasa.gov Fri Oct 4 00:01:31 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["653" "" "19" "September" "91" "04:47:46" "GMT" "Wayne Landsman" "landsman at stars.gsfc.nasa.gov " nil "17" "Looking for a test REAL*8 FITS file" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Reply-To: landsman at stars.gsfc.nasa.gov Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA News-Software: VAX/VMS VNEWS 1.3-4 Nntp-Posting-Host: stars.gsfc.nasa.gov From: landsman at stars.gsfc.nasa.gov (Wayne Landsman) Subject: Looking for a test REAL*8 FITS file Date: 19 Sep 91 04:47:46 GMT Specifically, I'm looking for a REAL*8 FITS data file (preferably accessible via FTP) which I can use to help develop and test a FITS I/O package. More generally, I have found the FITS files on the "FITS Test Tape" (available on fits.cx.nrao.edu) extremely useful in debugging my FITS I/O routines. But these files are now 9 years old and do not include more recent developments such as the floating point standards. Are there any sites with more recent test FITS files available? Thanks, Wayne Landsman stars::landsman ST Systems Co. landsman at stars.gsfc.nasa.gov NASA/GSFC Greenbelt, MD 20771 From nelson at stsci.edu Fri Oct 4 00:01:37 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["355" "" "20" "September" "91" "18:08:23" "GMT" "Nelson Zarate" "nelson at stsci.edu " nil "10" "Re: Looking for a test REAL*8 FITS file" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Summary: Fresh FITS data now at stsci anonymous containing scaled real*8 and ieee values. Organization: Space Telescope Science Institute From: nelson at stsci.edu (Nelson Zarate) Subject: Re: Looking for a test REAL*8 FITS file Date: 20 Sep 91 18:08:23 GMT > > Specifically, I'm looking for a REAL*8 FITS data file (preferably accessible > via FTP) which I can use to help develop and test a FITS I/O package. > Yes, I have put some FITS files in the anonymous account in the stsci.edu machine. They are scaled (BIPIX = 32) double precision and ieee version of ST data. Nelson Zarate zarate at stsci.edu From sla at helios.ucsc.edu Fri Oct 4 00:02:04 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1369" "" "19" "September" "91" "04:26:32" "GMT" "Steve Allen" "sla at helios.ucsc.edu " nil "30" "a FITS puzzle, suggestions welcome" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Summary: well, we wanted some traffic in this group, right? Keywords: mosaics of CCDs Organization: UCO/Lick Observatory From: sla at helios.ucsc.edu (Steve Allen) Subject: a FITS puzzle, suggestions welcome Date: 19 Sep 91 04:26:32 GMT Here at Lick the engineering shops have begun construction of a dewar which will eventually hold a 2x2 mosaic of Ford 2048**2 chips. The adjacent edges of the CCDs will be cut off so that they can be butted together with very little gap. When the entirety of all 4 chips is readout, we expect that the best method of storing these images is as a 3-d FITS array. This is exactly akin to the method used in the HST's WF/PC images, which has a very similar layout (conceptually). The question we have not yet answered is this: What happens when subregions of the CCDs are readout as part of a single exposure? More precisely, if the subregions of the 4 CCDs do not have the same dimensions, how can these images be stored as a FITS image? Obviously we could simply pick the largest dimensions in each direction, and pad the edges of the images with Null values for pixels which were not read. However, when dealing with up to 4096x4096 pixels, these Null values begin to take up A LOT of space on disk, tape, whatever; it would really be preferable not to store them at all. Again, we could store such images as 4 separate FITS files, but that is also unappealing. We would like to keep data from the same exposure in one file. Any suggestions are welcome. Let's see how busy this newsgroup might be. Steve Allen sla at helios.ucsc.edu ...!ucbvax!ucscc!helios!sla From steve at cfht.hawaii.edu Fri Oct 4 00:04:57 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["743" "" "19" "September" "91" "20:22:50" "GMT" "Steven S Smith" "steve at cfht.hawaii.edu " nil "20" "storage of drift scan images?" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Keywords: drift scan, FITS Organization: University of Hawaii Nntp-Posting-Host: quonset.cfht.hawaii.edu From: steve at cfht.hawaii.edu (Steven S Smith) Subject: storage of drift scan images? Date: 19 Sep 91 20:22:50 GMT Has anyone used FITS to store drift scanned images? In our case, the telescope will not be tracking an object, but the detector will be read out, row by row, at a rate equal to the normal tracking rate. Since this gives a strip of the sky, the resulting data can be very long. While I am sure that the FITS standard can easily do this, I doubt that very many FITS implementations would be able to utilize the resulting data. I guess just arbitrarily hacking the data into some nice length would work, but it seems inelegant (it could leave an object partly on two FITS files). Any prior experience with this problem, or general comments are welcome. steve -- Steven S. Smith Canada-France-Hawaii Telescope Corp. steve at cfht.hawaii.edu From pwb at newt.phys.unsw.oz.au Fri Oct 4 00:05:09 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1692" "" "22" "September" "91" "14:49:19" "GMT" "Paul W. Brooks" "pwb at newt.phys.unsw.oz.au " nil "36" "Re: storage of drift scan images?" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Summary: Mine can! Keywords: drift scan, FITS From: pwb at newt.phys.unsw.oz.au (Paul W. Brooks) Subject: Re: storage of drift scan images? Date: 22 Sep 91 14:49:19 GMT In article , steve at cfht.hawaii.edu (Steven S Smith) writes: > Has anyone used FITS to store drift scanned images? > > In our case, the telescope will not be tracking an object, > but the detector will be read out, row by row, at a rate equal to the > normal tracking rate. Since this gives a strip of the sky, the > resulting data can be very long. While I am sure that the FITS > standard can easily do this, I doubt that very many FITS > implementations would be able to utilize the resulting data. My implementation could, and it runs on a lowly PC! In general, I would expect most implementations which deal with FITS images would not try to load the entire image into RAM, but would deal with it on a line-by-line basis, as a means of ensuring that the software would not be limited as the images grow larger, which they are steadily doing from year to year. As your drift-scanned images have a limited (normal) number of columns, I would hope that most software would not be too fazed by having more ROWS than the programmer expected - A programmer can never outguess the users of the software, and set limits that "will never be exceeded in practice". Rather s/he should create software without limits! > Any prior experience with this problem, or general comments are > welcome. > > steve > -- > Steven S. Smith Canada-France-Hawaii Telescope Corp. > steve at cfht.hawaii.edu Paul Brooks |Internet: pwb at newt.phys.unsw.edu.au Uni. of N.S.W. |If you have trouble sleeping, try lying on the end of Kensington NSW 2033| your bed. With a little luck you'll drop off. AUSTRALIA | - Mark Twain. From steve at cfht.hawaii.edu Fri Oct 4 00:05:29 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2507" "" "23" "September" "91" "21:00:49" "GMT" "Steven S Smith" "steve at cfht.hawaii.edu " nil "49" "Re: storage of drift scan images?" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Keywords: drift scan, FITS Organization: University of Hawaii Nntp-Posting-Host: quonset.cfht.hawaii.edu From: steve at cfht.hawaii.edu (Steven S Smith) Subject: Re: storage of drift scan images? Date: 23 Sep 91 21:00:49 GMT pwb at newt.phys.unsw.oz.au (Paul W. Brooks) writes: ... >My implementation could, and it runs on a lowly PC! >In general, I would expect most implementations which deal with FITS >images would not try to load the entire image into RAM, but would >deal with it on a line-by-line basis, as a means of ensuring that the >software would not be limited as the images grow larger, which they are >steadily doing from year to year. As your drift-scanned images have >a limited (normal) number of columns, I would hope that most software >would not be too fazed by having more ROWS than the programmer expected >- >A programmer can never outguess the users of the software, and set >limits that "will never be exceeded in practice". Rather s/he should >create software without limits! While I agree that this is a good goal, in reality something has to give. Assuming a 2048x2048 ccd, with a shift rate of ~30 seconds for a full frame equivalent, the data sizes we are looking at are on the order of a gigabyte/hour. An individual scan could last several hours. Your program may handle this, but the operating and/or file systems may fail. And indeed, since the main benefit of FITS (for us, anyhow) is that it allows people to easily drag their data back to their home institutions, the data we produce needs to work with most of the FITS software available. Expecting Joe Astronomer's FITS-to-whatever program to work on these monsters is not realistic, since simply changing locations and (or names) of some header cards causes complaints. I believe that we will have to wait a few years before the computer hardware/software infrastructure will have expanded to allow multi-gigabyte files to work. The FITS "standard" may not need any improvement to allow this, but individual implementations will. So until then, any ideas on how to chop them up? Is it useful to add some heuristics to the splitting routine, for example, only split on rows that contain relatively low data values? Or do we duplicate a few rows from one FITS file to the next, so that an object of interest will be contained entirely on one of the files? And how many is a few? >Paul Brooks |Internet: pwb at newt.phys.unsw.edu.au >Uni. of N.S.W. |If you have trouble sleeping, try lying on the end of >Kensington NSW 2033| your bed. With a little luck you'll drop off. >AUSTRALIA | - Mark Twain. -- Steven S. Smith Canada-France-Hawaii Telescope Corp. steve at cfht.hawaii.edu From 17001_1511 at uwovax.uwo.ca Fri Oct 4 00:08:42 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["106" "" "20" "September" "91" "14:49:07" "GMT" "17001_1511 at uwovax.uwo.ca" "17001_1511 at uwovax.uwo.ca" nil "4" "FITS to VICAR?" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits From: 17001_1511 at uwovax.uwo.ca Subject: FITS to VICAR? Date: 20 Sep 91 14:49:07 GMT Is there a convenient routine for converting from FITS to VICAR format? Anything out there? phil stooke From dwells at fits.cx.nrao.edu Fri Oct 4 00:09:08 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["1038" "Mon" "23" "September" "1991" "15:03:43" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "23" "BINTABLE document available" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Organization: National Radio Astronomy Observatory, Charlottesville, VA Distribution: alt From: dwells at fits.cx.nrao.edu (Don Wells) Subject: BINTABLE document available Date: Mon, 23 Sep 1991 15:03:43 GMT Bill Cotton and Doug Tody have agreed on the latest version of their proposal for XTENSION= 'BINTABLE'. The document is available via anonymous-FTP to fits.cx.nrao.edu [192.33.115.8], in directory FITS/doc: -rw-r--r-- 1 dwells vlb 179394 Sep 23 09:45 bintable.ps -rw-r--r-- 1 dwells vlb 32332 Sep 23 09:00 bintable.tex Appendix B ("Variable Length Array" Facility) is especially interesting. This announcement is alt.sci.astro.fits in action --- you heard it here first! I will post an announcement to other places later today. This announcement is also probably the first message which will pass through the gateway to the "fitsbits" exploder. The exploder now has two subscribers, one in Texas and the other in Germany. -- Donald C. Wells Associate Scientist dwells at nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N From pence at heawk1.gsfc.nasa.gov Fri Oct 4 00:09:14 1991 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["4335" "" "23" "September" "91" "21:15:24" "GMT" " William Pence" "pence at heawk1.gsfc.nasa.gov " nil "93" "FITSIO Version 3" "^From:" nil nil "9" nil nil nil nil] nil) Newsgroups: alt.sci.astro.fits Summary: Version 3 of the FITSIO package is now available. Keywords: FITS Organization: Goddard Space Flight Center Nntp-Posting-Host: heawk1.gsfc.nasa.gov From: pence at heawk1.gsfc.nasa.gov ( William Pence) Subject: FITSIO Version 3 Date: 23 Sep 91 21:15:24 GMT FITSIO - Version 3.00 This is to announce that Version 3 of the FITSIO package is now available for general use. FITSIO is a Fortran-77 subroutine interface for reading and writing files in the FITS (Flexible Image Transport System) format. This new version of FITSIO has been completely rewritten and offers many additional capabilities and enhancements not found in the previous version, including: - direct access I/O which allows the data in the FITS file to be read or written in any random order. The previous version of FITSIO had required that all the bytes of data in the file be accessed in strict sequential order. - implicit datatype conversion. This allows users to read (or write) data from a FITS array or table column of a given datatype (e.g. short integers) into a different datatype array (e.g. Real*4). - automatic data scaling can be performed (using the BZERO and BSCALE, or TZEROnnn and TSCALnnn values). - access to data in ASCII and Binary table extensions is now much more convenient. In general, the interface reads or writes individual columns of data in a table with a single subroutine call. - a more logical and consistent calling parameter sequence is used throughout all the subroutines in the interface. Unfortunately, this also means that the calling sequences in Version 3 are entirely different from those in Version 2. Any future versions, if any, will be designed to be upwardly compatible with Version 3. As with the previous version, this new FITSIO package is strictly oriented towards access to FITS files on magnetic disk, not magnetic tape. Support for tape devices might be added later, but this may be difficult due to the random access capabilities which are fundamental to this implementation. (Programmers interested in modify this package to support mag tapes should probably start with FITSIO Version 2, since it accesses the FITS file in a purely sequential order). About 95% of the FITSIO code is written in strict Fortran-77. A small set of subroutines specific to a particular machine are also provided (VAX/VMS, DECstations, SUN workstations, and IBM PCs are currently supported with an IBM mainframe version due in a couple weeks). For programmers preferring a C language implementation, these Fortran subroutines can be processed through the public domain 'f2c' conversion program. We have only done limited testing of this, but it appeared that the C version of the code runs and produces the same FITS file output as the Fortran version. Because this is a new release of a large set of code, it is likely that some bugs still remain. Please send any problem reports to me at: pence at tetra.gsfc.nasa.gov or to LHEAVX::PENCE. Users should also periodically check to see if a new release has been made. (The file called 'version.nnn' in the release directory indicates the current release version. The current release is 'version.300'.) This version conforms to the FITS file formats defined in the draft standard published by the NASA/OSSA Office of Standards and Technology (NOST) and in the draft 'Binary Table Extensions to FITS' dated 20 May 1991. The only significant feature not supported in the current version of FITSIO is the 'Variable Length Array' facility in binary tables. Support for this will be added in the near future. A new version of FITSIO will be issued whenever the above documents are revised. To get a copy of the FITSIO software and documentation, use anonymous ftp to connect to: ftp tetra.gsfc.nasa.gov or ftp 128.183.8.77 Then type the following: ftp> user anonymous Password: [type your username as a password] ftp> cd pub/fitsio3 [to move to the version 3 subdirectory] ftp> ls [to see a list of the available files] ftp> get read.me [contains latest information about FITSIO] ftp> get fitsio.doc [complete user documentation] ftp> get ... [get any additional desired files] ftp> exit Alternatively, the FITSIO files may be copied from following SPAN node NDADSA::HEASARC:[EXOSAT.XANADU.FITSIO.VERSION3] William Pence HEASARC Code 668 NASA/Goddard Space Flight Center Greenbelt, MD 20771