Dapper in-situ conventions spec available

Roy Mendelssohn Roy.Mendelssohn at noaa.gov
Wed Oct 11 12:04:42 PDT 2006


First, it is nice to see some efforts to agree on conventions for 
in-situ data.  It is long overdue and will simplify a lot of the work 
we  (ERD) do.

I have been trying to follow this discussion as best as I can, and 
would like to add several comments that may be somewhat orthogonal to 
previous discussion points.  I would like to add that some of the 
points raised in previous emails appear more about how to store 
in-situ data, rather than a formal convention for transmitting them 
in OPeNDAP.  I assume that this is the primary purpose of the 
specification.

We now have several Dapper  servers serving a fairly large amount of 
data using the precursor to this specification (though it is 
essentially the same),  and my comments are directed at the types of 
in-situ data that we have found do not make for a natural fit with 
this convention.  It may be that for a first pass we do not want to 
include this in the specification, as no one spec will always please 
everybody.

1.  Station data with an inexact station location.  In many fisheries 
and oceanographic surveys data are taken at "stations" but the 
location is inexact, so that it is necessary to have changing lat/lon 
information with the observations.  You can do this by having a 
separate  "file" for each profile, and having the station number as a 
variable in the inner sequence, or including the lat/lon in the inner 
sequence (which to some extent would violate the convention), but 
since most programs will look to the outer sequence for coordinate 
type information, neither of these solutions work that well. A 
possibility would be to have an option to have station number in the 
inner sequence in a set way, and that server/clients know to look for 
this  (the present Dapper server actually does this).

2.  Ragged arrays and either z or t in the inner sequence.  Netcdf-4 
will have ragged arrays - though I haven't had a chance yet to see 
how the handle the dimensioning for the ragged array.  Do we want 
something that can handle that in one file.  Again staying with the 
idea that we have set of profiles at depth at a "station" with 
inexact positions, and we would like to send all the data from that 
station together.  When you have subsurface data, the biggest problem 
is that the depths vary with  each profile.  So to combine the 
profiles, you either have a depth dimension with all possible depths 
and a lot of missing data, or else you do one "file" per profile. The 
latter, combined with the either 't' or 'z' axis in the inner 
sequence tends to make it so we can't readily do time series from the 
same station, though that is an obvious thing to want to do  (to be 
more precise - clearly one could do that if they know what to look 
for in our files and that they have that structure - but there is 
nothing in the spec per se that would make this a general solution). 
So do we want something in the spec that describes ragged arrays?

3. has_data attribute.  to use the spec effectively, particularly in 
a time series sense, we have found that any parameter in the inner 
sequence show always be there, but often it will not be observed 
while other parameters were.  Rather than having to look at the data 
itself to see if it is totally missing, do we want a "has_data" 
attribute required?

I hope these comments are at least somewhat clear.  I am a little 
fuzzy-headed normally and a bad cold hasn't helped.  May have more 
comments but those are my initial ones.

BTW - for others on my staff that are not on the mail-list - are the 
sequence of emails being archived somewhere that they can view them. 
The discussion has been very interesting.

-Roy M.
-- 
**********************
"The contents of this message do not reflect any position of the U.S. 
Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: Roy.Mendelssohn at noaa.gov (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."



More information about the Opendap-tech mailing list