Dapper in-situ conventions spec available

Steve Hankin Steven.C.Hankin at noaa.gov
Wed Oct 11 12:41:02 PDT 2006


All,

I'd like to second Roy's point about the "has_data attribute".   A real 
weakness of the current OPeNDAP/DAPPER formulation is that there is no 
standard way defined to request (say)

    "give me all of the profiles in <space-time region> that contain 
measurements of TEMPERATURE AND SALINITY"

In real world practice collections of observations very frequently have 
lists of variables that vary from one site to the next.  The ability to 
constrain requests to only observations that contain the variables that 
of interest seems pretty fundamental.

    - Steve

===========================================

Roy Mendelssohn wrote:
> First, it is nice to see some efforts to agree on conventions for 
> in-situ data.  It is long overdue and will simplify a lot of the work 
> we  (ERD) do.
>
> I have been trying to follow this discussion as best as I can, and 
> would like to add several comments that may be somewhat orthogonal to 
> previous discussion points.  I would like to add that some of the 
> points raised in previous emails appear more about how to store 
> in-situ data, rather than a formal convention for transmitting them in 
> OPeNDAP.  I assume that this is the primary purpose of the specification.
>
> We now have several Dapper  servers serving a fairly large amount of 
> data using the precursor to this specification (though it is 
> essentially the same),  and my comments are directed at the types of 
> in-situ data that we have found do not make for a natural fit with 
> this convention.  It may be that for a first pass we do not want to 
> include this in the specification, as no one spec will always please 
> everybody.
>
> 1.  Station data with an inexact station location.  In many fisheries 
> and oceanographic surveys data are taken at "stations" but the 
> location is inexact, so that it is necessary to have changing lat/lon 
> information with the observations.  You can do this by having a 
> separate  "file" for each profile, and having the station number as a 
> variable in the inner sequence, or including the lat/lon in the inner 
> sequence (which to some extent would violate the convention), but 
> since most programs will look to the outer sequence for coordinate 
> type information, neither of these solutions work that well. A 
> possibility would be to have an option to have station number in the 
> inner sequence in a set way, and that server/clients know to look for 
> this  (the present Dapper server actually does this).
>
> 2.  Ragged arrays and either z or t in the inner sequence.  Netcdf-4 
> will have ragged arrays - though I haven't had a chance yet to see how 
> the handle the dimensioning for the ragged array.  Do we want 
> something that can handle that in one file.  Again staying with the 
> idea that we have set of profiles at depth at a "station" with inexact 
> positions, and we would like to send all the data from that station 
> together.  When you have subsurface data, the biggest problem is that 
> the depths vary with  each profile.  So to combine the profiles, you 
> either have a depth dimension with all possible depths and a lot of 
> missing data, or else you do one "file" per profile. The latter, 
> combined with the either 't' or 'z' axis in the inner sequence tends 
> to make it so we can't readily do time series from the same station, 
> though that is an obvious thing to want to do  (to be more precise - 
> clearly one could do that if they know what to look for in our files 
> and that they have that structure - but there is nothing in the spec 
> per se that would make this a general solution). So do we want 
> something in the spec that describes ragged arrays?
>
> 3. has_data attribute.  to use the spec effectively, particularly in a 
> time series sense, we have found that any parameter in the inner 
> sequence show always be there, but often it will not be observed while 
> other parameters were.  Rather than having to look at the data itself 
> to see if it is totally missing, do we want a "has_data" attribute 
> required?
>
> I hope these comments are at least somewhat clear.  I am a little 
> fuzzy-headed normally and a bad cold hasn't helped.  May have more 
> comments but those are my initial ones.
>
> BTW - for others on my staff that are not on the mail-list - are the 
> sequence of emails being archived somewhere that they can view them. 
> The discussion has been very interesting.
>
> -Roy M.

-- 
--

Steve Hankin, NOAA/PMEL -- Steven.C.Hankin at noaa.gov
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744



More information about the Opendap-tech mailing list