Structures in Dapper dds & Loaddap
Joe Sirott
Joe.Sirott at noaa.gov
Thu May 4 12:45:47 PDT 2006
Hi All,
Just a bit of clarification on the Dapper attributes structures... Many
in-situ datasets have two classes of attributes -- those that are a
function of a given profile or time series (e.g. who the PI was, who
made the measurement, what instrument was used for the measurement) and
those that are also a function of a given profile but are also
associated with a variable (e.g. the valid data range for, say, the
pressure). The first class is represented by the 'attributes' structure
and the second by the 'variable_attributes' structure. The reason that a
nested structure is used for the 'variable_attributes' structure:
Structure {
Structure {
Float32 valid_range[2];
} PRES;
} variable_attributes;
is that the attribute needs to be associated with a variable. If the
structure wasn't nested, then some kind of naming convention would be
required:
Structure {
Float32 valid_range[2];
} PRES_ATTRIBUTE;
which IMHO would make life harder for clients attempting to recover the
variable attributes.
- Joe
Thomas LOUBRIEU wrote:
> Hi Dan,
> Thanks very much for your quick response.
> It will take us some time to fully take into account the information
> you gave us, but I am already very happy to read the reason why the
> 'attributes' structure exists in dapper (for keeping its fields
> separated from the usual profile dimensions (x,y,z,t)).
> Actually I didn't understood that by myself. That 'attributes'
> structure used to look strange to me, but now I think it's a
> wonderful idea.
>
> Bye,
>
> Thomas
>
>
>
>
> Daniel Holloway wrote:
>> Hi Arnaud,
>>
>> There are several issues here, the primary culprit is handling
>> complex sequences not structures. I'll provide some input inline below.
>>
>> On May 4, 2006, at 10:09 AM, Arnaud FOREST wrote:
>>
>>> Hi all,
>>>
>>> I work on an opendap server for in-situ vertical profiles stored in
>>> a Oracle DBMS.
>>> Our dds structure is copied from the 'argo' dapper interface
>>> (http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp.dds) but I
>>> notice that matlab struct tools(Libdap : 3.6.2, loaddap : 3.5.2,
>>> Matlab : 7 ) doesn't completly manage the in-situ profile dapper
>>> interface.
>>>
>>> The 'structures' are especially not very well managed by the matlab
>>> client :
>>
>> The problem is with handling nested sequences for this particular
>> data source. The following data source is a complex structure that
>> loads fine with loaddap(v-3.5.2)
>>
>> http://test.opendap.org:8080/dods/dts/complex_structs.03.dds
>>
>> ---------
>> >> loaddap('http://test.opendap.org:8080/dods/dts/complex_structs.03')
>> >> whos
>> Name Size Bytes Class
>>
>> Outermost 1x1 1472 struct array
>>
>> Grand total is 68 elements using 1472 bytes
>>
>> >> Outermost
>>
>> Outermost
>>
>> SimpleStructure: [1x1 struct]
>>
>> >> Outermost.SimpleStructure
>>
>> ans
>>
>> Innermost: [1x1 struct]
>>
>> >> Outermost.SimpleStructure.Innermost
>>
>> ans
>>
>> i32: [10x1 double]
>> ui32: [10x1 double]
>> i16: [10x1 double]
>> ui16: [10x1 double]
>> f32: [10x1 double]
>> f64: [10x1 double]
>>
>> >> Outermost.SimpleStructure.Innermost.i32
>>
>> ans
>>
>> 0
>> 2048
>> 4096
>> 6144
>> 8192
>> 10240
>> 12288
>> 14336
>> 16384
>> 18432
>>
>> >>
>>
>> ---------
>>
>> Place those structs inside a Sequence and baboom.... So yes,
>> there is a bug in loaddap with respect to these nested sequences. I
>> didn't realize it was there as I used loaddap recently to load
>> another Dapper data source and it worked fine.
>>
>> ---------
>> >> clear
>> >>
>> loaddap('http://las.pfeg.noaa.gov/dods/ndbcMet/all_noaa_time_series.cdp?LAT,LON,WSPD1&LAT>39.58&LAT<43&TIME<=1105315200000&TIME>=1104537600000')
>>
>> >> whos
>> Name Size Bytes Class
>>
>> location 1x1 11348 struct array
>>
>> Grand total is 1328 elements using 11348 bytes
>>
>> >> location
>>
>> location
>>
>> profile: [6x1 struct]
>> LON: [6x1 double]
>> LAT: [6x1 double]
>>
>> >> location.LAT(2)
>>
>> ans
>>
>> 42.7500
>>
>> >> location.LON(2)
>>
>> ans
>>
>> 235.1500
>>
>>
>> >> location.profile(2).WSPD1(1:5)
>>
>> ans
>>
>> 5.1000
>> 5.8000
>> 5.9000
>> 7.2000
>> 4.6000
>>
>> >>
>> ---------------
>>
>> So, I was not aware that Dapper output would cause loaddap to
>> fail, so I'll look into what the problem might be that this
>> particular data source is causing the client.
>>
>> Nested Sequences can be a bear to deal with...
>>
>>>
>>> 1) "Loaddap" doesn't support the structure of structure, a
>>> fatal error occures. (the matlab's crash dump is at the
>>> end of the
>>> message)
>>>
>>>
>>>
>>> Is there a known bug or updates foreseen about the structure
>>> management in the loaddap matalb toolbox ?
>>
>> There is now, at least for this particular data source.
>>>
>>> Why the dapper interface has been defined with so many structures in
>>> it?
>>
>>
>> I shouldn't try to explain the rationale behind the Dapper use
>> of Sequences but IMO it's a pretty good representation of the
>> underlying relationships.
>>
>> Basically Dapper provides the following form for all of its data
>> sources:
>>
>> Dataset {
>>
>> Sequence { ... } location;
>>
>> Structure { independent dims } constrained_ranges;
>>
>> } data source;
>>
>>
>> Each data source has a series of location data, and a single
>> structure listing the extent of the independent dimensions (x,y,z,t)
>> for the data source itself (as constrained by the request)
>>
>> ------------
>>
>> To represent the 'location' series Dapper uses a nested sequence
>> to represent the relationships between the parts of the series, such
>> that values which do not change for a particular series which in this
>> case are the lat/lon/juld variables are stored in the outer sequence
>> element. The variables that change most frequently, whether that is
>> a time-series, or vertical profile, etc., are stored in the inner
>> sequence, in this case the variable 'profile'.
>>
>> -----------
>> Dataset {
>> Sequence {
>> Float64 JULD;
>> Float32 LONGITUDE;
>> Float32 LATITUDE;
>> Int32 _id;
>> Sequence {
>> Float32 PSAL_QC;
>> Float32 CNDC_ADJUSTED_QC;
>> Float32 TEMP_ADJUSTED_ERROR;
>> ...
>> } profile;
>> ...
>> } location;
>> }
>>
>> ------------
>>
>> Logically, you have a set of locations, which have a JULD, LAT,
>> LON, and foreach of these you have a series of observations (profile).
>>
>> The potentially confusing part is that for this data source Dapper
>> includes two additional structure variables, but it's important to
>> note that these structure variables reside in the outermost sequence
>> element of 'location'. That means that the values in the
>> 'attribute' structure and the 'variable_attributes' structure apply
>> to the outermost relationship. Long story short the values in
>> 'attributes' like 'PLATFORM_NUMBER' don't change as a function of the
>> 'profile' series. The nested Structure 'variable_attributes' list
>> the extent or range of the independent dimension implicit within the
>> nested sequence 'profile', which in this example is PRES (pressure).
>>
>> Extending the above, you have a set of locations, which have a
>> JULD, LAT, LON and attributes like PLATFORM_NUMBER, etc., and the
>> range on the independent dim (PRES) for the series 'profile' is
>> contained in the structure
>> 'variable_attributes.PRES.valid_ranges[0:1], and all the
>> observations recorded at that location are stored in the profile
>> series which happens to use PRES as the independent dimension between
>> observations.
>>
>> ------------
>>
>> OK, I've probably butchered that explanation... There are a couple
>> of issues at work here that you should be aware of:
>>
>> 1: Not sure why they use a nested structure for
>> 'variable_attributes', maybe they envision supporting more than 2
>> levels of nesting to represent a complicated relationship but I think
>> you don't have to have nested structures for this particular variable.
>>
>> 2: If you write a server that uses nested sequences the server
>> should serialize any constructor variables (.e.g., structures) before
>> the inner nested sequence itself. I believe this is documented in a
>> recent RFC on the DAP but I'll have to double check. Logically from
>> a client's perspective there's not difference between the ordering
>> but from the existing implementations standpoint there is a big
>> difference. I'm not sure if that's the reason why the client is
>> failing for this particular data source or not but will look into it.
>>
>>>
>>> Does anyone know what opendap clients are fully compliant with the
>>> dapper output (structures of structures, sequences of sequences,
>>> sequences of structures...) and what is planned for the improvement
>>> of the compliance between dapper interface and opendap clients
>>> (matlab, ferret, nco, Opendap Data connector, pyDAP, GrADS... I
>>> guess C++ and JAVA API are right).
>>>
>>
>> I can't speak for other client developers, we will make every
>> effort to insure that our supported clients can read any valid DAP
>> response. We support the Matlab, IDL and ODC, as well as the API
>> implementations we distribute. I doubt if every client application
>> will be able to support reading Dapper responses, they won't easily
>> map into some of the underlying client APIs.
>>
>> Dan
>>
>>>
>>> Thanks a lot,
>>>
>>> Arnaud and Thomas
>>>
>>> --------------------- Matlab Crash Dump ---------------------
>>>
>>>
>>> >>loaddap('http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp?
>>> &location.JULD>1143929418000&location.JULD<1144447818000&location.LATITUDE>30
>>>
>>> &location.LATITUDE<50&location.LONGITUDE>-51&location.LONGITUDE<-5')
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> Segmentation violation detected at Thu May 4 15:06:08 2006
>>>> ------------------------------------------------------------------------
>>>>
>>>> Configuration:
>>>> MATLAB Version: 7.0.1.24704 (R14) Service Pack 1
>>>> MATLAB License: 122551
>>>> Operating System: Linux 2.6.9-11.EL #1 Wed Jun 8 16:59:52 CDT
>>>> 2005 i686
>>>> Window System: Hummingbird Communications Ltd. (7000), display
>>>> br144-122:0.0
>>>> Current Visual: 0x23 (class 4, depth 24)
>>>> Processor ID: x86 Family 15 Model 0 Stepping 10, GenuineIntel
>>>> Virtual Machine: Java is not enabled
>>>> Default Charset: UTF-8
>>>> Register State:
>>>> eax = 00000000 ebx = 00cbc148
>>>> ecx = 02d09560 edx = 0896bd40
>>>> esi = 025250d0 edi = 00000003
>>>> ebp = bfff8bc8 esp = bfff8bc8
>>>> eip = 00cb737b flg = 00210286
>>>> Stack Trace:
>>>> [0] loaddap.mexglx:vfprintf~(0x0896bd40, 0x025250d0, 0xbfffafb0
>>>> "location", 0x00cb8d64) + 13179 bytes
>>>> [1] loaddap.mexglx:0x00cb8e6c(0x08642338, 0xbfffb20c, 0x00cba7c6, 1)
>>>> [2] loaddap.mexglx:0x00cb97ba(0x08642338, 0x00cba429, 1, 0xbfffbe30)
>>>> [3] loaddap.mexglx:mexFunction~(0, 0xbfffbdd0, 1, 0xbfffbe30) +
>>>> 304 bytes
>>>> [4] libmex.so:mexRunMexFile(0, 0xbfffbdd0, 1, 0xbfffbe30) + 93 bytes
>>>> [5] libmex.so:Mfh_mex::dispatch_file(int, mxArray_tag**, int,
>>>> mxArray_tag**)(0x02b40fb0, 0, 0xbfffbdd0, 1) + 537 bytes
>>>> [6] libmwm_dispatcher.so:Mfh_file::dispatch_fh(int,
>>>> mxArray_tag**, int, mxArray_tag**)(0x02b40fb0, 0, 0xbfffbdd0, 1) +
>>>> 262 bytes
>>>> [7] libmwm_interpreter.so:inDispatchFromStack(455, 0x0869ad20
>>>> "loaddap", 0, 1) + 1240 bytes
>>>> [8] libmwm_interpreter.so:inDispatchCall(char const*, int, int,
>>>> int, int*, int*)(0x0869ad20 "loaddap", 455, 0, 1) + 112 bytes
>>>> [9] libmwm_interpreter.so:.L924(2, 0, 0, 0) + 165 bytes
>>>> [10] libmwm_interpreter.so:inInterPcodeSJ(inDebugCheck, int, int,
>>>> opcodes, inPcodeNest_tag*)(2, 0, 0, 0) + 315 bytes
>>>> [11] libmwm_interpreter.so:inInterPcode(2, 0, 0xbfffc3f8,
>>>> 0x0095e39b) + 93 bytes
>>>> [12] libmwm_interpreter.so:in_local_call_eval_function(int*,
>>>> _pcodeheader*, int*, mxArray_tag**, inDebugCheck)(0, 0xbfffcdf0,
>>>> 0xbfffce7c, 0xbfffcea8) + 163 bytes
>>>> [13]
>>>> libmwm_interpreter.so:inEvalStringWithIsVarFcn(_memory_context*,
>>>> char const*, EvalType, int, mxArray_tag**, inDebugCheck,
>>>> _pcodeheader*, int*, bool (*)(void*, char const*),
>>>> void*)(0x008dd468, 0x08744bd0 "loaddap('http://dapper.pmel.noaa..",
>>>> 0, 0) + 2358 bytes
>>>> [14] libmwm_interpreter.so:inEvalCmdNoEnd(0x08744bd0
>>>> "loaddap('http://dapper.pmel.noaa..", 0x08744bd0
>>>> "loaddap('http://dapper.pmel.noaa..", 0xbfffd048 ", 0x00de8c27) +
>>>> 85 bytes
>>>> [15] libmwbridge.so:mnParser(0x00cd21e8 "@@@", 0x00cd22d8
>>>> "mnParser", 1, 0xbfffd0a4) + 471 bytes
>>>> [16] libmwmcr.so:mcrInstance::mnParser()(0x080a01c0, 0,
>>>> 0xbffff398, 0x0804a902) + 96 bytes
>>>> [17] MATLAB:mcrMain(int, char**)(2, 0xbffff444, 0x0804ad1c,
>>>> 0xbffff3b8) + 308 bytes
>>>> [18] MATLAB:main(2, 0xbffff444, 0xbffff450, 0x0047ebe6) + 23 bytes
>>>> [19] libc.so.6:__libc_start_main~(0x0804a7c4, 2, 0xbffff444,
>>>> 0x0804a3d8) + 211 bytes
>>>> This error was detected while a MEX-file was running. If the MEX-file
>>>> is not an official MathWorks function, please examine its source code
>>>> for errors. Please consult the External Interfaces Guide for
>>>> information
>>>> on debugging MEX-files.
>>>
>>> FOREST Arnaud
>>
More information about the Opendap-tech
mailing list