Structures in Dapper dds & Loaddap

Joe Sirott Joe.Sirott at noaa.gov
Thu May 4 12:45:47 PDT 2006


Hi All,

Just a bit of clarification on the Dapper attributes structures... Many 
in-situ datasets have two classes of attributes -- those that are a 
function of a given profile or time series (e.g. who the PI was, who 
made the measurement, what instrument was used for the measurement) and 
those that are also a function of a given profile but are also 
associated with a variable  (e.g. the valid data range for, say, the 
pressure).  The first class is represented by the 'attributes' structure 
and the second by the 'variable_attributes' structure. The reason that a 
nested structure is used for the 'variable_attributes' structure:

        Structure {
            Structure {
                Float32 valid_range[2];
            } PRES;
        } variable_attributes;

is that the attribute needs to be associated with a variable. If the 
structure wasn't nested, then some kind of naming convention would be 
required:

        Structure {
            Float32 valid_range[2];
        } PRES_ATTRIBUTE;

which IMHO would make life harder for clients attempting to recover the 
variable attributes.

- Joe


Thomas LOUBRIEU wrote:
> Hi Dan,
> Thanks  very much for your quick response.
> It will take us some time to fully take into account the information 
> you gave us, but I am already very happy to read the reason why the 
> 'attributes' structure exists in dapper (for keeping its fields 
> separated from the usual profile dimensions (x,y,z,t)).
> Actually I didn't understood that by myself. That 'attributes' 
> structure used to look strange to me,  but now I think it's a 
> wonderful idea.
>
> Bye,
>
> Thomas
>
>
>
>
> Daniel Holloway wrote:
>> Hi Arnaud,
>>
>>     There are several issues here, the primary culprit is handling 
>> complex sequences not structures.  I'll provide some input inline below.
>>
>> On May 4, 2006, at 10:09 AM, Arnaud FOREST wrote:
>>
>>> Hi all,
>>>
>>> I work on an opendap server for in-situ vertical profiles stored in 
>>> a Oracle DBMS.
>>> Our dds structure is copied from the 'argo' dapper interface 
>>> (http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp.dds) but I 
>>> notice that matlab struct tools(Libdap :  3.6.2, loaddap : 3.5.2, 
>>> Matlab : 7 ) doesn't completly manage the in-situ profile dapper 
>>> interface.
>>>
>>> The 'structures' are especially not very well managed by the matlab 
>>> client :
>>
>>     The problem is with handling nested sequences for this particular 
>> data source.  The following data source is a complex structure that 
>> loads fine with loaddap(v-3.5.2)
>>
>>     http://test.opendap.org:8080/dods/dts/complex_structs.03.dds
>>
>>     ---------
>> >> loaddap('http://test.opendap.org:8080/dods/dts/complex_structs.03')
>> >> whos
>>   Name            Size                    Bytes  Class
>>
>>   Outermost       1x1                      1472  struct array
>>
>> Grand total is 68 elements using 1472 bytes
>>
>> >> Outermost
>>
>> Outermost
>>
>>     SimpleStructure: [1x1 struct]
>>
>> >> Outermost.SimpleStructure
>>
>> ans
>>
>>     Innermost: [1x1 struct]
>>
>> >> Outermost.SimpleStructure.Innermost
>>
>> ans
>>
>>      i32: [10x1 double]
>>     ui32: [10x1 double]
>>      i16: [10x1 double]
>>     ui16: [10x1 double]
>>      f32: [10x1 double]
>>      f64: [10x1 double]
>>
>> >> Outermost.SimpleStructure.Innermost.i32
>>
>> ans
>>
>>            0
>>         2048
>>         4096
>>         6144
>>         8192
>>        10240
>>        12288
>>        14336
>>        16384
>>        18432
>>
>> >>
>>
>> ---------
>>
>>      Place those structs inside a Sequence and baboom....   So yes, 
>> there is a bug in loaddap with respect to these nested sequences.  I 
>> didn't realize it was there as I used loaddap recently to load 
>> another Dapper data source and it worked fine.
>>
>> ---------
>> >> clear
>> >> 
>> loaddap('http://las.pfeg.noaa.gov/dods/ndbcMet/all_noaa_time_series.cdp?LAT,LON,WSPD1&LAT>39.58&LAT<43&TIME<=1105315200000&TIME>=1104537600000') 
>>
>> >> whos
>>   Name           Size                    Bytes  Class
>>
>>   location       1x1                     11348  struct array
>>
>> Grand total is 1328 elements using 11348 bytes
>>
>> >> location
>>
>> location
>>
>>     profile: [6x1 struct]
>>         LON: [6x1 double]
>>         LAT: [6x1 double]
>>
>> >> location.LAT(2)
>>
>> ans
>>
>>    42.7500
>>
>> >> location.LON(2)
>>
>> ans
>>
>>   235.1500
>>
>>
>> >> location.profile(2).WSPD1(1:5)
>>
>> ans
>>
>>     5.1000
>>     5.8000
>>     5.9000
>>     7.2000
>>     4.6000
>>
>> >>
>> ---------------
>>
>>     So, I was not aware that Dapper output would cause loaddap to 
>> fail, so I'll look into what the problem might be that this 
>> particular data source is causing the client.
>>
>>     Nested Sequences can be a bear to deal with...
>>
>>>
>>>    1) "Loaddap" doesn't support the structure of structure, a 
>>> fatal             error occures. (the matlab's crash dump is at the 
>>> end of the
>>>       message)
>>>
>>>
>>>
>>> Is there a known bug or updates foreseen about the structure 
>>> management in the loaddap matalb toolbox ?
>>
>>       There is now, at least for this particular data source.
>>>
>>> Why the dapper interface has been defined with so many structures in 
>>> it?
>>
>>
>>       I shouldn't try to explain the rationale behind the Dapper use 
>> of Sequences but IMO it's a pretty good representation of the 
>> underlying relationships.
>>
>>      Basically Dapper provides the following form for all of its data 
>> sources:
>>
>>       Dataset {
>>
>>            Sequence {  ...  } location;
>>
>>            Structure {  independent dims } constrained_ranges;
>>
>>       }  data source;
>>
>>
>>       Each data source has a series of location data, and a single 
>> structure listing the extent of the independent dimensions (x,y,z,t) 
>> for the data source itself (as constrained by the request)
>>
>> ------------
>>
>>      To represent the 'location' series Dapper uses a nested sequence 
>> to represent the relationships between the parts of the series, such 
>> that values which do not change for a particular series which in this 
>> case are the lat/lon/juld variables are stored in the outer sequence 
>> element.   The variables that change most frequently, whether that is 
>> a time-series, or vertical profile, etc., are stored in the inner 
>> sequence, in this case the variable 'profile'.
>>
>> -----------
>> Dataset {
>>     Sequence {
>>         Float64 JULD;
>>         Float32 LONGITUDE;
>>         Float32 LATITUDE;
>>         Int32 _id;
>>         Sequence {
>>             Float32 PSAL_QC;
>>             Float32 CNDC_ADJUSTED_QC;
>>             Float32 TEMP_ADJUSTED_ERROR;
>>             ...
>>        } profile;
>>        ...
>>      } location;
>>  }
>>
>> ------------
>>
>>    Logically, you have a set of locations, which have a JULD, LAT, 
>> LON, and foreach of these you have a series of observations (profile).
>>
>>    The potentially confusing part is that for this data source Dapper 
>> includes two additional structure variables, but it's important to 
>> note that these structure variables reside in the outermost sequence 
>> element of 'location'.   That means that the values in the 
>> 'attribute' structure and the 'variable_attributes' structure apply 
>> to the outermost relationship.  Long story short the values in 
>> 'attributes' like 'PLATFORM_NUMBER' don't change as a function of the 
>> 'profile' series.  The nested Structure 'variable_attributes' list 
>> the extent or range of the independent dimension implicit within the 
>> nested sequence 'profile', which in this example is PRES (pressure).
>>
>>      Extending the above, you have a set of locations, which have a 
>> JULD, LAT, LON and attributes like PLATFORM_NUMBER, etc., and the 
>> range on the independent dim (PRES) for the series 'profile' is 
>> contained in the structure 
>> 'variable_attributes.PRES.valid_ranges[0:1],  and all the 
>> observations recorded at that location are stored in the profile 
>> series which happens to use PRES as the independent dimension between 
>> observations.
>>
>> ------------
>>
>>   OK, I've probably butchered that explanation... There are a couple 
>> of issues at work here that you should be aware of:
>>
>>    1:  Not sure why they use a nested structure for 
>> 'variable_attributes', maybe they envision supporting more than 2 
>> levels of nesting to represent a complicated relationship but I think 
>> you don't have to have nested structures for this particular variable.
>>
>>    2:  If you write a server that uses nested sequences the server 
>> should serialize any constructor variables (.e.g., structures) before 
>> the inner nested sequence itself.  I believe this is documented in a 
>> recent RFC on the DAP but I'll have to double check.  Logically from 
>> a client's perspective there's not difference between the ordering 
>> but from the existing implementations standpoint there is a big 
>> difference.  I'm not sure if that's the reason why the client is 
>> failing for this particular data source or not but will look into it.
>>
>>>
>>> Does anyone know what opendap clients are fully compliant with the 
>>> dapper output (structures of structures, sequences of sequences, 
>>> sequences of structures...) and what is planned for the improvement 
>>> of the compliance between dapper interface and opendap clients 
>>> (matlab, ferret, nco, Opendap Data connector, pyDAP, GrADS... I 
>>> guess C++ and JAVA API are right).
>>>
>>
>>    I can't speak for other client developers, we will make every 
>> effort to insure that our supported clients can read any valid DAP 
>> response.  We support the Matlab, IDL and ODC, as well as the API 
>> implementations we distribute.   I doubt if every client application 
>> will be able to support reading Dapper responses, they won't easily 
>> map into some of the underlying client APIs.
>>
>>    Dan
>>
>>>
>>> Thanks a lot,
>>>
>>> Arnaud and Thomas
>>>
>>> --------------------- Matlab Crash Dump ---------------------
>>>
>>>
>>> >>loaddap('http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp?
>>> &location.JULD>1143929418000&location.JULD<1144447818000&location.LATITUDE>30 
>>>
>>> &location.LATITUDE<50&location.LONGITUDE>-51&location.LONGITUDE<-5')
>>>
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>        Segmentation violation detected at Thu May  4 15:06:08 2006
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> Configuration:
>>>>   MATLAB Version:   7.0.1.24704 (R14) Service Pack 1
>>>>   MATLAB License:   122551
>>>>   Operating System: Linux 2.6.9-11.EL #1 Wed Jun 8 16:59:52 CDT 
>>>> 2005 i686
>>>>   Window System:    Hummingbird Communications Ltd. (7000), display 
>>>> br144-122:0.0
>>>>   Current Visual:   0x23 (class 4, depth 24)
>>>>   Processor ID:     x86 Family 15 Model 0 Stepping 10, GenuineIntel
>>>>   Virtual Machine:  Java is not enabled
>>>>   Default Charset:  UTF-8
>>>> Register State:
>>>>   eax = 00000000   ebx = 00cbc148
>>>>   ecx = 02d09560   edx = 0896bd40
>>>>   esi = 025250d0   edi = 00000003
>>>>   ebp = bfff8bc8   esp = bfff8bc8
>>>>   eip = 00cb737b   flg = 00210286
>>>> Stack Trace:
>>>>   [0] loaddap.mexglx:vfprintf~(0x0896bd40, 0x025250d0, 0xbfffafb0 
>>>> "location", 0x00cb8d64) + 13179 bytes
>>>>   [1] loaddap.mexglx:0x00cb8e6c(0x08642338, 0xbfffb20c, 0x00cba7c6, 1)
>>>>   [2] loaddap.mexglx:0x00cb97ba(0x08642338, 0x00cba429, 1, 0xbfffbe30)
>>>>   [3] loaddap.mexglx:mexFunction~(0, 0xbfffbdd0, 1, 0xbfffbe30) + 
>>>> 304 bytes
>>>>   [4] libmex.so:mexRunMexFile(0, 0xbfffbdd0, 1, 0xbfffbe30) + 93 bytes
>>>>   [5] libmex.so:Mfh_mex::dispatch_file(int, mxArray_tag**, int, 
>>>> mxArray_tag**)(0x02b40fb0, 0, 0xbfffbdd0, 1) + 537 bytes
>>>>   [6] libmwm_dispatcher.so:Mfh_file::dispatch_fh(int, 
>>>> mxArray_tag**, int, mxArray_tag**)(0x02b40fb0, 0, 0xbfffbdd0, 1) + 
>>>> 262 bytes
>>>>   [7] libmwm_interpreter.so:inDispatchFromStack(455, 0x0869ad20 
>>>> "loaddap", 0, 1) + 1240 bytes
>>>>   [8] libmwm_interpreter.so:inDispatchCall(char const*, int, int, 
>>>> int, int*, int*)(0x0869ad20 "loaddap", 455, 0, 1) + 112 bytes
>>>>   [9] libmwm_interpreter.so:.L924(2, 0, 0, 0) + 165 bytes
>>>>   [10] libmwm_interpreter.so:inInterPcodeSJ(inDebugCheck, int, int, 
>>>> opcodes, inPcodeNest_tag*)(2, 0, 0, 0) + 315 bytes
>>>>   [11] libmwm_interpreter.so:inInterPcode(2, 0, 0xbfffc3f8, 
>>>> 0x0095e39b) + 93 bytes
>>>>   [12] libmwm_interpreter.so:in_local_call_eval_function(int*, 
>>>> _pcodeheader*, int*, mxArray_tag**, inDebugCheck)(0, 0xbfffcdf0, 
>>>> 0xbfffce7c, 0xbfffcea8) + 163 bytes
>>>>   [13] 
>>>> libmwm_interpreter.so:inEvalStringWithIsVarFcn(_memory_context*, 
>>>> char const*, EvalType, int, mxArray_tag**, inDebugCheck, 
>>>> _pcodeheader*, int*, bool (*)(void*, char const*), 
>>>> void*)(0x008dd468, 0x08744bd0 "loaddap('http://dapper.pmel.noaa..", 
>>>> 0, 0) + 2358 bytes
>>>>   [14] libmwm_interpreter.so:inEvalCmdNoEnd(0x08744bd0 
>>>> "loaddap('http://dapper.pmel.noaa..", 0x08744bd0 
>>>> "loaddap('http://dapper.pmel.noaa..", 0xbfffd048 ", 0x00de8c27) + 
>>>> 85 bytes
>>>>   [15] libmwbridge.so:mnParser(0x00cd21e8 "@@@", 0x00cd22d8 
>>>> "mnParser", 1, 0xbfffd0a4) + 471 bytes
>>>>   [16] libmwmcr.so:mcrInstance::mnParser()(0x080a01c0, 0, 
>>>> 0xbffff398, 0x0804a902) + 96 bytes
>>>>   [17] MATLAB:mcrMain(int, char**)(2, 0xbffff444, 0x0804ad1c, 
>>>> 0xbffff3b8) + 308 bytes
>>>>   [18] MATLAB:main(2, 0xbffff444, 0xbffff450, 0x0047ebe6) + 23 bytes
>>>>   [19] libc.so.6:__libc_start_main~(0x0804a7c4, 2, 0xbffff444, 
>>>> 0x0804a3d8) + 211 bytes
>>>> This error was detected while a MEX-file was running.  If the MEX-file
>>>> is not an official MathWorks function, please examine its source code
>>>> for errors.  Please consult the External Interfaces Guide for 
>>>> information
>>>> on debugging MEX-files.
>>>
>>> FOREST Arnaud
>>



More information about the Opendap-tech mailing list