NOAA ERDDAP™
Easier access to scientific data
   
Brought to you by NOAA NMFS SWFSC ERD    
 

Working with the datasets.xml File

[This web page will only be of interest to ERDDAP™ administrators.]

After you have followed the ERDDAP™ installation instructions, you must edit the datasets.xml file in tomcat/content/erddap/ to describe the datasets that your ERDDAP™ installation will serve.

Table of Contents


 

Introduction

Some Assembly Required
Setting up a dataset in ERDDAP™ isn't just a matter of pointing to the dataset's directory or URL. You have to write a chunk of XML for datasets.xml which describes the dataset.

If you buy into these ideas and expend the effort to create the XML for datasets.xml, you get all the advantages of ERDDAP™, including: Making the datasets.xml takes considerable effort for the first few datasets, but it gets easier. After the first dataset, you can often re-use a lot of your work for the next dataset. Fortunately, ERDDAP™ comes with two Tools to help you create the XML for each dataset in datasets.xml.
If you get stuck, please send an email with the details to erd dot data at noaa dot gov.
Or, you can join the ERDDAP™ Google Group / Mailing List and post your question there.

Data Provider Form
When a data provider comes to you hoping to add some data to your ERDDAP, it can be difficult and time consuming to collect all of the metadata (information about the dataset) needed to add the dataset into ERDDAP. Many data sources (for example, .csv files, Excel files, databases) have no internal metadata, so ERDDAP™ has a Data Provider Form which gathers metadata from the data provider and gives the data provider some other guidance, including extensive guidance for Data In Databases. The information submitted is converted into the datasets.xml format and then emailed to the ERDDAP™ administrator (you) and written (appended) to bigParentDirectory/logs/dataProviderForm.log . Thus, the form semi-automates the process of getting a dataset into ERDDAP, but the ERDDAP™ administrator still has to complete the datasets.xml chunk and deal with getting the data file(s) from the provider or connecting to the database.

The submission of actual data files from external sources is a huge security risk, so ERDDAP™ does not deal with that. You have to figure out a solution that works for you and the data provider, for example, email (for small files), pull from the cloud (for example, DropBox or Google Drive), an sftp site (with passwords), or sneakerNet (a USB thumb drive or external hard drive). You should probably only accept files from people you know. You will need to scan the files for viruses and take other security precautions.

There isn't a link in ERDDAP™ to the Data Provider Form (for example, on the ERDDAP™ home page). Instead, when someone tells you they want to have their data served by your ERDDAP, you can send them an email saying something like:
Yes, we can get your data into ERDDAP. To get started, please fill out the form at https://yourUrl/erddap/dataProviderForm.html (or http:// if https:// isn't enabled).
After you finish, I'll contact you to work out the final details.

If you just want to look at the form (without filling it out), you can see the form on ERD's ERDDAP: Introduction, Part 1, Part 2, Part 3, and Part 4. These links on the ERD ERDDAP™ send information to me, not you, so don't submit information with them unless you actually want to add data to the ERD ERDDAP.

If you want to remove the Data Provider Form from your ERDDAP™, put
<dataProviderFormActive>false</dataProviderFormActive>
in your setup.xml file.

The impetus for this was NOAA's 2014 Public Access to Research Results (PARR) directive (external link), which requires that all NOAA environmental data funded through taxpayer dollars be made available via a data service (not just files) within 12 months of creation. So there is increased interest in using ERDDAP™ to make datasets available via a service ASAP. We needed a more efficient way to deal with a large number of data providers.

Feedback/Suggestions? This form is new, so please email erd dot data at noaa dot gov if you have any feedback or suggestions for improving this.

Tools
ERDDAP™ comes with two command line programs which are tools to help you create the XML for each dataset that you want your ERDDAP™ to serve. Once you have set up ERDDAP™ and run it (at least one time), you can find and use these programs in the tomcat/webapps/erddap/WEB-INF directory. There are Linux/Unix shell scripts (with the extension .sh) and Windows scripts (with the extension .bat) for each program. [On Linux, run these tools as the same user (tomcat?) that will run Tomcat.] When you run each program, it will ask you questions. For each question, type a response, then press Enter. Or press ^C to exit a program at any time.

Program won't run?

The tools print various diagnostic messages:

The two tools are a big help, but you still must read all of these instructions on this page carefully and make important decisions yourself.

Bonus Third-Party Tool: ERDDAP-lint
ERDDAP-lint is a program from Rob Fuller and Adam Leadbetter of the Irish Marine Institute that you can use to improve the metadata of your ERDDAP™ datasets. ERDDAP-lint "contains rules and a simple static web application for running some verification tests against your ERDDAP™ server. All the tests are run in the web browser." Like the Unix/Linux lint tool (external link), you can edit the existing rules or add new rules. See ERDDAP-lint (external link) for more information.

This tool is especially useful for datasets that you created some time ago and now want to bring up-to-date with your current metadata preferences. For example, early versions of GenerateDatasetsXml didn't put any effort into creating global creator_name, creator_email, creator_type, or creator_url metadata. You could use ERDDAP-lint to identify the datasets that lack those metadata attributes.

Thanks to Rob and Adam for creating this tool and making it available to the ERDDAP™ community.
 

The Basic Structure of the datasets.xml File
The required and optional tags allowed in a datasets.xml file (and the number of times they may appear) are shown below. In practice, your datasets.xml will have lots of <dataset>'s tags and only use the other tags within <erddapDatasets> as needed.

<?xml version="1.0" encoding="ISO-8859-1" ?>
<erddapDatasets>
  <angularDegreeUnits>...</angularDegreeUnits> <!-- 0 or 1 -->
  <angularDegreeTrueUnits>...</angularDegreeTrueUnits> <!-- 0 or 1 -->
  <cacheMinutes>...</cacheMinutes> <!-- 0 or 1 -->
  <commonStandardNames>...</commonStandardNames> <!-- 0 or 1 -->
  <convertInterpolateDatasetIDVariableExample /> <!-- 0 or more -->
  <convertInterpolateDatasetIDVariableList /> <!-- 0 or more -->
  <convertToPublicSourceUrl /> <!-- 0 or more -->
  <decompressedCacheMaxGB>...</decompressedCacheMaxGB> <!-- 0 or 1 -->
  <decompressedCacheMaxMinutesOld>...</decompressedCacheMaxMinutesOld> <!-- 0 or 1 -->
  <drawLandMask>...</drawLandMask> <!-- 0 or 1 -->
  <emailDiagnosticsToErdData>...</emailDiagnosticsToErdData> <!-- 0 or 1 -->
  <graphBackgroundColor>...</graphBackgroundColor> <!-- 0 or 1 -->
  <ipAddressMaxRequests>...</ipAddressMaxRequests> <!-- 0 or 1 -->
  <ipAddressMaxRequestsActive>...<ipAddressMaxRequestsActive> <!-- 0 or 1 -->
  <ipAddressUnlimited>...<ipAddressUnlimited> <!-- 0 or 1 -->
  <loadDatasetsMinMinutes>...</loadDatasetsMinMinutes> <!-- 0 or 1 -->
  <loadDatasetsMaxMinutes>...</loadDatasetsMaxMinutes> <!-- 0 or 1 -->
  <logLevel>...</logLevel> <!-- 0 or 1 -->
  <nGridThreads>...</nGridThreads> <!-- 0 or 1 -->
  <nTableThreads>...</nTableThreads> <!-- 0 or 1 -->
  <palettes>...</palettes> <!-- 0 or 1 -->
  <partialRequestMaxBytes>...</partialRequestMaxBytes> <!-- 0 or 1 -->
  <partialRequestMaxCells>...</partialRequestMaxCells> <!-- 0 or 1 -->
  <requestBlacklist>...</requestBlacklist> <!-- 0 or 1 -->
  <slowDownTroubleMillis>...</slowDownTroubleMillis> <!-- 0 or 1 -->
  <subscriptionEmailBlacklist>...</subscriptionEmailBlacklist> <!-- 0 or 1 -->
  <unusualActivity>...</unusualActivity> <!-- 0 or 1 -->
  <updateMaxEvents>...</updateMaxEvents> <!-- 0 or 1 -->

  <standardLicense>...</standardLicense> <!-- 0 or 1 -->
  <standardContact>...</standardContact> <!-- 0 or 1 -->
  <standardDataLicenses>...</standardDataLicenses> <!-- 0 or 1 -->
  <standardDisclaimerOfEndorsement>...</standardDisclaimerOfEndorsement> <!-- 0 or 1 -->
  <standardDisclaimerOfExternalLinks>...</standardDisclaimerOfExternalLinks> <!-- 0 or 1 -->
  <standardGeneralDisclaimer>...</standardGeneralDisclaimer> <!-- 0 or 1 -->
  <standardPrivacyPolicy>...</standardPrivacyPolicy> <!-- 0 or 1 -->
  <startHeadHtml5>...</startHeadHtml5> <!-- 0 or 1 -->
  <startBodyHtml5>...</startBodyHtml5> <!-- 0 or 1 -->
  <theShortDescriptionHtml>...</theShortDescriptionHtml> <!-- 0 or 1 -->
  <endBodyHtml5>...</endBodyHtml5> <!-- 0 or 1 -->

  <user username="..." password="..." roles="..." /> <!-- 0 or more -->

  <dataset>...</dataset> <!-- 1 or more -->
</erddapDatasets>
It is possible that other encodings will be allowed in the future, but for now, only ISO-8859-1 is recommended.
 

XInclude
New in version 2.25 is support for XInclude. This requires you are using the SAX parser <useSaxParser>true</useSaxParser> in your setup.xml. This can allow you to write each dataset in its own file, then include them all in the main datasets.xml, reuse parts of dataset definitions, or both. If you want to see an example, EDDTestDataset.java sets up XInclude to reuse variable definitions.
 


Notes

Working with the datasets.xml file is a non-trivial project. Please read all of these notes carefully. After you pick a dataset type, please read the detailed description of it carefully.
 

List of Types Datasets

If you need help choosing the right dataset type, see Choosing the Dataset Type.

The types of datasets fall into two categories. (Why?)


 

Detailed Descriptions of Dataset Types

EDDGridFromDap handles grid variables from DAP (external link) servers.

EDDGridFromEDDTable lets you convert an EDDTable tabular dataset into an EDDGrid gridded dataset. Remember that ERDDAP™ treats datasets as either gridded datasets (subclasses of EDDGrid) or tabular datasets (subclasses of EDDTable).

EDDGridFromErddap handles gridded data from a remote ERDDAP™ server.
EDDTableFromErddap handles tabular data from a remote ERDDAP™ server.

EDDGridFromEtopo just serves the ETOPO1 Global 1-Minute Gridded Elevation Data Set (external link) (Ice Surface, grid registered, binary, 2byte int: etopo1_ice_g_i2.zip) which is distributed with ERDDAP.

EDDGridFromFiles is the superclass of all EDDGridFrom...Files classes. You can't use EDDGridFromFiles directly. Instead, use a subclass of EDDGridFromFiles to handle the specific file type:

Currently, no other file types are supported. But it is usually relatively easy to add support for other file types. Contact us if you have a request. Or, if your data is in an old file format that you would like to move away from, we recommend converting the files to be NetCDF v3 .nc files. NetCDF is a widely supported, binary format, allows fast random access to the data, and is already supported by ERDDAP.

Details -- The following information applies to all of the subclasses of EDDGridFromFiles.

EDDGridFromAudioFiles and EDDTableFromAudioFiles aggregate data from a collection of local audio files. (These first appeared in ERDDAP™ v1.82.) The difference is that EDDGridFromAudioFiles treats the data as a multidimensional dataset (usually with 2 dimensions: [file startTime] and [elapsedTime within a file]), whereas EDDTableFromAudioFiles treats the data as tabular data (usually with columns for the file startTime, the elapsedTime with the file, and the data from the audio channels). EDDGridFromAudioFiles requires that all files have the same number of samples, so if that is not true, you must use EDDTableFromAudioFiles. Otherwise, the choice of which EDD type to use is entirely your choice. One advantage of EDDTableFromAudioFiles: you can add other variables with other information, e.g., stationID, stationType. In both cases, the lack of a unified time variable makes it more difficult to work with the data from these EDD types, but there was no good way to set up a unified time variable.

See these class' superclasses, EDDGridFromFiles and EDDTableFromFiles, for general information on how this class works and how to use it.

We strongly recommend using the GenerateDatasetsXml program to make a rough draft of the datasets.xml chunk for this dataset. Since audio files have no metadata other than information related to the encoding of the sound data, you will have to edit the output from GenerateDatasetsXml to provide essential information (e.g., title, summary, creator_name, institution, history).

Details:

EDDGridFromMergeIRFiles aggregates data from local, MergeIR (external link) files, which are from the Tropical Rainfall Measuring Mission (TRMM) (external link), which is a joint mission between NASA and the Japan Aerospace Exploration Agency (JAXA). MergeIR files can be downloaded from NASA (external link).

EDDGridFromMergeIRFiles.java was written and contributed to the ERDDAP™ project by Jonathan Lafite and Philippe Makowski of R.Tech Engineering (license: copyrighted open source).

EDDGridFromMergeIRFiles is a little unusual:

See this class' superclass, EDDGridFromFiles, for general information on how this class works and how to use it.

We strongly recommend using the GenerateDatasetsXml program to make a rough draft of the datasets.xml chunk for this dataset. You can then edit that to fine tune it.
 

EDDGridFromNcFiles aggregates data from local, gridded, GRIB .grb and .grb2 (external link) files, HDF (v4 or v5) .hdf (external link) files, .ncml files, NetCDF (v3 or v4) .nc (external link) files, and Zarr (external link) files (as of version 2.25). Zarr files have slightly different behavior and require either the fileNameRegex or the pathRegex to include "zarr".

This may work with other file types (for example, BUFR), we just haven't tested it -- please send us some sample files.

EDDGridFromNcFilesUnpacked is a variant of EDDGridFromNcFiles which aggregates data from local, gridded NetCDF (v3 or v4) .nc and related files. The difference is that this class unpacks each data file before EDDGridFromFiles looks at the files:

The big advantage of this class is that it provides a way to deal with different values of scale_factor, add_offset, _FillValue, missing_value, or time units in different source files in a collection. Otherwise, you would have to use a tool like NcML or NCO to modify each file to remove the differences so that the files could be handled by EDDGridFromNcFiles. For this class to work properly, the files must follow the CF standards for the related attributes.

EDDGridLonPM180 modifies the longitude values of a child (enclosed) EDDGrid dataset that has some longitude values greater than 180 (for example, 0 to 360) so that they are in the range -180 to 180 (Longitude Plus or Minus 180, hence the name).

EDDGridLon0360 modifies the longitude values of a child (enclosed) EDDGrid dataset that has some longitude values less than 0 (for example, -180 to 180) so that they are in the range 0 to 360 (hence the name).

EDDGridSideBySide aggregates two or more EDDGrid datasets (the children) side by side.

EDDGridAggregateExistingDimension aggregates two or more EDDGrid datasets each of which has a different range of values for the first dimension, but identical values for the other dimensions.

EDDGridCopy makes and maintains a local copy of another EDDGrid's data and serves data from the local copy.

EDDTableFromCassandra handles data from one Cassandra (external link) table. Cassandra is a NoSQL database.

EDDTableFromDapSequence handles variables within 1- and 2-level sequences from DAP (external link) servers such as DAPPER (was at https://www.pmel.noaa.gov/epic/software/dapper/, now discontinued).

EDDTableFromDatabase handles data from one relational database table or view (external link).

EDDTableFromEDDGrid lets you create an EDDTable dataset from any EDDGrid dataset.

EDDTableFromFileNames creates a dataset from information about a group of files in the server's file system, including a URL for each file so that users can download the files via ERDDAP's "files" system. Unlike all of the EDDTableFromFiles subclasses, this dataset type does not serve data from within the files.

EDDTableFromFiles is the superclass of all EDDTableFrom...Files classes. You can't use EDDTableFromFiles directly. Instead, use a subclass of EDDTableFromFiles to handle the specific file type:

Currently, no other file types are supported. But it is usually relatively easy to add support for other file types. Contact us if you have a request. Or, if your data is in an old file format that you would like to move away from, we recommend converting the files to be NetCDF v3 .nc files (and especially .nc files with the CF Discrete Sampling Geometries (DSG) (external link) Contiguous Ragged Array data structure -- ERDDAP™ can extract data from them very quickly). NetCDF is a widely supported, binary format, allows fast random access to the data, and is already supported by ERDDAP.

Details -- The following information applies to all of the subclasses of EDDTableFromFiles.

EDDTableFromAsciiService is essentially a screen scraper. It is intended to deal with data sources which have a simple web service for requesting data (often an HTML form on a web page) and which can return the data in some structured ASCII format (for example, a comma-separated-value or columnar ASCII text format, often with other information before and/or after the data).

EDDTableFromAsciiService is the superclass of all EDDTableFromAsciiService... classes. You can't use EDDTableFromAsciiService directly. Instead, use a subclass of EDDTableFromAsciiService to handle specific types of services:

Currently, no other service types are supported. But it is usually relatively easy to support other services if they work in a similar way. Contact us if you have a request.

Details -- The following information applies to all of the subclasses of EDDTableFromAsciiService.

EDDTableFromAsciiServiceNOS makes EDDTable datasets from the ASCII text data services offered by NOAA's National Ocean Service (NOS) (external link). For information on how this class works and how to use it, see this class's superclass EDDTableFromAsciiService. It is unlikely that anyone other than Bob Simons will need to use this subclass.

Since the data within the response from a NOS service uses a columnar ASCII text format, data variables other than latitude and longitude must have a special attribute which specifies which characters of each data line contain that variable's data, for example,
<att name="responseSubstring">17, 25</att>
 

EDDTableFromAllDatasets is a higher-level dataset which has information about all of the other datasets which are currently loaded in your ERDDAP. Unlike other types of datasets, there is no specification for the allDatasets dataset in datasets.xml. ERDDAP™ automatically creates one EDDTableFromAllDatasets dataset (with datasetID=allDatasets). Thus, an allDatasets dataset will be created in each ERDDAP™ installation and will work the same way in each ERDDAP™ installation.

The allDatasets dataset is a tabular dataset. It has a row of information for each dataset. It has columns with information about each dataset, e.g., datasetID, accessible, institution, title, minLongitude, maxLongitude, minLatitude, maxLatitude, minTime, maxTime, etc. Because allDatasets is a tabular dataset, you can query it the same way you can query any other tabular dataset in ERDDAP™, and you can specify the file type for the response. This lets users search for datasets of interest in very powerful ways.
 

EDDTableFromAsciiFiles aggregates data from comma-, tab-, semicolon-, or space-separated tabular ASCII data files.

EDDTableFromAwsXmlFiles aggregates data from a set of Automatic Weather Station (AWS) XML data files using the WeatherBug Rest XML API (which is no longer active).

EDDTableFromColumnarAsciiFiles aggregates data from tabular ASCII data files with fixed-width columns.

EDDTableFromHttpGet is different from all other types of datasets in ERDDAP™ in that it has a system whereby specific "authors" can add data, revise data, or delete data from the dataset by regular HTTP GET or POST requests from a computer program, a script or a browser. The dataset is queryable by users in the same way that all other EDDTable datasets are queryable in ERDDAP. See the description of this class's superclass, EDDTableFromFiles, to read about the features which are inherited from that superclass.

The unique features of EDDTableFromHttpGet are described below. You need to read all of this initial section and understand it; otherwise, you may have unrealistic expectations or get yourself into trouble that is hard to fix.

Set Up

Here are the recommended steps to setting up an EDDTableFromHttpGet dataset:
  1. Make the main directory to hold this dataset's data. For this example, let's use /data/testGet/ . The user running GenerateDatasetsXml and the user running ERDDAP™ must both have read-write access to this directory.
     
  2. Use a text editor to make a sample .jsonl CSV file with the extension .jsonl in that directory.
    The name isn't important. For example, you could call it sample.jsonl
    Make a 2 line .jsonl CSV file, with column names on the first line and dummy/typical values (of the correct data type) on the second line. Here is a sample file that is suitable for a collection of featureType=TimeSeries data that measured air and water temperature.
    [For featureType=Trajectory, you might change stationID to be trajectoryID.]
    [For featureType=Profile, you might change stationID to be profileID and add a depth variable.]
    ["stationID", "time", "latitude", "longitude", "airTemp", "waterTemp", "timestamp", "author", "command"]
    ["myStation", "2018-06-25T17:00:00Z", 0.0, 0.0, 0.0, 0.0, 0.0, "SomeBody", 0]
    
    Note:
    • The actual data values don't matter because you will eventually delete this file, but they should be of the correct data type. Notably, the time variable should use the same format that the actual data from the source will use.
    • For all variables, the sourceName will equal the destinationName, so use the correct/final variable names now, including time, latitude, longitude and sometimes depth or altitude if variables with that information will be included.
    • There will almost always be a variable named time which records the time the observation was made. It can be dataType String with units suitable for string times (e.g., yyyy-MM-dd'T'HH:mm:ss.SSSZ) or dataType double with units suitable for numeric times (e.g., seconds since 1970-01-01T00:00:00Z, or some other base time).
    • Three of the columns (usually the last three) must be timestamp, author, command.
    • The timestamp column will be used by EDDTableFromHttpGet to add a timestamp indicating when it added a given line of data to the data file. It will have dataType double and units seconds since 1970-01-01T00:00:00Z.
    • The author column with dataType String will be used to record which authorized author provided this line's data. Authorized authors are specified by the httpGetKeys global attribute. Although the keys are specified as author_key and are in the "request" URL in that form, only the author part is saved in the data file.
    • The command column with dataType byte will indicate if the data on this line is an insertion (0) or a deletion (1).
       
  3. Run GenerateDatasetsXml and tell it
    1. The dataset type is EDDTableFromHttpGet
    2. The directory is (for this example) /data/testGet/
    3. The sample file is (for this example) /data/testGet/startup.jsonl
    4. The httpGetRequiredVariables are (for this example) stationID, time See the description of httpGetRequiredVariables below.
    5. If data is collected every 5 minutes, the httpGetDirectoryStructure for this example is stationID/2months . See the description of httpGetDirectoryStructure below.
    6. The httpGetKeys
    Add the output (the chunk of datasets.xml for the dataset) to datasets.xml.
     
  4. Edit the datasets.xml chunk for this dataset to make it correct and complete.
    Notably, replace all the ??? with correct content.
     
  5. For the <fileTableInMemory> setting:
    • Set this to true if the dataset will usually get frequent .insert and/or .delete requests (e.g,. more often than once every 10 seconds). This helps EDDTableFromHttpGet respond faster to .insert and/or .delete requests. If you set this to true, EDDTableFromHttpGet will still save the fileTable and related information to disk periodically (as needed, roughly every 5 seconds).
    • Set this to false (the default) if the dataset will usually get infrequent .insert and/or .delete requests (e.g., less than once every 10 seconds).
       
  6. Note: It is possible to use <cacheFromUrl> and related settings in datasets.xml for EDDTableFromHttpGet datasets as a way to make and maintain a local copy of a remote EDDTableFromHttpGet dataset on another ERDDAP. However, in this case, this local dataset will reject any .insert and .delete requests.

Using EDDTableFromHttpGet Datasets

Detailed Information about EDDTableFromHttpGet

The topics are: Here is the detailed information:

EDDTableFromHyraxFiles (deprecated) aggregates data files with several variables, each with one or more shared dimensions (for example, time, altitude (or depth), latitude, longitude), and served by a Hyrax OPeNDAP server (external link).

EDDTableFromInvalidCRAFiles aggregates data from NetCDF (v3 or v4) .nc files which use a specific, invalid, variant of the CF DSG Contiguous Ragged Array (CRA) files. Although ERDDAP™ supports this file type, it is an invalid file type that no one should start using. Groups that currently use this file type are strongly encouraged to use ERDDAP™ to generate valid CF DSG CRA files and stop using these files.

Details: These files have multiple row_size variables, each with a sample_dimension attribute. The files are non-CF-standard files because the multiple sample (obs) dimensions are to be decoded and related to each other with this additional rule and promise that is not part of the CF DSG specification: "you can associate a given e.g., temperature value (temp_obs dimension) with a given depth value (z_obs dimension, the dimension with the most values), because: the temperature row_size (for a given cast) will be either 0 or equal to the corresponding depth row_size (for that cast) (that's the rule). So, if the temperature row_size isn't 0, then the n temperature values for that cast relate directly to the n depth values for that cast (that's the promise)."

Another problem with these files: the Principal_Investigator row_size variable doesn't have a sample_dimension attribute and doesn't follow the above rule.

Sample files for this dataset type can be found at https://data.nodc.noaa.gov/thredds/catalog/ncei/wod/ [2020-10-21 This server is no longer reliably available].

See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

We strongly recommend using the GenerateDatasetsXml program to make a rough draft of the datasets.xml chunk for this dataset. You can then edit that to fine tune it.

The first thing GenerateDatasetsXml does for this type of dataset after you answer the questions is print the ncdump-like structure of the sample file. So if you enter a few goofy answers for the first loop through GenerateDatasetsXml, at least you'll be able to see if ERDDAP™ can read the file and see what dimensions and variables are in the file. Then you can give better answers for the second loop through GenerateDatasetsXml.
 

EDDTableFromJsonlCSVFiles aggregates data from JSON Lines CSV files (external link). See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

EDDTableFromMultidimNcFiles aggregates data from NetCDF (v3 or v4) .nc (or .ncml) files with several variables, each with one or more shared dimensions. The files may have character variables with or without an additional dimension (for example, STRING14). See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

EDDTableFromNcFiles aggregates data from NetCDF (v3 or v4) .nc (or .ncml) files and Zarr (external link) files (as of version 2.25) with several variables, each with one shared dimension (for example, time) or more than one shared dimensions (for example, time, altitude (or depth), latitude, longitude). The files must have the same dimension names. A given file may have multiple values for each of the dimensions and the values may be different in different source files. The files may have character variables with an additional dimension (for example, STRING14). See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

Zarr files have slightly different behavior and require either the fileNameRegex or the pathRegex to include "zarr".

EDDTableFromNcCFFiles aggregates data aggregates data from NetCDF (v3 or v4) .nc (or .ncml) files which use one of the file formats specified by the CF Discrete Sampling Geometries (DSG) (external link) conventions. See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

For files using one of the multidimensional CF DSG variants, use EDDTableFromMultidimNcFiles instead.

The CF DSG conventions defines dozens of file formats and includes numerous minor variations. This class deals with all of the variations we are aware of, but we may have missed one (or more). So if this class can't read data from your CF DSG files, please email Chris.John at noaa.gov and include a sample file.
Or, you can join the ERDDAP™ Google Group / Mailing List and post your question there.

We strongly recommend using the GenerateDatasetsXml program to make a rough draft of the datasets.xml chunk for this dataset. You can then edit that to fine tune it.
 

EDDTableFromNccsvFiles aggregates data from NCCSV ASCII .csv files. See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

EDDTableFromNOS (DEPRECATED) handles data from a NOAA NOS (external link) source, which uses SOAP+XML for requests and responses. It is very specific to NOAA NOS's XML. See the sample EDDTableFromNOS dataset in datasets2.xml.
 

EDDTableFromOBIS handles data from an Ocean Biogeographic Information System (OBIS) server (was http://www.iobis.org ). It is possible that there are no more active servers which use this now out-of-date type of OBIS server system.

EDDTableFromParquetFiles handles data from Parquet (external link). See this class' superclass, EDDTableFromFiles, for information on how this class works and how to use it.

EDDTableFromSOS handles data from a Sensor Observation Service (SWE/SOS (external link)) server.

EDDTableFromThreddsFiles (deprecated) aggregates data files with several variables, each with one or more shared dimensions (for example, time, altitude (or depth), latitude, longitude), and served by a THREDDS OPeNDAP server (external link).

EDDTableFromWFSFiles (DEPRECATED) makes a local copy of all of the data from an ArcGIS MapServer WFS server so the data can then be re-served quickly to ERDDAP™ users.

EDDTableAggregateRows can make an EDDTable dataset from a group of "child" EDDTable datasets.

EDDTableCopy can make a local copy of many types of EDDTable datasets and then re-serve the data quickly from the local copy.


Details

Here are detailed descriptions of common tags and attributes.

Several tags can appear between the <dataset> and </dataset> tags.
There is some variation in which tags are allowed by which types of datasets. See the documentation for a specific type of dataset for details.


 

Contact

Questions, comments, suggestions? Please send an email to erd dot data at noaa dot gov and include the ERDDAP™ URL directly related to your question or comment.

Or, you can join the ERDDAP™ Google Group / Mailing List by visiting https://groups.google.com/forum/#!forum/erddap (external link) and clicking on "Apply for membership". Once you are a member, you can post your question there or search to see if the question has already been asked and answered.
 


ERDDAP, Version 2.25
Disclaimers | Privacy Policy