EML, KNB, and ERDDAP Bob Simons DOC / NOAA / NMFS / SWFSC / ERD Monterey, CA [email protected] (Special thanks to Margaret O'Brien) What ERDDAP looks like
to a user: ERDDAP Your Favorite Client Software What ERDDAP looks like to a data provider: OBIS
SOS THREDDS Hyrax Database 3 ERDDAP ERDDAP
Files 2 1 4 Your Favorite Client Software
... Acting as a middleman allows ERDDAP to
Improve each dataset's metadata. Generate ISO 19115 metadata. Standardize the format of time data. Provide a unified way for users to search for datasets. Offer a standard way to request data from any dataset. Let users specify the response file format.
Make life easier for data providers and for users. Gridded Data A RESTful URL specifies an entire request: dataset, response file type, subset: http://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.html ?analysed_sst[(2016-07-01T09:00:00Z)][(-89.99):(89.99)][(-179.00):(180)] Special file types: .html (Data Access Form), .graph (graph form), .fgdc,
.iso19115, .das, .dds Data file types: .asc, .csv, .esriAscii, .json, .htmlTable, .mat, .nc, .odvTxt, .tsv, ... Image file types: .geotif, .kml, .pdf, .png, .transparentPng Tabular Data
A RESTful URL specifies an entire request: dataset, response file type, subset: http://coastwatch.pfeg.noaa.gov/erddap/tabledap/pmelTaoDySst.html ?longitude,latitude,T_25,time&time=2011-08-10T12:00:00Z Special file types: .html (Data Access Form), .graph (graph form), .fgdc, .iso19115, .das, .dds, .subset Data file types: .asc, .csv, .esriCsv,
.htmlTable, .geoJson, .json, .mat, .nc, .ncCF, .ncCFMA, .odvTxt, .tsv, .xhtml, ... Image file types: .geotif, .kml, .pdf, .png, .transparentPng Knowledge Network for Biocomplexity (KNB) An NSF-funded data repository for
ecologists and environmentalists A DataONE (NSF) Member Node Downloadable: EML metadata files and tabular data files https://knb.ecoinformatics.org/# Ecological Metadata Language (EML)
is a great XML-based metadata language for describing ecological tabular scientific datasets.
It is standardized, mature, well-documented. The XML is easy for a human to read and understand Indentation that is just right (cough, not like OGC, cough) There are tools to help users It has XML tags for a full description of the variables Everything is very well done! https://knb.ecoinformatics.org/#external//emlparser/docs/index.html
What if we combined ERDDAP and EML?! + GenerateDatasetsXml A tool that comes with ERDDAP It generates the description of a dataset that ERDDAP needs to serve the datasets.
New EDDTableFromEML option New EDDTableFromEMLBatch option https://coastwatch.pfeg.noaa.gov/erddap/download/EDDTableFromEML.html It works great! Added 200 SBC LTER datasets in 20 minutes! The EML has all the information ERDDAP needs. That was key. EML succeeds brilliantly!
The EML always a link to the data file. That was key. EML succeeds brilliantly! The data files have diverse but usable formats. The date time formats are not well described in the EML, but can be determined. I added other new ERDDAP features, e.g., timezones. The conversion succeeds with about 90% of EML files.
The rest are fixable (e.g., make a date+time variable). Advantages of having the datasets in ERDDAP: Users can now do Google-like searches for datasets of interest. Time is now formatted consistently. Users can now see subsets of the dataset and make custom graphs and maps
without downloading the data. Data are now available in a consistent file format of the user's choice. ERDDAP generates ISO 19115 XML for inclusion in data.gov catalog. Interactive - not download, parse, then explore. Public Access to Research Results (PARR) Requirements
Government funded data shall be publicly and freely: Discoverable via a catalog (data.noaa.gov -> data.gov) Understandable via metadata Accessible via a web service (e.g., ERDDAP / DAP) (not just downloadable files or a shopping cart) Your data available within one year (by last year ASAP!) ERDDAP can help with all of these requirements! https://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-fundedresearch
https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf A Bridge One can easily envision one ERDDAP with NOAA/USGS datasets and KNB/EML datasets offering a consistent way to search for and access all ecological/environmental datasets. What's Next?
Work with people using EML / tabular datasets to set up an ERDDAP and add datasets. Call or email me! Let's talk! Thank you! Questions? Comments? Suggestions? Want to work on this?
email [email protected] Give ERDDAP a try! http://coastwatch.pfeg.noaa.gov/erddap/