Skip to content Learn about the access keys available for Metadata.NSW
A NSW Government website
Metadata.NSW (beta)

Concept help - Data Set

A Data Set describes a record of data, including any location or time boundaries for the data, that has been captured and is available for use under a specific licence. A Data Set may be included in a Data Catalog, and can reference multiple Distributions that record different parts or formats of the data that are available to download.

A a dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset does not have to be available as a downloadable file. For example, a dataset that is available via an API can be defined as an instance of dcat:Dataset and the API can be defined as an instance of dcat:Distribution. DCAT itself does not define properties specific to APIs description. These are considered out of the scope of this version of the vocabulary. Nevertheless, this can be defined as a profile of the DCAT vocabulary.

Fields available on this metadata type

Field ISO definition
Name The primary name used for human identification purposes.
Definition Representation of a concept by a descriptive statement which serves to differentiate it from related concepts. (3.2.39)
Is Federated
Is Not Federable
Version Unique version identifier of this metadata item.
References Significant documents that contributed to the development of the metadata item which were not the direct source for the metadata content.
Origin The source (e.g. document, project, discipline or model) for the item (8.1.2.2.3.5)
Comments Descriptive comments about the metadata item (8.1.2.2.3.4)
Deleted The date after which the item has been soft deleted and is no longer visible in the registry
License Information about the license document under which the dataset is made available.
Rights Information about rights held in and over the dataset.
Release Date Date of formal publication of the dataset.
Modification Date Most recent date on which the dataset was changed, updated or modified.
Frequency The frequency at which dataset is published.
Spatial Coverage Spatial or geographic coverage of the dataset.
Temporal Coverage The temporal or time period that the dataset covers.
Catalog An entity responsible for making the dataset available.
Landing Page A Web page that can be navigated to in a Web browser to gain access to the dataset, its distributions and/or additional information
Contact Point Relevant contact information for the Dataset.
Conforming Specification An established standard to which the described resource conforms.
Item Base

Custom Fields

Field Short definition Long definition
Security Classification [Core] Security classification of information
[Core] Security classification of information The security classification applied to the asset as specified by the Australian Government Protective Security Policy Framework (PSPF) The Australian Government uses 3 security classifications: * PROTECTED * SECRET * TOP SECRET All other information from business operations and services is OFFICIAL or, where it is sensitive, OFFICIAL: Sensitive. [NB: the old UNCLASSIFIED classification was renames OFFICIAL (see PSPF v2018.1 Sep 2018)] The originator of the data asset is responsible for applying the relevant Security Classification. This attribute relates to Sensitive Data and Access Rights.
Sensitive Data [Additional] The type of sensitivity of the data asset, where applicable.
[Additional] The type of sensitivity of the data asset, where applicable. If Security Classification, as specified by the Australian Government Protective Security Policy Framework (PSPF), has value “OFFICIAL: Sensitive”, provide type of sensitivity. Where multiple sensitivity types exist within the data asset, provide the most restrictive dissemination limiting marker (DLM). This attribute relates to Security Classification and Access Rights.
Data Custodian [Core] The custodian(s) of the data asset.
Contact Point [Core] Key data roles. The relevant contact information from which information for the asset can be obtained
[Core] Key data roles. The relevant contact information from which information for the asset can be obtained: Directorate, Asset owner position, Asset steward position, Current Owner, Current Steward, Subject matter expert, External Custodian
Update Frequency [Additional] The frequency at which new, revised or updated versions of this data asset are made available.
[Additional] The frequency at which new, revised or updated versions of this data asset are made available.
Keyword [Core] Word(s) or tags that describe the data asset subject matter.
[Core] Word(s) or tags that describe the data asset subject matter. These word(s) or terms describe the topic(s) covered by the data asset. It answers the question “what is this data asset about?” and supports data discovery. When selecting keywords, consider what search terms your users may choose when searching for the data asset. It is recommended to include at least one term from the Australian Governments’ Interactive Functions Thesaurus (AGIFT) that covers words and terms related to Australian Government agencies’ core business functions and activities. Also include words such as Indigenous, Disability or Gender if appropriate to better support the Government’s priority data activities. Where multiple keywords apply, separate the terms with a comma ‘,’.
Resource Type [Core] The type of data asset being described
[Core] The type of data asset being described This attribute specifies the type of data asset. The most common types of data asset applicable are listed below with their definitions. (This attribute could be supplemented by attribute Format.) collection an aggregation of items. The term collection means that the resource is described as a group; its parts may be separately described and navigated. dataset structured information encoded in lists, tables, databases, etc., which will normally be in a format available for direct machine processing. For example - spreadsheets, databases, GIS data, midi data. Note that unstructured numbers and words would be considered as text. image the content is primarily symbolic visual representation other than text. For example - images and photographs of physical objects, paintings, prints, drawings, other images and graphics, animations and moving pictures, film, diagrams, maps, musical notation. Note that image may include both electronic and physical representations. interactive resource a resource which requires interaction from the user to be understood, executed, or experienced. For example - forms on web pages, applets, multimedia learning objects, virtual reality. model an abstraction of the real thing, i.e. some generalisation and interpretation. Models could be considered a symbolic representation. Examples include performance models, cost models, mechanical models, etc. service a system that provides one or more functions of value to the end-user. Examples include: a photocopying service, a banking service, an authentication service, interlibrary loans, a Z39.50 or Web server. software a computer program in source or compiled form which may be available for installation non-transiently on another machine. For software which exists only to create an interactive environment, use interactive instead. sound a resource whose content is primarily audio or intended to be realised in audio. For example - music, speech, recorded sounds. This category includes musical notation, including score, which is unrealised in sound.

Official Definition

A representation of a dataset in a catalog. Data Catalog Vocabulary (DCAT): 5.3 Class: Dataset