The Australasian Virtual Herbarium (AVH) is the contribution of Australian and New Zealand herbaria to the Atlas of Living Australia (ALA). Currently, data is delivered to AVH by 14 Australian and nine New Zealand herbaria.
Herbarium specimen data is catalogued in accordance with the data entry standards and protocols of each herbarium. Once aggregated in the ALA BioCache, the data is standardised and a range of quality checks are applied to enhance data retrieval and analysis. The standardisation and data processing applied in the BioCache is described below. The unprocessed data provided by the herbaria is always available; an overview of the provided versus processed data is available for each record via the Record detail page.
The AVH contains records of the holdings of Australian and New Zealand herbaria, including specimens collected outside those countries. It also contains records of cultivated plants and introduced occurrences and while we have the Establishment means field to indicate whether a plant is cultivated or not native, this has not always been used. AVH records may also just be wrong. A record in the AVH of a species from a certain location only means that there is a specimen with this species name from the given location, not necessarily that the species with that name naturally occurs at that location. The major asset of the AVH is that all records are based on preserved specimens, so they can, to a large extent, be verified. It is up to the user to decide whether or not verification is warranted. By itself the AVH data is not more reliable than data you get from anywhere else.
The name of the herbarium at which the specimen is held. The following herbaria currently contribute specimen data to the AVH:
BRI – Queensland Herbarium
HO – Tasmanian Herbarium
LINC – Lincoln University Herbarium
NZFRI – National Forestry Herbarium NZ
PDD – New Zealand Fungarium
PERTH – Western Australian Herbarium
The Index Herbariorum code of the herbarium at which the specimen is held.
The catalogue number is an identifier for the physical specimen. In the AVH, catalogue numbers consist of the Herbarium code, followed by a space and then the catalogue number used at the institution. The search for catalogue number uses an exact match, so the catalogue number entered needs to be exactly the same as it is in the AVH. Catalogue number formats applied at the different herbaria are listed in Table 1.
Table 1. Catalogue number formats used by the herbaria that contribute data to Australia’s Virtual Herbarium
|State Herbarium of South Australia||AD 00000000||a number of up to nine digits, sometimes followed by a single letter|
|AD-A 00000(X)||a number of up to five digits, sometimes followed by a single letter|
|AD-C 00000(X)||a number of up to five digits, sometimes followed by a single letter|
|Auckland War Memorial Museum Herbarium||AK000000||a number of two to six digits, with no space after the herbarium code|
|Australian National Herbarium||CANB 000000||a number of up to six digits|
|CBG 0000000||a number of up to seven digits|
|University of Canterbury Herbarium||CANU 000000||a number of two to six digits, sometimes followed by a single letter|
|Allan Herbarium, Landcare Research NZ Ltd||CHR 000000||a a number of up to six digits, sometimes followed by a space and a letter|
|Australian Tropical Herbarium||CNS 000000||a number of up to six digits|
|QRS 000000||a number of up to six digits|
|Northern Territory Herbarium||DNA X0000000||a string of one letter followed by seven digits – the letter indicates whether the specimen originates from the Darwin Herbarium (D) or the Alice Springs Herbarium (A)|
|Tasmanian Herbarium||HO 000000||a number of up to six digits|
|James Cook University Herbarium||JCT JCT-X00000.0||a string of a single letter (indicating the major group to which the specimen belongs) followed by up to five digits, a full stop, then another digit; the herbarium code is repeated at the start of the catalogue number, and separated from the rest of the string by a hyphen|
|Lincoln University Herbarium||LINC-0000||a number of up to four digits, separated from the herbarium code by a hyphen (with no spaces)|
|La Trobe University Herbarium||HO 000000||a string of nine digits (with leading zeroes for numbers less than 100000000)|
|National Herbarium of Victoria||MEL 0000000X||a string of seven digits (with leading zeroes for numbers less than 1000000) followed by one letter|
|The University of Melbourne Herbarium||MELU X000000x||a string of a single letter (indicating the major group to which the specimen belongs) followed by six digits and a lower-case letter|
|Dame Ella Campbell Herbarium, Massey University||MPN 00000||a number of up to five digits|
|University of New England N.C.W. Beadle Herbarium||NE 000000||a number of one to six digits|
|National Herbarium of New South Wales||NSW 000000||a number of up to six digits|
|National Forestry Herbarium NZ||NZFRI 00000||a number of up to five digits|
|New Zealand Fungarium||PDD 000000||a string of six digits (with leading zeroes for numbers less than 100000)|
|Western Australian Herbarium||PERTH 0000000||a number of up to eight digits|
|Unitec Institute of Technology Herbarium||UNITEC 000000||a string of six digits (with leading zeroes for numbers less than 100000)|
|Herbarium of the Museum of New Zealand – Te Papa Tongarewa||WELT.X000000/X||a string of a single letter (indicating the major group to which the specimen belongs) followed by six digits (with leading zeroes for numbers less than 100000); the string is separated from the herbarium code by a full stop, and may be appended by a forward slash and a letter to indicate multiple parts|
|Janet Cosh Herbarium, University of Wollongong||MEL 0000000X||a number of up to five digits|
ALA record ID
The record id is a unique identifier applied to the specimen record in the ALA BioCache. The ALA record ID is listed in the CSV downloads and in the URL of the Record detail page. You can use the ALA record ID to find the most up-to-date data for the specimen record by typing ‘/[record ID]’ in the navigation bar of your browser.
Basis of record
Basis of record describes the source of the information in a record. Occurrence records are generally either based on an object (such as a preserved specimen), or on an observation of an organism in the field. Object-based records are more verifiable than observation-based records, because the specimens are held in permanent collections and can be examined at a later date and the identification can be verified. All AVH records are based on objects.
Note that there are currently about 15,000 records from CANB and PERTH which are displayed with the record type ‘Human observation’, however, this is a data delivery error; all records in the AVH are specimen-based.
The preparation type of the specimen (‘Sheet’, ‘Packet’ etc.).
Date last updated
The date the record was last updated. On the Record detail pages this is shown as Date loaded and will appear under the location map. The Date last processed indicates when the record was last processed within the ALA BioCache. This may happen, for instance, when there is a change in the backbone taxonomy or the Sensitive Data Service, or when new environmental layers are loaded.
The name of the collector, as provided with the specimen record.
The number (or other identifier) assigned to the specimen by the collector.
The names of any additional collectors who were present at the time the specimen was collected.
The date that the specimen was collected. Collecting dates in the AVH are delivered in two ways: as Collecting date, which has to be ISO compliant, or as Verbatim collecting date. Some herbaria deliver Verbatim collecting date only when an ISO-compliant Collecting date can’t be delivered, or not all information can be delivered as an ISO-compliant date (e.g. ‘Summer of ’69’), while others deliver it for all collecting dates. Because of database restrictions, some herbaria are only able to deliver incomplete collecting dates as Verbatim collecting date. We hope to remedy this in the near future.
A small proportion of records in the AVH are undated. These will predominantly be historical (pre-1900) records, but will also include records collected in the twentieth century.
Querying by date
The Collecting date term in the Advanced search form allows you to query for a range of collecting dates. If you enter just a start date or an end date, your results will include records of specimens collected since or up to that date respectively. To query for a particular collecting date, enter the same date in both fields. Because of problems associated with querying incomplete dates, the results for a search that includes a collecting date term will only return records with complete collecting dates.
There are Year, Month and Decade facets on the Results page that allow you to filter your results for records collected in a particular year, month or decade. Note that the Decade facet and chart only include records with complete collecting dates; records with only a month and a year, or just a year, will not be counted.
The locality at which the specimen was collected.
A description of the habitat that the plant, alga or fungus was growing in, provided by the collector(s) of the specimen. Some herbarium databases have this information split into multiple fields, e.g. ‘Habitat’, ‘Associated taxa’, ‘Substrate’ and ‘Host’, but the information from the separate fields is concatenated before being delivered to the AVH.
Any additional notes about the specimen or the collecting event provided by the collector. Some herbaria store this information in multiple fields in their herbarium databases, for example ‘Collecting notes’ and ‘Descriptive notes’, but this information is concatenated for delivery to the AVH. Some herbarium databases have a general notes field and no other place to store notes from determiners or data entry personnel – and even those that do have dedicated fields for other types of notes have not always had them – so, for many records, Collecting notes will contain more than just collecting notes.
The Phenology field indicates whether the specimen is bearing flowers or fruits, etc.
Establishment means describes how the specimen was established at the collecting locality. It combines the HISPID concepts ‘Cultivation status’ and ‘Natural occurrence’. Cultivation status can be one of the following values: ‘Cultivated’, ‘Not cultivated’, ‘Assumed to be cultivated’ and ‘Doubtfully cultivated’. Cultivation status is not recorded consistently among the different herbaria, and is not provided by all herbaria. Natural occurrence indicates whether the occurrence is natural or whether the specimen has been introduced at the collecting locality. Allowed values are: ‘Native’, ‘Assumed to be native’, ‘Doubtfully native’ and ‘Not native’. This information is recorded rather haphazardly.
The conservation status associated with the taxon in the state or territory in which it was collected. The classification codes for each state or territory are listed in Table 2.
Table 2. Conservation status classification codes for Australian states and territories.
|New South Wales||
|Australian Capital Territory||
Taxon names in the AVH
The ALA applies taxon name resolution to incoming data. The taxon name with a record will be stored in the ALA BioCache exactly as it is delivered to the AVH – called ‘Taxon name (provided)’ in the AVH – but the taxon name will also be processed. The processing of the taxon name includes parsing the name into its parts, e.g. genus name and epithet(s). If the name can be parsed, the name resolver will try to match the canonical name against the ALA name list. If no match can be found or the name could not be parsed, it will try to match the genus name; if the genus name is not in the ALA name list, it will try to match the name of the family, which is provided separately.
If a matched name is in one of the authoritative national checklists that are part of the ALA name list, such as the Australian Plant Census (APC), and if the matched name is considered a synonym, the processed name will be the accepted name from the checklist. The processed taxon name is called ‘Taxon name (processed)’ in the AVH.
The Identified to rank is the rank of the processed taxon name. This may be the same rank as the rank of the provided name, or a higher rank. If the rank of the processed name is different from that of the provided name, and the name resolver can work out the rank of the provided name, the rank of the provided name will be given on the Record detail page as well.
The Name match metric describes how the provided name was matched to a name in the ALA name list. If the parsed name was matched, the Name match metric will be ‘Canonical name match’. ‘Higher taxa match’ means that the name itself could not be matched, but the name of a higher taxon – genus or family – could. ‘No match’ means that neither the provided name nor the name of a higher taxon could be matched to a name in the ALA name list.
Author is the authorship of the processed name. The authorship of the provided name is given as part of Taxon name (provided).
The Common name (processed) is the common name recorded in the ALA name list. Common names are never provided by the herbaria.
There are facets for taxonomic groups of all mandatory ranks (Kingdom, Phylum, Class, Order, Family, Genus, Species) and for Infraspecific taxon. These taxonomic groups are also given on the Record detail page and in the downloads. If there was no match for a provided name, the names of the taxonomic groups provided with the specimen record will be displayed.
The Botanical group facet allows you to select records of a botanical group that is not necessarily a taxonomic group, for example, bryophytes, angiosperms or dicotyledons. Some other useful groupings, e.g. lichens, have not been implemented yet, but will be added when the National Species Lists have been completed. As these are not taxonomic groups, botanical groups may overlap and not all taxa are represented in one of the recognised groups.
The Identification qualifier facet can be used to select records with certain determination qualifiers, or to exclude records with uncertain determination. Note that excluding records with uncertain determination disqualifies all records with determination qualifiers, no matter at what rank the qualifier applies. If a qualifier applies at the infraspecific rank you might want to include the determination in the results of a taxon name query. In this case you are better off keeping the uncertain determinations and deleting the ones that you don’t want from the output, or do a filter on uncertain determinations and see what you are going to throw out.
The taxon name addendum is a qualifier that comes after the taxon name. Name addenda endorsed by HISPID (translated in proper English) are ‘s. str.’ (in the narrow sense), ‘s.l.’ (in the broad sense) and ‘agg.’ (aggregate, group, complex). The first two are supposed to be used to differentiate between competing taxon concepts, but s.l. is often used to indicate uncertainty about whether the specimen belongs to the taxon in question, or if it belongs to a similar taxon. In this case, ‘agg.’ would have been more appropriate to use. Often aggregates are not formally recognised, although there is mostly general agreement on what the complexes are. For general use it is probably best to ignore ‘s. str.’ and treat determinations with the other name addenda as uncertain determinations.
The person or persons who last determined the specimen. Due to the practice at some Australian herbaria of changing the name on specimens as part of the curatorial process, without examining the specimen, a lot of this information is meaningless or even misleading.
The role the determiner has played in the determination, e.g. determined the specimen (‘Det.’) or confirmed an earlier determination (‘Conf.’).
The Identified date term in the Advanced search allows you to query for a range of determination dates. If you fill in just a start date or an end date, your results will include records of specimens determined since or up to that date respectively. In order to query for a particular determination date, enter the same date in both fields. Because of problems associated with querying incomplete dates, a result for a search that includes a determination date term will only include records with complete determination dates. It is common practice to give only a month and year on determination slips, so be aware that a large part of the determinations will be missed when querying by determination date.
Determination notes are any notes that are made by the determiner at the time of the determination. These may include diagnostic features, a reference to the work that was used to identify the specimen, or an indication that the specimen is not typical for the taxon. Due to the structure of some herbarium databases, determination notes have often been included in Collecting notes.
Specimen type (Type status)
The type status of any type specimens in the results. Note that, if you searched by a taxon name, a type specimen in the result is not necessarily a type of the name that you searched for.
The values in many of the following geography fields have been inferred from the latitude and longitude provided with the specimen records. In some cases, the geography values stored in the herbarium record may differ from the inferred values due to geocoding errors. If you suspect that there is an error with a record, you can flag an issue on the Record detail page.
The country in which the specimen was collected. Note that, while most Australian and New Zealand herbaria hold specimens collected outside Australia and New Zealand, only a small proportion of foreign holdings have been databased and made available through the AVH.
States (including coastal waters)
The state, territory or province that Australian specimens were collected in. Queries for records from countries other than Australia and New Zealand cannot be limited to political divisions below the country level.
Local government area
The local government area in which the specimen was collected, based on the latitude and longitude provided with the specimen record.
Latitude and longitude
The latitude and longitude of the collecting locality. If correctable errors are detected, for example the latitude or longitude is in the wrong hemisphere or the latitude and longitude are transposed, the values are corrected and the unprocessed latitude and longitude are given in the Latitude (provided) and Longitude (provided)fields in the Record detail page and in downloaded results.
The geodetic datum is the reference from which the latitude and longitude are measured. The most common datums are WGS84, GDA94 and AGD66. While important for records of more recent collections, the inaccuracy of the georeferences in the bulk of the AVH records is much greater than the differences between the different datums. Geodetic datum is very haphazardly delivered with AVH data, even with georeferences that are accurate enough for the geodetic datum to be meaningful.
The Geocode uncertainty is a measurement or estimate of how far away in metres the point represented by the latitude and longitude may be from the actual location where the specimen was collected. Most herbaria use ranges in their collections databases, for example 0-100 m, 100 m-1 km, 1-10 km, 10-25 km and >25 km. The values that are delivered to the AVH are the maximum values of these ranges.
The method by which the georeference was obtained, for example by GPS or by using a topographic map. In the HISPID standard, which is used for data delivery to the AVH, ‘Geocode source’ is a mixed concept and includes both generalisations for the method used and for the person who provided the georeference. In the AVH, the terms that refer to the person who provided the georeference are stored under Georeferenced by.
A general term for the person who provided the georeference (‘Collector’ or ‘Compiler’).
The altitude in metres of the collecting locality, if provided with the specimen record. Altitude is presented in the AVH as Minimum altitude (m) and Maximum altitude (m). If a single value for altitude is provided, which will mostly be the case, this will be given as Minimum altitude (m). If a range is provided a Maximum altitude (m) will also be given.
The depth in metres where the specimen was collected. Depth is presented in the AVH as Minimum depth (m)and Maximum depth (m). If a single value for depth is provided, which will mostly be the case, this will be given as Minimum depth (m). If a range is provided a Maximum depth (m) will also be given.
IBRA 7 regions
The Interim Biogeographic Regionalisation for Australia (IBRA) 7 region that corresponds to the latitude and longitude provided with the specimen record. (This data is only available for Australia.)
IBRA 7 subregions
The Interim Biogeographic Regionalisation for Australia (IBRA) 7 subregion that corresponds to the latitude and longitude provided with the specimen record. (This data is only available for Australia.)
IMCRA 4 region
The Integrated Marine and Coastal Regionalisation of Australia (IMCRA) 4 bioregion that corresponds to the latitude and longitude provided with the specimen record. (This data is only available for Australia.)
IMCRA Meso-scale bioregions
The Integrated Marine and Coastal Regionalisation of Australia (IMCRA) meso-scale bioregion that corresponds to the latitude and longitude provided with the specimen record. (This data is only available for Australia.)
NZ ecological districts
The New Zealand ecological district that corresponds to the latitude and longitude provided with the specimen record.
NZ land districts
The New Zealand land district that corresponds to the latitude and longitude provided with the specimen record.
The New Zealand region that corresponds to the latitude and longitude provided with the specimen record.
The combined IBRA and IMCRA biogeographic regions. This facet is hidden by default, but can be selected under the Refine results options. (This data is only available for Australia.)
The ecoregion (non-marine, marine, or limnetic) in which the taxon occurs. The data comes from the Interim Register of Marine and Nonmarine Genera (IRMNG) and is based on the genus name, not from the latitude and longitude associated with the record. Not all genus names in the AVH are in IRMNG. (This data is only available for Australia.)
The Major Vegetation Groups (from the National Vegetation Information System) at the collecting locality, inferred from the latitude and longitude provided with the specimen record. There are layers and facets for both the extant vegetation type (Vegetation types: extant) and the estimated vegetation before European settlement (Vegetation types: pre-1750). (This data is only available for Australia.)
Commonwealth Electoral Boundaries
The Commonwealth electoral division that corresponds to the latitude and longitude provided with the specimen record (for Australian specimens).
The Australian Natural Resource Management (NRM) region that corresponds to the latitude and longitude provided with the specimen record.
Duplicates sent to
The herbaria to which duplicates have been sent. Herbaria are identified by their Index Herbariorum acronym. When querying this field, note that the specimen records in the results will be from the herbarium that sent the duplicates, not the herbaria that received the duplicates. A potential use case of the Duplicates sent to query term in the Advanced search would be a herbarium trying to find the original records of specimens they have received on exchange from all other Australian herbaria, or, if used in combination with the Herbarium field, from one particular herbarium.
Herbarium received from
The herbarium from which the specimen was received. In most cases, but not all, this will be the herbarium where the original specimen is held. Herbaria are identified by their Index Herbariorum acronym.
Ex herb. catalogue number
The catalogue number for the original specimen at the herbarium from which a duplicate specimen was received.
The Loan number is the reference number or identifier assigned to the loan by the lending institution, and is used by the lender and the borrower for administrative purposes. Searching by loan number enables botanists who have borrowed from a herbarium that delivers loan data to the AVH (currently only AD and MEL) to retrieve the records for all specimens in a particular loan.
The Index Herbariorum acronym of the borrowing institution.
Data quality checks
When AVH data is uploaded into the BioCache, a range of quality assurance checks are performed and potential data issues are flagged. Some data issues (such as transposed or negated latitude and longitude) will result in the data being modified; other issues will simply be flagged. The details of any changes made during processing can be viewed by clicking on the Original vs processed button on the Record detail page. Users can also flag potential issues with specimen records by using the Flag an issue feature on the Record detail page. Data issues detected during processing or flagged by users are available as a facet on the Results page, and can be used to narrow down your search results. The data issues are also available on the Record detail page and in the downloads.
An assessment of whether or not the location is spatially valid, based on a range of data quality checks and user-contributed annotations. If the record suffers from one or more of the geospatial Data issues listed in Table 2 it is considered ‘Spatially suspect’, otherwise it is ‘Spatially valid’. Note that a ‘Spatially valid’ record is not necessarily correctly georeferenced, and, depending on the data issue, a ‘Spatially suspect’ record is not necessarily incorrectly georeferenced.
Outliers are observations that are distant from the rest of the data in a sample. In the AVH the sample is observations of a taxon. The presence of outliers might indicate that specimens (the outliers) have been incorrectly identified or georeferenced, but also that the distribution is skewed or disjunct, or that the taxon has been under-collected in certain areas. Checks for outliers in the AVH are done using five climate surfaces: precipitation seasonality, precipitation of the driest quarter, radiation seasonality, radiation of the warmest quarter and mean moisture index of the quarter with the highest moisture index. The tests are conducted only where there are 20 or more unique locations for a taxon. For more information on how the tests are done, see the notes on the spatial outlier detection method used by the ALA.
The Outlier for layer facet indicates if a specimen is an outlier for an environmental layer, based on the known environmental range of the taxon to which the specimen has been identified. The Outlier layer count facet allows you to filter your results for records that are outliers for certain numbers of environmental layers. You can also display your results by Outlier for layer or Outlier layer count on the distribution map. If a record is an outlier for one or more layers, the Record detail page will display graphs for each of the variables for which the record is an outlier with the distribution of the records of the taxon to which it belongs. The layers for which a record is an outlier will also be given in the downloads.
Australia’s Virtual Herbarium contains data that may be considered sensitive because of conservation or biosecurity issues. The ALA Sensitive Data Service contains authoritative lists of taxa that are considered sensitive, obtained in collaboration with Commonwealth, state and territory agencies and data providers, with information on how data of these taxa should be handled for each state. Distribution data for these sensitive taxa may be either withheld or generalised. The latter means that instead of a detailed locality the local government area will be given and that the latitude and longitude will be rounded to, for example, a single decimal place. Currently, distribution data is only completely withheld for a single native species, the Wollemi Pine (Wollemia nobilis). There is a Sensitive data facet that can be used to check for sensitive data among query results. The Record detail page and the downloads indicate whether distribution data has been withheld or generalised for each record.
The Multimedia facet allows you to filter your results for records that have images or other multimedia attached.