Skip to main content

Table 1 Terminology

From: Data integration in biological research: an overview

Schema A structured and “queryable” way of storing data
Database A single or collection of schemata
Sources A number of databases that contain data. Data that reside in each source can either duplicate and/or complement data from other sources
Data Integration The process of combining data that reside in different sources, to provide users with a unified view of such data
Data Standards Agreements on representation, format, and definition for common data
Data Formats A structured way to represent data and metadata in a file
Data Warehousing Model for integrating data where the data from different sources reside on a central repository (aka data warehouse)
Federated Databases Model for integrating data where the data reside on the original sources and users are provided with a unified view of the data based on mapping mechanisms of the information
Linked Data The network of interlinked data that is available on the web. It is used to automatically share semantically rich information and represents the biggest attempt to convert significant amounts of human knowledge across all fields in a computer readable format
Ontology A structured way of describing data, often presented in a computer-readable format. In bioinformatics, ontologies are sets of unambiguous, universally agreed terms used to describe biological phenomena and “entities”, their properties and their relationships
lled Vocabulary A collection of terms for describing a certain domain of interest
Unique Identifier A unique representation for a biological entity (molecule, organism, ontology term, etc.). Usually an alphanumeric string that is used to refer to this entity and distinguishes it from others (much like ID or passport number in humans).
Metadata Data describing data, i.e., additional information (e.g., a comment, explanation, attributes, etc.) for a specific biological entity or process. As an example, in the context of an ontology, this is used to specify significant properties of the ontology
Annotation The process of attaching relevant information (metadata) to a raw biological entity
Automatic Annotation Automatic means that the annotation is being done by computer software (often by transferring information from a source to another). This is a way of producing a large amount of metadata
Manual Annotation As opposed to automatic annotation, manual means that an actual individual does it
GUI Graphical User Interface. Is the way that a user interacts with a computer by using graphical icons and visual indicators such as buttons, forms etc. In the scope of this paper we are using the term GUI to refer to interfaces that allow biologists to search/read/edit integrated biological data
API Application Programming Interface. Set of tool and protocols that a power user can use in order to automatically gain access to functionality and/or data that have been developed/gathered by another individual/organisation
UX User eXperience. The process of improving user satisfaction by focusing on the usability of a given product.
Visualisation Tools Applications that help biologists view the data in a more human-friendly way (e.g., Cytoscape for visualising complex networks) like 3D or graph representations of the data