Schema | A structured and “queryable” way of storing data |
Database | A single or collection of schemata |
Sources | A number of databases that contain data. Data that reside in each source can either duplicate and/or complement data from other sources |
Data Integration | The process of combining data that reside in different sources, to provide users with a unified view of such data |
Data Standards | Agreements on representation, format, and definition for common data |
Data Formats | A structured way to represent data and metadata in a file |
Data Warehousing | Model for integrating data where the data from different sources reside on a central repository (aka data warehouse) |
Federated Databases | Model for integrating data where the data reside on the original sources and users are provided with a unified view of the data based on mapping mechanisms of the information |
Linked Data | The network of interlinked data that is available on the web. It is used to automatically share semantically rich information and represents the biggest attempt to convert significant amounts of human knowledge across all fields in a computer readable format |
Ontology | A structured way of describing data, often presented in a computer-readable format. In bioinformatics, ontologies are sets of unambiguous, universally agreed terms used to describe biological phenomena and “entities”, their properties and their relationships |
lled Vocabulary | A collection of terms for describing a certain domain of interest |
Unique Identifier | A unique representation for a biological entity (molecule, organism, ontology term, etc.). Usually an alphanumeric string that is used to refer to this entity and distinguishes it from others (much like ID or passport number in humans). |
Metadata | Data describing data, i.e., additional information (e.g., a comment, explanation, attributes, etc.) for a specific biological entity or process. As an example, in the context of an ontology, this is used to specify significant properties of the ontology |
Annotation | The process of attaching relevant information (metadata) to a raw biological entity |
Automatic Annotation | Automatic means that the annotation is being done by computer software (often by transferring information from a source to another). This is a way of producing a large amount of metadata |
Manual Annotation | As opposed to automatic annotation, manual means that an actual individual does it |
GUI | Graphical User Interface. Is the way that a user interacts with a computer by using graphical icons and visual indicators such as buttons, forms etc. In the scope of this paper we are using the term GUI to refer to interfaces that allow biologists to search/read/edit integrated biological data |
API | Application Programming Interface. Set of tool and protocols that a power user can use in order to automatically gain access to functionality and/or data that have been developed/gathered by another individual/organisation |
UX | User eXperience. The process of improving user satisfaction by focusing on the usability of a given product. |
Visualisation Tools | Applications that help biologists view the data in a more human-friendly way (e.g., Cytoscape for visualising complex networks) like 3D or graph representations of the data |