In the IT world, there is a lot of buzz about semantics. This is usually surrounded by a lot of vagueness and is sometimes just used as marketing speech.
The principal idea of semantics to to give the true identity or true meaning of essence to a word, phrase, thing.
In the computer science world, this usually done by mapping these things to either a taxonomy or an ontology.
- A taxonomy is a tree which is used to classify the “thing”. In principle, everything should thus have exactly one place in the taxonomy where it can be classified. On of the problems being that if the taxonomy is incomplete, too much will be put in the “other” basket. Another problem is that a taxonomy does not really help to determine the amount of similarity between things. On the other hand, a taxonomy is a simple concept and can for example be useful when doing something like a google search. Say you would be searching on “capital”, it would be useful if google would ask you whether you mean capital as in money, as the capital of a country or region, or another meaning. This would greatly increase the relevance of results with just one extra click.
- An ontology is a set a triples (concept, relation, concept) which together build a graph. The semantic meaning is then defined as a commitment on the ontology. The commitment indicates all triples and all concepts which are relevant for the “thing” in question. When two objects both have a commitment on the same ontology, then you can compare them semantically. This is a graph comparison which determines the amount of overlap between the two concepts. For example, you could compare a table and a chair. Depending on the ontology and commitment, they could have common triples like “object can be made of wood”, “object has legs”, “object can be used to sit on”. The comparison will probably indicate that there is more overlap between a chair and a table than between a chair and a radio.
When I think of semantics I tend to think of the ability to compare different things and learn how well they match, I think of ontologies and semantic (or ontological) commitments. However, semantics and ontologies are used for different things.
- Data modelling, or I would rather refer this as terminology or defining a glossary. A couple of domain experts get together and discuss a common terminology, the concepts which are relevant for a certain domain and how they are interlinked (the relations between these concepts). This is very useful to assure that different data formats can be mapped. However, it is somewhat limited in scope as this results in commitments which typically refer to one concept in the ontology (for example a “invoice-reference” in one program could then map to the “invoice-id” in the ontology).
- Matching, when the ontology defines the relevant part of the world (in a certain context), then the commitment will be some cloud which indicates the mapping between the thing and the ontology. These commitments can be compared and this may give an indication about how similar things are. You can imagine that it is a lot more difficult to build a good ontology for these cases.
- In many cases, attaching meta-data to objects is also referred to as semantics. This can be very useful, and is often used as a simple alternative to using either a taxonomy or an ontological commitment. For example, for an article about the amount of capital needed to start a company, you could add some meta-data to indicate that this is an article about money. This is similar with tagging or assigning categories as used in blogs. However, not all use meta-data has semantic value.