Category Archives: semantics

Why XML security is broken

I am currently also part of the TAS3 European project which is about a “Trusted Architecture for Securely Shared Services“.
This results in very interesting discussions about how to handle security, at which layer etc.
The aim of the project is to assure that the details of who is allowed to do/see/get something is not defined for each person or role as this causes problems. You do not know in advance what your data or service will be used for so this would require a lot of foresight. Another aspect is that the role/id of the client can be insufficient, an indication the purpose for which the service or data is needed is also important to decide whether access is granted or not.
The intended solution for preventing the need for foresight is by using semantic footprints (commitments) to determine when access is either allowed or forbidden. In that case, instead of just comparing role and purpose using id or description, you can do a match on the semantic definition and when they match to a sufficiently high degree, you can draw a conclusion.

As a result of discussions about this, I received a mail from Dave Chadwick about xml security. It contains some interesting links to documents about problems with the ws-* stack and how older (non-XML, specifically SSL) solutions can provide a better solution in many situations. It gives in interesting read.

For more details see :

displaying a data model ontology

Using the “commercial” use of semantics, is is possible in equanda to automatically generate a data model ontology. This is exported as a OWL file.
The main use for this is to generate a graphical representation of the data model. Such an owl file can be displayed in Protégé, which (after enabling the Jambalaya plugin in the project properties tab) can generate a graph like the one below.

jambalaya view of data model ontology an generated by equanda

This exercise has indicated some strange facets of the OWL file format. While you would assume that a triple (A,X,B) and a triple (C,X,D) would result in two arcs, one from A to B and one from C to D. This is not true, it also generates arcs from A to D and from C to B. Apparently you have to assure that the relations have unique names. This does make the relation names which equanda generates less nice, but at least it works.

introduction to semantics

In the IT world, there is a lot of buzz about semantics. This is usually surrounded by a lot of vagueness and is sometimes just used as marketing speech.
The principal idea of semantics to to give the true identity or true meaning of essence to a word, phrase, thing.
In the computer science world, this usually done by mapping these things to either a taxonomy or an ontology.

  • A taxonomy is a tree which is used to classify the “thing”. In principle, everything should thus have exactly one place in the taxonomy where it can be classified. On of the problems being that if the taxonomy is incomplete, too much will be put in the “other” basket. Another problem is that a taxonomy does not really help to determine the amount of similarity between things. On the other hand, a taxonomy is a simple concept and can for example be useful when doing something like a google search. Say you would be searching on “capital”, it would be useful if google would ask you whether you mean capital as in money, as the capital of a country or region, or another meaning. This would greatly increase the relevance of results with just one extra click.
  • An ontology is a set a triples (concept, relation, concept) which together build a graph. The semantic meaning is then defined as a commitment on the ontology. The commitment indicates all triples and all concepts which are relevant for the “thing” in question. When two objects both have a commitment on the same ontology, then you can compare them semantically. This is a graph comparison which determines the amount of overlap between the two concepts. For example, you could compare a table and a chair. Depending on the ontology and commitment, they could have common triples like “object can be made of wood”, “object has legs”, “object can be used to sit on”. The comparison will probably indicate that there is more overlap between a chair and a table than between a chair and a radio.

When I think of semantics I tend to think of the ability to compare different things and learn how well they match, I think of ontologies and semantic (or ontological) commitments. However, semantics and ontologies are used for different things.

  • Data modelling, or I would rather refer this as terminology or defining a glossary. A couple of domain experts get together and discuss a common terminology, the concepts which are relevant for a certain domain and how they are interlinked (the relations between these concepts). This is very useful to assure that different data formats can be mapped. However, it is somewhat limited in scope as this results in commitments which typically refer to one concept in the ontology (for example a “invoice-reference” in one program could then map to the “invoice-id” in the ontology).
  • Matching, when the ontology defines the relevant part of the world (in a certain context), then the commitment will be some cloud which indicates the mapping between the thing and the ontology. These commitments can be compared and this may give an indication about how similar things are. You can imagine that it is a lot more difficult to build a good ontology for these cases.
  • In many cases, attaching meta-data to objects is also referred to as semantics. This can be very useful, and is often used as a simple alternative to using either a taxonomy or an ontological commitment. For example, for an article about the amount of capital needed to start a company, you could add some meta-data to indicate that this is an article about money. This is similar with tagging or assigning categories as used in blogs. However, not all use meta-data has semantic value.