AAA
I quite like the idea of AAA in RDF: Anyone can say Anything about Anything (AAA). It comes from the RDF world, and learned about it in the Working Ontologist book. I have the feeling that it is not really possible to store the AAA in an RDF database (see RDF Confusion).
Having AAA in our database means that we need to store contradicting and/or redundant information about things in our database. Let’s have a look at the idea on how to store it in an LPG graph, e.g. using neo4j or memgraph.
Storing the information
Let’s say we have persons, Alice and Bob, and that Alice likes Bob. Or so we think:
The green boxes are nodes, the strings on top of the box are the
labels, followed by relevant properties. The first string on relations
describes the type of the relation, again followed by properties.
_id is the internal id of an object, so that we can talk
about it.
- n1 describes
:Aliceas a Person with the name Alice - n2 describes
:Aliceas a Person with the age 30 - n3 describes
:Aliceas a Human and has the name Ally - n4 describes
:Bobas a Person with the name Bob - n5 describes
:Charlieas a Person with the name Charlie - r1 says
:Alicelikes:Bob - r2 says
:Alicelikes:Charliesince 2025 - r3 says
:Alicelikes:Charliesince 2026
Does the Person called Alice maybe also have the name Ally? Are the same persons? Is she poly and likes both Bob and Charlie, or not, and since when? Lots of questions…
Aggregation creates an image of reality
In order to create a picture of reality, we need to pick nodes and aggregate the information in them. The database holds contradicting information. We need to select which nodes to rely on, and that selection paints our picture of the world.
- n6 aggregates n1 and n2, and learns that
:Aliceis a Person, age 30, name Alice - n7 aggregates n2 and n3.
:Alicecould be Human, Person or maybe both? The name seems to be Ally - n8 aggregates n1 and n3.
:Aliceis again now Human or Person or both. We see two names. Are both valid or is this a problem? - n9 and n10 - nothing special here
The questions mentioned need to be handled by the aggregation mechanism - it is outside the system to give a clean answer. Having meta information around could be helpful - if we knew that ‘Person’ and ‘Human’ are compatible terms, both could be true.
Aggregated relations
Depending on the aggregation one also sees different relations between the objects:
- n6 sees r12 & r13 (because of r1 & r2),
:Alicelikes:Bobor:Charlie(since 2025) - n7 sees r14 (because of r2 & r3), obviously
:Alicelikes:Charlie, but it is not clear when it started. Relations of the same type get aggregated. The property_numbershould tell the system that there were two relations merged for this, as a helper information - n8 sees r15 and r16 (because of r1 and r3): the does like both, and
its clear that it started with
:Charliein 2026.
The only thing that obvious is that there is more than one truth, and the perspective on the world depends on the facts that you pick.
Note: we know that r12 - r16 are aggregated relations between there are between ‘Agg’ nodes
Provenance
What’s left out of the picture is the provenance - who said it, when,
with what certainty. The idea is to store this information on the
original data nodes (n1-n5), in properties that won’t get aggregated
(e.g. properties starting with _). This helps us around a
limit of LPG: we can’t really talk about the assignment of labels, types
and properties. If we bundle the information with the same
context/provenance together in the same node, we have a way to store and
differentiate it from other contexts.
Conclusion
Using this approach we can actually do AAA in LPG. It doesn’t bend the system too much, and it allows “truth decision” at read time. There is no need to decide what is true beforehand.