Archive for January, 2004

foaf:mbox and other identifiers

Thursday, January 8th, 2004

There’s been recent discussion on the foafmailing list saying that foaf:mbox doesn’t make a sensibleidentifier for variety of reasons. There seems to be some seriousmisconceptions here, foaf does not require, or care if you have afoaf:mbox (or even an mbox_sha1sum) the entire foaf universe worksfine if no-one had one.

foaf:mbox is defined as being unique for an agent, that’s a definitionof a foaf:mbox, it’s not a definition of an email address. It beingunique for an agent means there is no problem with more than oneperson sharing an address, that’s already supported (some tools may,and others may want to make assumptions about individuals but that’s alimitation of those assumptions, often though such assumptions arefine for the use case - i.e. thanking Aunt Maud, and Grandma andGrandad for the pressies is equivalent, who cares that one’s a person,and one’s a group.)

foaf:mbox’s definition is usable for the task that it’s used for - aconvenient distributed identifier for many people - foaf:weblog andfoaf:homepage are just as convenient. It’s not much use for answeringthe question “what email do I use to email someone”, if you want toanswer that question you do need to start looking very importantly attemporal issues. (which aren’t as simple as has been suggestedrecently, interpretationProperties and RDF architecture - no smushing- prevent the solution with dcterms being workable I believe.)

So foaf:mbox isn’t required, it’s just defined so that authors can useit, there is not a solution that a GUID is workable, to do itauthoring will always need a central server with total knowledge ofthe system so they can discover what an appropriate GUID for a personis, and what happens when they don’t have one.

If someone wants to construct a uniquely identifying string/guid orwhatever they’re free to do is, it’s simple:

 <rdf:Property rdf:about=\"http:/jibbering.com/vocabs/invalid/GUID\">     <rdf:type rdf:resource=\"http://www.w3.org/2002/07/owl#InverseFunctionalProperty\"/>     <rdfs:domain rdf:resource=\"http://xmlns.com/foaf/0.1/Agent\"/>     <rdfs:range rdf:resource=\"http://www.w3.org/2000/01/rdf-schema#Resource\"/>   </rdf:Property>

Job done, defined, and foafnaut will automatically use it as a uniqueidentifier and smush on it - as will other tools, this is what RDFgives. If you can’t use an mbox for someone, use your invented GUID,or any other IFP. A single guid system will never work, which is whyit shouldn’t be in the FOAF namespace, a local guid system in localareas can work, but that local group must take responsibility forsolving a lot of problems it brings, there’s no problem using itthough.

There’s also been suggestion that governments currently use thingsabout peoples birth and appearance as identifiers. This is true, andit works for some governments, however I do not believe it works on adistributed internet system that is modelling life, not simply tryingto ensure they have identity. Equally it has collision problemsanyway.

The problems with birth place/date is that not everyone knows themeven about themselves let alone other people, and being able to talkabout other people without hassling them is a requirement in FOAF as Isee it. Parents also change, this doesn’t matter to a Governmenttheir database is private, and can force people to tell the truth -andnot actually care if it’s wrong, since it’s only used as an identifieranyway, it does however matter to a person. Not everyone wants toreveal such information to the world - do I really want to admit I’mRonald Reagans and Margaret Thatchers love child?

Everything that can be used to identify a person has problems, the RDFapproach to identifiers is that anything can be used to identifythings, this means you can pick one which doesn’t cause problems forthe thing you’re wanting to identify, and RDF aware tools will pick upon this.

Equally everything about a person or a resource has a temporal aspect(who my parents are last week, and who they are next week does notchange biologically, but it does change sociologically and due to noteveryone having complete knowledge of situations) So every propertyneeds to be able to be qualified by the time it is valid, and for manythis cannot go into the future 100% reliably, being completely correctalso means you’ll likely be completely useless.

For simplicity and generally because it’s actually only relevant to afew properties, the temporal methods can be handled outside of the RDFdocument (e.g. at the http level) I know from an RDF documents thatSomeBod someNS:hairColourOnHeadNotFacial Red and the file was createdat 2004-01-08 with an expires of 3 days, then I can safely concludethat the hair was red, in a weeks time I wouldn’t.

Of course if I wasn’t interested in the hair colour right now, butonly in answering the question “Has SomeBod ever had Red hair” then Ican do that too, hair colour is an incredibly transient thing for somepeople and for others it changes maybe once in their lifetime, asingle model doesn’t work.

In fact that’s my conclusion to all RDF modelling - A single model doesn’t work. Fortunately in RDF, you don’t need a single model.