The Semantic Lunch

http://www.flickr.com/photos/ob1/15102366/Lunch today with John Davies, who’s in charge of next-​​web research for BT. It was quite a long, or rather intense, dis­cus­sion, so I’ll only tackle the basics here. I’ve been trying to nail this semantic web issue for some time, but every time I start reading an academic paper, my atten­tion seems to wander off. So this was a good oppor­tunity for me. I wasn’t going to deviate. As soon as he sat down, I was in with my care­fully prepared, top journalist’s question: “so what’s this semantic web thingy, then?”

It turns out that that is one of the more dif­fi­cult ques­tions. (Damn!) It depends on what you mean. You might mean turning the billions of existing web sites semantic or only about possible future sites or services. The second of these options is the most likely outcome at the present. Semantic web is partly about annot­ating web pages to make them amenable to machines. John prefers the expres­sion ‘semantic tech­no­lo­gies’ to avoid this confusion.

At the moment, inform­a­tion on the web is pretty much designed for human con­sump­tion. You and I know when we go to a shopping site that the figure in bold is the price, that a certain number is the product code and that this piece of inform­a­tion is about the shipping inform­a­tion. To a machine, it may make no sense what­so­ever. If machines are to be able to bring together all these dif­ferent pages to make the web more useful, then they need to be able to read them.

We’ll see the first applic­a­tions of semantic tech­no­lo­gies in the enter­prise space. Its need is more acute. They have lots of data­bases, all built by dif­ferent people according to dif­ferent rules. Integrating the inform­a­tion from those is already a very costly and time-​​consuming activity. One database may talk about CustomerName, another may refer to CustomerID, for example. Joining these things together, so perhaps, a support depart­ment knows about what equip­ment the logistics depart­ment has installed for a customer, improves business effi­ciency. Semantic tech­no­lo­gies put what Davies called a “wrapper” around these dif­ferent data sources to create over­arching access, con­necting dif­ferent data­sources in a way that doesn’t require nearly so much human effort.

People devel­oping semantic tech­no­lo­gies work by devel­oping an ontology for under­standing the sort of data it’s looking at and the tech­no­logy will be able to do some reas­oning based on this. An ontology means a form of clas­si­fic­a­tion system for whatever it is that’s being examined. For foods, that might include their ingredi­ents, nutri­tional prop­er­ties, sup­pliers and type. It’s not just a list, though, but will also under­stand the rela­tion­ships between dif­ferent items. An ontology developed for food might come across E101 and addit­ives and CrispyPop bar. It will know that E101 falls under addit­ives which are part of the ingredi­ents of the bar. If that descrip­tion then gets combined with a database of shops at a whole­saler where you might send the bar, then the semantic agent will cal­cu­late that health food shops aren’t going to stock CrispyPop bars. It’s not intel­li­gence in any way, but the applic­a­tion of rules that the creators have decided upon.

Because the semantic tech­no­lo­gies are light­weight and open source, they are poten­tially avail­able to any company. For this reason, enter­prises that get some of their data from external sources will still be able to use semantic approaches to integ­rate and drill the ensuing com­bin­a­tion. These are my words, not John’s, but one way to think of the tech­no­lo­gies is as providing a toolkit for more easily creating web mashups. Companies already exist, such as Cerebra and Ontoprise, to sell ways to integ­rate enter­prise inform­a­tion using OWL, the web ontology language.

I kind of under­stood, so far, but I needed a good example. I suspect you may be the same.

BT works closely with the National Health Service in the UK. The Service has already gone a long way into digit­ising and col­lating its inform­a­tion through the Electronic Patient Record system and also inform­a­tion on medical know­ledge through the SNOMED clas­si­fic­a­tion system. Unfortunately, though, the data can still be very dis­persed. The X-​​Ray depart­ment might have a patient’s data on a dif­ferent system to the Pharmacy, for example, and those might be com­pletely uncon­nected to the systems used in a dif­ferent hospital or by a clinic.

What semantic web tech­no­lo­gies bring to this is unity and also what John called Description Logics. It can prise open the dif­ferent data­bases, allow an overview, but also cal­cu­late with it. Imagine a patient’s medical record says that they are allergic to almonds. Then a doctor misses this and somehow pre­scribes a nut-​​based food. When the nurse enters this into the patient’s record, the semantic applic­a­tion will use its ontology to work out that almonds are nuts and that there­fore this is a very bad idea. Semantic tech­no­lo­gies that can perform cal­cu­la­tions like this, and poten­tially save lives, are already in use in the UK Health Service.

I’ll leave this there for today. My head hurts already. If there’s interest, I’ll be happy to do a follow-​​up on another day.

Share this post:

Digg This
Reddit This
Stumble Now!
Buzz This
Share on Facebook
Bookmark this on Delicious
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Possibly related:

9 comments to The Semantic Lunch

  • Ian, I share your pain, it’s the semantic migrane ;-)

    But I really want to under­stand! So, with that in mind I went to a lecture on ‘Ontologies & The Semantic Web’ given by Professor Ian Horrocks at the Royal Society in December last year. It was a great talk and it actually helped me over the basic hurdles. However, some of the details were too much (read my report on the talk and you’ll see exactly where my brain melted).

    At a recent Beers & Innovation event (which I organise) on Web Services & Mash Ups, Simon Willison who’s Technology Dev in Yahoo! said that the devel­op­ment of web data has come in three broad stages: unstruc­tured, struc­tured and stand­ard­ised. We’re still strug­gling to get data struc­tured at the minute.

    Tom Loosemore, Project Director for BBC 2.0, said at the same event that stand­ards cannot be fixed when the envir­on­ment is not mature. Tim Berners-​​Lee would now find it hard to argue that the web has not grown as a mess, Tom said and he stressed that stand­ards emerge and imposing them stifles innov­a­tion. Interesting point. And of course I don’t dream that he is saying they emerge uncon­ciously, but from col­lab­or­a­tion and testing I suppose.…

    Anyways, that’s my less than 2 cents. Horrock’s talk was good and I didn’t see any other write ups of it, but then most of the other people there weren’t random non-​​sciencey types like me ;-)

  • Hey… how come you have no link to the set of articles (or at least the original article) I had pub­lished that seeded your interest in the semantic Web?

    ;-)

    Marc

  • My guy was quite keen to separate the idea of annot­ating indi­vidual web pages through xml from the idea of semantic web services.
    In fact, he was quite dis­par­aging about Microformats because they are pre­scriptive of the content they are used to rep­resent. Semantic web services are designed to be descriptive, and thus inher­ently more flexible. While the onto­lo­gies are formal frame­works, the doc­u­ments they describe can be anything.
    Hang on, my headache’s coming back.

  • Marc — I’ll find some pretext for writing about it soon. ;)

  • We’re bloggers, we need no stinking pretexts!

    BTW, I used to think it’s 98 all over again with Web 2.0 and all the hype. Now I think it’s 95 all over again. Well, at least in the area I’m in, which is exper­i­en­cing the same growth the Web was exper­i­en­cing in 95.

    Your blog’s template/​html has been evolving faster than any blog out there and somehow it has gone from good to better through all those changes. Thumbs up for getting rid of the Google NonSense ads.

    :)

    Marc

  • Like you, I blog for lots of reasons: personal pleasure, because it helps me work things out, to have a bit of dialogue about those things, to perhaps increase the audience for the eventual book, to maybe get some other work, to attract sponsorship.

    Google doesn’t really factor into *any* of that. Sod ‘em.

  • Interesting is Google’s problem with the semantic web. A web site should have a machine readable and under­stand­able version of its content — whose current form is a micro­format. But given the abuse of existing web page descript­ives by Spammers, like the meta­names abuse, Google is not too impressed.

    I think it is a minor issue and, as you say, annot­ating the current web should be separate from the overall semantic web project. Yet it sadly intrudes into the debate as in Berners-Lee’s recent Berners-Lee’s recent Question & Answer session with Google’s Norvig

  • ‘Semantic Technologies’ in the Enterprise…

    Ian Delaney, a journ­alist based in London, reports on an inter­view he had with John Davies of BT (the former British Telecom) during a The Semantic Lunch. I have pre­vi­ously written about Davies’ BT col­league Paul Warren and his call for the …

  • […] Techmeme is full to bursting with posts announcing/​decrying the announce­ment of some­thing called Web 3.0. The ker­fuffle follows an article in the New York Times yes­terday, which is actually about semantic tech­no­lo­gies — I gave a little overview in August and there’s more here. The ideas have been around since at least 1999, and are part of Berners-Lee’s vision for Web 1.0. My friend Marc Fawzi gave a good intro­duc­tion to the idea in June: […]

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>