Right now I am working on different projects at Dog Blue Software that revolve around natural language processing and applying it to Internet search. Natural language processing looks at how all the words on a page are used in sentences and giving the computer some idea about what all those words mean (the semantics).
Working with information from the Internet is challenging on a number of levels. There is the amount of data. There are the limitations of the interface. There are the multiple languages and idioms.
Some people talk about "The Semantic Web" as some kind of saviour to the problem of sifting through large amounts of unstructured data. Clay Shirky neatly critiques the limitations of such a concept, even if it were to coalesce into being.
Rather than a simplistic solution - "we'll just ask everyone to become an Internet librarian" - or Google's needle in the haystack solution - "we'll give you ten needles that everybody already knows about, you can forget about all the other ones" - the solution to navigating large information spaces will probably require a level of sophistication bordering on artificial intelligence.
But until then, letting the computer find the meaning of text rather than its representation is an idea that seems to have promise.
0 comments:
Post a Comment