Saturday 12 July 2014

Watsonian

On waking this morning, I returned to the Watson of my first post of 10th July, starting out in a dream but eventually waking up to the extent of arriving here.

In the dream I was wondering about, simplifying a little, the various models of Rolls Royce cars (see http://www.rolls-roycemotorcars.com/, not to be confused with the people who make power plants). We suppose that the marketing chaps have arranged the various models in an array three deep and twelve across. The three deep might be, for example, three different styles of headlight. Further, that a buyer is allowed to chose a colour, perhaps with a different choice of colour scheme for each of the 36 models.

The task for Watson, on the basis of having looked at a reasonable number of cars from a vantage point on the outside, is first to be able to enumerate the various models available, second to be able to say which model some new car is and third to describe the various models in a way that a car enthusiast would relate to.

We suppose that Watson is equipped with two eyes, eyes which are capable of jointly scanning the car and so capable, by triangulation, of plotting the position in space of every point on the visible surface of the car.

We suppose in passing that the models are such that one can tell what model a car is from whatever reasonable angle that one looks at it. And we leave aside the business of getting enough images of any one model to be able to build up a complete picture.

That such plotting can be turned into a three dimensional array where each element of the array can be described as (surface as boolean, colour as string, transparent as boolean, edge as boolean). So the array describes a brick shaped portion of space and for each point in the array we can say whether or not it was part of the surface of the object being described, and if it is whether or not that bit of the surface is transparent (and so probably glass or some such), and if it is not what colour it is and whether or not it is on a boundary line of some kind on that surface. A boundary which might be, for example, that of the windscreen or that between two colours.

Exercise for the reader: would it be necessary or helpful for Watson to know about inside and outside? To be able to say of all points which were not part of the surface, whether or not they were in or out?

We suppose in passing that the colours can be picked from a list. That there are only a few dozen of them to bother about. No need to get into RGB tuples. No need to worry too much about the quality of the ambient light.

We suppose in passing that the cars are more or less convex bodies and that their surfaces are more or less topologically equivalent to the surfaces of spheres. That is to say they are reasonably straightforward objects which Watson can reasonably be expected to have a stab at.

Then how does Watson get from such an array, vaguely comparable to the output from the eye-side processing of vision in the brain, to the various models of Rolls Royce cars?

To make things a bit easier for Watson, we assume that all the cars have four wheels in the same place and that all cars have bilateral symmetry. This means that he can start on comparing two images by getting them to face the same way, superimposing the wheels and then seeing what he has got.

One could probably decide on this basis whether two different images were of the same shape. If the images had good resolution, one would not get very many models to the shape, maybe just one most of the time. We suppose that it is just one all of the time.

Allowing for a bit of noise in the system, Watson can then, given enough teaching cars, work out how many models there are and classify any inbound car to model. I think he would be able to do this without knowing anything much about cars at all. No domain knowledge to speak of.

But what about saying what it was that made one model different from another? How could he start to pick out relevant features of cars and say that this model had fatter headlights than that model? Would he ever arrive at the three by twelve array that the marketing chaps had come up with?

I dare say that, again given enough teaching cars, which would probably need to include cars which were not Rolls Royces, he could work out that there were common features, features that most cars shared. Things like headlights, windscreens, doors and door handles, to which he could assign his own names. He could perhaps ask his human teacher for the human names for such features, which names would help a lot with interactions with humans.

And having got this far, it would be interesting to talk to the Watson engineers to find out how they would actually go about it. Do IBM do tours?

No comments:

Post a Comment