This is a question that has me pondering in my long (4km) morning walks,"What is 'Understanding'?"

We have been talking about Artificial General Intelligence (AGI), intelligence that is human-like. Part of intelligence is 'understanding', in my opinion. But what is 'understanding' in the first place? Only if we get the definition, can we set up metrics to determine if an AGI that we have built understands and how much it understands.

So far what I can think of is, if I want to know whether a human understands what I have shared, I can see the "understanding" in two ways.

Firstly, through the application of the knowledge. If the human understand a particular knowledge well, the human will apply the knowledge well, to its full extent. However this is where I get into a bit of a challenge and that is how do I measure the "apply well" part? Is there a way to do it and is there an objective way to do it?

Secondly, how do we know another person understands is much like the comprehension section in our English tests or exams, where we have the human read the story and then answer questions given. This has been done quite well through information retrieval methods using deep learning and transformers. However, is that really understanding? Yes, the answer generated may be paraphrased but we all know that is an illusion rather, because its just a mere manipulation of the order of keywords rather.

Thirdly, another way we know a person understands something is that he or she can present the knowledge in many different perspectives and draw relationship with other pieces. For instance, Alice is told about what an apple is, and from there Alice knows that it can be made into an apple juice or a pie, it is found in the Snow White and Seven Dwarf story, etc. Chinese call it being able to "举一反三" as in from one piece of knowledge, he/she can relate it to another three pieces of knowledge. Most will agree this is the best way to see if someone understands...but again, how do we measure it then? What will be a more objective metrics rather?

So these are my thoughts so far. I am keen to think through it more if I have more time but for now this is what I can think of. What are your thoughts on this?

