Tuesday, November 29, 2005

Google and Genetics Research

A recent article ( http://www.timesonline.co.uk/article/0,,2095-1892323,00.html ) highlights Google's possible entry into the field of genetics data mining.

The article points out correctly that in terms of genetics and biology, there is a very real "islands of information" problem that makes just accessing data painful. Google apparently wants to bring to bear their search technology to allow people to search on some biological term(s) and get useful results back from these various sources. While this is a decent first step, I see the problem as being much more complicated than this simplistic approach.

Biological data comes in many forms. There is the Gene Ontology, which is hierarchical. There is the KEGG pathway database which is more akin to nodes and links; there is the sequence database which lends itself to custom chromosome viewer graphics, and the list goes on and on. Much of this data lends itself to specific visual/graphical views that may be distinct to the data type. But these various data sources do represent atoms of information that are not independent...they are related to eachother. Sequence information denotes the underpinnings of genes, genes produce proteins/enzymes that are in pathways, and genes are also characterized by function in Gene Ontology. The point I am making is that for this information to be manageable and digestable to the researcher, a seamless graphical interface has to be layered on top of the data. This interface must allow users to easily move between these disparate sources, allowing the user to piece the puzzle together of what a SNP/Gene/Protein is doing. I don't see Google solving this problem soon...the graphics are challenging as is the data integration problem. There are well over 100 significant biological databases that could be utilized in such a manner as described above.

The problem gets harder when you realize that almost all of the real data out there that is useful to researchers is stored in research papers. I can tell you that this is not a task for your mother's NLP system to address. No truly competent language processor has yet to be able to comb through thousands of research papers and determine what the contents of the paper are. Until this information is successfully mined, researchers will continue to be constrained by this bottleneck. I am rather dismayed that some standards body has not stepped forward to mandate an XML type standard that would accompany every new paper and allow for easy mining of papers' content. Perhaps I should undertake to do this :)

Lastly, but not leastly, what really is needed is to take all this aggregated information and apply some real AI to it. We don't need superhuman, or even human level, AI here...we just need a very good and accurate inference system. The depth and breadth of data that resides out "there" tells me that there are discoveries to be made by looking collectively at these many sources and inferring relations that weren't previously known. A human could do the same, but the sheer volume of data, in many different places and in many different forms, makes it laborious at best, and intractable at worst. This is a perfect application for a competent AI inference system.

I have sketched out a design to do all of the above, and even more. The problem is resources and time and money....I hope someday to be able to have the time and money to implement this system... I firmly believe that it would have a major impact on research in the biological field.

Cheers,
Kevin Cramer

Review of the Librie "Electronic Book" Reader

Matt Bamberger has posted an interesting review of the Librie electronic book reader here:

http://www.mattbamberger.com/Main/updates/TheSonyLibrie

This is definitely a technology whose time WILL come, and Matt makes a decent argument that its time of arrival is near in terms of technology -- though there is not yet enough content published in Librie format to make it worthwhile for most people to buy one.

-- Ben Goertzel

Sunday, November 27, 2005

Review of "The Mind and the Brain" by Jeffrey Schwartz

I just read a fairly interesting book called

The Mind and the Brain: Neuroplasticity and the Power of Mental Force

by Jeffrey Schwartz and Sharon Begley.

The interesting part of the book is the wealth of biological examples, which illustrate the powerful ways in which the human brain can modify its own structure during adult life. When I first studied neuroscience in the 1980's we were taught that the brain can't grow new neurons or synapses after childhood. I was always skeptical of this and now it turns out that the old wisdom was false: brains can and do grow new neurons and synapses during adulthood, and this is a significant aspect of the way humans learn over their lifetimes.

The lead author is an expert on the treatment of obsessive-compulsive disorder and gives interesting examples of how appropriate therapies allow patients to overcome OCD via neural restructuring (which can be observed via brain scanning).

The frustrating part of the book is the end where the author draws on Stapp's ideas to argue that neural restructuring is a consequence of quantum dynamics in the brain -- that this restructuring is caused by some quantum-enabled "force of will". None of the biological material presented seems to demand this kind of recourse to quantum magic, and it weakens the book considerably (although from the author's perspective it's one of the main points of the book!).

(A concise version of my own current view on the relation between quantum theory and consciousness is here:
http://www.goertzel.org/blog/2005/10/quantum-theory-and-consciousness.html)

IBM's new Cell processor and its potential uses for AI

The following link

http://www.ibm.com/developerworks/power/library/pa-fpfunleashing/?ca=dgr-lnxw01CellUnleash

contains some interesting information on IBM's new Cell processors (used inside the PlayStation 3).

I am left wondering whether the Cell Broadband Engine could potentially be effective for Genetic Programming learning. (In Novamente we use an evolutionary learning algorithm different from GP, but it's similar enough that if the CBE can do GP, then it can do our algorithm as well.)

I am curious for the opinion of others who are more knowledgeable about such things. (I'm not really a hardware guy.)

Looking over the article referenced, it seems to me that the PPE (the main processing unit, a PowerPC variant) could be used to run the main GP algorithm, and then fitness evaluations could be carried out on the 8 SPE's ("synergistic processing units") in parallel.

For Novamente-relevant cases (as opposed to mathematical optimization problems) this would require what the above article calls a "Large single-SPE programming model", meaning that the SPE would need to access main memory to do its fitness evaluation. (Because for Novamente learning, fitness evaluation of evolved programs has to do with comparison of programs against fairly large databases of experientially and inferentially acquired knowledge.)

A downside is that the Cell has only 256MB of RAM. This doesn't seem to be a fundamental obstacle to GP applications but it means care would have to be taken in coding/design.... In the application I envision, most of the RAM would be taken up by the set of data against which the candidate programs are compared during the fitness evaluation process.

Of course, the basic idea is that if this worked it would be much cheaper to buy PS3's than 8-processor PC's, so a much larger evolutionary learning farm could be constructed at a relatively modest budget.

Hmmmm...

-- Ben Goertzel

Thursday, November 03, 2005

Chimps may lack altruism

This article

http://www.thestate.com/mld/thestate/news/nation/13007022.htm

reports research showing that chimpanzees, at least in a laboratory setting, lack the altruistic impulse that characterizes some humans.

To quote:

"
The experiment gave the animals the opportunity to pull a lever and provide treats for chimps in adjacent cages — without receiving anything in return and at no cost to themselves.

No cost. No benefit. Sorry, Bonzo, no banana.
"

If this holds up it has interesting implications regarding the evolution of human altruism.

I was reminded however of some experiments I read about in the book "The Curse of the Self" this morning (it's a decent though not awesome book, with a lot of interesting research-psychology tidbits assembled in favor of its Buddhistic theme on the dangers associated with the psychological construct called "self"). In these experiments, a set of people was divided into two groups by a series of coin tosses, and then informed about who was in their group and who was not. Then, people were asked questions about the individuals in their group and in the other group -- and systematically people rated individuals in their (randomly selected!) group higher on various scales than people in the other group. This sort of finding has been replicated repeatedly and seems pretty robust.

Maybe if you divided the chimps into randomly selected groups and let them know who was on their team, then the chimps in each group would be willing to pull the levers to give each other bananas! Hmmm....

It is nice to know that we are more ethically advanced than chimps, but unfortunately I don't think we've advanced all that far, given the society I see around me....

But at least we do have the notion of balancing selfishness with altruism -- i.e. of balancing the good of our own individual system with the good of the larger systems in which we're embedded. We have a hard time figuring out how to carry out this balancing act, but even choosing to carry out such a balancing act at all is progress beyond our chimp forebears, it would seem.

And I do call this "progress" intentionally -- I think that it's progress not only in the narrow sense of agreeing better with human value systems, but also in the broader sense that it leads to more complex and interesting structures than the chimp way. Altruism, kept within appropriate bounds, promotes the emergence of complex inter-organismic social structures, which support things like mathematics, literature, books, articles, experiments and blogs. Some libertarians have argued that pure selfishness would lead to more complex and productive emergent inter-human structures, but I tend not to believe it. My guess is that, if the emergence of interesting social and cultural systems is the goal, there is some optimal level of altruism which is between zero and maximal.