Last year a new coordinate system named Waxholm Space (WHS) for the mouse brain was published (see “WHS: The Standard Mouse Brain Coordinate System?“). The space was named after the Swedish city Waxholm where, in February of 2007, a group of scientists was assembled through the International Neuroinformatics Coordinating Facility (INCF) to discuss what they might do to help coordinate mouse brain research data. They decided on establishing a reference atlas of the mouse brain that would act as a universal coordinate system for mouse data. A new paper “Digital Atlasing and Standardization in the Mouse Brain” (published February 3, 2011 in PLoS Computational Biology) describes the goals of the INCF Digital Atlasing Infrastructure team to create a framework that not only enables interoperability between existing and future mouse data resources but also provides the tools for the discovery and publishing of data aggregated from distributed resources.
Figure 1. The International Neuroinformatics Coordinating Facility (INCF) Digital Atlasing Infrastructure enables interoperability between existing and future mouse brain data resources. Figure 3 from “Digital Atlasing and Standardization in the Mouse Brain” by Michael Hawrylycz, Richard A. Baldock, Albert Burger, Tsutomu Hashikawa, G. Allan Johnson, Maryann Martone, Lydia Ng, Chris Lau, Stephen D. Larsen, Jonathan Nissanov, Luis Puelles, Seth Ruffins, Fons Verbeek, Ilya Zaslavsky and Jyl Boline. PLoS Computational Biology Volume 7, Number 2, February 3, 2011.
To achieve their goals, providing a common mouse brain coordinate system was just one step. They also needed to provide a collection of distributed services that would support publication, discovery, and the aggregate use of different distributed atlas resources. The prototype version of the INCF Digital Atlasing Infrastructure is released and supports mapping between WHS reference space and the following online resources:
Kudos to the coordinating organization INCF and all of the people from organizations around the world that are working hard together to make a global atlas-based data sharing framework a reality! Their work will have a huge impact on the future of neuroscience, including the application of research data for medical purposes, and should be supported by everyone.
We’ve looked at some recent papers that take rate coding to task and argue that individual action potentials and their precise timing are important for signal processing in the brain. The recent paper “Sensitivity to perturbations in vivo implies high noise and suggests rate coding in cortex” (published July 1, 2010 in Nature) provides support for rate coding in the cerebral cortex. In fact, it goes further and states that the evidence rules out the importance of individual spikes and their precise timing for signal processing in the cerebral cortex.
But what does that have to do with paper bloat? It just so happens that this six page paper has 42 pages of supplementary material associated with it. This is not unusual. At least not with the types of papers discussed in this blog. A six page paper discussed two days ago had 88 pages of supplementary material and a paper discussed last November had 177 pages of supplementary material. Each paper is becoming a book. How may we, with finite life spans, keep up?
The nature of the journal article must change. Supplementary material is a necessity but instead of each article sitting on top of a submerged mammoth sized text it will be linked (using Semantic Web technologies) with visualization, simulations, and other high-level methods for efficiently conveying large masses of information. Naturally, interested parties will be able to drill down into text and equations but they’ll be able to absorb a lot of the information by working with the products of the study rather than reading about them.
There is another reason to bring up the urgent need for a change in publishing practices in connection with this article. The authors base some of their work on an existing model that runs in NEURON and is available from the SenseLabModelDB repository (Spike Initiation in Neocortical Pyramidal Neurons (Mainen et al 1995)). However, they did not publish their own models to an open repository. That’s the least they should have done to help readers evaluate their conclusions. They also should have published their physiological data to an open data repository.
Fundamentally, the team asked if small perturbations to spiking activity in cortical networks are amplified. Here’s what they found:
A perturbation consisting of a single extra spike in one neuron produced approximately 28 additional spikes in its postsynaptic targets.
A single spike in a neuron produced a detectable increase in firing rate in the local network.
The observed amplification led to intrinsic, stimulus independent variations in membrane potential of the order of 2.2 to 4.5 millivolts.
Their conclusions hinge on the idea that a well defined perturbation resulted in stimulus independent variations in membrane potential. How can the one lead to the other but, on the other hand, be independent? The authors state that the variations in membrane potential “are pure noise, and so carry no information at all.” They go on to conclude “for the brain to perform reliable computations, it must either use a rate code, or generate very large, fast depolarizing events, such as those proposed by the theory of synfire chains.” They follow this up by recording activity from layer 5 pyramidal cells in somatosensory cortex. They state that their “findings are consistent with the idea that cortex is likely to use primarily a rate code.” They came to this conclusion because they found that large, fast depolarizing events were very rare.
The question of signal processing in the brain is fundamental so these conclusions warrant a careful look and deep consideration. Over the next couple of days I’ll post closer looks at the work reported in 48 pages of journal article and supplementary material.
Figure 1. The frequency of use of the words genomic and holonomic in English language books published from 1880 through 2008. Smoothing is set to zero to show raw data results
A quick search for the words genomic and holonomic in books from 1880 through 2008 shows a typical lesson from history (see Figure 1 above). The words have been in use for a lot longer than I expected. Notice the little bumps as far back as the late 1890s. But how were the words used?
First let’s consider the accuracy of the data. When I look at books cited as containing the word genomic before the 1930’s most if not all of the citations are errors. The great majority of errors are due to mistakes in optical character recognition. Many of the late 19th century mistakes are due to the French word generale but also from aenemic, economics, and Cenozoic. Some of the errors are due to wrong dates. For example, a book from 1982 may be listed as from 1882. This changes around the 1930s. For example a 1939 lecture given by by Richard Goldschmidt stated “The facts reported indicate differences between species which are on a chromosomal level and, maybe, frequently even on a genomic level.” This was published in a 1940 book titled “The Material Basis of Evolution.”
Note: A 1-gram is a string of characters uninterrupted by a space. An n-gram is a sequence of 1-grams, such as the phrases “holonomic brain” (a 2-gram) and “the neuron doctrine” (a 3-gram). Usage frequency (y-axis in graphs by Google’s Ngram Viewer) is computed by dividing the number of instances of an n-gram in a given year by the total number of n-grams in the corpus in that year.
The y-axis in Figure 1 shows the search word’s percentage of all the words (1-grams; see note above) in all the books published in English that are currently part of the database. Even when genomic becomes relatively common the word peaks at showing up about 0.000300% of the time. That means genomic occurs 3 ten thousandths of one percent of the time in English language books. The little bumps between 1930 and 1960 are far smaller; less than 0.00000050% or 5 ten millionths of one percent during most of the 1930s and less than 0.00000250% or 25 ten millionths of one percent (or 5 times more) during 1960. These little bumps due to genomic aren’t even detectable in the graph shown in Figure 1 but they include a high percentage of real usage (rather than errors) in ways similar to the way the word is used today. Around 1970 the use of genomic becomes detectable in Figure 1 at 0.00000700% or 7 millionths of one percent.
Figure 2. The 2-gram holonomic brain appears in English language books beginning around the publication of “Brain and Perception” on June 1, 1991. Smoothing is set to zero to show raw data results.
Interestingly, even though the word holonomic has remained rare, early references actually pan out as genuine rather than errors. For example, a set of 16 books containing the word holonomic was returned for between 1902 and 1905. All 16 citations were correct. On the other hand the use of the word is so rare in the corpus as to barely be detectable at less than 0.00000040% or 4 ten millionths of one percent. Holonomic was defined as “a dynamical system for which a displacement represented by arbitrary infinitesimal changes in the coordinates is in general a possible displacement” in the 1904 book “A Treatise on the Analytical Dynamics of Particles and Rigid Bodies” by Edmund Taylor Whittaker. The use was not in relation to the brain but was used in mathematical definitions of specific types of dynamical systems. It wasn’t until around the publication of “Brain and Perception” on June 1, 1991 that the 2-gram holonomic brain appears in the literature (see Figure 2 above).
All of this points to how fun the tools and data set presented in the paper “Quantitative Analysis of Culture Using Millions of Digitized Books” can be. The paper states that over 15 million books (about 12% of all books ever published) have been digitized by Google so far. The authors carried out some cultural investigations on a subset of those data containing 5,195,769 books (about 4% of all books ever published).
Note: Those interested in research methods and other details should download the supporting online material for this article, an 88 page pdf file, available here.
Mass access to our published heritage is a positive development. However, even the most voracious reader may only read an extremely small percentage of published books and literature. As the authors said in the paper “If you tried to read only English-language entries from the year 2000 alone, at the reasonable pace of 200 words/min, without interruptions for food or sleep, it would take 80 years.”
How will we, as finite beings, be able to keep up? Even within our areas of special interest? Clearly twenty-first century breakthroughs will be about extending our capabilities through automated knowledge acquisition. That’s where the Semantic Web comes in.