Tag: Semantic Web

  • Finding the Right Neuron: Introducing Protégé, OWL, and the Neuron Phenotype Ontology

    Figure 1. Details of two layer 2 pyramidal neurons sitting side-by-side from a sample of around 50,000 neurons in a cubic millimeter of human cortex scanned at electron microscopic resolution and provided in a publicly available database.

    It’s been three years since I last wrote to you here. I’ve been busy in theoretical and computational neuroscience after several years focused on the semantic web and artificial intelligence technologies company I founded in 2001. My big surprise is that neuroscientists continue to painstakingly search for and extract data from research papers by hand!

    Of course there will always be a need to read and pull data and ideas straight from research papers but I thought that by now, 20 years later, easy-to-use web pages would provide globally aggregated brain research data. Not yet. There is exciting linked-data based work going on and a wide range of useful ontologies used by brain researchers with sophisticated working understandings of semantic web technologies but there appears to be a large gap in tools for those not steeped in this knowledge. Neuroscientists don’t have easy rapid access to data answering, for example, the question,  “what is the average spike rate of pyramidal neurons in layer 5B primary motor cortex?”

    The infrastructure that enables expert searches through piles of data is everywhere. Google, Apple, and the Allen Institute for Brain Science are just a few of the institutions that use semantic web technologies (ontologies, reasoners, linked-data) but these technologies remain mostly hidden and unrecognized by the public. Indeed, most programmers are at best only vaguely aware of them.

    Speed Date with Semantic Web Technologies

    Let’s take a look at semantic web technologies in the brain sciences today by diving into the important new Neuron Phenotype Ontology. The Neuron Phenotype Ontology was created to help find and categorize neurons with particular sets of traits (phenotypes).

    Figure 2. Protégé loaded with the Neuron Phenotype Ontology.

    To follow along, download the latest Protégé version and set up the ELK Reasoner (see the Protégé Setup with ELK Reasoner box below). Then downloaded the Neuron Phenotype Ontology OWL file from the National Center for Biomedical Ontology BioPortal (see the Neuron Phenotype Ontology OWL File box below).

    Protégé Setup with ELK Reasoner

    1. To download the ELK Reasoner, select Check for plugins … from the Protégé File menu and the Automatic Update window will pop up.

    2. Check the Install box next to ELK Reasoner and click on the window’s Install button.

    3. Select the ELK Reasoner by opening the Protégé Reasoner menu and selecting ELK Reasoner.

    4. Run the reasoner by opening the Protégé Reasoner menu and selecting Start Reasoner.

    Ideally, an otology has access to a world of linked-data through the Internet. The Neuron Phenotype Ontology is in its infancy and so it’s available as an evaluation version with a relatively small set of data from three sources (Henry Markram’s Blue Brain Project, the Allen Institute for Brain Science, and Josh Huang’s laboratory). With limited links and data, just any search won’t do. For example, most of the linked-data are from visual and somatosensory cortices, there are a small number of neurons from auditory cortex, but there are no neurons from motor cortex. With this in mind, let’s take a look.

    Neuron Phenotype Ontology OWL File

    1. Look for the Submissions section on the Neuron Phenotype Ontology page in the BioPortal,  and find the latest version (top) and under Downloads click on the OWL link. The npo.ttl file will download to your computer.

    2. In Protégé, open the File menu and select Open. Find and load the npo.ttl file.

    A Brief Tour of Neuron Phenotype Ontology

    We can do a simple query but first a very brief tour of our ontology so that you may start building your own queries. Select the Entities tab at top left and you see several tabs below. Open the Classes sub-tab. These are the things, in OWL known as classes, that you may work with in the Neuron Phenotype Ontology. You may only see the top thing owl:Thing. Select the > to the left of owl:Thing and a whole sub-tree of classes should appear. Labels for many of these may not make sense to the human reader but are important classes for machine reasoning.

    Figure 3. Searching for morphological phenotype in the Protégé Search window.

    Let’s find an important human readable class in this ontology using Protégé search. Click on the Search… button in the far upper right. The Search window pops up (see Figure 3 above). Type in morphological phenotype. At the top of the resulting list should be the class, represented by an orange circle at left, named ‘Morphological phenotype’. The blue rectangles represent properties. More on these in a minute. Double click on the ‘Morphological phenotype’ class. Move or close your Search window and you’ll see that a sub-tree opened up in your Classes sub-tab list and the class Morphological phenotype is highlighted. Searching is the best way to get around in an ontology.

    Look at the list of sub-classes under the Morphological phenotype class (select the > at left to see them) . These are the pre-defined cell morphology phenotypes currently in the Neuron Phenotype Ontology. Also notice that there are a whole lot of other classes under the high-level Phenotype class that break out neuron phenotypes in different ways including axon, cell, dendrite, and electrophysiological. You can use these classes to find the data you want.

    To carry out a query you’ll need properties in addition to classes. You can think of classes as things in the world. Physical or mental things. Properties are, well, properties of those things. A bicycle is a thing with a color property. A blue bicycle. Blue is a thing that is a property of the bicycle. Likewise, a morphological phenotype is a thing but it is also a property of cells, including neurons. Notice that listed under the ‘Morphological phenotype’ class (yellow circle) in Figure 3 above is the hasMorphologicalPhenotype property (blue rectangle). Now you’re ready to query the data for a particular neuron phenotype.

    Query for Neurons with Dendritic Spines

    Many of the most complex neurons in cortex have dendrites with a lot of spines on them like in the image of two human cortical neurons at top in Figure 1. Each spine has a synapse (yellow dot) with an incoming axon from another neuron. We will query for data about neurons with spiny dendrites. Select the DL Query tab in Protégé and type Neuron and hasMorphologicalPhenotype some ‘Spiny phenotype’ into the Query (class expression) text box.

    Neuron is the class of cells you’re looking for with spiny morphology. That is, you’re looking for every neuron that has the morphological phenotype of dendrites with spines (‘Spiny phenotype’ is a sub-class of ‘Dendrite phenotype’).

    You should get around 454 results. Most and perhaps all data for this query was collected at the Allen Institute. Normally what you want a query to return is linked-data from around the world. In this case we are using a proof-of-concept version of the ontology with a small local dataset. Also, the classes you see in your results represent the data. Typically end results provide links (URLs) to the actual data.

    Conclusions

    Why do neuroscientists continue to painstakingly search for and extract data from research papers by hand when technology is available to automate much of the process? The number one reason is most likely the complexity of neuroscience data. A lot of work went into creating the Neuron Phenotype Ontology … and there is much still to do. Another probable reason is the work and expense that goes into building and maintaining the necessary web-based user friendly tools. Finally, there is the question of how much linked-data is openly available to the neuroscience community. I believe that there is a large and growing collection of neuroscience linked-data now openly available. Check back soon and I’ll share with you what I find.

  • R and RDF: Where Statistics and the Semantic Web Meet

    Many of the most exciting developments in information based technologies today incorporate Semantic Web technologies. These include IBM’s Watson, Apple’s Siri (originally created and developed at SRI), and Google’s Search Engine. It’s not surprising, but perhaps unfortunate, to read in Egon Willighagen’s preprint “Accessing biological data in R with semantic web technologies” that “most new databases do not yet use semantic web technologies.”

    Note: R is a freely available and open source tool especially useful for interactive data analytics and visualization.

    The happy news is that Egon Willighagen has once again contributed to the community tool chest. While his article is targeted at a life sciences audience, the tool (a set of R packages) he has written, known as rrdf, is useful to anyone who wants to import RDF data into R.

    Want to start working with triples from within your R environment?Install the rrdf package:

    install.packages(“rrdf”)

    Installing the rrdf package also installs two dependencies: rJava and rrdflibs. The rrdf package provides RDF and SPARQL functionality through Apache Jena. The rrdflibs package contains the Apache Jena libraries which are written in Java. The rJava package provides an interface to Java so that Apache Jena may run. The rrdf package itself contains the R functions that wrap Jena functionality and convert data into the appropriate structures where needed.

    Load the rrdf package into your R environment:

    library(rrdf)

    Now you’re able to query your favorite triple store and pull triples into your R environment. Here we will query Live DBpedia for a list of 40 programming languages. First provide the URL for the SPARQL endpoint:

    endpoint <- "http://dbpedia.org/sparql"

    Next provide the SPARQL query itself:

    query <- "SELECT DISTINCT ?language WHERE { ?s ?o . ?o ?language } LIMIT 40"

    Finally, carry out the query using the sparql.remote() function and assign the results to a variable:

    data <- sparql.remote(endpoint, query)

    You’ve imported your first set of triple data into R!

  • A Powerful Mouse Whole Brain Connectivity Atlas

    Figure 1. The authors hypothesize the above local circuit connections to and from the medial olfactory bulb (MOB) based on data they added to the whole brain connectivity atlas. From Figure 8 in “Comprehensive connectivity of the mouse main olfactory bulb: analysis and online digital atlas” (published August 7, 2012 in Frontiers in Neuroanatomy).

    The recent paper “Comprehensive connectivity of the mouse main olfactory bulb: analysis and online digital atlas” (published August 7, 2012 in Frontiers in Neuroanatomy) introduced the first olfactory (sense of smell) mouse brain connectivity data available as an open resource in the Mouse Connectome Project online database. And not from just any mouse, but from the C57BL/6J mouse. The C57 black 6 mouse (C57BL/6J) is the most common genetic strain of mouse used in biomedical research today.

    The Mouse Connectome Project’s publicly available whole brain connectivity atlas of the C57BL/6J mouse is being created to help provide data for generating testable ideas (hypotheses) about local brain circuits, brain function, behavior, and disease. The project provides data through the following link: http://www.mouseconnectome.org/iConnectome (see Figure 2 below).

    Figure 2. This webpage is the access point to the Mouse Connectome Project’s data repository. The page lists the data and data characteristics which are viewed through the iConnectome Viewer (see text below).

    There is no direct access to the raw data that I can find. Instead a tool named the iConnectome Viewer is provided. The user selects the data set they want to work with by checking the check box in the left-most column titled “Show” and then clicking on the “VIEW” button above.

    The paper includes several figures that display data which lead the authors to hypothesize connections to and from the medial olfactory bulb (MOB) shown above in Figure 1. The beauty of shared data access begins to become apparent on noticing the case numbers referencing the data included in each figure caption.

    Case numbers are listed in the Mouse Connectome Project data repository webpage in the column next to Show (see Figure 2 above). Type a case number into the Search box (under the “REFRESH” button at the far right and at the same level as the “VIEW” button) and the case list shrinks. I typed in case SW110403-01A from the paper’s Figure 7E-H into the search box, checked the check box next to the case number, and then clicked the “VIEW” button. An impressive looking viewer appeared, as shown in Figure 3 below.

    Figure 3. The iConnectome Viewer displays the data contained in the Mouse Connectome Project’s whole brain connectivity atlas.

    There are five main areas of the iConnectome Viewer. The navigation controls are grouped in upper left hand area. This area provides controls for zooming, panning, and refreshing the selected viewport (the viewer opens with one viewport but it may show up to four viewports at a time). Most navigation functions may also be accomplished through double-clicking, dragging, and other interface gestures. There is also a thumbnail view of the section in the currently selected viewport shown here. There is one set of navigation controls for all open viewports.

    The data set and layer controls appear in the upper right portion of each open viewport. The case number of the data displayed in the viewport is displayed above a list of data types that may be layered one upon the other. The base layer that displays a standard Nissl stained section is selected by default. Above the base layer is the atlas layer which shows the appropriate section and data from the Allen Reference Atlas (ARA). The Allen Reference Atlas has become the standard high-resolution anatomic reference atlas accompanied by a systematic, hierarchically organized taxonomy of mouse brain structures created by the Allen Institute for Brain Science. The other layers are for each type of label used in the case study. A lot of important information for each section is displayed when you mouse over the blue information button “i” to the left of the case study number including the section coordinates.

    The section controls are grouped along the bottom of the iConnectome Viewer. The sagittal view of the brain at bottom left has a yellow vertical bar you may drag right and left to show sections from different coronal planes. The data layer buttons above the brain are the same as the data layers provided in each viewport. To the right is an array of coronal sections with the particular section you selected using the vertical bar shown with a yellow outline. You may click on a different coronal section and it will be outlined in yellow and the yellow vertical bar in the sagittal section at left will reposition appropriately.

    The viewport and menu tool bar display along the top of the iConnectome Viewer. At left, above the navigation controls, are the buttons that control the layout of the viewports. From left to right they are the single, double, 2×2, and tab viewport layout. The button to the right of the four viewport buttons is very important. It’s the synchronize viewports button. When synchronize is activated, any navigation action performed in one viewport is mirrored in the others.

    Figure 4. The Mouse Connectome Project’s data repository paired with the iConnectome Viewer provide a powerful mouse brain atlas for the 21st century.

    It’s time for you to take a test drive.

    • Click on the Dual View button in the viewport tool bar at top left.
    • Now click on the Link Viewports button.
    • Move the vertical yellow line to the right in the sagittal brain section at bottom left. The sections in the left and right viewports synchronize. In fact, they’re exactly the same because the data layers are both set to base.
    • To make this more interesting, in the right viewport click on the base button in the data set and layer controls area. The section should disappear.
    • Click on the atlas button. Now the right viewport should display the exact same part of the brain displayed in the left viewport except the section at right is from the Allen Reference Atlas and it includes a diagram with structure names.
    • Zoom in to take a closer look by double clicking in the viewport or by using the slider in the navigation area.
    • Make this even more interesting by selecting the fg (Fluorogold) data set in the left viewport.

    Your iConnectome Viewer should look similar to the one pictured in Figure 4 above. The Mouse Connectome Project’s whole brain connectivity atlas and iConnectome Viewer combine to provide a powerful mouse brain atlas for the 21st century. The main shortfall that I see is the apparent inability for public access of the raw data in the repository. Access to the raw data is essential for investigators to be able to go beyond the atlas and to using machine processing to quantitatively analyze mouse brain connectivity.