Tag Archives: Semantic Web

R and RDF: Where Statistics and the Semantic Web Meet

Many of the most exciting developments in information based technologies today incorporate Semantic Web technologies. These include IBM’s Watson, Apple’s Siri (originally created and developed at SRI), and Google’s Search Engine. It’s not surprising, but perhaps unfortunate, to read in Egon Willighagen’s preprint “Accessing biological data in R with semantic web technologies” that “most new databases do not yet use semantic web technologies.”

Note: R is a freely available and open source tool especially useful for interactive data analytics and visualization.

The happy news is that Egon Willighagen has once again contributed to the community tool chest. While his article is targeted at a life sciences audience, the tool (a set of R packages) he has written, known as rrdf, is useful to anyone who wants to import RDF data into R.

Want to start working with triples from within your R environment?Install the rrdf package:


Installing the rrdf package also installs two dependencies: rJava and rrdflibs. The rrdf package provides RDF and SPARQL functionality through Apache Jena. The rrdflibs package contains the Apache Jena libraries which are written in Java. The rJava package provides an interface to Java so that Apache Jena may run. The rrdf package itself contains the R functions that wrap Jena functionality and convert data into the appropriate structures where needed.

Load the rrdf package into your R environment:


Now you’re able to query your favorite triple store and pull triples into your R environment. Here we will query Live DBpedia for a list of 40 programming languages. First provide the URL for the SPARQL endpoint:

endpoint <- "http://dbpedia.org/sparql"

Next provide the SPARQL query itself:

query <- "SELECT DISTINCT ?language WHERE { ?s ?o . ?o ?language } LIMIT 40"

Finally, carry out the query using the sparql.remote() function and assign the results to a variable:

data <- sparql.remote(endpoint, query)

You've imported your first set of triple data into R!

A Powerful Mouse Whole Brain Connectivity Atlas

A local brain circuit wiring diagram in the mouse olfactory bulb.
Figure 1. The authors hypothesize the above local circuit connections to and from the medial olfactory bulb (MOB) based on data they added to the whole brain connectivity atlas. From Figure 8 in “Comprehensive connectivity of the mouse main olfactory bulb: analysis and online digital atlas” (published August 7, 2012 in Frontiers in Neuroanatomy).

The recent paper “Comprehensive connectivity of the mouse main olfactory bulb: analysis and online digital atlas” (published August 7, 2012 in Frontiers in Neuroanatomy) introduced the first olfactory (sense of smell) mouse brain connectivity data available as an open resource in the Mouse Connectome Project online database. And not from just any mouse, but from the C57BL/6J mouse. The C57 black 6 mouse (C57BL/6J) is the most common genetic strain of mouse used in biomedical research today.

The Mouse Connectome Project’s publicly available whole brain connectivity atlas of the C57BL/6J mouse is being created to help provide data for generating testable ideas (hypotheses) about local brain circuits, brain function, behavior, and disease. The project provides data through the following link: http://www.mouseconnectome.org/iConnectome (see Figure 2 below).

The Mouse Connectome Project's data repository.
Figure 2. This webpage is the access point to the Mouse Connectome Project’s data repository. The page lists the data and data characteristics which are viewed through the iConnectome Viewer (see text below).

There is no direct access to the raw data that I can find. Instead a tool named the iConnectome Viewer is provided. The user selects the data set they want to work with by checking the check box in the left-most column titled “Show” and then clicking on the “VIEW” button above.

The paper includes several figures that display data which lead the authors to hypothesize connections to and from the medial olfactory bulb (MOB) shown above in Figure 1. The beauty of shared data access begins to become apparent on noticing the case numbers referencing the data included in each figure caption.

Case numbers are listed in the Mouse Connectome Project data repository webpage in the column next to Show (see Figure 2 above). Type a case number into the Search box (under the “REFRESH” button at the far right and at the same level as the “VIEW” button) and the case list shrinks. I typed in case SW110403-01A from the paper’s Figure 7E-H into the search box, checked the check box next to the case number, and then clicked the “VIEW” button. An impressive looking viewer appeared, as shown in Figure 3 below.

The iConnectome Viewer displays the data contained in the Mouse Connectome Project's whole brain connectivity atlas.
Figure 3. The iConnectome Viewer displays the data contained in the Mouse Connectome Project’s whole brain connectivity atlas.

There are five main areas of the iConnectome Viewer. The navigation controls are grouped in upper left hand area. This area provides controls for zooming, panning, and refreshing the selected viewport (the viewer opens with one viewport but it may show up to four viewports at a time). Most navigation functions may also be accomplished through double-clicking, dragging, and other interface gestures. There is also a thumbnail view of the section in the currently selected viewport shown here. There is one set of navigation controls for all open viewports.

The data set and layer controls appear in the upper right portion of each open viewport. The case number of the data displayed in the viewport is displayed above a list of data types that may be layered one upon the other. The base layer that displays a standard Nissl stained section is selected by default. Above the base layer is the atlas layer which shows the appropriate section and data from the Allen Reference Atlas (ARA). The Allen Reference Atlas has become the standard high-resolution anatomic reference atlas accompanied by a systematic, hierarchically organized taxonomy of mouse brain structures created by the Allen Institute for Brain Science. The other layers are for each type of label used in the case study. A lot of important information for each section is displayed when you mouse over the blue information button “i” to the left of the case study number including the section coordinates.

The section controls are grouped along the bottom of the iConnectome Viewer. The sagittal view of the brain at bottom left has a yellow vertical bar you may drag right and left to show sections from different coronal planes. The data layer buttons above the brain are the same as the data layers provided in each viewport. To the right is an array of coronal sections with the particular section you selected using the vertical bar shown with a yellow outline. You may click on a different coronal section and it will be outlined in yellow and the yellow vertical bar in the sagittal section at left will reposition appropriately.

The viewport and menu tool bar display along the top of the iConnectome Viewer. At left, above the navigation controls, are the buttons that control the layout of the viewports. From left to right they are the single, double, 2×2, and tab viewport layout. The button to the right of the four viewport buttons is very important. It’s the synchronize viewports button. When synchronize is activated, any navigation action performed in one viewport is mirrored in the others.

The Mouse Connectome Project's data repository paired with the iConnectome Viewer make a powerful mouse brain atlas.
Figure 4. The Mouse Connectome Project’s data repository paired with the iConnectome Viewer provide a powerful mouse brain atlas for the 21st century.

It’s time for you to take a test drive.

  • Click on the Dual View button in the viewport tool bar at top left.
  • Now click on the Link Viewports button.
  • Move the vertical yellow line to the right in the sagittal brain section at bottom left. The sections in the left and right viewports synchronize. In fact, they’re exactly the same because the data layers are both set to base.
  • To make this more interesting, in the right viewport click on the base button in the data set and layer controls area. The section should disappear.
  • Click on the atlas button. Now the right viewport should display the exact same part of the brain displayed in the left viewport except the section at right is from the Allen Reference Atlas and it includes a diagram with structure names.
  • Zoom in to take a closer look by double clicking in the viewport or by using the slider in the navigation area.
  • Make this even more interesting by selecting the fg (Fluorogold) data set in the left viewport.

Your iConnectome Viewer should look similar to the one pictured in Figure 4 above. The Mouse Connectome Project’s whole brain connectivity atlas and iConnectome Viewer combine to provide a powerful mouse brain atlas for the 21st century. The main shortfall that I see is the apparent inability for public access of the raw data in the repository. Access to the raw data is essential for investigators to be able to go beyond the atlas and to using machine processing to quantitatively analyze mouse brain connectivity.

Using Graph Theory in the Brain Sciences

Graph theory has provided a new set of tools for helping us to understand signal processing networks in the brain (see “Other related blog posts” below for some earlier posts on graph theoretic based approaches). In particular, a field known as network theory or complex network theory, which is rooted in graph theory, has been helping to provide insight into the way circuits in the brain are wired and how those circuits contribute to brain function. The new review paper “Dissecting functional connectivity of neuronal microcircuits: experimental and theoretical insights” (published online April 2, 2011 in Trends in Neuroscience) provides a high-level look at using network theory in the brain sciences.

First the authors explain key definitions for those unfamiliar with network theory. They discuss the features of networks that are typically captured in network theoretic equations. These equations are used for analyzing graphs composed of nodes and edges (links). Results of the analyses provide insights into the physical and functional organization of graphs.

The nature of the insights provided by the analysis of a graph depend on the graph’s composition. Nodes may represent any number of brain structures. For example, nodes may represent neurons, brain areas, or dendritic spines. Edges represent the means of communication amongst the structures represented by the nodes. For example, edges may represent chemical synapses, gap junctions, or diffusible molecules.

In network theory some general organizational principles have emerged. For instance, a scale-free network is a graph containing nodes that exhibit a wide range in their number of connections with other nodes. These graphs include rare hub nodes that have an extraordinarily large number of connections with other nodes. Hub nodes have a strong impact on signal processing within a network.

Scale-free networks are found in neuronal circuits in the brain. For example, an important structure for learning and memory known as the hippocampus exhibits scale-free network organization during development. The inhibitory GABAergic neurons in this structure act as hubs that help orchestrate synchronous activity across the network. If these concepts or their application in the brain sciences are interesting and new to you then you may find this review paper to be a good introduction.

Other related blog posts:

New Analysis Methodologies and the Case for Data Sharing in Brain Research

Sex Matters But the Brain is Like Nothing Else

Explosive Change in Network Connectivity and Catastrophic Information Loss

Synthetic Brain Cells and Graph Theory