Neighborhood Explorer Documentation

Neighborhood Explorer: initial appearance
Neighborhood Explorer initial appearance


SPOKE (Scalable Precision Medicine Knowledge Engine) is a very large network containing multiple types of biological data. Pooling such diverse data into a single knowledge environment allows identifying new connections, with implications for biomedical applications like personalized medicine: suggesting which drugs may be effective for a specific patient. An earlier version of the network was used to suggest new uses for existing drugs (Himmelstein et al., eLife 2017 Sep 22;6. pii: e26726).

SPOKE is a heterogeneous network, meaning that different nodes (points) within the network can represent different types of data. The edges between pairs of nodes represent known connections. Paths that follow a series of edges may connect nodes not previously known to be related.

SPOKE is much too large and dense to comprehend visually all at once. Although automated analyses may operate on the entire network, a graphical interface for exploring limited areas or subsets is need for human interaction.

The Neighborhood Explorer allows interactively finding and viewing specific “neighborhoods” within SPOKE. As detailed below, it allows searching SPOKE for a specific drug compound, gene, or protein, and seeing what other nodes are in its immediate neighborhood, connected by one or a few edges. These other nodes could be any of the data types in the heterogeneous network, including related diseases, side effects, pathways, and other compounds, genes, or proteins.

The search can be restricted to certain node and edge types, and by edge value. The resulting network is displayed with color-coding by node type, with detailed information for any node or edge popping up on mouseover. Sets of terminal nodes (“leaf” nodes) are grouped into rectangles that can be collapsed to simplify the view. The network can be extended further from any node(s) of interest by one or more additional rounds of searching.

Searching and Search Options

Search types, with the required query in parentheses:

The identifier to enter is rarely known in advance. Clicking the Source button shows the website corresponding to the current search type, for example, Disease Ontology for diseases. The identifier(s) can be looked up at this website. For SMILES string lookup, however, sites other than the SEA website are recommended, such as PubChem Compound or ChEMBL. SMILES string example, for lurasidone: C1CCC(C(C1)CN2CCN(CC2)C3=NSC4=CC=CC=C43)CN5C(=O)C6C7CCC(C7)C6C5=O

results from CFTR sample query (Apr 2020)
results from CFTR sample query

Entering the exact match to long data type such as a pathway or species name is often very difficult, and sometimes a broader search is desired. For these reasons, all search types except SEA and Node type allow entering a partial match as a regular expression. A leading tilde symbol ~ is required to indicate when a regular expression is being used. Examples:

**It is very important to set options before submitting the search, mainly to limit the results to a reasonable number of nodes and edges. Searches that are too broad not only take longer, the results may be impossible to view. A good way to get a feel for a reasonable amount of results and the corresponding option settings is to run some of the sample queries.

Checkboxes show/hide parts of the interface:

Clicking Submit initiates the search, which may take several seconds depending on the specific query and option settings.

leaf-node group selected leaf-node group collapsed
leaf-node group selected leaf-node group collapsed

Exploring the Network

The resulting network is displayed when the search is complete. The query node is shown with a double border. Most edges are solid lines, except that SEA search compound-protein binding predictions are shown as dashed lines that vary in width by significance value.

General interactions with the network:

GATA2 gene node selected after extending from GATA2 and redoing layout
GATA2 gene node selected after extending from GATA2

When numerous “leaf nodes” (those connected by only one edge) emanate from the same central node, they are grouped into a rectangle and can be collapsed onto that node to simplify the view.

Clicking the rectangle representing a leaf-node group selects it for action by the Collapse button, or if the rectangle is in the collapsed state, for re-expansion with the Expand button.

Other buttons:

UCSF Resource for Biocomputing, Visualization, and Informatics / December 2020