SPOKE

Neighborhood Explorer Documentation

Neighborhood Explorer: initial appearance
Neighborhood Explorer initial appearance

Introduction

SPOKE (Scalable Precision Medicine Knowledge Engine) is a very large network containing multiple types of biological data. Pooling such diverse data into a single knowledge environment allows identifying new connections, with implications for biomedical applications like personalized medicine: suggesting which drugs may be effective for a specific patient. An earlier version of the network was used to suggest new uses for existing drugs (Himmelstein et al., eLife 2017 Sep 22;6. pii: e26726).

SPOKE is a heterogeneous network, meaning that different nodes (points) within the network can represent different types of data. The edges between pairs of nodes represent known connections. Paths that follow a series of edges may connect nodes not previously known to be related.

SPOKE is much too large and dense to comprehend visually all at once. Although automated analyses may operate on the entire network, a graphical interface for exploring limited areas or subsets is need for human interaction.

The Neighborhood Explorer allows interactively finding and viewing specific “neighborhoods” within SPOKE. As detailed below, it allows searching SPOKE for a specific drug compound, gene, or protein, and seeing what other nodes are in its immediate neighborhood, connected by one or a few edges. These other nodes could be any of the data types in the heterogeneous network, including related diseases, side effects, pathways, and other compounds, genes, or proteins.

The search can be restricted to certain node and edge types, and by edge value. The resulting network is displayed with color-coding by node type, with detailed information for any node or edge popping up on mouseover. Sets of terminal nodes (“leaf” nodes) are grouped into rectangles that can be collapsed to simplify the view. The network can be extended further from any node(s) of interest by one or more additional rounds of searching.

Searching and Search Options

All search fields except for SEA searches will accept any text that matches the name, or identifier of the item being searched. Once a full text query is entered, the matching values will be displayed in a pop-down windows. Selecting one of the values will place the required identifier or name in the search box.

Search types, with the search type in parentheses:

The identifier to enter is rarely known in advance. Clicking the Source button shows the website corresponding to the current search type, for example, Disease Ontology for diseases. The identifier(s) can be looked up at this website. For SMILES string lookup, however, sites other than the SEA website are recommended, such as PubChem Compound or ChEMBL. SMILES string example, for lurasidone: C1CCC(C(C1)CN2CCN(CC2)C3=NSC4=CC=CC=C43)CN5C(=O)C6C7CCC(C7)C6C5=O

results from CFTR sample query (Apr 2020)
results from CFTR sample query

Entering the exact match to long data type such as a pathway or species name is often very difficult, and sometimes a broader search is desired. For these reasons, in addition to the full-text search mentioned above, all search types except SEA and Node type allow entering a partial match as a regular expression. A leading tilde symbol ~ is required to indicate when a regular expression is being used. Examples:

**It is very important to set options before submitting the search, mainly to limit the results to a reasonable number of nodes and edges. Searches that are too broad not only take longer, the results may be impossible to view. A good way to get a feel for a reasonable amount of results and the corresponding option settings is to run some of the sample queries.

Checkboxes show/hide parts of the interface:

Clicking Submit initiates the search, which may take several seconds depending on the specific query and option settings.

leaf-node group selected leaf-node group collapsed
leaf-node group selected leaf-node group collapsed

Exploring the Network

The resulting network is displayed when the search is complete. The query node is shown with a double border. Most edges are solid lines, except that SEA search compound-protein binding predictions are shown as dashed lines that vary in width by significance value.

General interactions with the network:

In addition to the interactions mentioned above, the Neighborhood Explorer provides several keyboard accelerators to manipulate node and edge selection, move nodes around, and delete nodes:
Key CombinationAction
Del Deletes the currently selected nodes
Moves selected nodes up
Shift-↑ Moves selected nodes up a smaller increment
Moves selected nodes down
Shift-↓ Moves selected nodes down a smaller increment
Moves selected nodes left
Shift-← Moves selected nodes left a smaller increment
Moves selected nodes right
Shift-→ Moves selected nodes right a smaller increment
Control-6 Selects the first neighbors of the selected nodes
Control-I Inverts the current node selection
Alt-I Inverts the current edge selection
Alt-N Select All Nodes
Alt-Shift-N Deselect All Nodes
Alt-Control-N Selected nodes connected by selected edges
Alt-Control-N Selected nodes connected by selected edges
Alt-E Select All Edges
Alt-Shift-E Deselect All Edges
Alt-Control-E Select edges adjacent to selected nodes
GATA2 gene node selected after extending from GATA2 and redoing layout
GATA2 gene node selected after extending from GATA2

When numerous “leaf nodes” (those connected by only one edge) emanate from the same central node, they are grouped into a rectangle and can be collapsed onto that node to simplify the view.

Clicking the rectangle representing a leaf-node group selects it for action by the Collapse button (or double-clicking), or if the rectangle is in the collapsed state, for re-expansion with the Expand button.

Other buttons:


UCSF Resource for Biocomputing, Visualization, and Informatics / October 2021