[spoke.ucsf.edu]
[spoke.rbvi.ucsf.edu]
SPOKE Nodes and Edges
SPOKE
(Scalable Precision Medicine Knowledge Engine)
is a very large network containing multiple types of biological data.
Pooling such diverse data into a single knowledge environment
allows identifying new connections, with implications for
biomedical applications like personalized medicine:
suggesting which drugs may be effective for a specific patient.
An earlier version of the network was used to suggest new uses for existing
drugs (Himmelstein et al., 2017).
SPOKE is a heterogeneous network,
meaning that different nodes (points) within the network
can represent different types of data.
The edges between pairs of nodes represent known connections.
Paths that follow a series of edges may connect nodes not
previously known to be related.
Node and edge
types and their data sources are given below. Except as noted,
data are updated weekly on a rotating basis (different types on different days).
= updated currently
= in progress
= present but static
See also:
source licenses
[back to top]
Node Types
-
Anatomy
– anatomical structures; all entries in
Uberon,
along with directed edges
between them to reflect the hierarchy
-
AnatomyCellType
– Anatomy–Cell Type
combinations with expression data in the
Human Protein Atlas
-
BiologicalProcess
– biological process terms from
Gene Ontology
with 2-1000 annotated genes
-
Blend
– Dietary Supplement ingredient names are recorded from the product label's supplement facts panel. Data from
NHANES
These ingredients have a blend flag.
-
CellType
– cell types from Cell Ontology
for AnatomyCellType nodes
-
CellularComponent
– cellular component terms from
Gene Ontology
with 2-1000 annotated genes
-
Compound – the union of the following:
- approved small-molecule compounds
in DrugBank
with documented chemical structures
- all compounds in ChEMBL
glycans from the Kyoto Encyclopedia of Genes and Genomes (KEGG)
-
DietarySupplement – Product data from
NHANES
-
Disease
– all entries in Disease Ontology,
along with directed edges
between them to reflect the hierarchy
-
EC
– Enzyme Commission numbers
from pathway sources
-
Food – from
FoodOn
-
Gene – protein-coding genes
of select organisms from
Entrez Gene
-
Location – US data with zipcode from
Location
and all countries from
Geonames
-
MiRNA – MiRNA data from
MiRDB
-
MolecularFunction
– molecular function terms from
Gene Ontology
with 2-1000 annotated genes
-
Organism
– all taxonomic levels from
NCBI Taxonomy
for Homo sapiens, bacteria with Pathway data,
and severe acute respiratory syndrome coronovirus 2 (SARS-CoV-2). Furthermore, it contains bacterial strains sourced from
BV-BRC
-
Pathway
– human pathways from:
... human and canonical bacterial pathways from:
... bacterial pathways not represented in KEGG from:
-
PharmacologicClass
– from DrugCentral,
the following annotation types:
- FDA Chemical/Ingredient
- FDA Chemical Structure
- FDA Mechanism of Action
- FDA Physiologic Effect
-
Protein
– proteins of select organisms in
UniProtKB
-
ProteinDomain
– from Pfam,
domains in proteins
-
ProteinFamily
– from Pfam,
families (clans) of protein domains
-
Reaction
– reactions in pathways
-
SARSCov2
– SARS-CoV-2 proteins studied in
Gordon et al., 2020
-
SideEffect
– all entries in SIDER
-
Variants
– Variants pulled from ClinVar and GWAS Catalog, position in HG38 build. Information from population allele dosage pulled from dbSNP.
-
Symptom
– Medical Subject Headings (MeSH) terms from:
- MeSH subtree C23.888
(Diseases / Pathological Conditions, Signs and Symptoms / Signs and Symptoms)
- Human Phenotype Ontology disease-symptom data
[back to top]
Edge Types