Public Links
Protein Sequence and Function
UniProt
Comprehensive resource for protein sequence, function, annotation, and proteome information.
InterPro
Protein family, domain, and functional site annotation resource with InterProScan support.
NCBI Gene
Gene-centered portal linking sequence records, annotation, variation, and related resources.
UniProt BLAST
Sequence similarity search against UniProt protein datasets.
Protein Structure Resources
RCSB PDB
Primary portal for experimentally determined macromolecular 3D structures.
AlphaFold Protein Structure Database
Large-scale resource of predicted protein structures from AlphaFold.
RCSB Mol* 3D Viewer
Interactive web viewer for exploring and comparing 3D molecular structures.
Genome and Annotation Databases
NCBI Genomes
Genome portal for assemblies, annotation, and large-scale genomics resources.
NCBI Datasets
Download portal for genomes and related sequence data across the tree of life.
Ensembl
Genome browser and annotation resource for vertebrates and other eukaryotes.
Ensembl Plants
Plant-focused genome browser and comparative genomics resource.
GenBank
NIH nucleotide sequence archive and part of the INSDC data exchange network.
NCBI Genome Data Viewer
Interactive genome browser for browsing assemblies and annotations.
Pathways, Systems Biology, and Functional Context
KEGG
Integrated resource for pathways, genes, compounds, drugs, and biological systems.
KEGG PATHWAY
Manually curated pathway maps for metabolism, signaling, and cellular processes.
KEGG GENES
Gene and protein collection linked to KEGG orthology and pathway systems.
KEGG DRUG
Drug knowledgebase linked to therapeutic targets and molecular interaction context.
Small Molecules and Drug Discovery
PubChem
Large open NIH resource for chemical structures, properties, and associated bioactivity information.
PubChem Structure Search
Chemical structure and substructure search entry point for PubChem.
ChEMBL
Curated chemogenomics database linking compounds, bioactivity, and targets.
ChEMBL Compounds
Browse compound records and chemistry-focused entries in ChEMBL.
ChEMBL Drugs
Browse drug entries and related annotations in ChEMBL.
DrugBank
Drug and target knowledgebase spanning drugs, targets, interactions, and clinical context.
Programming, Open Source, and Community Resources
Bioconductor
Major open-source ecosystem for reproducible analysis of biological data in R.
Bioconductor Books
Online books and workflows for modern genomic and transcriptomic data analysis.
GitHub
Software hosting and collaboration platform widely used for open-source scientific software.
GitHub Community
Community discussions, learning resources, and support for GitHub users.
GitHub Open Source
Open-source maintainer and contributor resources hosted by GitHub.
Useful Search and Download Entry Points
NCBI Search
Unified search across NCBI biomedical and genomic databases.
UniProt Downloads
Download help and dataset access for UniProt resources.
RCSB PDB Downloads
Bulk download access for structures and related archive files.
InterPro Downloads
Download portal for InterPro, InterProScan, and related signature databases.
Ensembl Downloads
Download access for Ensembl data, software, and FTP resources.
Plant and Crop Genomics
TAIR
Primary community resource for Arabidopsis thaliana genetics, genomics, and molecular biology.
TAIR Search
Entry point for gene, genome position, polymorphism, and functional searches in TAIR.
Ensembl Plants
Genome browser and annotation resource for plant species of scientific and agricultural interest.
Ensembl Plants Tools
Access BLAST, BioMart, VEP, and related analysis tools for plant genomes.
Gramene
Comparative plant genomics resource covering crops and model plant species.
Plant Reactome
Curated pathway knowledgebase for crops and model plants within the Gramene ecosystem.
Aging and Longevity Resources
Aging Atlas
Multi-omics resource for aging-related transcriptomic, epigenomic, proteomic, and pharmacogenomic datasets.
Aging Atlas Gene Sets
Curated aging-related gene sets collected from the literature and organized by category.
Aging Atlas Help
Documentation and usage guidance for the Aging Atlas resource.
PubMed
Primary literature search resource for aging biology, geroscience, and translational biomedicine.
AI for Computational Biology
AlphaFold Protein Structure Database
Open access database of large-scale predicted protein structures for biological discovery.
AlphaFold DB Downloads
Bulk download access for AlphaFold DB predictions and associated resources.
OpenFold
Trainable and memory-efficient open-source PyTorch reproduction of AlphaFold-style structure prediction.
Hugging Face
Major platform for sharing and discovering AI models, datasets, and applications.
Hugging Face Models
Searchable index of public models, including biological and biomedical foundation models.
Hugging Face Datasets
Repository of datasets useful for ML, LLM, multimodal, and bioinformatics workflows.
scGPT
Foundation-model framework for single-cell and multi-omics analysis using generative AI.
scGPT-spatial
Spatial transcriptomics extension of scGPT for large-scale spatial omics modeling.
scGenePT
Single-cell perturbation modeling framework integrating foundation-model ideas with gene-level embeddings.
Expression, Perturbation, and Single-Cell Resources
NCBI GEO
Major public repository for functional genomics data, including array- and sequence-based studies with search and download tools.
GEO DataSets
Curated GEO study records for browsing experiments, metadata, and downloadable expression datasets.
GEO Profiles
Gene-centered view of curated expression profiles derived from GEO datasets.
GEO2R
Browser-based differential expression tool for rapid comparisons within GEO studies.
ArrayExpress
EMBL-EBI functional genomics collection for expression and other high-throughput studies, with links to associated sequence archives.
ArrayExpress in BioStudies
Overview of ArrayExpress content integration with BioStudies at EMBL-EBI.
ArrayExpress Studies
Browse released ArrayExpress studies through the BioStudies interface.
NIH LINCS Program
NIH resource for large-scale perturbational datasets spanning small molecules, ligands, genetic perturbations, and multiple assay types.
LINCS L1000 Access Guide
Practical entry point for finding and downloading LINCS L1000 perturbation data.
iLINCS Workflows
Analysis entry point for LINCS transcriptomic and proteomic signatures through iLINCS workflows.
CZ CELLxGENE Discover
Curated platform for finding, downloading, and visually exploring standardized single-cell biology datasets.
CELLxGENE Collections
Browse dataset collections organized for discovery and reuse across tissues, diseases, and organisms.
CELLxGENE Datasets
Searchable dataset index for published single-cell studies in CELLxGENE Discover.
CELLxGENE Gene Expression
Gene-centric exploration view across curated single-cell datasets.
Literature and Preprints
PubMed
Primary literature search resource for biomedical, genomics, and life science research.
PubMed Advanced Search
Fielded query builder for more precise literature searches in PubMed.
PubMed Central (PMC)
Free full-text archive of biomedical and life sciences journal literature.
Europe PMC
Broad life sciences literature platform with abstracts, full text, citations, and linked research outputs.
Europe PMC Advanced Search
Advanced query interface for literature, grants, preprints, and full-text content in Europe PMC.
Europe PMC Downloads
Download access for open-access full text and metadata from Europe PMC.
bioRxiv
Major preprint server for life sciences research prior to peer review.
bioRxiv Latest Articles
Browse recently posted life science preprints on bioRxiv.
bioRxiv Subject Collections
Browse preprints by topic area across the life sciences.
medRxiv
Major preprint server for health sciences and translational biomedical research prior to peer review.
medRxiv Latest Articles
Browse recently posted health science preprints on medRxiv.
medRxiv Subject Areas
Browse medRxiv preprints by clinical and biomedical topic area.
Analysis Tools and Portals
UCSC Genome Browser
Major genome browser for viewing assemblies, annotations, tracks, and comparative genomics data.
UCSC Genome Browser Gateway
Direct entry point for selecting assemblies and navigating genomic regions of interest.
UCSC Genome Browser Training
Tutorials, workshops, and teaching materials for effective use of the UCSC Genome Browser.
UCSC Genome Browser User’s Guide
Documentation for browser navigation, track configuration, and genomic data visualization.
IGV
High-performance interactive viewer for genomic alignments, variants, annotations, and other omics tracks.
IGV Web App
Browser-based version of IGV for quick interactive data exploration and sharable sessions.
IGV Desktop Documentation
Documentation and usage guidance for the desktop Integrative Genomics Viewer.
STRING
Protein association network resource for functional interactions, evidence channels, and enrichment analysis.
STRING API
Programmatic access to STRING interaction networks and supporting evidence.
Reactome
Open, curated, and peer-reviewed pathway database for pathway browsing, analysis, and systems biology context.
Reactome Pathway Browser
Interactive pathway exploration interface for navigating curated biological pathways.
Reactome Analysis Tools
Pathway analysis entry point for gene lists, expression results, and other molecular datasets.
Reactome Downloads
Download access for pathway data and standard export formats.
cBioPortal
Cancer genomics portal for exploring mutations, copy number changes, expression, and clinical associations across studies.
cBioPortal Studies
Browsable collection of cancer genomics studies and cohorts available in cBioPortal.
Sequence Search and Alignment
NCBI BLAST
Standard sequence similarity search tool for comparing nucleotide or protein sequences against major public databases.
NCBI Nucleotide BLAST
BLAST entry point for nucleotide-versus-nucleotide sequence searches.
NCBI Protein BLAST
BLAST entry point for protein-versus-protein sequence similarity searches.
EMBL-EBI NCBI BLAST
Alternative BLAST interface through the EMBL-EBI job dispatcher framework.
HMMER
Sensitive homology search tools for sequence and profile-based searches against protein databases.
EMBL-EBI Job Dispatcher
Portal for sequence search and alignment tools provided by EMBL-EBI through web and programmatic interfaces.
Clustal Omega
Widely used multiple sequence alignment tool for protein and nucleotide sequences.
UniProt Align Tool
Convenient sequence alignment entry point integrated with UniProt and powered by Clustal Omega.
MAFFT
High-quality multiple sequence alignment software with options ranging from fast large-scale alignment to more accurate modes.
MAFFT Documentation
Manual and usage guide for running MAFFT with different alignment strategies and parameters.
Variants and Clinical Interpretation
ClinVar
Public archive of human variants with disease and drug-response interpretations and supporting evidence.
ClinVar Introduction
Overview of ClinVar content, submission model, and interpretation framework.
ClinVar Search Help
Guide for searching ClinVar by gene, HGVS notation, genomic region, and related fields.
ClinVar Advanced Search
Fielded search builder for more precise variant and phenotype queries.
gnomAD
Large population reference database for human genetic variation and allele frequencies.
gnomAD News and Releases
Release notes and updates for new datasets, browser features, and frequency resources.
dbSNP
NCBI archive for single nucleotide polymorphisms and related small genetic variants.
dbSNP Release Updates
Recent NCBI update page for dbSNP build releases and expanded variant content.
Ensembl VEP
Variant Effect Predictor for annotating SNPs, indels, CNVs, and structural variants against genes, transcripts, proteins, and regulatory regions.
Ensembl VEP Web Interface
Browser-based VEP workflow for interactive variant annotation and filtering.
Ensembl VEP Command Line
Command-line documentation for scalable variant annotation and plugin-based workflows.
Expression and Pathway Enrichment Tools
GSEA / MSigDB
Gene set enrichment analysis platform and home of the Molecular Signatures Database for enrichment-based interpretation of expression results.
MSigDB
Curated collection of human and mouse gene sets for pathway, perturbation, ontology, and cell-state enrichment analyses.
MSigDB Human Collections
Browse hallmark, pathway, perturbation, ontology, and regulatory gene-set collections for human analyses.
GSEA Downloads
Download the GSEA desktop software and related resources for enrichment analysis workflows.
GSEA Documentation
User guide and documentation for running and interpreting GSEA analyses.
DAVID
Widely used functional annotation and enrichment resource for interpreting gene lists with biological context.
DAVID Functional Annotation
Entry point for annotation, clustering, and enrichment-oriented analysis tools in DAVID.
Enrichr
Interactive web tool for gene-list enrichment analysis across many pathway, ontology, and perturbation libraries.
Enrichr Libraries
Overview of available enrichment libraries, including pathways, ontologies, transcription factors, and perturbation signatures.
clusterProfiler
Bioconductor package for statistical enrichment analysis and visualization of functional profiles for genes and gene clusters.
clusterProfiler Documentation
Package landing page with vignettes, reference manual, and installation details.
DOSE
Bioconductor package for disease ontology semantic analysis and enrichment workflows that often complement clusterProfiler.
Single-Cell and Spatial Analysis Tools
Scanpy
Scalable Python toolkit for preprocessing, visualization, clustering, and downstream analysis of single-cell gene expression data.
Scanpy Tutorials
Curated tutorials covering core workflows, visualization, trajectory analysis, and spatial data analysis with Scanpy.
Seurat
Widely used R toolkit for QC, analysis, integration, and exploration of single-cell transcriptomic data.
Seurat Getting Started
Introductory tutorials and walkthroughs for building standard Seurat-based single-cell workflows.
Seurat Guided Clustering Tutorial
Classic end-to-end tutorial for preprocessing, clustering, and visualizing single-cell RNA-seq data in Seurat.
OSCA: Orchestrating Single-Cell Analysis with Bioconductor
Comprehensive Bioconductor book for processing, analyzing, visualizing, and exploring single-cell RNA-seq data.
SingleCellExperiment
Core Bioconductor data structure for storing expression matrices, cell metadata, and feature annotation in synchronized form.
Bioconductor Single-Cell Installation Guide
Practical setup guide for getting started with single-cell analysis in R and Bioconductor.
Squidpy
Python toolkit for the analysis and visualization of spatial molecular data built on top of Scanpy and AnnData.
Squidpy Tutorials
Tutorial collection demonstrating core Squidpy workflows on diverse spatial datasets.
Squidpy API
Reference documentation for graph, image, and spatial analysis functions in Squidpy.
Visium Analysis with Squidpy
Worked example for analyzing 10x Visium spatial transcriptomics data with tissue-image context.
Cloud and Workflow Platforms
Galaxy Project
Open, web-based platform for accessible, reproducible, and transparent computational biomedical research.
UseGalaxy Servers
Public Galaxy servers with synchronized tools and reference resources for broad community use.
Galaxy Training Network
Extensive tutorials and training materials for learning Galaxy-based workflows and analyses.
Terra
Cloud-native platform for biomedical data access, analysis, and collaboration at scale.
Terra Overview
Getting-started overview of Terra’s data, analysis, and collaborative workflow model.
Terra Interactive Analysis
Entry point for notebooks, cloud environments, and interactive analysis workflows on Terra.
Dockstore
Free and open-source platform for sharing reusable and scalable analytical tools and workflows.
Dockstore Quick Start
Practical introduction to finding, launching, and developing tools and workflows on Dockstore.
Dockstore API / Swagger
Programmatic access to Dockstore resources for integration and workflow automation.
nf-core
Community framework for high-quality open-source Nextflow components and pipelines.
nf-core Pipelines
Browse currently available nf-core pipelines across many common genomics and bioinformatics applications.
nf-core Configs
Collection of institutional and platform-specific configuration profiles for running nf-core workflows.
Snakemake
Widely used workflow system for reproducible and scalable data analysis.
Snakemake Documentation
Documentation for building, scaling, and deploying Snakemake workflows.
ML and AI Toolkits
PyTorch
Widely used open-source deep learning framework for tensor computation, neural networks, and research-oriented model development.
PyTorch Tutorials
Official tutorials for tensors, neural networks, training workflows, and model deployment in PyTorch.
PyTorch Ecosystem
Overview of libraries, tools, and projects built around the PyTorch framework.
TensorFlow
End-to-end machine learning platform for model development, training, and deployment across research and production settings.
TensorFlow Tutorials
Official tutorials covering neural networks, structured data, time series, images, and text.
Keras
High-level deep learning API for building and training neural networks with a streamlined interface.
Keras Examples
Collection of practical examples for deep learning workflows across images, text, timeseries, graphs, and generative models.
XGBoost
High-performance gradient boosting framework widely used for classification, regression, ranking, and feature-rich tabular data.
XGBoost Python Package
Documentation for building, training, and tuning XGBoost models in Python.
XGBoost R Package
R interface for gradient boosting workflows with XGBoost.
LightGBM
Gradient boosting framework optimized for speed and efficiency on large-scale tabular datasets.
CatBoost
Gradient boosting toolkit designed for strong performance on tabular data, including categorical features.
scikit-learn
Core Python machine learning library for classification, regression, clustering, preprocessing, and model selection.
scikit-learn User Guide
Comprehensive guide to estimators, preprocessing pipelines, evaluation, and common ML workflows.
Hugging Face
Major platform for sharing AI models, datasets, inference tools, and application workflows.
Hugging Face Models
Searchable model hub including language, vision, multimodal, and biology-relevant foundation models.
Hugging Face Datasets
Repository of datasets for machine learning, LLM, multimodal, and computational biology workflows.
Weights & Biases
Experiment tracking, model evaluation, and collaboration platform for machine learning workflows.
MLflow
Open-source platform for experiment tracking, model packaging, and ML workflow management.
ONNX
Open format and ecosystem for interoperable machine learning models across frameworks and runtimes.
tidymodels
R ecosystem for modeling, preprocessing, tuning, workflows, and reproducible machine learning pipelines.
caret
Classic R toolkit for streamlined model training and evaluation across many machine learning methods.
torch for R
Native R interface for tensors, autodiff, neural networks, and deep learning workflows inspired by PyTorch.
torch for R - Get Started
Introductory documentation and installation guide for deep learning in R with torch.
keras3 for R
R interface for Keras workflows, including deep learning model construction and training.
tensorflow for R
R interface and guides for TensorFlow-based machine learning workflows.
LLMs and Agent Frameworks
LangChain
Framework for building LLM-powered applications and agents with model integrations, memory, tools, and orchestration components.
LangChain Quickstart
Fast entry point for building a first LangChain-based application or agent.
LangChain Learn
Tutorials, conceptual guides, and learning resources for LangChain and LangGraph workflows.
LlamaIndex
Framework for building LLM-powered agents and retrieval workflows over external documents, databases, and private data.
LlamaIndex Quickstart
Getting-started entry point for building LlamaIndex applications in Python or TypeScript.
LlamaIndex Workflows
Guide to building multi-step agent workflows and retrieval pipelines with LlamaIndex.
Ollama
Platform for running and managing open models locally for chat, embeddings, coding, and multimodal workflows.
Ollama Download
Installation entry point for macOS, Linux, and Windows.
Ollama Model Library
Browsable catalog of open models available through Ollama.
vLLM
High-performance inference and serving engine for large language models.
vLLM Quickstart
Quickstart guide for installing and running vLLM for inference and serving.
vLLM Supported Models
Reference list of model architectures supported by vLLM.
Data Science and Visualization
pandas
Core Python library for tabular data manipulation, analysis, and data-frame workflows.
pandas Getting Started
Introductory guides, tutorials, and quickstart material for learning pandas.
pandas User Guide
Topic-based documentation for data wrangling, missing values, joins, time series, and more.
NumPy
Foundational numerical computing library for Python centered on efficient multidimensional arrays.
NumPy User Guide
Overview of core array concepts, indexing, broadcasting, and numerical workflows in NumPy.
NumPy Absolute Beginners Guide
Beginner-friendly introduction to NumPy arrays and basic numerical operations.
SciPy
Scientific computing library for Python with tools for optimization, integration, interpolation, signal processing, and statistics.
SciPy User Guide
Guide to key SciPy concepts and common scientific computing workflows.
SciPy API Reference
Reference documentation for SciPy modules and public functions.
matplotlib
Comprehensive visualization library for static, animated, and interactive plotting in Python.
matplotlib Tutorials
Tutorial collection for plotting fundamentals, figures, axes, and customization workflows.
matplotlib Quick Start
Fast introduction to common plotting patterns and recommended usage.
seaborn
High-level statistical data visualization library built on top of matplotlib.
seaborn Tutorial
Guided introduction to seaborn plotting functions and statistical graphics workflows.
seaborn Example Gallery
Browsable gallery of common seaborn visualization patterns and examples.
Plotly for Python
Interactive graphing library for building publication-quality figures and dashboards in Python.
Plotly Getting Started
Setup and introductory material for creating interactive figures with Plotly.
Dash
Python framework for building interactive analytical web apps around Plotly-based visualizations.
Altair
Declarative visualization library for Python built on the Vega-Lite grammar.
Altair Getting Started
Introductory documentation for building chart specifications and visual analytics workflows in Altair.
Altair Example Gallery
Collection of example charts illustrating Altair’s declarative plotting model.