GeneTrail RESTful API

GeneTrail is fully scriptable via a RESTful API. This allows our users to easily process larger enrichment studies in an automated fashion or to integrate GeneTrail into existing tools. As the API solely relies on standard HTTP requests no special libraries or software is required and bindings for any programming language can be created. In the following we will introduce the basic concepts needed for working with the API. If you are looking for the documentation of all implemented methods see our API reference.

Introduction

RESTful API represent resources as URLs on the server. Actions on the URLs are usually conducted using the standard HTTP verbs GET, POST, PUT, and DELETE. For example GeneTrail is focused on the concept of Sessions, Jobs, and Resources. Sessions are a collection of Resources such as score lists, expression matrices, categories, etc. They are usually produced by executing a Job. For obtaining a new Session object in GeneTrail we can use the following request:

GET /api/session

This might create the following JSON formatted output:

{ "session": "116654b3-7b2e-489b-ab66-2a377b9928c7" }

Using this session identifier we can now upload files, start computations, and display results. If a session is no longer needed we can delete it and all its contents by issuing the request

DELETE /api/session/116654b3-7b2e-489b-ab66-2a377b9928c7

For displaying a resource we can use the following request:

GET /api/resource/988?session=b0e9a4aa-345d-45f1-bc37-0e059ccd907c

Here session=b0e9a4aa-345d-45f1-bc37-0e059ccd907c specifies the session from which the resource should be retrieved and 988 is the identifier of the Resource object. The generated response looks like this:

{
  "id": 988,
  "session": "b0e9a4aa-345d-45f1-bc37-0e059ccd907c",
  "createdBy": "max-mean",
  "organism": 9606,
  "comment": "",
  "metadata": {
    "significance": 0.05,
    "input_file": 985,
    "parameters": {
      "significance": "0.05",
      "adjustment": "benjamini_hochberg",
      "minimum": "3",
      "maximum": "500",
      "permutations": "1000000",
      "input_file": "985",
      "adjustSeparately": "true",
      "algorithm": "max-mean"
    },
    "warnings": [],
    "algorithm": "max-mean"
  },
  "shared": false,
  "intermediate": false,
  "normalized": true,
  "displayName": "mRNA - Blastemal vs. Non-Blastemal - Max-Mean",
  "mediaType": "application/zip",
  "type": "Enrichment",
  "creationDate": 1430830959361,
  "identifier": "Gene-Symbol",
  "modificationDate": 1431681088991,
  "algorithm": "max-mean",
  "pipeline": {}
}

Example

So how does a script using the GeneTrail API in practice? Suppose you want to compute multiple enrichments from a matrix of gene expression values. For accomplishing this we will write a Python script. We start with the main procedure that calls some helper functions that do the actual work.

import json

# Load method definitions
from graviton import *
# Load assignment of samples into groups
from dataGroups import *

# Obtain a session
key = getSession()

# Upload the input data to the server
matrixId = uploadFile(key, 'matrix.txt')['id']

# Compute scores for the input data and the
# data groups using the shrinkage-t-test
# The first call will only create the job object
# on the server, but will not yet compute anything.
setupScoring(key, 'independent-shrinkage-t-test',
  file1 = matrixId,
  sg = json.dumps(groups['sg']),
  rg = json.dumps(groups['rg'])
)

# Run the actual computation
scores = runJob(key)['scores']['id'];

# Create a list of categories for which we
# want to compute our enrichments
categories = [
  '9606-gene-go-biologicalprocess',
  '9606-gene-kegg-pathways',
  '9606-gene-reactome-pathways',
  '9606-gene-pfam-proteinfamilies',
]

# Create and run the job for the enrichment.
# We use the GSEA algorithm here.
setupEnrichment(key, 'gsea', scores, categories)
result = runJob(key)['enrichment']['id']

# Download and store the results
downloadResult(key, result, 'mrnaAllSamples.gsea.zip')

You can download the example input file here. The dataGroups.py script can be downloaded here.

The methods getSession, uploadFile, setupScoring, setupEnrichment, runJob, and downloadResult are defined in the graviton module that is available from Github. You can install it via pip install git+git://github.com/dstoeckel/Graviton.py.git.

Bindings

In order to facilitate using GeneTrail from other languages, we created a set of bindings. Currently, bindings are available for: