library(dyno)
library(tidyverse)
dynwrap::infer_trajectory
is the main function to infer a trajectory. It requires two things:
dynwrap::wrap_expression
dataset <- wrap_expression(
counts = example_dataset$counts,
expression = example_dataset$expression
)
model <- infer_trajectory(dataset, ti_comp1())
## Loading required namespace: hdf5r
This model now contains the main information on the trajectory, i.e. the milestone_network
and progressions
:
model$milestone_network
## from to length directed
## 1 milestone_begin milestone_end 1 FALSE
head(model$progressions, 10)
## cell_id from to percentage
## 1 Cell1 milestone_begin milestone_end 0.25088041
## 2 Cell2 milestone_begin milestone_end 0.36976202
## 3 Cell3 milestone_begin milestone_end 0.56873343
## 4 Cell4 milestone_begin milestone_end 0.91506325
## 5 Cell5 milestone_begin milestone_end 0.21259337
## 6 Cell6 milestone_begin milestone_end 0.91000325
## 7 Cell7 milestone_begin milestone_end 0.90920327
## 8 Cell8 milestone_begin milestone_end 0.64542720
## 9 Cell9 milestone_begin milestone_end 0.60957610
## 10 Cell10 milestone_begin milestone_end 0.06369673
While running methods inside a docker or singularity container reduces problems with dependencies and makes an analysis more reproducible, it can also create a considerable overhead. We plan to wrap wrap some more “popular” methods directly into R. See https://github.com/dynverse/dynmethods/issues/152 for an overview.
Optionally, you can also give it some parameters. These are all documented within the relevant functions in dynmethods (also available in the reference section):
?ti_comp1
## Component 1
##
## Description:
##
## Will generate a trajectory using Component 1.
##
## This method was wrapped inside a container.
##
## Usage:
##
## ti_comp1(dimred = "pca", ndim = 2L, component = 1L)
##
## Arguments:
##
## dimred: Which dimensionality reduction method to use. Domain: pca,
## mds, tsne, ica, lle, landmark_mds, mds_sammon, mds_isomds,
## mds_smacof, umap, dm_diffusionMap. Default: pca. Format:
## character.
##
## ndim: . Domain: U(2, 30). Default: 2. Format: integer.
##
## component: . Domain: U(1, 10). Default: 1. Format: integer.
##
## Value:
##
## A TI method wrapper to be used together with 'infer_trajectory'
model <- infer_trajectory(dataset, ti_comp1(dimred = "landmark_mds"))
The method will error if it requires some prior information that is not provided in the dataset. Some methods also have optional prior information, in which case you can give them to the method using the give_priors
argument:
dataset <- dataset %>% add_prior_information(start_id = example_dataset$prior_information$start_id)
model <- infer_trajectory(dataset, ti_comp1(), give_priors = c("start_id"))
An overview of all possible priors is given in the dynwrap tibble:
dynwrap::priors
prior_id | name | description | type |
---|---|---|---|
start_id | Start cell(s) | One or more start cell identifiers | soft |
end_id | End cell(s) | One or more end cell identifiers | soft |
end_n | # end states | The number of end states | soft |
start_n | # start states | The number of start states | soft |
groups_id | Cell clustering | Named character vector linking the cell identifiers to different states/branches | hard |
groups_n | # states | Number of states/branches, including start, end and intermediary states | soft |
groups_network | State network | Dataframe containing the known network between states/branches. Contains a from and to column | hard |
timecourse_continuous | Time course (continuous) | Named numeric vector linking the cell ids to time points | hard |
timecourse_discrete | Time course (discrete) | Named numeric vector linking the cell ids to time course points | hard |
features_id | Marker genes | Genes/features known to be important in the dynamic process | soft |
dataset | The full dataset | The full dataset, including (if available) the gold standard | hard |
You can find all methods that can use a certain type of prior by filtering the dynmethods::methods
. For example, to find all method wrappers that can use timecourse data:
dynmethods::methods %>%
filter(map_lgl(wrapper_inputs, ~ any(c("timecourse_discrete", "timecourse_continuous") %in% .$input_id))) %>%
select(method_id, method_name)
## # A tibble: 2 x 2
## method_id method_name
## <chr> <chr>
## 1 grandprix GrandPrix
## 2 scuba SCUBA
To make the execution of a method reproducible, fix the seed either using set.seed()
or through the seed
argument of infer_trajectory
:
set.seed(1)
model <- dynwrap::infer_trajectory(dataset, ti_comp1())
Often it is useful to run multiple methods and/or use multiple datasets. While you can easily parallelise this yourself, we provide a helper function for this: dynwrap::infer_trajectories()
. This function integrates well with future interpretation and plotting functions.
models <- dynwrap::infer_trajectories(
dataset,
method = list(ti_comp1(), ti_angle())
)
This function generates a data frame containing the different models and extra information.
models
## # A tibble: 2 x 7
## dataset_ix method_ix dataset_id method_id method_name model summary
## <int> <int> <chr> <chr> <chr> <list> <list>
## 1 1 1 20190409_163… comp1 Component 1 <S3: dy… <tibbl…
## 2 1 2 20190409_163… angle Angle <S3: dy… <tibbl…
Some methods can generate errors which are beyond our control. To know what and where a method generated an error, you can turn on the verbosity:
dataset$expression@x[] <- NA
model <- dynwrap::infer_trajectory(dataset, ti_comp1(), give_priors = c("start_id"), verbose = TRUE)
## Error: Error during trajectory inference
## Loading required package: dynutils
## Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x'
## Calls: <Anonymous> ... <Anonymous> -> <Anonymous> -> prcomp.default -> svd
## Execution halted
## Executing 'comp1' on '20190409_163010__data_wrapper__pbfhZY5hhE'
## With parameters: list(dimred = "pca", ndim = 2L, component = 1L),
## inputs: expression, and
## priors :
## Input saved to /tmp/Rtmp50BQLE/file44714125ec76/ti
## Running method using babelwhale
## Running /usr/bin/docker run -e 'TMPDIR=/tmp2' --workdir /ti/workspace -v \
## '/tmp/Rtmp50BQLE/file44714125ec76/ti:/ti' -v \
## '/tmp/Rtmp50BQLE/file447173b4375f/tmp:/tmp2' \
## 'dynverse/ti_comp1:v0.9.9.01' --dataset /ti/input.h5 --output \
## /ti/output.h5
## Loading required package: dynutils
## Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x'
## Calls: <Anonymous> ... <Anonymous> -> <Anonymous> -> prcomp.default -> svd
## Execution halted
It is also possible to run each method in dynmethods from the command-line:
docker run dynverse/ti_comp1 --help
## Usage: LOCAL=/path/to/folder; MOUNT=/ti; docker run -v $LOCAL:$MOUNT dynverse/ti_comp1
##
##
## Options:
## -h, --help
## Show this help message and exit
##
## --dataset=DATASET
## Filename of the dataset, example: $MOUNT/dataset.(h5|loom). h5 files can be created by cellranger or dyncli.
##
## --loom_expression_layer=LOOM_EXPRESSION_LAYER
## If available, the name of the loom-layer containing normalised log-transformed data.
##
## --output=OUTPUT
## Filename of the output trajectory data, example: $MOUNT/output.h5.
##
## --parameters=PARAMETERS
## A file containing method-specific parameters, example: $MOUNT/parameters.(h5|yml).
##
## --dimred=DIMRED
## Parameter: Which dimensionality reduction method to use
## Domain: {pca, mds, tsne, ica, lle, landmark_mds, mds_sammon, mds_isomds, mds_smacof, umap, dm_diffusionMap}
## Default: pca
## Format: character
##
## --ndim=NDIM
## Parameter:
## Domain: U(2, 30)
## Default: 2
## Format: integer
##
## --component=COMPONENT
## Parameter:
## Domain: U(1, 10)
## Default: 1
## Format: integer
##
## --verbosity=VERBOSITY
## The verbosity level: 0 => none, 1 => critical (default), 2 => info, 3 => debug.
##
## --seed=SEED
## A seed to be set to ensure reproducability.
This container will output an HDF5 file, which can be read by tools such as dynutils::read_h5()
.