fit Function
The full keyword list, coupling description, key-based loading, and return values are maintained in the in-source docstring for fit. In the Julia REPL, use ?fit after using StochasticGene. This page is a short summary; details may lag the code.
fit fits steady-state or transient GM/GRSM models to data for a single gene (or coupled units), writes results through finalize, and returns fit results and diagnostics.
For coupled transcribing units, arguments transitions, G, R, S, insertstep, and trace-related settings become tuples of the single-unit type (e.g. two units with G = (2, 3)).
Syntax
fits, stats, measures, data, model, options = fit(; kwargs...)The keyword form is the usual entry point. Positional overloads exist for advanced / legacy callers; optional inference keywords on the positional path are forwarded into make_structures (see the _MAKE_STRUCTURES_OPTION_KW constant in fit.jl).
Inference methods
Posterior / variational inference is selected with inference_method (also accepted: plain symbols :mh, :nuts, :advi, or the aliases INFERENCE_MH, INFERENCE_NUTS, INFERENCE_ADVI):
| Method | Meaning |
|---|---|
:mh / INFERENCE_MH | Metropolis–Hastings MCMC (run_mh / metropolis_hastings); default. |
:nuts / INFERENCE_NUTS | NUTS HMC on the same transformed parameter space as MH (run_nuts_fit, AdvancedHMC). |
:advi / INFERENCE_ADVI | Mean-field ADVI (run_advi_fit); returns a variational approximation (see notes below). |
Shared budget keywords are harmonized in load_options when building option structs:
samplesteps: MH — posterior samples to collect; NUTS —n_samples; ADVI —maxiterunless you setmaxiterexplicitly in the run dict.warmupsteps: MH — discarded warmup (shares wall time with sampling viamaxtime); NUTS —n_adaptsunlessn_adaptsis set explicitly; ADVI — not the same semantic object (usen_mcfor ELBO Monte Carlo draws).
Other cross-method options stored on MHOptions, NUTSOptions, and ADVIOptions in common.jl:
device::cpuor:gpu(GPU paths may error if unsupported for a method).parallel(aliasparallelism)::single,:threaded, or:distributed— used withnchainsfor multi-chain NUTS/ADVI dispatch (run_inference); MH multi-chain still uses the existing distributed MH chain runner whennchains > 1.gradient: method-specific; e.g.:finite,:ForwardDiff,:Zygotefor NUTS/ADVI;:nonedefault for MH. String values in TOML/JSON-style dicts are coerced when possible.
make_structures merges _current_run_spec[] (from fit(; key=...)) with explicit samplesteps / warmupsteps / maxtime / temp and any extra kwargs..., then calls load_options on that dict. You can also call load_options directly in scripts or tests.
Arguments (high level)
Basic model parameters
G::Int,R::Int,S::Int,insertstep::Int,transitions: Model topology (see in-sourcefitdocstring).coupling,grid,hierarchical: Coupled / grid / hierarchical layouts.
Data parameters
datatype: String or symbol for a single legacy datatype, e.g."rna",:trace,"rnadwelltime","tracejoint". A tuple or vector such as(:rna, :dwelltime)requests the v1.10CombinedDatapath.datapath: Path, vector/tuple of paths, or forCombinedDatapreferably a modality-keyedNamedTuple, e.g.(rna = "smFISH", dwelltime = ["ON.csv", "OFF.csv"]).datacond,cell,gene,nalleles,trace_specs,dwell_specs, …
Fitting / inference parameters
nchains::Int: Number of parallel chains forfit(nchains, data, model, …)dispatch (run_inference); for MH this matches the existing pooled-chain behavior; for NUTS/ADVI, multi-chain runs are merged whennchains > 1(seesrc/inference_common.jlin the package source).maxtime: Primary wall budget for MH (warmup + sampling); numeric seconds or strings like"90m","2h"(seemaxtime_seconds). Interpretation for NUTS/ADVI is method-specific (many NUTS/ADVI controls are in the option structs fromload_options).samplesteps,warmupsteps,temp: As above;tempis MH temperature; NUTS/ADVI paths use a neutral value for finalize when not applicable.propcv: MH proposal CV / covariance loading (see Package overview).
Priors, indices, outputs
priormean,priorcv,noisepriors,fittedparam,fixedeffects,onstates,decayrate, …resultfolder,label,writesamples,burst,optimize, …
v1.10 API changes
CombinedData: New multimodal fits usedatatype = (:rna, :trace)ordatatype = (:rna, :dwelltime). The order is canonicalized, likelihoods are evaluated per modality, scalar likelihoods are summed, and WAIC pointwise predictions are concatenated in canonical modality order.- Retired legacy input keywords:
infolderandinlabelare no longer part of the public API. Useroot,datapath,label, andresultfolder. Older key-based run specs may still containinfolder;fit(; key=...)ignores it during migration. - Trace and dwell metadata: Prefer
trace_specsanddwell_specs. Legacytraceinfoanddttypeentries may be consumed from old run specs, then are dropped from newly written run specs.
Run specification and key-based naming
key = nothing: When a string,fitloadsinfo_<key>.jld2(companion to the marker TOML) and merges with explicit keywords (keywords win). Outputs use that stem. See Run specification (info TOML).
Returns
The top-level fit(; …) return tuple is:
fits, stats, measures, data, model, options = fit(; ...)fits:Fit— posterior samples (MH/NUTS) or variational mean column (ADVI);ll, WAIC-related fields, acceptance summaries depend on the method.stats,measures:Stats,Measures— comparable structs across methods where meaningful (ADVI may use single-point proxies for some diagnostics; see therun_advi_fitdocstring in the source).data,model,options: The concreteOptionssubtype isMHOptions,NUTSOptions, orADVIOptionsdepending oninference_method.
Examples
Basic RNA histogram (MH default)
fits, stats, measures, data, model, options = fit(
G = 2,
R = 0,
transitions = ([1,2], [2,1]),
datatype = "rna",
datapath = "data/HCT116_testdata/",
gene = "MYC",
datacond = "MOCK",
)NUTS (same keyword surface; different options type)
fits, stats, measures, data, model, options = fit(
G = 2,
R = 0,
transitions = ([1,2], [2,1]),
datatype = "rna",
datapath = "data/HCT116_testdata/",
gene = "MYC",
datacond = "MOCK",
inference_method = :nuts,
samplesteps = 500,
warmupsteps = 250,
parallel = :single,
gradient = :ForwardDiff,
)Combined RNA + dwell-time data
fits, stats, measures, data, model, options = fit(
datatype = (:rna, :dwelltime),
datapath = (
rna = "HBEC_smFISH",
dwelltime = [
"dwelltime/CANX_ON.csv",
"dwelltime/CANX_OFF.csv",
"dwelltime/CANX_ONG.csv",
],
),
gene = "CANX",
cell = "HBEC",
datacond = "",
resultfolder = "FISH",
dwell_specs = [
(
unit = 1,
onstates = [Int[], Int[], [2, 3]],
dttype = ["ON", "OFF", "ONG"],
),
],
)For new scripts, the keyed datapath form is preferred over the legacy positional "rnadwelltime" layout. See v1.10 CombinedData API.
Trace fit with multiple MH chains
fits, stats, measures, data, model, options = fit(
G = 3,
R = 2,
S = 2,
insertstep = 1,
transitions = ([1,2], [2,1], [2,3], [3,1]),
datatype = "trace",
datapath = "data/testtraces",
cell = "TEST",
gene = "test",
datacond = "testtrace",
trace_specs = [(unit = 1, interval = 1.0, start = 1.0, t_end = -1.0, zeromedian = true, active_fraction = 1.0, background = 0.0)],
noisepriors = [40., 20., 200., 10.],
nchains = 4,
)Notes
Rate order — G transitions, R transitions, S if present, decay, noise parameters (see package overview).
Convergence — R-hat, ESS, etc. apply to sample-based chains; interpret ADVI diagnostics separately.
MH proposal covariance and warmup — See Package overview.
Key-based workflows — Use
key="..."for reproducible cluster runs andwrite_run_spec_preset/makeswarmfiles; includeinference_method/parallel/gradientin the saved dict when needed.Cluster scripts —
makeswarmand related helpers can emitfit(; key=..., inference_method=:nuts, …)overrides; positional gene/coupled scripts append; kw=...suffixes for the same keywords. See Cluster and batch workflows.Migration from older inputs — Replace
infolderwithdatapath; replaceinlabelusage withlabel; keep output routing inresultfolder. Prefertrace_specs/dwell_specsovertraceinfo/dttype.