fit Function
Fit steady state or transient GM/GRSM model to RNA data for a single gene, write the result (through function finalize), and return fit results and diagnostics.
For coupled transcribing units, arguments transitions, G, R, S, insertstep, and trace become tuples of the single unit type, e.g. If two types of transcription models are desired with G=2 and G=3 then G = (2,3).
Syntax
fits = fit(; kwargs...)Arguments
Basic Model Parameters
G::Int = 2: Number of gene statesR::Int = 0: Number of pre-RNA stepsS::Int = 0: Number of splice sites (must be ≤ R - insertstep + 1)insertstep::Int = 1: Reporter insertion steptransitions::Tuple: Tuple of vectors specifying state transitions
Data Parameters
datatype: Data type (supported: :rna, :rnacount, :rnaonoff, :rnadwelltime, :dwelltime, :trace, :tracerna, :tracejoint, :tracegrid)datapath: Path to data filesroot: Root directory containing data and results foldersgene: Gene name(s)cell: Cell typedecayrate: mRNA decay rate (if < 0, will be looked up)
Coupled Models
coupling: Coupling structure between unitsnalleles: Number of alleles (automatically set to 1 if coupling is nonempty)
Transition Types
TransitionType: String specifying transition pattern (e.g., "KP", "cyclic")
Returns
- Model parameters
- Posterior distributions
- Predicted data
- Goodness of fit metrics
- Model diagnostics
- Analysis results
Notes
- For coupled models, many parameters become tuples (e.g., G = (2,3) for two units with 2 and 3 gene states)
- The number of splice sites S cannot exceed R - insertstep + 1
- If coupling is used, nalleles is automatically set to 1
- Default transitions can be generated using the
get_transitionsfunction
Data Parameters
datatype::String = "rna": Data type:- "rna": mRNA count distributions
- "trace": Intensity traces
- "rnadwelltime": Combined RNA and dwell time data
- "tracejoint": Simultaneous recorded traces
datapath::String = "": Path to data file or folderdatacond::String = "": Data condition identifiercell::String = "": Cell typegene::String = "MYC": Gene namenalleles::Int = 1: Number of alleles
Fitting Parameters
nchains::Int = 2: Number of MCMC chainsmaxtime::Float64 = 60: Maximum wall time (minutes)samplesteps::Int = 1000000: Number of MCMC sampling stepswarmupsteps::Int = 0: Number of warmup stepspropcv::Float64 = 0.01: Proposal distribution coefficient of variationannealsteps::Int = 0: Number of annealing stepstemp::Float64 = 1.0: MCMC temperature
Prior Parameters
priormean::Vector{Float64} = Float64[]: Mean rates for prior distributionpriorcv::Float64 = 10.0: Prior distribution coefficient of variationnoisepriors::Vector = []: Observation noise priorsfittedparams::Vector{Int} = Int[]: Indices of rates to fitfixedeffects::Tuple = tuple(): Fixed effects specificationhierarchical::Tuple = tuple(): Hierarchical model specification
Trace Parameters (for trace data)
traceinfo::Tuple = (1.0, 1., -1, 1.): Trace parameters:- Frame interval (minutes)
- Starting frame time (minutes)
- Ending frame time (-1 for last)
- Fraction of active traces
datacol::Int = 3: Data column indexprobfn::Function = prob_Gaussian: Observation probability functionnoiseparams::Int = 4: Number of noise parameterszeromedian::Bool = false: Subtract median from traces
Output Parameters
resultfolder::String = "test": Results output folderlabel::String = "": Output file labelinfolder::String = "": Folder for initial parametersinlabel::String = "": Label of initial parameter fileswritesamples::Bool = false: Write MCMC samples
Returns
fits: MCMC fit results containing:- Posterior samples
- Log-likelihoods
- Acceptance rates
Examples
Basic RNA Histogram Fit
fits = fit(
G = 2,
R = 0,
transitions = ([1,2], [2,1]),
datatype = "rna",
datapath = "data/HCT116_testdata/",
gene = "MYC",
datacond = "MOCK"
)Trace Data Fit
fits = fit(
G = 3,
R = 2,
S = 2,
insertstep = 1,
transitions = ([1,2], [2,1], [2,3], [3,1]),
datatype = "trace",
datapath = "data/testtraces",
cell = "TEST",
gene = "test",
datacond = "testtrace",
traceinfo = (1.0, 1., -1, 1.),
noisepriors = [40., 20., 200., 10.],
nchains = 4
)Notes
Rate Order
- G transitions
- R transitions
- S transitions
- Decay
- Noise parameters
MCMC Convergence
- R-hat should be close to 1 (ideally < 1.05)
- Increase
maxtimeorsamplestepsif R-hat is high - Use
warmupstepsto improve proposal distribution