SFrontiers.jl: Stochastic Frontier Model Estimation: Simulation-Based and Analytic Methods

User Manual for Julia Implementation

© Hung-Jen Wang
wangh@ntu.edu.tw


What is it

SFrontiers.jl is a Julia package for flexible estimation of stochastic frontier models. It uses simulated maximum likelihood, automatic differentiation, and GPU acceleration to deliver accurate estimation with practical runtime. The package supports three distributions for the noise term (v), eight for the inefficiency term (u), copula-based dependence, and selected panel-data specifications, all within a unified workflow. Its estimates typically match closed-form MLE benchmarks to several decimal places.


Table of Contents

  1. Introduction
  2. Hardware and Software Requirements
  3. Installation and Dependencies
  4. Quick Start and Reference Example
  5. A Detailed Empirical Example
  6. API Reference
  7. Supported Models
  8. Special Topics
  9. Panel Data Models

1. Introduction

This package provides a unified framework for estimating stochastic frontier (SF) models via simulation-based likelihood evaluation and, for a limited subset of models, analytic maximum likelihood estimation (MLE). The simulation-based methods, including the Maximum Simulated Likelihood Estimation (MSLE) and Monte Carlo Integration (MCI), support a broad class of distributional and dependence specifications. For the classical Normal–half-normal, Normal–truncated-normal, and Normal–Exponential models, closed-form analytic MLE is also available.

To address the traditional drawbacks of simulation-based estimation, the package uses automatic differentiation for numerical accuracy and GPU computing for speed. Monte Carlo evidence in Wang and Cheng (2026) shows that the resulting estimators achieve bias and RMSE comparable to those of analytic maximum likelihood estimators where such benchmarks exist.

For the theoretical background of the methods, see Wang and Cheng (2026).

Key features of the package:


2. Hardware and Software Requirements

2.1 Hardware

2.2 Software


3. Installation and Dependencies

SFrontiers.jl is a registered Julia package. Install it from Julia using the built-in package manager Pkg:

using Pkg
Pkg.add("SFrontiers")

Load SFrontiers.jl and the required package for CPU-only computing:

# CPU-only usage (no CUDA needed):
using SFrontiers
using Optim

Load the packages for GPU computing:

# GPU usage:
using CUDA          # Load CUDA first
using SFrontiers    # Then load SFrontiers
using Optim

Note 1: CUDA.jl is needed only if you want GPU-accelerated estimation via sfmodel_method(method=:MCI, GPU=true) or sfmodel_method(method=:MSLE, GPU=true). MLE estimation does not use GPU.

Note 2: If you plan to use GPU features, CUDA.jl must be loaded before SFrontiers.jl. This is because SFrontiers conditionally detects CUDA at load time and registers GPU function overloads only if CUDA is already available. If CUDA is loaded after SFrontiers, GPU features will not be available and you will need to restart Julia and load them in the correct order.

Note 3: If you want to use the diagnostic plot recipes (e.g., plot(result, :efficiency)), you must also load Plots.jl by using Plots.


4. Quick Start and Reference Example

Here we show a complete five-step estimation example. Some steps are optional and can be skipped; more examples latter.

using CUDA
using SFrontiers
using Optim
using CSV, DataFrames

# Load data. The csv has column names y, x1, x2, and x3 that are used as variable names.
df = CSV.read("demodata.csv", DataFrame)
yvar = df.y
X = hcat(ones(length(y)), df.x1, df.x2)  # Frontier variables (include constant)
Z = hcat(ones(length(y)), df.x3)          # Inefficiency determinants

# Step 1: Specify the model
myspec = sfmodel_spec(
    type = :production,        # for production frontier; alt. :cost
    depvar = yvar,             # dependent variable; a vector
    frontier = X,              # matrix of X vars in frontier
    zvar = Z,                  # matrix of Z vars for inefficiency
    noise = :Normal,           # dist of v
    ineff = :TruncatedNormal,  # dist of u
    hetero = [:mu, :sigma_sq], # heteroscedastic of u's μ & σ²
)

# alternative Step 1: DSL style using DataFrame column names
df._cons = ones(nrow(df))      # create a constant column first
myspec_alt = sfmodel_spec(
    type = :production,
    @useData(df),
    @depvar(y),                # column name in df
    @frontier(_cons, x1, x2),  # column names 
    @zvar(_cons, x3),          # column names
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu, :sigma_sq],
)

# Step 2: Choose the estimation method
mymeth = sfmodel_method(
    method = :MSLE,           # :MLE, :MSLE, or :MCI
    n_draws = 2^12,           # number of Halton draws per obs
    GPU = true,               # use GPU; default is `false` thus CPU
    chunks = 10               # for GPU memory management; default 10
)

# Step 3: Set initial values
myinit = sfmodel_init(
    spec = myspec,              # from sfmodel_spec()
    frontier = X \ yvar,        # OLS estimates
    mu = zeros(size(Z, 2)),     # μ coefficients
    ln_sigma_sq = zeros(size(Z, 2)),  # ln(σ²) coefficients
    ln_sigma_v_sq = [0.0]       # ln(σᵥ²)
)

# Step 4: Configure optimization for Julia's Optim
myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),  # first stage "warm-up" estimation
    warmstart_opt = (iterations = 200, g_abstol = 1e-5),
    main_solver = Newton(),           # 2nd stage "main" estimation
    main_opt = (iterations = 100, g_abstol = 1e-8)
)

# Step 5: Estimate the model
result = sfmodel_fit(
    spec = myspec,         # from sfmodel_spec()
    method = mymeth,       # from sfmodel_method()
    init = myinit,         # from sfmodel_init()
    optim_options = myopt, # from sfmodel_opt()
    marginal = true,       # marginal effects of Z; the default
    show_table = true      # print estimation table; the default
)

# Access results: two complementary styles

# (a) Direct field access on the result object:
println("coefficients:           ", result.coeff)
println("BC efficiency index:    ", result.bc)        # E(exp(-u)|ε), per obs
println("JLMS inefficiency idx:  ", result.jlms)      # E(u|ε), per obs
println("Marginal Effects of Z:  ", result.marginal)

# (b) StatsAPI-conformant accessor functions (work on `result`):
coeftable(result)              # formatted coefficient table (estimates, SE, z, p, CIs)
coef(result)                   # coefficient vector
vcov(result)                   # variance-covariance matrix
stderror(result)               # standard errors
confint(result; level = 0.95)  # Wald confidence intervals
loglikelihood(result)          # maximized log-likelihood
aic(result); bic(result)       # information criteria
nobs(result)                   # number of observations
residuals(result)              # composed errors (also `:u`, `:v`, `:ols` via `type=`)
fitted(result)                 # frontier fitted values x'β̂

# See Section 6 (API Reference) for further diagnostics
# (efficiency_summary, residual_diagnostics, sf_vs_ols, lrtest),
# plotting recipes, and prediction on new data.

For models with Normal noise and half-normal, truncated-normal, or Exponential inefficiency, analytic MLE is available and requires no simulation settings:

mymeth_mle = sfmodel_method(method = :MLE)  # no draws, GPU, or chunks needed
result_mle = sfmodel_fit(spec = myspec, 
                         method = mymeth_mle, 
                         init = myinit, 
                         optim_options = myopt)

5. A Detailed Empirical Example

We replicate the empirical study of Wang (2002) using the same dataset to walk through the full specification and estimation process. The model is a cross-sectional stochastic production frontier with normal noise and truncated-normal inefficiency, where both \(\mu\) and \(\sigma_u^2\) of the inefficiency distribution are parameterized by exogenous determinants.

Although this model admits a closed-form MLE (which was used by Wang 2002), we demonstrate estimation via MSLE to illustrate the simulation-based workflow. The MSLE estimates are extremely close to those of MLE in this example, with most of the estimates typically agreeing to at least 5 decimal places.

Model Setup

The specification is the follows:

\[ \begin{aligned} y_i &= x_i'\beta + \varepsilon_i, \\ \varepsilon_i &= v_i - u_i, \\ v_i &\sim N(0, \sigma_v^2), \\ u_i &\sim N^+(\mu_i, \sigma_{u,i}^2), \\ \mu_i &= z_i'\delta, \quad \sigma_{u,i}^2 = \exp(z_i'\gamma). \end{aligned} \]

Here, \(N^+(\mu, \sigma_u^2)\) denotes a truncated-normal distribution obtained by truncating \(N(\mu, \sigma_u^2)\) from below at 0. The vector \(z_i\) contains exogenous determinants of inefficiency. Wang (2002) parameterizes both \(\mu\) and \(\sigma_u^2\) by the same vector \(z_i\), while Battese and Coelli (1995) parameterize only \(\mu\).

Goals of Estimation

Our goals include:

Step 1: Load Data and Specify the Model

The data is rice farmers’ production in India. The dependent variable \(y\) is annual rice production and \(x\) is a vector of agricultural inputs.

using CUDA
using SFrontiers
using Optim
using DataFrames, CSV

df = CSV.read("sampledata.csv", DataFrame)
df._cons = ones(nrow(df))    # create a column of ones for intercepts

Let’s see what is in the data:

julia> describe(df)
11×7 DataFrame
 Row │ variable  mean       min       median    max       nmissing  eltype
     │ Symbol    Float64    Real      Float64   Real      Int64     DataType
─────┼───────────────────────────────────────────────────────────────────────
   1 │ yvar       7.27812    3.58666   7.28586   9.80335         0  Float64
   2 │ Lland      1.05695   -1.60944   1.14307   3.04309         0  Float64
   3 │ PIland     0.146997   0.0       0.0       1.0             0  Float64
   4 │ Llabor     6.84951    3.2581    6.72263   9.46622         0  Float64
   5 │ Lbull      5.64161    2.07944   5.68358   8.37008         0  Float64
   6 │ Lcost      4.6033     0.0       5.1511    8.73311         0  Float64
   7 │ yr         5.38007    1         5.0      10               0  Int64
   8 │ age       53.8856    26        53.0      90               0  Int64
   9 │ school     2.02583    0         0.0      10               0  Int64
  10 │ yr_1       5.38007    1         5.0      10               0  Int64
  11 │ _cons      1.0        1         1.0       1               0  Int64

Now we specify the model using the DSL macros:

myspec = sfmodel_spec(
    type   = :production,
    @useData(df),
    @depvar(yvar),
    @frontier(_cons, Lland, PIland, Llabor, Lbull, Lcost, yr),
    @zvar(_cons, age, school, yr),
    noise  = :Normal,
    ineff  = :TruncatedNormal,
    hetero = [:mu, :sigma_sq],    # both μ and σ²_u depend on zvar
)

Alternative: using matrix data

If data comes from matrices rather than a DataFrame (common in simulation studies), use keyword arguments directly:

y = df.yvar
X = hcat(ones(nrow(df)), df.Lland, df.PIland, df.Llabor, df.Lbull, df.Lcost, df.yr)
Z = hcat(ones(nrow(df)), df.age, df.school, df.yr)

myspec = sfmodel_spec(
    type     = :production,
    depvar   = y,         # dependent variable vector
    frontier = X,         # covariate matrix (include constant column)
    zvar     = Z,         # covariates for inefficiency determinants
    noise    = :Normal,
    ineff    = :TruncatedNormal,
    hetero   = [:mu, :sigma_sq],
    varnames = ["_cons", "Lland", "PIland", "Llabor", "Lbull", "Lcost", "yr",
                "_cons", "age", "school", "yr",
                "_cons", "age", "school", "yr",
                "_cons"],  # optional; auto-generated if omitted
)

Step 2: Choose the Estimation Method

mymeth = sfmodel_method(
    method  = :MSLE,     # or :MCI, :MLE
    n_draws = 2^13,      # number of Halton draws per observation
    GPU     = true       # set true for GPU acceleration
)

Step 3: Set Initial Values (optional)

myinit = sfmodel_init(
    spec          = myspec,
    # frontier    = ...,          # skip to use OLS-based defaults
    mu            = zeros(4),     # 4 coefficients in μ equation
    ln_sigma_sq   = zeros(4),     # 4 coefficients in ln(σ²_u) equation
    ln_sigma_v_sq = [0.0]         # 1 coefficient in ln(σ²_v)
)

Step 4: Configure the Optimizer (optional)

An effective strategy for challenging problems is a two-stage approach: a gradient-free solver first to find a good neighborhood, then a gradient-based solver for precise convergence.

myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt    = (iterations = 200, g_abstol = 1e-5),
    main_solver      = Newton(),
    main_opt         = (iterations = 100, g_abstol = 1e-8)
)

Step 5: Estimate the Model

result = sfmodel_fit(
    spec          = myspec,     # from sfmodel_spec()
    method        = mymeth,     # from sfmodel_method()
    init          = myinit,     # from sfmodel_init()
    optim_options = myopt,      # from sfmodel_opt()
    marginal      = true,       # compute marginal effects (default)
    jlms_bc_index = true,       # compute efficiency indices (default)
    show_table    = true        # print results to console (default)
)

Results and Post-Estimation Analysis

After estimation, sfmodel_fit() prints a formatted summary to the console. Below is sample output from the paddy-farmer example:

*********************************
      Estimation Results
*********************************
Method: MSLE
Model type: noise=Normal, ineff=TruncatedNormal
Heteroscedastic parameters: [:mu, :sigma_sq]
Number of observations: 271
Number of frontier regressors (K): 7
Number of Z columns (L): 4
Number of draws: 8192
Frontier type: production
GPU computing: true
Number of iterations: 16
Converged: true
Log-likelihood: -82.01844

┌──────────┬────────┬─────────┬──────────┬──────────┬────────┬─────────┬─────────┐
│          │   Var. │   Coef. │ Std.Err. │        z │  P>|z| │ 95%CI_l │ 95%CI_u │
├──────────┼────────┼─────────┼──────────┼──────────┼────────┼─────────┼─────────┤
│ frontier │  _cons │  1.5430 │   0.3578 │   4.3127 │ 0.0000 │  0.8418 │  2.2443 │
│          │  Lland │  0.2582 │   0.0725 │   3.5611 │ 0.0004 │  0.1161 │  0.4004 │
│          │ PIland │  0.1718 │   0.1761 │   0.9753 │ 0.3303 │ -0.1734 │  0.5169 │
│          │ Llabor │  1.1658 │   0.0840 │  13.8805 │ 0.0000 │  1.0012 │  1.3304 │
│          │  Lbull │ -0.4215 │   0.0596 │  -7.0666 │ 0.0000 │ -0.5384 │ -0.3046 │
│          │  Lcost │  0.0142 │   0.0128 │   1.1090 │ 0.2685 │ -0.0109 │  0.0394 │
│          │     yr │  0.0183 │   0.0095 │   1.9226 │ 0.0556 │ -0.0004 │  0.0369 │
│        μ │  _cons │  1.0415 │   0.7284 │   1.4298 │ 0.1540 │ -0.3862 │  2.4691 │
│          │    age │ -0.0479 │   0.0303 │  -1.5804 │ 0.1153 │ -0.1073 │  0.0115 │
│          │ school │ -0.2143 │   0.1712 │  -1.2521 │ 0.2117 │ -0.5497 │  0.1212 │
│          │     yr │  0.1480 │   0.1248 │   1.1854 │ 0.2369 │ -0.0967 │  0.3926 │
│   ln_σᵤ² │  _cons │ -1.1393 │   0.8902 │  -1.2798 │ 0.2018 │ -2.8842 │  0.6055 │
│          │    age │  0.0256 │   0.0096 │   2.6653 │ 0.0082 │  0.0068 │  0.0445 │
│          │ school │  0.1141 │   0.0569 │   2.0054 │ 0.0460 │  0.0026 │  0.2256 │
│          │     yr │ -0.2256 │   0.0496 │  -4.5500 │ 0.0000 │ -0.3228 │ -0.1284 │
│   ln_σᵥ² │ ln_σᵥ² │ -3.2668 │   0.2623 │ -12.4553 │ 0.0000 │ -3.7808 │ -2.7527 │
└──────────┴────────┴─────────┴──────────┴──────────┴────────┴─────────┴─────────┘

Log-parameters converted to original scale (σ² = exp(log_σ²)):
┌─────┬────────┬──────────┐
│     │  Coef. │ Std.Err. │
├─────┼────────┼──────────┤
│ σᵥ² │ 0.0381 │   0.0100 │
└─────┴────────┴──────────┘

Table format: text
***** Additional Information *********
* OLS (frontier-only) log-likelihood: -104.96993
* Skewness of OLS residuals: -0.70351
* The sample mean of the JLMS inefficiency index: 0.33416
* The sample mean of the BC efficiency index: 0.74619

* The sample mean of inefficiency determinants' marginal effects on E(u): (age = -0.00264, school = -0.01197, yr = -0.0265)
* Marginal effects of the inefficiency determinants at the observational level are saved in the return. See the follows.

* Use `name.list` to see saved results (keys and values) where `name` is the return specified in `name = sfmodel_MSLE_fit(..)`. Values may be retrieved using the keys. For instance:
   ** `name.loglikelihood`: the log-likelihood value of the model;
   ** `name.jlms`: Jondrow et al. (1982) inefficiency index;
   ** `name.bc`: Battese and Coelli (1988) efficiency index;
   ** `name.marginal`: a DataFrame with variables' (if any) marginal effects on E(u).
* Use `keys(name.list)` to see available keys.
**************************************

The returned result is a NamedTuple whose fields provide programmatic access to all outputs:

Key fields:

Field Description
result.coeff Full estimated parameter vector
result.std_err Asymptotic standard errors
result.var_cov_mat Variance–covariance matrix
result.table Formatted coefficient table
result.jlms JLMS inefficiency index\(E[u_i \mid \varepsilon_i]\)
result.bc Battese–Coelli efficiency index\(E[\exp(-u_i) \mid \varepsilon_i]\)
result.marginal Observation-level marginal effects (DataFrame)
result.marginal_mean Sample mean of marginal effects
result.loglikelihood Log-likelihood of the estimated model
result.OLS_loglikelihood Log-likelihood of the OLS model
result.OLS_resid_skew Skewness of OLS residuals
result.converged Whether the optimizer converged

Hypothesis Testing

We can test whether the data support the frontier specification against an OLS model using sf_vs_ols(result; dof=...). For the rice-farmer model the result is:

julia> sf_vs_ols(result; dof = 8)   # 4 parameters in μ, 4 in log σ_u²
Likelihood-ratio test (mixed χ̄² reference distribution)
  LR statistic   : 45.902978
  degrees of freedom: 8
  p-value        : 2.4e-07
  log-lik (R, U) : (-104.9699, -82.0184)
  critical values: 10% = 12.737, 5% = 14.853, 2.5% = 16.856, 1% = 19.384

Since the LR statistic (\(45.903\)) is much larger than the critical value at the 1% level (\(19.384\)), we overwhelmingly reject the null hypothesis of an OLS model.

See Section 6.11 ‘Hypothesis Testing’ for the full reference (sf_vs_ols, the general lrtest, and the LRTestResult struct).

Inefficiency and Efficiency Index

The JLMS inefficiency index and the Battese–Coelli efficiency index are computed automatically and stored in the result:

julia> [result.jlms  result.bc]
271×2 CuArray{Float64, 2, CUDA.DeviceMemory}:
 0.571107   0.574412
 0.510025   0.610202
 0.10391    0.904546
 0.287699   0.758798
 0.15192    0.864167
 0.570986   0.574326
 ⋮
 1.17584    0.314265
 0.428374   0.662447
 0.847933   0.436294
 0.109994   0.899461
 0.175169   0.845739
 0.165553   0.853446

The same indices can also be obtained on new data via predict(result; frontier, depvar, zvar, what = :jlms | :bc | :all). See Section 6.13 ‘Prediction on New Data’ for the full reference.

A built-in plot recipe visualizes the JLMS and BC distributions on the rice-farmer fit:

using Plots
plot(result, :efficiency)       # JLMS + BC efficiency histograms side by side
Histograms of the JLMS inefficiency index and the BC efficiency index

See Section 6.12 ‘Plot Recipes’ for the other recipe kinds (:residuals, :marginal, :convergence, and the default 2×2 diagnostic panel).

Marginal Effects

The marginal effects of the inefficiency determinants on \(E(u_i)\) at the observation level are available as a DataFrame:

julia> result.marginal
271×3 DataFrame
│ Row │ marg_age    │ marg_school │ marg_yr     │
│     │ Float64     │ Float64     │ Float64     │
├─────┼─────────────┼─────────────┼─────────────┤
│ 1   │ -0.0052194  │ -0.0234664  │ -0.0134735  │
│ 2   │ -0.00636135 │ -0.028566   │ -0.00756538 │
│ 3   │ -0.00775549 │ -0.0347949  │ -0.00113382 │
│ 4   │ -0.00953588 │ -0.0427526  │ 0.00630896  │
⋮
│ 268 │ 0.00291455  │ 0.0128417   │ -0.0596646  │
│ 269 │ 0.00221272  │ 0.0097239   │ -0.051836   │
│ 270 │ 0.00160268  │ 0.00701424  │ -0.0449243  │
│ 271 │ 0.00107203  │ 0.0046576   │ -0.0388295  │

The marginal effects can be plotted against covariates to reveal non-linear patterns. Using the built-in recipe:

using Plots

plot(result, :marginal)         # grid of all z-variable marginal scatters
plot(result, :marginal, df.age) # marginal effect vs a specific covariate

The built-in recipe automatically adds a dashed zero reference line. A hand-coded equivalent, useful when you want full control over styling:

scatter(df.age, result.marginal[:, :marg_age],
        xlabel="age", ylabel="marginal effect of age on E(u)",
        label=false)
hline!([0.0], label=false, linestyle=:dash)
Marginal effect of age on E(u)

The plot reveals a non-monotonic effect: production inefficiency decreased with age in the early years of the farmer’s life (perhaps due to experience accumulation), but increased with age in later years (perhaps due to deteriorating physical health). Wang’s (2002) model allows this non-monotonic effect by parameterizing both \(\mu\) and \(\sigma_u^2\) by the same vector of inefficiency determinants.

See Section 6.12 ‘Plot Recipes’ for the full reference of the :marginal recipe and the other recipe kinds.

Post-Estimation Diagnostics

efficiency_summary(result) reports the cross-sectional distribution of the JLMS inefficiency index and the Battese–Coelli efficiency index:

julia> es = efficiency_summary(result)
(n = 271,
 mean_jlms = 0.334, std_jlms = 0.274, min_jlms = 0.071, max_jlms = 1.513,
 median_jlms = 0.237,
 quantiles_jlms = (q10 = 0.106, q20 = 0.130, q25 = 0.139, q30 = 0.166,
                   q40 = 0.199, q50 = 0.237, q60 = 0.301, q70 = 0.369,
                   q75 = 0.419, q80 = 0.480, q90 = 0.735),
 mean_bc = 0.746, std_bc = 0.160, min_bc = 0.224, max_bc = 0.933,
 median_bc = 0.797,
 quantiles_bc = (q10 = 0.488, q20 = 0.629, q25 = 0.669, q30 = 0.702,
                 q40 = 0.749, q50 = 0.797, q60 = 0.827, q70 = 0.853,
                 q75 = 0.875, q80 = 0.882, q90 = 0.903))

The median Battese–Coelli efficiency is \(0.797\), meaning the typical farmer operates at about 80% of the production frontier. The dispersion is wide: the bottom decile sits at \(\text{BC}=0.488\) (severe inefficiency, producing less than half of potential output), while the top decile reaches \(0.903\). The JLMS index is right-skewed (mean \(0.334\), median \(0.237\), max \(1.513\)), reflecting a thin tail of highly inefficient farmers. This is consistent with the heteroscedastic specification that lets both \(\mu\) and \(\sigma_u^2\) vary with \(\mathbf{z}_i\).

residual_diagnostics(result) checks the sign and shape of the residual distribution:

julia> rd = residual_diagnostics(result)
(skew_ols = -0.7035, skew_composed = -0.9147,
 expected_skew_sign = -1,
 skew_ols_sign_ok = true, skew_composed_sign_ok = true,
 JarqueBera_stat = 32.6099, JarqueBera_pvalue = 8.30e-8)

Both the OLS and composed residuals are negatively skewed, matching the expected sign for a production frontier (expected_skew_sign = -1); both *_sign_ok flags confirm this. The Jarque–Bera test on OLS residuals rejects normality at any conventional level (\(p \approx 8 \times 10^{-8}\)), corroborating the sf_vs_ols rejection above: the OLS residuals carry the signature of a one-sided inefficiency component pulling output below the frontier.

The composed residuals \(\hat\varepsilon_i = y_i - x_i'\hat\beta\) and frontier fit \(x_i'\hat\beta\) are returned by residuals(result) and fitted(result). Both accept a type keyword to extract alternative quantities such as \(\hat u_i\), \(\hat v_i\), or OLS residuals.

See ‘Diagnostic Functions’ in Section 6.10 for the full reference (residuals, fitted, efficiency_summary, residual_diagnostics), including all keyword options and the complete list of returned NamedTuple fields.

Prediction on New Data

The fitted object can be applied to new observations through predict(result; frontier, zvar, depvar, what). See ‘Prediction on New Data’ in Section 6.13 for the full reference (signature, the what-keyword table, and supported model classes).

Given any new observations \((X_{\text{new}}, Z_{\text{new}}, y_{\text{new}})\), predict() reuses the estimated coefficients — no refitting — to return the same family of quantities available at training time: the frontier fit \(x_i'\hat\beta\), composed residuals, response-scale fit, the JLMS inefficiency index \(E(u_i\mid\hat\varepsilon_i)\), the Battese–Coelli efficiency index \(E(e^{-u_i}\mid\hat\varepsilon_i)\), and marginal effects \(\partial E(u_i)/\partial z_{ij}\). The what keyword selects the target.

This interface supports several post-estimation tasks:

The exact signature, data-requirement table, and currently-supported model classes are documented in Section 6.13 ‘Prediction on New Data’.

Saving Results

The result can be saved for later analysis. Using the JLD2 package for binary storage:

using JLD2

save_object("model1.jld2", result)             # save everything
result_loaded = load_object("model1.jld2")     # load it back

For cross-platform, human-readable text storage:

using CSV

CSV.write("marginal.csv", result.marginal)     # marginal effects (DataFrame)

6. API Reference

The API is grouped in two sets. Part A — Model building and fitting (6.1–6.6) covers the functions used to specify and estimate a model. Part B — Post-estimation accessors (6.7–6.13) covers the methods that operate on the SFResult object returned by sfmodel_fit(), including StatsAPI accessors, diagnostic functions, hypothesis tests, plot recipes, and predict().


Part A: Model building and fitting

6.1 sfmodel_spec()

The sfmodel_spec() function creates a model specification object that encapsulates all data, distributional assumptions, and metadata required for estimation. It is independent of the choice of estimation method.

Syntax

sfmodel_spec() — define the model specification.

Arguments

Argument Type Description Required
depvar Vector Dependent variable. Cross-sectional: \(N\) observations. Panel: \(N \times T\) stacked by firm. Yes
frontier Matrix Covariate matrix for the frontier equation, dimension\(N\times K\). Accepts a Matrix or a list form [v1, v2, ...] that is internally assembled into a matrix. Cross-sectional: include a column of ones (1) for intercept. Panel: do NOT include a constant column (within-demeaning eliminates it). Yes
noise Symbol Distribution of the noise term\(v\). Cross-sectional: :Normal, :StudentT, :Laplace. Panel: :Normal only. See Section 7. Yes
ineff Symbol Distribution of the inefficiency term\(u\). Supported options: :HalfNormal, :TruncatedNormal, :Exponential, :Weibull, :Lognormal, :Lomax, :Rayleigh, and :Gamma. See Section 7. Note: :Gamma is MCI only; method=:MLE supports only :HalfNormal, :TruncatedNormal, :Exponential. Yes
datatype Symbol Data type.:cross_sectional (default), :panel_TFE (Wang and Ho 2010 true fixed-effect), :panel_TFE_CSW (Chen, Schmidt, and Wang 2014, MLE only), or :panel_TRE (true random-effect, MLE only). See Section 9. No
type Symbol Frontier type. Use:production or :prod for production frontier (\(\varepsilon_i = v_i - u_i\)), and :cost for cost frontier (\(\varepsilon_i = v_i + u_i\)). No
zvar Matrix Covariate matrix.
Cross-sectional: for heteroscedasticity equations, dimension \(N\times L\); include a column of ones if an intercept is required. When hetero=:scaling, the zvar matrix supplies the \(\mathbf{z}_i\) variables for the scaling function \(h(\mathbf{z}_i)=\exp(\mathbf{z}_i'\boldsymbol{\delta})\); do NOT include a constant column (for identification).
Panel: for scaling function \(h(z)=\exp(z'\delta)\), dimension \(NT \times L\); do NOT include a constant column. Optional for both.
No
copula Symbol Cross-sectional only. Copula for dependence between \(v\) and \(u\). Options: :None (default), :Gaussian, :Clayton, :Clayton90, :Gumbel. Not available with panel datatypes. No
hetero Vector{Symbol} or Symbol Cross-sectional only. Parameters of the distributional specification that are allowed to be heteroscedastic (e.g., [:mu, :sigma_sq]), or the symbol :scaling to activate the scaling property model. Not available with panel datatypes. See the Hetero Options column in Section 7 and Section 8.5. No
id Vector Panel only. Unit identifier column. Required for all panel datatypes. Data must be grouped by unit (contiguous rows for each firm). For balanced panels, create an id column: e.g., id = repeat(1:N, inner=T). No
varnames Vector{String} Variable names used in output tables. Ifnothing (default), names are generated automatically. No
eqnames Vector{String} Equation block names (e.g.,["frontier", "mu", "ln_sigma_u_sq"]). If nothing (default), names are generated from ineff. No
eq_indices Vector{Int} Equation boundary indices. Ifnothing (default), auto-generated based on the model structure. No

Return Value

Returns a UnifiedSpec{T} struct containing the model specifications internally. For cross-sectional models, it holds MCI, MSLE, and MLE backend specs (as applicable). For panel_TFE, it holds both Panel (MCI/MSLE) and MLE specs. For panel_TFE_CSW and panel_TRE, it holds MLE spec only.

Example: Basic Specification

Using the paddy-farmer data from Section 5:

using CUDA
using SFrontiers, Optim
using DataFrames, CSV

df = CSV.read("sampledata.csv", DataFrame)
df._cons = ones(nrow(df))  # add a column of _cons to the df DataFrame

# --- Keyword form (matrix data) ---

y = df.yvar
X = hcat(df._cons, df.Lland, df.PIland, df.Llabor, df.Lbull, df.Lcost, df.yr)
Z = hcat(df._cons, df.age, df.school, df.yr)

# Homoscedastic (no Z needed)
spec1 = sfmodel_spec(
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :TruncatedNormal,
)

# Heteroscedastic, also with custom variable names
spec2 = sfmodel_spec(
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu],  # or [:sigma_sq], [:mu, :sigma_sq]. See the Hetero Options column in Section 7
    varnames = ["constant", "land", "Iland", "labor", "bull", "cost", "year",  # frontier
                "constant", "age", "schooling", "year",                        # mu
                "constant",                                                    # ln_sigma_u_sq
                "constant"],                                                   # ln_sigma_v_sq
)

# --- DSL form (DataFrame). Instead of passing data vectors and matrices directly, users specify column names of the DataFrame in the macros (`@depvar`, `@frontier`, `@zvar`), and the data is extracted automatically. The macros can appear in any order.

# Heteroscedastic

spec3 = sfmodel_spec(
    @useData(df),             # the DataFrame
    @depvar(yvar),            # arguments are column names, not data
    @frontier(_cons, Lland, PIland, Llabor, Lbull, Lcost, yr),
    @zvar(_cons, age, school, yr),
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu],
)

Example: Scaling Property Model

The scaling property model uses hetero = :scaling. In this specification, zvar provides the environmental variables \(\mathbf{z}_i\) for the scaling function \(h(\mathbf{z}_i) = \exp(\mathbf{z}_i'\boldsymbol{\delta})\), and the inefficiency distribution parameters remain scalar (homoscedastic). The zvar matrix must not contain a constant column (for identification; see Section 8.5).

# Scaling property model (using keyword form as an example)
# Z_nocons must NOT contain a constant column

Z_nocons = hcat(df.age, df.school, df.yr)

spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,           # include constant as usual
    zvar = Z_nocons,        # environmental variables (no constant!)
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling,      # scaling property model
)

# Scaling + copula (allowed)
spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    zvar = Z_nocons,
    noise = :Normal,
    ineff = :Exponential,
    hetero = :scaling,
    copula = :Clayton,
)

Notes on scaling property model:

Example: Panel Data Specification

Assume a panel dataset has been loaded and the vectors/matrices \(y\), \(X\), \(Z\), and \(firm\_id\) are constructed. The \(firm\_id\) is a vector uniquely identifies each firm. We use keyword form in the examples; DSL form is also available.

# Panel TFE — Wang and Ho (2010) true fixed-effect model.
#   method=:MCI, :MSLE for all inefficiency distributions, :MLE only for half-normal/truncated-normal
spec = sfmodel_spec(
    type = :production,     # the default; optional
    datatype = :panel_TFE,  # Wang and Ho 2010 panel model
    depvar = y,             # N*T stacked by firm
    frontier = X,           # NT x K (no constant)
    zvar = Z,               # NT x L (no constant)
    hetero = :scaling,      # optional; :scaling is the default and the only permission option for panel_TFE
    id = firm_id,           # unit individual identifier (required for all panel models)
    noise = :Normal,
    ineff = :TruncatedNormal     
)


# Panel TFE_CSW — Chen, Schmidt, and Wang (2014) fixed-effect model.
#   MLE only, half-normal only
spec = sfmodel_spec(
    datatype = :panel_TFE_CSW,
    depvar = y,
    frontier = X,
    id = firm_id,
    noise = :Normal,
    ineff = :HalfNormal
)


# Panel TRE — True random-effect model 
#   MLE only, half-normal or truncated-normal
spec = sfmodel_spec(
    datatype = :panel_TRE,
    depvar = y,
    frontier = X,
    zvar = Z,
    id = firm_id,
    noise = :Normal,
    ineff = :HalfNormal
)

Notes: - Panel models do not support copula. - For hetero, it is only permissible in panel_TFE and has to be hetero=:scaling which is the default and may be omitted. That is, heterogeneity enters the model through the scaling function \(h(z_{it}) = \exp(z_{it}'\delta)\). The hetero is not available with other panel data models. - The frontier and zvar matrices must NOT include a constant column (except panel_TRE, where a constant in frontier is allowed). See Section 9 for details.


6.2 sfmodel_method()

The sfmodel_method() function specifies the estimation method and its computational settings (number of draws, GPU usage, etc.). This is separate from sfmodel_spec() so that the same model specification can be estimated under different methods.

Syntax

sfmodel_method() — choose the estimation method and simulation settings.

Arguments

Argument Type Description Required
method Symbol Estimation method. Use:MSLE for Maximum Simulated Likelihood Estimation, :MCI for Monte Carlo Integration, or :MLE for analytic Maximum Likelihood Estimation (available for a limited subset of models). MLE does not use any simulation arguments below. Yes
transformation Symbol/Nothing MCI only. Transformation rule for mapping uniform draws to inefficiency values. Options: :expo_rule, :logistic_1_rule, :logistic_2_rule, or nothing for distribution-specific defaults. Ignored with a warning if method=:MSLE. No
n_draws Int Number of Halton draws per observation. Default: 1024 for both MCI and MSLE. Used whendraws is not provided. When multiRand=true, must be \(\leq\) distinct_Halton_length. No
draws Matrix Draws for Monte Carlo integration as a1 x S row matrix. Use reshape(your_draws, 1, length(your_draws)) to convert a vector to matrix. If nothing (default), Halton draws are auto-generated with correct shape. No
multiRand Bool Whether each observation gets different Halton draws.true (default) generates an N x S wrapped Halton matrix where each observation uses different consecutive draws. false uses the original 1 x S shared draws. When true, n_draws must be \(\leq\) distinct_Halton_length. No
GPU Bool Whether to use GPU computing.false (default) uses CPU, and true uses GPU (requires using CUDA). No
chunks Int Number of chunks for splitting data in MCI/MSLE estimation (default: 10). Effective for both CPU and GPU; especially useful for GPU memory management. Not used by MLE. No
distinct_Halton_length Int Maximum length of the distinct Halton sequence generated formultiRand=true mode (default: 2^15-1 = 32767). Increase this if you need n_draws larger than the default limit. See Observation-Specific Halton Draws. No

About GPU and chunks options

The chunks option is effective for simulation-based estimators (:MSLE and :MCI); no use for :MLE. It works for both CPU and GPU computation and is particularly essential for GPU computing (GPU=true).

:MSLE :MCI :MLE
CPU
GPU ✓✓ ✓✓

Simulation-based estimation (MSLE and MCI) requires evaluating the likelihood contribution for every combination of N observations and S draws, forming an N x S matrix. When N or S is large, this matrix can exceed available memory (especially GPU VRAM), creating a bottleneck. The chunks option addresses this by splitting the N observations into smaller batches.

When chunks=1, all N observations are processed at once as a single N x S matrix. Setting chunks to a value greater than 1 (e.g., chunks=10) splits the observations into smaller batches, creating matrices of size (N/chunks) x S. Each batch is processed sequentially while accumulating the log-likelihood. This reduces peak memory usage, allowing larger datasets and n_draws to fit in GPU memory at the expense of slightly increased computation overhead due to the splitting and looping.

In Windows, users may use Task Manager to monitor the memory usage and adjust chunks to avoid bottlenecks.

Transformation Rules (MCI Only)

When method=:MCI, the transformation option controls the change-of-variable mapping from uniform draws \(t \in (0,1)\) to inefficiency values \(u \geq 0\). If nothing, a distribution-specific default is used.

:MSLE :MCI :MLE
CPU
GPU

The following is a table showing the available rules in the package.

Rule Formula Jacobian Default for
:expo_rule \(u = s \cdot (-\ln(1-t))\) \(s/(1-t)\) Exponential, Weibull, Gamma, Rayleigh
:logistic_1_rule \(u = s \cdot t/(1-t)\) \(s/(1-t)^2\) half-normal, truncated-normal, Lognormal, Lomax
:logistic_2_rule \(u = s \cdot (t/(1-t))^2\) \(2st/(1-t)^3\)

Here \(s\) is a scale parameter derived from the inefficiency distribution, determined automatically by the package. When heteroscedasticity is specified, \(s\) becomes observation-specific (\(s_i\)). The scale parameters are:

Distribution Scale\(s\) Meaning
half-normal \(\sigma\) Standard deviation of\(N^+(0, \sigma^2)\)
truncated-normal \(\sigma_u\) Standard deviation of\(N^+(\mu, \sigma_u^2)\)
Exponential \(\sqrt{\lambda}\) \(\sqrt{\lambda}\), where \(\lambda = \text{Var}(u)\)
Weibull \(\lambda\) Scale parameter of\(\text{Weibull}(\lambda, k)\)
Lognormal \(\sigma\) Log-scale standard deviation of\(\text{LogNormal}(\mu, \sigma^2)\)
Lomax \(\lambda\) Scale parameter of\(\text{Lomax}(\alpha, \lambda)\)
Rayleigh \(\sigma\) Scale parameter of\(\text{Rayleigh}(\sigma)\)
Gamma \(\theta\) Scale parameter of\(\text{Gamma}(k, \theta)\)

Return Value

Returns a method specification struct that encodes both the estimation method and computational settings.

Example

# MSLE with default settings
meth1 = sfmodel_method(method = :MSLE)

# MCI with default Halton draws, custom # of draws, and GPU
meth2 = sfmodel_method(
    method = :MCI,
    n_draws = 2^12,    # custom number
    transformation = :logistic_1_rule 
    GPU = true,
    chunks = 10,
)

# Larger Halton pool for multiRand mode
meth3 = sfmodel_method(
    method = :MSLE,
    n_draws = 50000,
    distinct_Halton_length = 2^16 - 1  # 65535
    GPU = true,
    chunks = 20 
)

# User-supplied draws (uniform on (0,1), reshaped to 1×S row matrix)
my_draws = reshape(rand(2^12), 1, :)
meth4 = sfmodel_method(
    method = :MSLE,
    draws = my_draws,  # identical for each obs
    GPU = true,
)

# MLE — analytic, no simulation parameters needed (limited model support)
meth0 = sfmodel_method(method = :MLE)

Note: If simulation arguments (transformation, draws, n_draws, GPU) are passed with method=:MLE, a warning is issued and they are ignored.


6.3 sfmodel_init()

The sfmodel_init() function creates an initial value vector for optimization. The initial-value vector depends only on the model specification (distribution choice), not on the estimation method. It supports two usage modes: full vector mode and component mode.

Syntax

sfmodel_init() — set initial values for optimization.

Arguments

Argument Type Description Required
spec UnifiedSpec Model specification returned bysfmodel_spec(). Yes
init Vector/Tuple Complete initial-value vector (or tuple). Ifinit is provided, all other component initial-value arguments are ignored. No
frontier Vector/Tuple Initial values for the frontier coefficients (\(K\) elements). No
scaling Vector/Tuple Initial values for the scaling function coefficients\(\boldsymbol{\delta}\) (\(L\) elements, one per column of zvar). Used when hetero = :scaling. No
mu Vector/Tuple Initial values for\(\mu\) (used by TruncatedNormal and Lognormal inefficiency specifications). No
ln_sigma_sq Vector/Tuple Initial values for\(\ln(\sigma^2)\) (used by TruncatedNormal, HalfNormal, Lognormal, and Rayleigh inefficiency specifications). No
ln_sigma_v_sq Vector/Tuple Initial values for\(\ln(\sigma_v^2)\) (used when the noise distribution is Normal or StudentT). No
ln_nu_minus_2 Vector/Tuple Initial values for\(\ln(\nu-2)\) (used when the noise distribution is StudentT). No
ln_b Vector/Tuple Initial values for\(\ln(b)\) (used when the noise distribution is Laplace). No
ln_lambda Vector/Tuple Initial values for\(\ln(\lambda)\) (used by Exponential and Weibull inefficiency specifications). No
ln_k Vector/Tuple Initial values for\(\ln(k)\) (used by Weibull and Gamma inefficiency specifications). No
ln_lambda Vector/Tuple Initial values for\(\ln(\lambda)\) (used by Lomax inefficiency specifications). No
ln_alpha Vector/Tuple Initial values for\(\ln(\alpha)\) (used by Lomax inefficiency specifications). No
ln_theta Vector/Tuple Initial values for\(\ln(\theta)\) (used by Gamma inefficiency specifications; MCI only). No
theta_rho Vector/Tuple Initial value for the copula parameter\(\theta_\rho\) (used when copula \(\neq\) :None). No
message Bool Iftrue, print a warning when init overrides the component initial-value arguments. No

Return Value

Returns a Vector{Float64} containing the initial parameter values in the correct order for optimization.

Mode 1: Full Vector Mode

Provide the complete parameter vector directly via the init keyword. The values must follow the exact equation order used internally by the likelihood function. This mode is generally not recommended for first-time use, since you need to know the internal parameter ordering. It is most useful when recycling estimates from a previous round of estimation (e.g., using a converged coefficient vector as a warm start for a re-specification).

myinit = sfmodel_init(
    spec = myspec,
    init = [0.5, 0.3, 0.2, 0.1, 0.1, 0.1, 0.0, 0.0]
)

Mode 2: Component Mode

Specify initial values by parameter group. The required groups depend on the model specification. The ordering of equation blocks is irrelevant; the program maps each block to the correct position in the coefficient vector automatically.

# Normal + truncated-normal with hetero = [:mu]
myinit = sfmodel_init(
    spec = myspec,
    frontier = [0.5, 0.3, 0.2],     # frontier coefficients
    mu = [0.1, 0.1, 0.1],           # mu of truncated normal
    ln_sigma_sq = [0.0],            # log(sigma_u^2) of truncated normal
    ln_sigma_v_sq = [0.0]           # log(sigma_v^2) of normal
)

# alternatively,
myinit2 = sfmodel_init(
    spec = myspec,
    frontier = X \ y,               # OLS coefficients
    mu = 0.1*ones(size(Z, 2)),      # mu of truncated normal
    ln_sigma_sq = [0.0],            # log(sigma_u^2) of truncated normal
    ln_sigma_v_sq = (0.0)           # log(sigma_v^2) of normal
)

Scaling Property Initial Values

When hetero = :scaling, provide the scaling keyword with initial values for the \(\boldsymbol{\delta}\) coefficients (one per column of zvar). All inefficiency distribution parameters are scalar:

# Normal + half-normal with scaling property
myinit = sfmodel_init(
    spec = myspec,
    frontier = X \ y,                   # OLS estimates
    scaling = zeros(size(Z_nocons, 2)), # δ coefficients (one per z-variable)
    ln_sigma_sq = [0.0],               # scalar base parameter
    ln_sigma_v_sq = [0.0]              # noise variance
)

Panel Initial Values

Panel models use the same keywords as cross-sectional (see table above). The only difference is that all inefficiency distribution parameters are scalar (not heteroscedastic).

# Panel TFE: Normal + half-normal
myinit = sfmodel_init(
    spec = myspec,                # assume spec with datatype=:panel_TFE
    frontier = X_tilde \ y_tilde, # OLS on demeaned data (auto-computed if omitted)
    scaling = [0.1, 0.1],        # scaling function coefficients
    ln_sigma_sq = 0.1,           # scalar
    ln_sigma_v_sq = 0.1          # scalar
)

Note: In panel models, all inefficiency distribution parameters are scalar (not heteroscedastic). If frontier and scaling are omitted, OLS-based defaults are used automatically.

Input Format Flexibility

All parameter arguments accept vectors, row vectors, or tuples:

# All equivalent
frontier = [0.5, 0.3, 0.2]    # Vector
frontier = [0.5 0.3 0.2]      # Row vector (1x3 matrix)
frontier = (0.5, 0.3, 0.2)    # Tuple

6.4 sfmodel_opt()

The sfmodel_opt() function specifies the optimization options. Solvers and options are passed directly to Julia’s Optim.jl interface, so any solver and any Optim.Options keyword are permissible. All three estimation methods (MCI, MSLE, and MLE) use the same optimizer interface, so no method argument is needed.

The function supports a two-stage approach where the first-stage result is used as initial values for the second-stage optimization. A practical usage is to use a derivative-free solver (e.g., NelderMead()) in the first stage (warmstart) to quickly and robustly zoom in on a good neighborhood of the optimum, and use a gradient-based solver (e.g., Newton()) in the second stage (main) to achieve precise convergence. This is often very useful for highly nonlinear models where gradient-based solvers alone may fail from poor starting values. An one-stage estimation is possible by omiiting the first stage (warmstart) estimation.

Syntax

sfmodel_opt() — configure optimization solvers and options.

Arguments

Argument Type Description Required
warmstart_solver Solver Warmstart optimizer, e.g.,NelderMead(), BFGS() No
warmstart_opt NamedTuple Warmstart options as a NamedTuple, e.g.,(iterations = 200, g_abstol = 1e-5) No
main_solver Solver Main optimizer, e.g.,Newton(), BFGS() Yes
main_opt NamedTuple Main options as a NamedTuple, e.g.,(iterations = 200, g_abstol = 1e-8) Yes

Common options for warmstart_opt and main_opt:

Parameter Description Typical Value
iterations Maximum iterations 20–1000
g_abstol
g_reltol
Gradient absolute and relative tolerance 1e-4 to 1e-8
f_abstol
f_reltol
Function absolute and relative tolerance 1e-32
x_abstol
x_reltol
Parameter absolute and relative tolerance 1e-32
show_trace Display iteration progress false

Return Value

Returns an optimization specification struct containing the solver and option specifications.

Example: Two-Stage Optimization

myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt = (iterations = 100, g_reltol = 1e-4),
    main_solver = Newton(),
    main_opt = (iterations = 200, g_reltol = 1e-8)
)

Example: Skip Warmstart

If the warmstart solver is not provided, the warmstart stage is skipped:

myopt = sfmodel_opt(
    main_solver = Newton(),
    main_opt = (iterations = 200, g_abstol = 1e-8)
)

Notes

Common Pitfall: Trailing Comma for Single-Element Options

When specifying options with only one element, you must include a trailing comma to create a NamedTuple:

# CORRECT - trailing comma creates a NamedTuple
main_opt = (iterations = 200,)

# WRONG - without comma, this is just the integer 200, not a NamedTuple
main_opt = (iterations = 200)

For two or more elements, the trailing comma is not needed:

# Both work fine
main_opt = (iterations = 200, g_abstol = 1e-8)
main_opt = (iterations = 200, g_abstol = 1e-8,)  # trailing comma optional

If you forget the trailing comma, the function will display an error message:

ERROR: Invalid `main_opt`: expected a NamedTuple, got Int64.
Hint: For single-element options, use a trailing comma:
`main_opt = (iterations = 200,)` not `main_opt = (iterations = 200)`.

6.5 sfmodel_fit()

The sfmodel_fit() function is the main estimation routine. It organizes the entire workflow: optimization, variance-covariance computation, efficiency index calculation, marginal effects, and results presentation.

Syntax

sfmodel_fit() — run estimation and produce results.

Arguments

Argument Type Description Required
spec UnifiedSpec Model specification fromsfmodel_spec(). Yes
method UnifiedMethod Method specification fromsfmodel_method(). Yes
init Vector Initial parameter vector fromsfmodel_init(). If nothing, uses OLS for frontier and 0.1 for other parameters. No
optim_options Optimization options fromsfmodel_opt(). If nothing, uses defaults. No
jlms_bc_index Bool Compute JLMS and BC efficiency indices (default:true). No
marginal Bool Compute marginal effects of\(Z\) on E(u) (default: true). No
show_table Bool Print formatted estimation table (default:true). No
verbose Bool Print detailed progress information (default:true). No

Return Value

Returns a NamedTuple with comprehensive results:

Convergence Information

Field Type Description
converged Bool Whether optimization converged
iter_limit_reached Bool Whether iteration limit was reached
redflag Int Warning flag: 0 = OK, 1 = potential issues

Method Information

Field Type Description
GPU Bool Whether GPU acceleration was used
n_draws Int Actual number of draws per observation (or per firm for panel)
multiRand Bool Whether per-observation/per-firm Halton draws were used
chunks Int Number of chunks for memory management
distinct_Halton_length Int Maximum Halton sequence length for multiRand
estimation_method Symbol Estimation method used (:MCI, :MSLE, :MLE)

Model Results

Field Type Description
n_observations Int Number of observations (N)
loglikelihood Float64 Maximized log-likelihood value
coeff Vector Estimated coefficient vector
std_err Vector Standard errors
var_cov_mat Matrix Variance-covariance matrix
table Matrix Formatted coefficient table

Efficiency Indices

Field Type Description
jlms Vector JLMS inefficiency index\(\text{E}(u \mid \varepsilon)\) for each observation
bc Vector Battese-Coelli efficiency index\(\text{E}(e^{-u} \mid \varepsilon)\) for each observation

Marginal Effects

Field Type Description
marginal DataFrame Observation-level marginal effects on E(u)
marginal_mean NamedTuple Mean marginal effects

For the scaling property model (hetero = :scaling), marginal effects are computed via \(\partial E(u_i) / \partial z_{ij} = \delta_j \cdot h(\mathbf{z}_i) \cdot E(u_i^*)\), using automatic differentiation. The coefficient \(\delta_j\) directly gives the semi-elasticity \(\partial \ln E(u_i) / \partial z_{ij} = \delta_j\). See Section 8.5.

OLS Diagnostics

Field Type Description
OLS_loglikelihood Float64 Log-likelihood from OLS frontier
OLS_resid_skew Float64 Skewness of OLS residuals

Technical Details

Field Type Description
model The internal model specification
Hessian Matrix Numerical Hessian at optimum
gradient_norm Float64 Gradient norm at convergence
actual_iterations Int Total iterations across all stages
warmstart_solver Solver Warmstart algorithm used
warmstart_ini Vector Initial values for warmstart
warmstart_maxIT Int Maximum warmstart iterations
main_solver Solver Main algorithm used
main_ini Vector Initial values for main stage
main_maxIT Int Maximum main iterations
main_tolerance Float64 Convergence tolerance

Parameter Subsets

Individual coefficient vectors are also available:

Dictionary Access

All results are also available through result.list, an OrderedDict:

keys(result.list)    # View all available keys
result.list[:coeff]  # Access specific result
result.coeff         # alternative 

Example: Full Estimation

result = sfmodel_fit(
    spec = myspec,          # from sfmodel_spec()
    method = mymeth,        # from sfmodel_method()
    init = myinit,          # from sfmodel_init()
    optim_options = myopt,  # from sfmodel_opt()
    jlms_bc_index = true,
    marginal = true,
    show_table = true,
    verbose = true
)

# Access results
println("Converged: ", result.converged)
println("Log-likelihood: ", result.loglikelihood)
println("Frontier coefficients: ", result.frontier)
println("Mean efficiency: ", mean(result.bc))
println("Marginal effects: ", result.marginal_mean)

Example: Minimal Call with Defaults

# Uses OLS-based initial values and default optimization
result = sfmodel_fit(spec = myspec, method = mymeth)

Output Display

When show_table = true, the function prints:

  1. Model specification summary (distributions, sample size, etc.)
  2. Warmstart progress (if enabled)
  3. Main optimization progress
  4. Formatted coefficient table with standard errors, z-statistics, p-values, and confidence intervals
  5. Auxiliary table converting log-transformed parameters to original scale
  6. Additional statistics (OLS log-likelihood, skewness, mean efficiency indices, marginal effects)

Convergence Warnings

The function monitors convergence and sets redflag = 1 if:

6.6 sfmodel_MixTable() and sfmodel_ChiSquareTable()

These two utilities print critical values used for manual likelihood-ratio (LR) testing. Use sfmodel_MixTable when the LR test places a parameter on the boundary of the parameter space — the canonical example in SF analysis being \(H_0: \sigma_u^2 = 0\), for which the reference distribution is the mixed \(\bar{\chi}^2\) of Kodde and Palm (1986). Use sfmodel_ChiSquareTable for standard interior LR tests (e.g., \(H_0: \beta_2 = \beta_3 = 0\)), where the usual \(\chi^2\) reference applies.

Syntax

sfmodel_MixTable()             # Full table (dof 1 to 40)
sfmodel_MixTable(dof)          # Row of critical values for a given dof

sfmodel_ChiSquareTable(dof)    # Row of critical values for a given dof

Arguments

Argument Type Description Required
dof Int Degrees of freedom. For sfmodel_MixTable, must be between 1 and 40. Omit dof in sfmodel_MixTable to print the full table. No

Return Value

A row (or, for sfmodel_MixTable() with no argument, a matrix) of critical values at the significance levels 0.10, 0.05, 0.025, and 0.01. The mixed table is reproduced from Table 1 of Kodde and Palm (1986, Econometrica).

The LR statistic to compare against the returned critical values is \(-2(\ln L_R - \ln L_U)\). For sfmodel_MixTable, the reference distribution is a 50:50 mixture of \(\chi^2(p-1)\) and \(\chi^2(p)\), where \(p\) is the number of boundary restrictions. For sfmodel_ChiSquareTable, it is the standard \(\chi^2(p)\).

Example: Testing for Inefficiency

# Estimate unrestricted model (stochastic frontier)
result_sf = sfmodel_fit(spec = myspec, method = mymeth)

# Estimate restricted model (OLS, no inefficiency)
# The OLS log-likelihood is available from the SF estimation output:
ll_ols = result_sf.OLS_loglikelihood
ll_sf  = result_sf.loglikelihood

# Compute LR statistic
LR = -2 * (ll_ols - ll_sf)

# Compare against mixed chi-squared critical values:
# Assuming u~N(0, sigma_u^2) so that there is only
# one additional parameter (sigma_u^2). 
# Thus, dof = 1 (one restriction: sigma_u^2 = 0).
sfmodel_MixTable(1)
# At 5% level, critical value is 2.705
# If LR > 2.705, reject H0 (inefficiency is statistically significant)

Easier alternative: sf_vs_ols(). The package provides a one-call wrapper that automates exactly this test. sf_vs_ols(result; dof = ...) reads the stored OLS log-likelihood from the fitted object, computes the LR statistic, selects the mixed \(\bar{\chi}^2\) reference distribution, and returns an LRTestResult containing the statistic, p-value, and the critical values at the 10%, 5%, 2.5%, and 1% levels. See Section 6.11 ‘Hypothesis Testing’ for the full signature and return-field reference. The sfmodel_MixTable / sfmodel_ChiSquareTable utilities remain useful when you need the raw critical values for a different LR test (e.g., restrictions on frontier coefficients or comparing two nested SF models by hand).


Part B: Post-estimation accessors

The remaining entries document the methods that operate on the SFResult object returned by sfmodel_fit(). They dispatch uniformly across all backends (MLE, MSLE, MCI, and panel) unless noted otherwise.

6.7 Standard Accessors (StatsAPI Interface)

The object returned by sfmodel_fit() is a thin SFResult wrapper around the result NamedTuple. All existing field-access forms (result.coeff, result.var_cov_mat, result.loglikelihood, etc.) continue to work unchanged. In addition, SFrontiers.jl implements the standard StatsAPI.jl interface, so fitted models can be queried with the canonical Julia statistical accessors. These methods dispatch uniformly across all backends (MLE, MSLE, MCI, and panel).

Syntax

# about coefficients
coef(result);   vcov(result);   stderror(result);   confint(result; level = 0.95);   coeftable(result)

# about model fit
loglikelihood(result);   aic(result);   bic(result);   nobs(result);   dof(result);   dof_residual(result)

# display
summary(result);   show(result)

Arguments

Argument Type Description Required
result SFResult A fitted model object returned by sfmodel_fit(). Yes
level Real Confidence level for confint (default 0.95). Applies only to confint. No

Return Value

Each accessor returns the quantity in the third column below; the second column lists the equivalent raw field on the result NamedTuple for reference.

Accessor Field equivalent Returns
coef(r) r.coeff Full coefficient vector
vcov(r) r.var_cov_mat Variance–covariance matrix
stderror(r) r.std_err Asymptotic standard errors
confint(r; level=...) N × 2 matrix of Wald confidence intervals
loglikelihood(r) r.loglikelihood Maximized log-likelihood
nobs(r) r.n_observations Number of observations
dof(r), dof_residual(r) Free parameters; residual degrees of freedom (\(N - k\))
aic(r), bic(r) Akaike and Bayesian information criteria
coeftable(r) r.table (matrix) A StatsBase.CoefTable (printable, exportable)
summary(r), show(r) Compact REPL display; works on reloaded objects without re-fit

Example

result = sfmodel_fit(spec = myspec, method = mymeth)

coef(result)                     # coefficient vector
confint(result; level = 0.95)    # Wald CIs
coeftable(result)                # StatsBase.CoefTable
aic(result);  bic(result)        # information criteria

6.8 Efficiency Indices

Two observation-level indices summarise where each firm stands relative to the frontier: the JLMS inefficiency index \(E(u_i \mid \varepsilon_i)\) (higher means more inefficient) and the Battese–Coelli (BC) efficiency index \(E(e^{-u_i} \mid \varepsilon_i)\) (bounded in \([0,1]\); a BC value of \(0.85\) means the firm produces \(85\%\) of its potential output). Both are stored as fields on the fitted result and are populated automatically when the model is fit with jlms_bc_index = true.

Syntax

result.jlms        # JLMS inefficiency index
result.bc          # Battese–Coelli efficiency index

Fields

Field Type Description
jlms Vector{Float64} (length N) JLMS index \(E(u_i \mid \varepsilon_i)\); higher indicates more inefficiency.
bc Vector{Float64} (length N) Battese–Coelli efficiency \(E(e^{-u_i} \mid \varepsilon_i) \in [0,1]\).

Return Value

Observation-level Vector{Float64} of length \(N\). Both fields are present when sfmodel_fit() is called with jlms_bc_index = true (the default); they are nothing otherwise.

Example

result = sfmodel_fit(spec = myspec, method = mymeth, jlms_bc_index = true)

mean_inefficiency = mean(result.jlms)
mean_efficiency   = mean(result.bc)

println("Firm 1 efficiency:   ", result.bc[1])
println("Firm 1 inefficiency: ", result.jlms[1])

6.9 Marginal Effects

Marginal effects quantify how changes in the Z variables affect expected inefficiency \(E(u)\). A positive marginal effect means that increasing \(z_j\) raises expected inefficiency. Marginal effects are populated on the fitted result when zvar is specified in sfmodel_spec() and sfmodel_fit(..., marginal = true) is used.

Syntax

result.marginal         # observation-level marginal effects (DataFrame)
result.marginal_mean    # sample-mean marginal effects (NamedTuple)

Fields

Field Type Description
marginal DataFrame One row per observation; columns marg_<name> give \(\partial E(u_i) / \partial z_{ij}\).
marginal_mean NamedTuple Sample averages of the observation-level effects, one entry per Z variable.

Return Value

result.marginal is a DataFrame of size N × L, and result.marginal_mean is a NamedTuple with L entries, where L is the number of Z covariates. Both are nothing when marginal = false or when no zvar is specified.

Example

result = sfmodel_fit(spec = myspec, method = mymeth, marginal = true)

println(result.marginal_mean)
# (age = 0.023, school = -0.015, ...)

marginal_df = result.marginal
println(names(marginal_df))                       # ["marg_age", "marg_school", ...]
println("Firm 1 marginal effect of age: ",
        marginal_df[1, :marg_age])

6.10 Diagnostic Functions

Beyond the StatsAPI accessors, the package provides a small family of post-estimation diagnostic methods that dispatch uniformly across all backends. Together they cover the four quantities most often needed after a fit: model residuals, fitted values, an efficiency-distribution summary, and a residual-skew/normality check.

Syntax

residuals(result; type = :composed)
fitted(result;    type = :frontier)

efficiency_summary(result)
residual_diagnostics(result)

Arguments

Argument Type Description Required
result SFResult A fitted model returned by sfmodel_fit(). Yes
type Symbol For residuals: one of :composed (default), :u, :v, :ols. For fitted: :frontier (default) or :response. No

Return Value

residuals and fitted return an observation-level Vector{Float64} of length \(N\), with the quantity selected by type:

Call Returned vector
residuals(result) (default type = :composed) \(\hat\varepsilon_i = y_i - x_i'\hat\beta\) (composed error)
residuals(result; type = :u) \(\hat u_i\), the JLMS inefficiency estimate
residuals(result; type = :v) \(\hat v_i = \hat\varepsilon_i \mp \hat u_i\) (implied noise estimate)
residuals(result; type = :ols) OLS residuals from the auxiliary OLS fit stored on the result
fitted(result) (default type = :frontier) \(x_i'\hat\beta\) (frontier fit)
fitted(result; type = :response) \(x_i'\hat\beta \mp \hat u_i\) (response-scale fit)

efficiency_summary returns a NamedTuple of summary statistics for the JLMS inefficiency and the Battese–Coelli efficiency indices:

Field Meaning
n Number of observations
mean_jlms, std_jlms Mean and standard deviation of JLMS
min_jlms, max_jlms, median_jlms Min, max, and median of JLMS
quantiles_jlms Deciles q10, q20, …, q90 and quartiles q25, q75
mean_bc, std_bc Mean and standard deviation of BC
min_bc, max_bc, median_bc Min, max, and median of BC
quantiles_bc Deciles and quartiles of BC

residual_diagnostics returns a NamedTuple of numeric checks on the composed and OLS residuals:

Field Meaning
skew_ols Sample skewness of OLS residuals (matches result.OLS_resid_skew)
skew_composed Sample skewness of the composed SF residuals \(\hat\varepsilon\)
expected_skew_sign \(-1\) for production, \(+1\) for cost
skew_ols_sign_ok true iff sign(skew_ols) == expected_skew_sign
skew_composed_sign_ok true iff sign(skew_composed) == expected_skew_sign
JarqueBera_stat Jarque–Bera statistic on OLS residuals: \(\mathrm{JB} = \tfrac{n}{6}(s^2 + \tfrac{1}{4}(k-3)^2)\)
JarqueBera_pvalue Upper-tail p-value of \(\mathrm{JB}\) under \(\chi^2_2\)

Under the SF specification, rejection of normality (small JarqueBera_pvalue) is consistent with the OLS residuals containing a one-sided inefficiency component. The expected sign of the skewness is negative for production and positive for cost.

Example

eps  = residuals(result)              # composed residuals
yhat = fitted(result)                 # frontier fit
es   = efficiency_summary(result)
rd   = residual_diagnostics(result)   # includes JB test + sign check

6.11 Hypothesis Testing

Two likelihood-ratio (LR) testing utilities are provided for fitted SF models. sf_vs_ols performs the boundary LR test of the stochastic frontier against its auxiliary OLS baseline (\(H_0: \sigma_u^2 = 0\)), automatically selecting the mixed \(\bar{\chi}^2\) reference distribution (Kodde and Palm 1986). lrtest is a general nested LR test for two fitted SF models, with an optional switch to the boundary reference. Both return a single LRTestResult struct with the statistic, p-value, and pre-computed critical values.

Syntax

sf_vs_ols(result; dof = nothing)
lrtest(restricted, unrestricted; mixed = false, dof = nothing)

Arguments

Argument Type Description Required
result SFResult sf_vs_ols only. The fitted SF model; its stored OLS_loglikelihood supplies the restricted log-likelihood. Yes
restricted SFResult lrtest only. Fitted restricted SF model. Yes
unrestricted SFResult lrtest only. Fitted unrestricted SF model, nested in the restricted one. Yes
dof Int Degrees of freedom for the reference distribution. For sf_vs_ols, the number of extra SF parameters over OLS; for lrtest, inferred from the two fits if omitted. No
mixed Bool lrtest only. true for the boundary \(\bar{\chi}^2\) reference; false (default) for the standard \(\chi^2\). No

Return Value

An LRTestResult struct with the following fields:

Field Meaning
LR Likelihood ratio statistic, \(-2(\ell_R - \ell_U)\)
dof Degrees of freedom used for the reference distribution
pvalue Upper-tail p-value under the chosen reference distribution
mixed true if the mixed \(\bar{\chi}^2\) reference was used
critical_values NamedTuple with fields p10, p05, p025, p01
ll_restricted Restricted-model log-likelihood \(\ell_R\)
ll_unrestricted Unrestricted-model log-likelihood \(\ell_U\)

Example

result_lr = sf_vs_ols(result; dof = 8)                          # SF vs OLS
lrtest(result_restricted, result_unrestricted; mixed = false)   # general nested LR test

6.12 Plot Recipes

The package provides a set of built-in diagnostic plots that you can generate directly from a fitted model with plot(result, ...). To use them, load Plots.jl alongside SFrontiers.jl; see Section 3 Note 3.

Syntax

plot(result)
plot(result, which::Symbol)
plot(result, :marginal, z::AbstractVector)

Arguments

Argument Type Description Required
result SFResult A fitted model object returned by sfmodel_fit(). Yes
which Symbol Selects the recipe: :efficiency, :residuals, :marginal, or :convergence. Omit to get the default 2×2 diagnostic panel. No
z AbstractVector Only with :marginal. A covariate vector against which the marginal effect is plotted. No

Return Value

A Plots.Plot object that can be displayed, saved with savefig, or further composed. What each call produces:

Call What is plotted
plot(result) Default 2×2 diagnostic panel (JLMS and BC histograms, residuals-vs-fitted, plus QQ or marginal effect)
plot(result, :efficiency) JLMS and BC efficiency histograms side by side
plot(result, :residuals) Residuals-vs-fitted scatter, residual histogram, and QQ plot of \(\hat v\)
plot(result, :marginal) Grid of marginal-effect scatters, one per zvar covariate
plot(result, :marginal, z) Marginal effect plotted against a single user-supplied covariate vector z
plot(result, :convergence) Optimizer trace; requires store_trace = true to have been passed to sfmodel_opt() at fit time

The QQ sub-panel of :residuals compares the imputed noise estimate \(\hat v_i = \hat\varepsilon_i \mp \hat u_i\) against \(N(0, \hat\sigma_v^2)\), not the composed residual itself. This is because the model assumes normality of \(v\), not \(\varepsilon\). The composed residual is deliberately non-normal under the SF specification (its skewness is the basis of sf_vs_ols). Since \(\hat v\) is a derived latent-variable quantity, the plot is a soft distributional-misspecification diagnostic rather than a formal normality test.

If using Plots has not been called, plot(result, ...) raises UndefVarError: plot not defined, accompanied by an error hint instructing the user to load Plots.jl.

Example

using Plots

plot(result)                     # 2×2 default diagnostic panel
savefig("diag_panel.png")

plot(result, :residuals)         # three-panel residual diagnostic
plot(result, :marginal, df.age)  # marginal effect against age

6.13 Prediction on New Data

A fitted model can be applied to new observations through a single entry point, predict(). The what keyword selects the prediction target: the frontier value, the composed residual, the response-scale fit, the JLMS or BC efficiency index, the observation-level marginal effects, or all of the above in one call. Calling predict() on the original training data reproduces the cached result.jlms / result.bc bit-for-bit; calling it on new data yields the corresponding out-of-sample quantities.

Syntax

predict(result; frontier, zvar, depvar, id, what, draws)

Arguments

Argument Type Description Required
result SFResult A fitted model returned by sfmodel_fit(). Yes
what Symbol Prediction target: :frontier, :residuals, :response, :jlms, :bc, :marginal, or :all. Yes
frontier Matrix Covariates \(X_{\text{new}}\) for the frontier equation. Needed for every what except :marginal. Target-specific
zvar Matrix Covariates \(Z_{\text{new}}\) for inefficiency determinants. Needed for :response, :jlms, :bc, :marginal, :all, and only when the fitted model uses hetero or :scaling. Target-specific
depvar Vector Response values \(y_{\text{new}}\). Needed for :residuals, :response, :jlms, :bc, :all (the conditional expectations depend on \(\hat\varepsilon_i = y_i - x_i'\hat\beta\)). Target-specific
id Vector Firm identifier for each predict-time observation. Required for panel fits; ignored for cross-sectional fits. Panel only
draws Matrix or Vector Fit-time Halton draws for simulation-based fits (MCI/MSLE). Required when the fit used user-supplied draws= in sfmodel_method(); optional otherwise (Halton fits reconstruct draws deterministically). Shape: \(N_{\text{train}} \times n_{\text{draws}}\) for multiRand=true, \(1 \times n_{\text{draws}}\) (or length-\(n_{\text{draws}}\) vector) for multiRand=false. Backend-specific

Return Value

A Vector{Float64} for scalar targets, a Matrix / DataFrame for :marginal, or a NamedTuple bundling all targets for what = :all. The following table shows the per-target data requirements and the quantity returned:

what Needs \(X_{\text{new}}\) Needs \(Z_{\text{new}}\) Needs \(y_{\text{new}}\) Needs id Output
:frontier \(x_i'\hat\beta\)
:residuals \(y_i - x_i'\hat\beta\)
:response \(x_i'\hat\beta \mp \hat u_i\)
:jlms \(E(u_i\mid\hat\varepsilon_i)\)
:bc \(E(e^{-u_i}\mid\hat\varepsilon_i)\)
:marginal \(\partial E(u_i)/\partial z_{ij}\)
:all NamedTuple packaging all of the above

\(Z_{\text{new}}\) is required only when the fitted model uses exogenous determinants (hetero or :scaling). Inefficiency and efficiency indices additionally require \(y_{\text{new}}\) because they are conditional on \(\hat\varepsilon_i\). ‡ id is required only for panel fits.

Model Coverage

predict() works with every model the package fits. It inspects the fitted object and applies the right computation internally — you do not need to choose between methods or tune anything. On the training data (the same X, Z, y passed at fit time) predict() reproduces the cached result.jlms / result.bc to high precision; on new data it returns the corresponding out-of-sample quantities. Model combinations the package does not fit raise an informative error at call time.

For panel fits (:panel_TFE, :panel_TFE_CSW, :panel_TRE) you must pass id= — a vector of firm identifiers, one per observation, with rows grouped contiguously by firm (the same convention used at fit time). predict() uses it to reconstruct the panel structure and align new-data firms to the training-time firms.

User-Supplied Draws

If the fit used sfmodel_method(..., draws = my_draws) with a custom Halton matrix, the draws themselves are not stored on result. Pass the same matrix back to predict() via the draws keyword to reproduce result.jlms / result.bc; predict will error out informatively if draws is missing in that case. Halton fits (the default) can also pass draws to override the reconstructed matrix — useful for variance exploration.

Example

# In-sample — bit-matches result.jlms and result.bc
u_hat = predict(result; frontier = X, depvar = y, zvar = Z, what = :jlms)
b_hat = predict(result; frontier = X, depvar = y, zvar = Z, what = :bc)

# Out-of-sample
out = predict(result; frontier = Xnew, depvar = ynew, zvar = Znew, what = :all)
out.jlms; out.bc; out.frontier; out.residuals; out.response

# User-supplied draws (for a fit that used custom draws=...)
u_hat = predict(result; frontier = X, depvar = y, zvar = Z,
                what = :jlms, draws = my_draws)

See Section 5 ‘Prediction on New Data’ for a discussion of typical use-case scenarios (held-out scoring, counterfactual analysis, policy evaluation, forecasting).


7. Supported Models

Noise Distributions

Each table below has five columns: Symbol is the keyword used in sfmodel_spec() (e.g., noise=:Normal); Distribution gives the statistical specification; Init Parameters lists the parameter names and their transformations used in sfmodel_init(); Models indicates whether the distribution is available for cross-sectional data, panel data, or both; Methods shows the compatible estimation methods (MLE, MSLE, MCI). The Inefficiency Distributions table has an additional Hetero Options column showing valid symbols for the hetero argument in sfmodel_spec().

Symbol Distribution Init Parameters Models Methods
:Normal \(v \sim N(0, \sigma_v^2)\) ln_sigma_v_sq \(= \log(\sigma_v^2)\) cross, panel MCI, MSLE, MLE
:StudentT \(v \sim t(0, \sigma_v, \nu)\), \(\nu > 2\) ln_sigma_v_sq \(= \log(\sigma_v^2)\)
ln_nu_minus_2 \(= \log(\nu - 2)\)
cross MCI, MSLE
:Laplace \(v \sim \text{Laplace}(0, b)\) ln_b \(= \log(b)\) cross MCI, MSLE

Inefficiency Distributions

Symbol Distribution Init Parameters Models Methods Hetero Options
:HalfNormal \(u \sim N^+(0, \sigma^2)\) ln_sigma_sq \(= \log(\sigma^2)\) cross, panel MCI, MSLE, MLE :sigma_sq for \(\sigma^2\)
:TruncatedNormal \(u \sim N^+(\mu, \sigma_u^2)\) mu \(= \mu\)
ln_sigma_sq \(= \log(\sigma_u^2)\)
cross, panel MCI, MSLE, MLE :mu for \(\mu\)
:sigma_sq for \(\sigma_u^2\)
:Exponential \(u \sim \text{Exp}(\lambda)\), \(\lambda = \text{Var}(u)\) ln_lambda \(= \log(\lambda)\) cross, panel MCI, MSLE, MLE :lambda for \(\lambda\)
:Weibull \(u \sim \text{Weibull}(\lambda, k)\) ln_lambda \(= \log(\lambda)\)
ln_k \(= \log(k)\)
cross, panel MCI, MSLE :lambda for \(\lambda\)
:k for \(k\)
:Lognormal \(u \sim \text{LogNormal}(\mu, \sigma^2)\) mu \(= \mu\)
ln_sigma_sq \(= \log(\sigma^2)\)
cross, panel MCI, MSLE :mu for \(\mu\)
:sigma_sq for \(\sigma^2\)
:Lomax \(u \sim \text{Lomax}(\alpha, \lambda)\) ln_lambda \(= \log(\lambda)\)
ln_alpha \(= \log(\alpha)\)
cross, panel MCI, MSLE :lambda for \(\lambda\)
:alpha for \(\alpha\)
:Rayleigh \(u \sim \text{Rayleigh}(\sigma)\) ln_sigma_sq \(= \log(\sigma^2)\) cross, panel MCI, MSLE :sigma_sq for \(\sigma^2\)
:Gamma \(u \sim \text{Gamma}(k, \theta)\) ln_k \(= \log(k)\) (shape)
ln_theta \(= \log(\theta)\) (scale)
cross, panel MCI :k for \(k\)
:theta for \(\theta\)

Note of Scaling property model alternative: Instead of making individual distribution parameters heteroscedastic (via hetero = [:mu], etc.), you can use hetero = :scaling to model heterogeneity through a single multiplicative scaling function \(u_i = h(\mathbf{z}_i) \cdot u_i^*\). Under scaling, all distribution parameters remain scalar and a separate set of \(\boldsymbol{\delta}\) coefficients is estimated. All 8 inefficiency distributions support the scaling property model. See Section 8.5.

Copula Models

Cross-sectional models only. A copula function models the dependence between the noise term \(v\) and the inefficiency term \(u\). When copula=:None (default), \(v\) and \(u\) are assumed independent. With a copula, the joint density becomes \(f(v,u) = f_v(v) \cdot f_u(u) \cdot c(F_v(v), F_u(u); \rho)\), where \(c\) is the copula density and \(\rho\) is the dependence parameter.

Symbol Copula Parameter Kendall’s \(\tau\) Tail Dependence Init Parameter
:Gaussian Gaussian \(\rho \in (-1,1)\) \((2/\pi)\arcsin(\rho)\) None theta_rho \(= \text{atanh}(\rho)\)
:Clayton Clayton \(\rho > 0\) \(\rho/(2+\rho)\) Lower: \(2^{-1/\rho}\) theta_rho \(= \log(\rho)\)
:Clayton90 Clayton 90° rotation \(\rho > 0\) \(-\rho/(2+\rho)\) Upper-lower: \(2^{-1/\rho}\) theta_rho \(= \log(\rho)\)
:Gumbel Gumbel \(\rho \geq 1\) \(1 - 1/\rho\) Upper: \(2 - 2^{1/\rho}\) theta_rho \(= \log(\rho - 1)\)

Notes:

Parameterization

When a parameter is modeled as heteroscedastic (observation-specific), we use a link function to ensure it stays in the correct domain:

Parameter Vector Length

The total number of parameters depends on heteroscedasticity settings:

# Homoscedastic: scalar parameters
hetero = Symbol[]
# -> Each inefficiency parameter contributes 1 to parameter count

# Heteroscedastic mu: L parameters
hetero = [:mu]
# -> mu contributes L parameters (one for each Z column)

# Fully heteroscedastic truncated-normal
hetero = [:mu, :sigma_sq]
# -> mu contributes L, sigma_u^2 contributes L

# Scaling property model
hetero = :scaling
# -> Adds L_scaling delta coefficients (one per column of zvar)
# -> All inefficiency parameters are scalar (1 each)

Example: Heteroscedastic Model

# Z has 4 columns (including constant)
# hetero = [:mu, :sigma_sq] means both mu and sigma_u^2 depend on Z

spec = sfmodel_spec(
    depvar = y,
    frontier = X,          # K = 3 variables
    zvar = Z,              # L = 4 variables
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu, :sigma_sq]
)

# Parameter vector structure:
# [beta_1, beta_2, beta_3,               <- frontier (K = 3)
#  delta_1, delta_2, delta_3, delta_4,   <- mu (L = 4, heteroscedastic)
#  gamma_1, gamma_2, gamma_3, gamma_4,   <- ln_sigma_u^2 (L = 4, heteroscedastic)
#  ln_sigma_v^2]                         <- noise variance (1)
# Total: 3 + 4 + 4 + 1 = 12 parameters

Example: Scaling Property Model

# Z_nocons has 2 columns (no constant!)
# hetero = :scaling activates the scaling property model

spec = sfmodel_spec(
    depvar = y,
    frontier = X,              # K = 3 variables
    zvar = Z_nocons,           # L = 2 variables (no constant)
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling
)

# Parameter vector structure:
# [beta_1, beta_2, beta_3,    <- frontier (K = 3)
#  delta_1, delta_2,          <- scaling function (L_scaling = 2)
#  ln_sigma^2,                <- inefficiency (scalar, 1)
#  ln_sigma_v^2]              <- noise variance (1)
# Total: 3 + 2 + 1 + 1 = 7 parameters

Distribution Selection Guidance

Inefficiency Distribution

The choice of inefficiency distribution affects the shape of the estimated inefficiency, particularly its mode and tail behavior. The following table summarizes the key properties:

Distribution Parameters Mode at Zero? Tail Recommended Use
half-normal 1 (\(\sigma\)) Yes Light Default starting point; most common in the literature
Exponential 1 (\(\lambda\)) Yes Light Simple alternative to half-normal; monotone decreasing density
truncated-normal 2 (\(\mu\), \(\sigma\)) Not necessarily Light When the mode of inefficiency may be at a positive value
Rayleigh 1 (\(\sigma\)) No (mode > 0) Light One-parameter distribution with mode away from zero
Weibull 2 (\(\lambda\), \(k\)) Flexible Moderate Flexible shape: mode at zero (\(k \le 1\)) or positive (\(k > 1\))
Lognormal 2 (\(\mu\), \(\sigma\)) No (mode > 0) Heavy Right-skewed inefficiency with a heavy right tail
Lomax 2 (\(\alpha\), \(\lambda\)) Yes Heavy Heavy-tailed; useful when a few firms are highly inefficient
Gamma 2 (\(k\), \(\theta\)) Flexible Moderate Maximum flexibility; requires MCI method

Practical workflow:

  1. Start simple. Begin with half-normal or Exponential. These one-parameter distributions are easy to estimate (MLE available) and serve as a baseline.
  2. Allow non-zero mode. If theory suggests that most firms have some positive level of inefficiency (rather than clustering near zero), try truncated-normal or Rayleigh.
  3. Add flexibility. If one- or two-parameter distributions seem restrictive, try Weibull, Lognormal, or Gamma. These can capture a wider variety of shapes but require simulation-based estimation (MCI or MSLE).

Noise Distribution

Distribution When to Use
Normal Default and standard assumption.
Student-t When residuals exhibit excess kurtosis (heavier tails than Normal). The degrees-of-freedom parameter\(\nu\) is estimated.
Laplace When the noise distribution is believed to be double-exponential (sharper peak, heavier tails than Normal).

Note: Student-t and Laplace noise require simulation-based estimation (MCI or MSLE) and are available for cross-sectional models only.


8. Special Topics

Choosing Between MLE, MCI, and MSLE

The package offers three estimation methods. MLE uses analytic (closed-form) log-likelihoods, while MCI and MSLE are simulation-based and use Halton quasi-random draws.

Feature MLE MCI MSLE
Likelihood Analytic (exact) Simulated Simulated
Simulation draws Not needed Halton sequence
or user supplied sequence
Halton sequence
or user supplied sequence
GPU support No Yes Yes
Noise distributions Normal only Normal, StudentT, Laplace Normal, StudentT, Laplace
Ineff distributions half-normal, trunc-normal, Expo All 8 All except Gamma
Copula support None Gaussian, Clayton, Clayton90, Gumbel Gaussian, Clayton, Clayton90, Gumbel
Scaling property Yes (TruncNormal only) All 8 distributions All except Gamma
Heteroscedastic Yes Yes Yes
Panel models TFE, TFE_CSW, TRE TFE only TFE only
Defaultn_draws N/A 1024 1024

When to use MLE: If your model uses Normal noise with half-normal, truncated-normal, or Exponential inefficiency (and no copula), MLE is the natural first choice — it is exact (no simulation error), fast, and does not require tuning the number of draws.

When to use MCI/MSLE: For models that MLE cannot handle (non-Normal noise, copula dependence, Weibull/Lognormal/Lomax/Rayleigh/Gamma inefficiency), the simulation-based methods are required. For models supported by all three methods, MLE and simulation estimates typically agree closely.

For a given model supported by both MCI and MSLE, the two simulation methods typically produce similar estimates. Differences may arise because MCI and MSLE use different likelihood constructions, so they can respond differently to the choice of distribution, sample size, and starting values. In particular, when the data do not conform well to the assumed distributional shape, the finite set of simulation draws may cover the tails unevenly, causing the two methods to weight those observations differently.

GPU Computation

GPU acceleration is available for the simulation-based methods (MCI and MSLE) and is highly recommended when the number of draws or the sample size is large. MLE does not use GPU because its likelihood is analytic.

Why GPU helps. Simulation-based estimation evaluates the likelihood contribution for every combination of \(N\) observations and \(S\) draws, forming an \(N \times S\) matrix. These element-wise operations are massively parallel and map naturally onto GPU architectures. In practice, enabling GPU can reduce estimation time from minutes to seconds.

Requirements:

  1. An NVIDIA GPU with a compatible driver installed on the machine.
  2. The Julia package CUDA.jl (using CUDA). Julia 1.10 or later is required.
  3. CUDA.jl must be loaded before SFrontiers.jl in every session. SFrontiers detects CUDA at load time; if the order is reversed, GPU features will not be available and you must restart Julia (see Section 3).

Usage:

using CUDA          # must come before SFrontiers
using SFrontiers

meth = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12 - 1,
    GPU = true,        # enable GPU acceleration
    chunks = 10        # split data into 10 batches for memory management
)

Managing GPU memory with chunks. When \(N\) or \(S\) is large, the full \(N \times S\) matrix may exceed GPU VRAM. The chunks option splits the \(N\) observations into smaller batches of size \(N/\text{chunks}\), processing each sequentially while accumulating the log-likelihood. This trades a small amount of overhead for significantly lower peak memory usage. Start with the default (chunks = 10) and increase if you encounter out-of-memory errors. On Windows, Task Manager can be used to monitor VRAM usage in real time.

Tip: The chunks option also works on CPU and can help with large datasets even without a GPU.

Copula Models

To model dependence between the noise and inefficiency terms, specify a copula:

# Clayton copula (lower tail dependence)
spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :HalfNormal,
    copula = :Clayton,
)

# Initial values must include theta_rho for the copula parameter
myinit = sfmodel_init(
    spec = spec,
    frontier = X \ y,
    ln_sigma_sq = [0.0],
    ln_sigma_v_sq = [0.0],
    theta_rho = [0.0]       # copula dependence parameter (transformed scale)
)

# Clayton 90° rotated copula (upper-lower tail dependence)
spec_c90 = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :HalfNormal,
    copula = :Clayton90,
)

# Gumbel copula (upper tail dependence)
spec_g = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :Exponential,
    copula = :Gumbel,
)

The estimation output includes a copula auxiliary table showing:

Tail dependence by copula type:

Copula Tail Dependence Interpretation
Gaussian None (symmetric, no tail dependence) Dependence is moderate and evenly spread; no extreme co-movement in the tails
Clayton Lower tail Firms at the low end of \(v\) (e.g., negative shock) and low end of \(u\) (e.g., efficient) move together.
Clayton 90° Upper-lower (rotated) Firms at the high end of \(v\) (e.g., positive shock) and low end of \(u\) (e.g., efficient) move together.
Gumbel Upper tail Firms at the high end of both \(v\) and \(u\) move together.

Note: Copula models are supported only for cross-sectional data. Student-t noise is not compatible with copulas (the copula density requires evaluating the Student-t CDF, to which its standard implementation is not compatible with the automatic differentiation used in the package).

Scaling Property Model (Cross-Sectional)

The scaling property model introduces observation-specific heterogeneity through a single multiplicative scaling function, rather than making individual distributional parameters heteroscedastic:

\[ u_i = h(\mathbf{z}_i) \cdot u_i^*, \qquad h(\mathbf{z}_i) = \exp(\mathbf{z}_i'\boldsymbol{\delta}), \]

where \(u_i^*\) follows a homoscedastic (scalar-parameter) base distribution and \(\mathbf{z}_i\) is a vector of environmental/exogenous variables. The scaling function \(h(\mathbf{z}_i)\) is always positive and rescales the base inefficiency without changing the distributional shape.

Key mathematical properties:

  1. Jacobian cancellation: The change of variable \(u_i = h_i u_i^*\) produces a Jacobian \(h_i\) that cancels with the \(1/h_i\) from the density transformation \(f_u(u_i) = \frac{1}{h_i} f_{u^*}(u_i/h_i)\), leaving \(f_{u^*}(u_i^*)\) directly in all integrals.
  2. JLMS factorization: \(E(u_i \mid \varepsilon_i) = h_i \cdot E(u_i^* \mid \varepsilon_i)\).
  3. BC non-factorization: \(E(e^{-u_i} \mid \varepsilon_i) = E(e^{-h_i u_i^*} \mid \varepsilon_i)\); the \(h_i\) must remain inside the exponential.
  4. Semi-elasticity: \(\partial \ln E(u_i) / \partial z_{ij} = \delta_j\), providing direct interpretation of the scaling coefficients.

Unconditional mean \(E(u_i) = h(\mathbf{z}_i) \cdot E(u_i^*)\), where \(E(u_i^*)\) depends on the base distribution:

Distribution \(E(u_i^*)\)
half-normal \(\sigma\sqrt{2/\pi}\)
truncated-normal \(\sigma_u(\Lambda + \phi(\Lambda)/\Phi(\Lambda))\), \(\Lambda = \mu/\sigma_u\)
Exponential \(\sqrt{\lambda}\)
Weibull \(\lambda\,\Gamma(1 + 1/k)\)
Lognormal \(\exp(\mu + \sigma^2/2)\)
Lomax \(\lambda / (\alpha - 1)\), \(\alpha > 1\)
Rayleigh \(\sigma\sqrt{\pi/2}\)
Gamma \(k\theta\)

Identification constraint: The zvar matrix must not contain a constant column. A constant in \(\mathbf{z}_i\) would create an intercept that is not separately identified from the scale parameter of the base distribution (e.g., \(\sigma\) in half-normal or \(\lambda\) in Exponential).

Supported distributions: All 8 inefficiency distributions are supported. The Gamma distribution requires method = :MCI.

Usage Example

# Step 1: Specify a scaling property model
spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,                 # include constant
    zvar = Z_nocons,              # environmental variables (NO constant!)
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling,            # activates scaling property model
)

# Step 2: Choose estimation method (both MSLE and MCI work)
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12 - 1,
    GPU = true,
    chunks = 10
)

# Step 3: Set initial values — note the "scaling" keyword for δ
init = sfmodel_init(
    spec = spec,
    frontier = X \ y,
    scaling = zeros(size(Z_nocons, 2)),   # δ initial values
    ln_sigma_sq = [0.0],                  # scalar base parameter
    ln_sigma_v_sq = [0.0]
)

# Step 4: Configure optimization
opt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt = (iterations = 200,),
    main_solver = Newton(),
    main_opt = (iterations = 100, g_abstol = 1e-8)
)

# Step 5: Estimate
result = sfmodel_fit(
    spec = spec, method = meth, init = init,
    optim_options = opt, marginal = true, show_table = true
)

# Access scaling coefficients
println("δ coefficients: ", result.delta)

# Marginal effects (computed via ForwardDiff)
println("Mean marginal effects: ", result.marginal_mean)
println("Observation-level marginal effects: ", result.marginal)

Scaling with Copula

The scaling property model can be combined with copula dependence:

spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    zvar = Z_nocons,
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling,
    copula = :Clayton,
)

init = sfmodel_init(
    spec = spec,
    frontier = X \ y,
    scaling = zeros(size(Z_nocons, 2)),
    ln_sigma_sq = [0.0],
    ln_sigma_v_sq = [0.0],
    theta_rho = [0.0]
)

Cost Frontier Models

For cost frontiers, where inefficiency increases costs:

spec = sfmodel_spec(
    type = :cost,           # Cost frontier
    depvar = totalC,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :Exponential,
)

Mathematical Difference

The key difference between production and cost frontiers lies in the sign of the inefficiency term:

The software handles the sign convention internally when type = :cost is set. All post-estimation quantities (JLMS, BC, marginal effects) are computed correctly for cost frontiers without additional user adjustments.

Interpretation Differences

Aspect Production Frontier Cost Frontier
Composed error \(\varepsilon = v - u\) \(\varepsilon = v + u\)
OLS residual skewness Should benegative Should bepositive
JLMS\(E(u \mid \varepsilon)\) Higher = more inefficient Same interpretation
BC\(E(e^{-u} \mid \varepsilon)\) Closer to 1 = more efficient Same interpretation
BC interpretation Firm produces\(\text{BC} \times 100\%\) of potential output Firm’s cost is\(\frac{1}{\text{BC}} \times 100\%\) of the efficient cost level
Frontier coefficients Output elasticities Cost elasticities

Common Pitfall: Wrong-Sign Skewness

Before estimating a stochastic frontier model, check the OLS residual skewness (reported in the sfmodel_fit() output as OLS_resid_skew):

If the skewness has the wrong sign, the data may not support the presence of inefficiency in the assumed direction, and the model may have difficulty converging or produce unreliable estimates.

Choosing the Number of Halton Draws

The number of Monte Carlo draws affects accuracy and computation time. The following is provided only as a rough reference.

n_draws Typical Use Case
127 Quick testing, CPU comfortable
1024 Standard estimation (default)
4095 (\(2^{12}-1\)) Publication-quality results
8191 High precision
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12 - 1  # 4095 draws
)

Observation-Specific Halton Draws (multiRand)

Simulation-based estimation (MCI and MSLE) approximates the likelihood by drawing simulated inefficiency values \(\{u_i^s\}\), \(s = 1, \ldots, S\), for each observation \(i\). These are generated from a set of uniform draws \(\{r_i^s\} \in (0, 1)\) via inverse transform sampling: \(u_i^s = F^{-1}(r_i^s)\), where \(F^{-1}\) is the inverse CDF (quantile function) of the assumed inefficiency distribution. This works because if \(r\) is uniformly distributed on \((0, 1)\), then \(F^{-1}(r)\) follows the distribution \(F\).

Instead of using pseudo-random uniform draws, the package uses the base-2 Halton sequence — a deterministic low-discrepancy sequence that covers the \((0, 1)\) interval more evenly than random sampling, leading to faster convergence of the simulated likelihood. For further variance reduction, the draws \(\{r_i^s\}\) should be fixed across optimization iterations for a given observation \(i\) but vary across different observations. The multiRand=true option (default) achieves this by assigning each observation its own consecutive segment of the Halton sequence. The algorithm, proposed by Chen and Wang (2026), is designed to maximize the number of distinct elements while leveraging the optimal coverage properties of the base-2 Halton sequence.

Algorithm. Let \(S\) denote the number of draws per observation and \(N\) the sample size.

  1. Compute the total number of draws needed: \(Q = N \times S\).
  2. Find the largest integer \(m^*\) such that \(2^{m^*} - 1 \leq Q\), i.e., \(m^* = \lfloor \log_2(Q + 1) \rfloor\). Generate a base-2 Halton sequence of length \(M = 2^{m^*} - 1\).
  3. If \(Q > M\), extend the sequence by wrapping around: append the first \(Q - M\) elements of the sequence to itself, bringing the total length to exactly \(Q\). Since any subset of the Halton sequence retains its low-discrepancy property, the recycled portion preserves the uniformity of the original sequence.

The resulting length-\(Q\) sequence is then divided among observations in consecutive blocks: observation 1 receives elements \([1, \ldots, S]\), observation 2 receives \([S+1, \ldots, 2S]\), and so on.

Why \(2^m - 1\)? The base-2 Halton sequence achieves particularly well-balanced spacing at lengths of \(2^m - 1\). For example, at \(m = 3\) the sequence produces 7 points \(\{0.5, 0.25, 0.75, 0.125, 0.625, 0.375, 0.875\}\), which are evenly distributed over \([0,1)\). Adding an 8th point (\(0.0625\)) would disrupt this balance. The algorithm therefore truncates at the largest such length that fits within \(Q\) to ensure optimal coverage.

Example. Suppose \(N = 200\) and \(S = 60\). Then \(Q = 12{,}000\) and \(m^* = \lfloor \log_2(12{,}001) \rfloor = 13\), so the algorithm generates \(M = 2^{13} - 1 = 8{,}191\) distinct Halton points and recycles the first \(3{,}809\) to fill the remaining slots, for a total of \(12{,}000\). Each observation is assigned a consecutive block of 60 draws. If \(N\) increases to \(5{,}000\) (with the same \(S = 60\)), then \(Q = 300{,}000\), \(m^* = 18\), and \(M = 262{,}143\) distinct elements are generated.

Upper bound on sequence length. The option distinct_Halton_length (default \(2^{15} - 1 = 32{,}767\)) caps the length of the generated sequence to avoid Halton points that lie extremely close to 0 or 1, which can trigger numerical instability through operations like \(\log(t)\) or \(1/(1 - t)\). In practice, the algorithm uses \(\min(M,\, \texttt{distinct\_Halton\_length})\) as the effective sequence length. This cap can be increased via the distinct_Halton_length option in sfmodel_method().

Constraint: When multiRand=true, n_draws must be \(\leq\) distinct_Halton_length (default \(32{,}767\)). To use more draws per observation, either increase distinct_Halton_length or set multiRand=false.

# Default: observation-specific draws (recommended)
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 1024,    # per obs draws; must be <= distinct_Halton_length (default 32767)
    multiRand = true   # Default
)

# Use more draws by increasing distinct_Halton_length
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 50000,
    multiRand = true,
    distinct_Halton_length = 2^16 - 1  # 65535
)

# Legacy mode: all observations share the same draws
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 8191,    # No constraint from distinct_Halton_length
    multiRand = false
)

Custom Draw Sequences

Instead of using the built-in base-2 Halton sequence, users can supply their own uniform draws \(\{r^s\}\), \(s = 1, \ldots, S\), via the draws option in sfmodel_method(). This allows the use of alternative low-discrepancy sequences (e.g., Halton with a different base, Sobol sequences) or pseudo-random numbers. The custom sequence must be a \(1 \times S\) matrix, and the same draws are applied to all observations (multiRand=false is required).

# Example 1: Custom Halton sequence (e.g., different base via HaltonSequences.jl)
using HaltonSequences

halton_vec = make_halton_p(1024; T = Float64)
halton = reshape(halton_vec, 1, length(halton_vec))  # Convert to 1xS

meth = sfmodel_method(
    method = :MSLE,
    draws = halton,     # 1xS matrix
    multiRand = false   # required when providing custom draws
)

# Example 2: Pseudo-random uniform draws
S = 1024
rand_draws = reshape(rand(S), 1, S)  # 1xS matrix of U(0,1) draws

meth = sfmodel_method(
    method = :MSLE,
    draws = rand_draws,
    multiRand = false
)

Handling Convergence Issues

If estimation fails to converge:

  1. Try different initial values

    Convergence failures are often caused by poor starting points. Use sfmodel_init() to supply values closer to the expected estimates — for example, based on OLS residuals or results from a simpler model. Even small changes in initial values can determine whether the optimizer finds the global maximum or gets stuck at a local optimum or saddle point.

  2. Change warmstart iterations

    Note, more warmstart iterations are not always better — an excessive warmstart can overshoot and move the parameters away from the basin of the global optimum, making it harder for the main solver to converge.

    myopt = sfmodel_opt(
        warmstart_solver = NelderMead(),
        warmstart_opt = (iterations = 400,),  # More warmstart iterations
        main_solver = Newton(),
        main_opt = (iterations = 200,)
    )
  3. Use different optimizers

    myopt = sfmodel_opt(
        warmstart_solver = ParticleSwarm(),  # Global search, slow
        warmstart_opt = (iterations = 500,),
        main_solver = BFGS(),  # Alternative to Newton
        main_opt = (iterations = 2000,)
    )
  4. Try a different estimation method

    If one method has convergence difficulties, try another:

    # If MSLE fails, try MCI (or vice versa)
    meth_alt = sfmodel_method(method = :MCI, n_draws = 1024)
    result = sfmodel_fit(spec = spec, method = meth_alt)
    
    # Or try MLE (if the model supports it: Normal noise, no copula,
    # half-normal/truncated-normal/Exponential inefficiency)
    meth_mle = sfmodel_method(method = :MLE)
    result = sfmodel_fit(spec = spec, method = meth_mle)
  5. Check OLS residual skewness

    Before investing effort in convergence tuning, verify that the data supports the presence of inefficiency. If result.OLS_resid_skew has the wrong sign (positive for production, negative for cost), the data may not exhibit the one-sided pattern that stochastic frontier models require. In such cases, even a perfectly converged model may produce meaningless estimates.

  6. Reduce model complexity first

    If a heteroscedastic or copula model fails to converge, first estimate a simpler version:

    • Drop the copula (copula = :None) to establish a baseline.
    • Remove heteroscedasticity (hetero = Symbol[]) and use a homoscedastic specification.
    • Use a simpler inefficiency distribution (e.g., half-normal instead of Lognormal).

    Once the simple model converges, use its estimates as initial values for the more complex specification. |


9. Panel Data Models

The module supports several panel stochastic frontier models for estimating firm-level inefficiency from balanced or unbalanced panel data. Panel estimation is accessed through the same unified API by setting the datatype argument in sfmodel_spec().

Panel Model Types

datatype Model Methods Ineff (MLE)
:panel_TFE Wang and Ho (2010) true fixed-effect MCI, MSLE, MLE half-normal, truncated-normal
:panel_TFE_CSW Chen, Schmidt, and Wang (2014) fixed-effect MLE only half-normal only
:panel_TRE Greene (2005) True random-effect MLE only half-normal, truncated-normal

For panel_TFE, the simulation-based methods (MCI, MSLE) support all 8 inefficiency distributions. MLE support is limited to the distributions listed above. For panel_TFE_CSW and panel_TRE, only MLE is available.

9.1 Theoretical Background

Wang and Ho (2010) True Fixed-Effect (panel_TFE)

The Wang and Ho (2010) model starts from:

\[ y_{it} = \alpha_i + x_{it}'\beta + v_{it} - h(z_{it}) \cdot u_i^* \]

where \(\alpha_i\) is a firm-specific fixed effect, \(v_{it} \sim N(0, \sigma_v^2)\) is noise, \(u_i^* \ge 0\) is inefficiency, and \(h(z_{it}) = \exp(z_{it}'\delta)\) is a scaling function.

To eliminate \(\alpha_i\), a within-group transformation (demeaning) is applied:

\[ \tilde{y}_{it} = \tilde{x}_{it}'\beta + \tilde{v}_{it} - \tilde{h}_{it} \cdot u_i^* \]

where tildes denote demeaned values. Key point: \(y\) and \(X\) are demeaned, but \(Z\) is NOT demeaned. Instead, \(h(z_{it})\) is computed first, then demeaned to obtain \(\tilde{h}_{it}\).

Because the demeaned noise has a singular covariance \(\Sigma = \sigma_v^2(I - \frac{1}{T}\mathbf{1}\mathbf{1}')\), the quadratic form simplifies to:

\[ \text{pert}' \Sigma^+ \text{pert} = \|\text{pert}\|^2 / \sigma_v^2 \]

This eliminates the need for the pseudo-inverse, keeping computation efficient.

Chen, Schmidt, and Wang (2014) Fixed-Effect (panel_TFE_CSW)

The CSW model (Chen, Schmidt, and Wang 2014, Journal of Econometrics) provides an alternative approach to estimating true fixed-effect stochastic frontier models. After the within-group transformation eliminates the fixed effects \(\alpha_i\), the CSW approach exploits properties of the closed skew-normal (CSN) distribution to derive a closed-form likelihood for the demeaned composed error. This avoids the need for the scaling-property structure required by Wang and Ho (2010).

Key characteristics:

True Random-Effect (panel_TRE)

The TRE model, attributed to Greene (2005), treats the individual effect \(\alpha_i\) as a random draw from \(N(0, \sigma_\alpha^2)\) rather than as a fixed parameter to be eliminated. Because \(\alpha_i\) is random and integrated out of the likelihood, a constant term may be included in the frontier equation (unlike the fixed-effect models where demeaning eliminates constants).

Key characteristics:

When to use TRE vs. TFE:

The choice between TRE and TFE mirrors the classic random-effects vs. fixed-effects tradeoff in panel econometrics:

9.2 Panel Quick Start

Example: Panel TFE with MSLE (simulation-based)

using CUDA
using SFrontiers
using CSV, DataFrames, Optim

# Load panel data (stacked: firm 1 all T periods, firm 2 all T periods, ...)
df = CSV.read("panel_data.csv", DataFrame)
y = df.y
X = hcat(df.x1, df.x2)    # NT x K — no constant column!
Z = hcat(df.z1)            # NT x L — no constant column!

# Step 1: Specify panel model
myspec = sfmodel_spec(
    type = :production,
    datatype = :panel_TFE,    # Wang and Ho 2010 true fixed-effect
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :HalfNormal,
    id = df.firm,             # unit identifier (required for all panel models)
)

# Step 2: Choose estimation method
mymeth = sfmodel_method(method = :MSLE, n_draws = 1024)

# Step 3: Set initial values (panel-specific keywords)
myinit = sfmodel_init(
    spec = myspec,
    scaling = [0.1],          # scaling function coefficient
    ln_sigma_sq = 0.1,       # scalar
    ln_sigma_v_sq = 0.1      # scalar
)

# Step 4: Configure optimization
myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt = (iterations = 200, g_abstol = 1e-3),
    main_solver = Newton(),
    main_opt = (iterations = 200, g_abstol = 1e-8)
)

# Step 5: Estimate
result = sfmodel_fit(
    spec = myspec,
    method = mymeth,
    init = myinit,
    optim_options = myopt,
    show_table = true
)

# Access results
println("Log-likelihood: ", result.loglikelihood)
println("Mean JLMS: ", mean(result.jlms))      # firm-level, N-vector
println("Mean BC: ", mean(result.bc))            # firm-level, N-vector

Example: Panel TFE with MLE (analytic)

# Same spec as above — MLE is available when ineff is half-normal or truncated-normal
mymeth_mle = sfmodel_method(method = :MLE)  # no draws needed
result_mle = sfmodel_fit(spec = myspec, method = mymeth_mle, show_table = true)

Example: Panel TFE_CSW (MLE only)

spec_csw = sfmodel_spec(
    datatype = :panel_TFE_CSW,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :HalfNormal,        # CSW requires half-normal
    id = firm_id
)
meth = sfmodel_method(method = :MLE)
result = sfmodel_fit(spec = spec_csw, method = meth)

Example: Panel TRE (MLE only)

spec_tre = sfmodel_spec(
    datatype = :panel_TRE,
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :HalfNormal,        # or :TruncatedNormal
    id = firm_id
)
meth = sfmodel_method(method = :MLE)
result = sfmodel_fit(spec = spec_tre, method = meth)

9.3 Panel vs. Cross-Sectional Differences

Feature Cross-Sectional Panel (TFE / TFE_CSW / TRE)
datatype :cross_sectional (default) :panel_TFE, :panel_TFE_CSW, :panel_TRE
Noise distributions Normal, Student T, Laplace Normal only
Inefficiency distributions All 8 TFE: all 8 (MCI/MSLE), 2 (MLE); CSW: half-normal; TRE: 2
Methods MCI, MSLE, MLE TFE: all 3; CSW/TRE: MLE only
Copula Gaussian, Clayton, Clayton90, Gumbel Not supported
Heteroscedastic hetero Yes (via Z), or hetero=:scaling for scaling property model supported through scaling function
Scaling property model hetero=:scaling; zvar has no constant; init keyword: scaling Always active; zvar has no constant; init keyword: scaling
Constant in frontier/zvar Required in frontier; required in zvar unless hetero=:scaling Not allowed (within-demeaning eliminates)
JLMS/BC indices Observation-level (\(N\) vector) Observation-level vector
Panel id variable N/A id (required for both balanced and unbalanced)

9.4 No Constant Columns

In the Wang and Ho (2010) panel model (panel_TFE), within-group demeaning eliminates any constant terms. Therefore:


Citation