SFrontiers.jl: Stochastic Frontier Model Estimation: Simulation-Based and Analytic Methods

User Manual for Julia Implementation

© Hung-Jen Wang
wangh@ntu.edu.tw


What is it

SFrontiers.jl is a Julia package for flexible estimation of stochastic frontier models. Instead of limiting users to a small set of models with closed-form likelihoods, it provides simulation-based methods that make it practical to work with a much wider range of distributional assumptions, including three choices for the noise component (v) and eight choices for the inefficiency component (u), as well as copula dependence and selected panel-data settings, all within a unified workflow. It also takes advantage of automatic differentiation and GPU acceleration for accurate and efficient estimation.


Table of Contents

  1. Introduction
  2. Hardware and Software Requirements
  3. Installation and Dependencies
  4. Quick Start and Reference Example
  5. API Reference
  6. Supported Models
  7. Distributions Reference
  8. Working with Results
  9. Advanced Topics
  10. Panel Data Models

1. Introduction

This package provides a unified framework for estimating stochastic frontier (SF) models via simulation-based likelihood evaluation and, for a limited subset of models, analytic maximum likelihood estimation (MLE). The simulation-based methods, including the Maximum Simulated Likelihood Estimation (MSLE) and Monte Carlo Integration (MCI), support a broad class of distributional and dependence specifications. For the classical Normal–half-normal, Normal–truncated-normal, and Normal–Exponential models, closed-form analytic MLE is also available.

To address the traditional drawbacks of simulation-based estimation, the package uses automatic differentiation for numerical accuracy and GPU computing for speed. Monte Carlo evidence in Wang and Cheng (2026) shows that the resulting estimators achieve bias and RMSE comparable to those of analytic maximum likelihood estimators where such benchmarks exist.

For the theoretical background of the methods, see Wang and Cheng (2026).

Key features of the package:


2. Hardware and Software Requirements

2.1 Hardware

2.2 Software


3. Installation and Dependencies

SFrontiers.jl is a registered Julia package. Install it from Julia using the built-in package manager Pkg:

using Pkg
Pkg.add("SFrontiers")

To load the package:

using SFrontiers
using Optim

GPU Support (Optional)

CUDA.jl is not a required dependency. It is only needed if you want GPU-accelerated estimation via sfmodel_method(method=:MCI, GPU=true) or sfmodel_method(method=:MSLE, GPU=true). MLE estimation does not use GPU.

Important: If you plan to use GPU features, CUDA.jl must be loaded before SFrontiers.jl. This is because SFrontiers conditionally detects CUDA at load time and registers GPU function overloads only if CUDA is already available.

# Correct order for GPU usage:
using CUDA          # Load CUDA first
using SFrontiers    # Then load SFrontiers
using Optim

# CPU-only usage (no CUDA needed):
using SFrontiers
using Optim

If CUDA is loaded after SFrontiers, GPU features will not be available and you will need to restart Julia and load them in the correct order.


4. Quick Start and Reference Example

Here we show a complete five-step estimation example. Some steps are optional and can be skipped; more examples latter.

using CSV, DataFrames
using CUDA
using SFrontiers
using Optim

# Load data. Assume the csv has column names y, x1, x2, and x3 that are used as variable names.
df = CSV.read("demodata.csv", DataFrame)
y = df.y
X = hcat(ones(length(y)), df.x1, df.x2)  # Frontier variables (include constant)
Z = hcat(ones(length(y)), df.x3)          # Inefficiency determinants

# Step 1: Specify the model
myspec = sfmodel_spec(
    type = :production,        # for production frontier; alt. :cost
    depvar = y,                # dependent variable; a vector
    frontier = X,              # matrix of X vars in frontier
    zvar = Z,                  # matrix of Z vars for inefficiency
    noise = :Normal,           # dist of v
    ineff = :TruncatedNormal,  # dist of u
    hetero = [:mu, :sigma_sq], # heteroscedastic of u's μ & σ²
)

# Step 2: Choose the estimation method
mymeth = sfmodel_method(
    method = :MSLE,           # :MLE, :MSLE, or :MCI
    n_draws = 2^12,           # number of Halton draws per obs
    GPU = true,               # use GPU; default is `false` thus CPU
    chunks = 10               # for GPU memory management; default 10
)

# Step 3: Set initial values
myinit = sfmodel_init(
    spec = myspec,              # from sfmodel_spec()
    frontier = X \ y,           # OLS estimates
    mu = zeros(size(Z, 2)),     # μ coefficients
    ln_sigma_sq = zeros(size(Z, 2)),  # ln(σ²) coefficients
    ln_sigma_v_sq = [0.0]       # ln(σᵥ²)
)

# Step 4: Configure optimization for Julia's Optim
myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),  # first stage "warm-up" estimation
    warmstart_opt    = (iterations = 200, g_abstol = 1e-5),
    main_solver      = Newton(),           # 2nd stage "main" estimation
    main_opt         = (iterations = 100, g_abstol = 1e-8)
)

# Step 5: Estimate the model
result = sfmodel_fit(
    spec = myspec,         # from sfmodel_spec()
    method = mymeth,       # from sfmodel_method()
    init = myinit,         # from sfmodel_init()
    optim_options = myopt, # from sfmodel_opt()
    marginal = true,       # marginal effects of Z; the default
    show_table = true      # print estimation table; the default
)

# Access results
println("coefficients: ", result.coeff)
println("BC efficiency index: ", result.bc)
println("Marginal Effects of Z: ", result.marginal)

MLE alternative: For models with Normal noise and half-normal, truncated-normal, or Exponential inefficiency, analytic MLE is available and requires no simulation settings:

mymeth_mle = sfmodel_method(method = :MLE)  # no draws, GPU, or chunks needed
result_mle = sfmodel_fit(spec = myspec, method = mymeth_mle, init = myinit, optim_options = myopt)

4.1 A Detailed Empirical Example

We replicate the empirical study of Wang (2002) using the same dataset to walk through the full specification and estimation process. The model is a cross-sectional stochastic production frontier with normal noise and truncated-normal inefficiency, where both \(\mu\) and \(\sigma_u^2\) of the inefficiency distribution are parameterized by exogenous determinants.

Although this model admits a closed-form MLE (which was used by Wang 2002), we demonstrate estimation via MSLE to illustrate the simulation-based workflow. The MSLE estimates are extremely close to those of MLE, typically agreeing to at least 5 decimal places.

Model Setup

Consider the following stochastic frontier model:

\[ \begin{aligned} y_i &= x_i'\beta + \varepsilon_i, \\ \varepsilon_i &= v_i - u_i, \\ v_i &\sim N(0, \sigma_v^2), \\ u_i &\sim N^+(\mu_i, \sigma_{u,i}^2), \\ \mu_i &= z_i'\delta, \quad \sigma_{u,i}^2 = \exp(z_i'\gamma). \end{aligned} \]

Here, \(N^+(\mu, \sigma_u^2)\) denotes a truncated-normal distribution obtained by truncating \(N(\mu, \sigma_u^2)\) from below at 0. The vector \(z_i\) contains exogenous determinants of inefficiency. Wang (2002) parameterizes both \(\mu\) and \(\sigma_u^2\) by the same vector \(z_i\), while Battese and Coelli (1995) parameterize only \(\mu\).

Goals of Estimation

Our goals include:

Step 1: Load Data and Specify the Model

We use production data from paddy farmers in India. The dependent variable \(y\) is annual rice production and \(x\) is a vector of agricultural inputs.

Using DataFrame with DSL macros:

using CUDA
using SFrontiers
using Optim
using DataFrames, CSV

df = CSV.read("sampledata.csv", DataFrame)
df._cons = ones(nrow(df))    # create a column of ones for intercepts

Let’s see what is in the data:

julia> describe(df)
11×7 DataFrame
 Row │ variable  mean       min       median    max       nmissing  eltype
     │ Symbol    Float64    Real      Float64   Real      Int64     DataType
─────┼───────────────────────────────────────────────────────────────────────
   1 │ yvar       7.27812    3.58666   7.28586   9.80335         0  Float64
   2 │ Lland      1.05695   -1.60944   1.14307   3.04309         0  Float64
   3 │ PIland     0.146997   0.0       0.0       1.0             0  Float64
   4 │ Llabor     6.84951    3.2581    6.72263   9.46622         0  Float64
   5 │ Lbull      5.64161    2.07944   5.68358   8.37008         0  Float64
   6 │ Lcost      4.6033     0.0       5.1511    8.73311         0  Float64
   7 │ yr         5.38007    1         5.0      10               0  Int64
   8 │ age       53.8856    26        53.0      90               0  Int64
   9 │ school     2.02583    0         0.0      10               0  Int64
  10 │ yr_1       5.38007    1         5.0      10               0  Int64
  11 │ _cons      1.0        1         1.0       1               0  Int64

Now we specify the model:

myspec = sfmodel_spec(
    type   = :production,
    @useData(df),
    @depvar(yvar),
    @frontier(_cons, Lland, PIland, Llabor, Lbull, Lcost, yr),
    @zvar(_cons, age, school, yr),
    noise  = :Normal,
    ineff  = :TruncatedNormal,
    hetero = [:mu, :sigma_sq],    # both μ and σ²_u depend on zvar
)

Alternative: using matrix data

If data comes from matrices rather than a DataFrame (common in simulation studies), use keyword arguments directly:

y = df.yvar
X = hcat(ones(nrow(df)), df.Lland, df.PIland, df.Llabor, df.Lbull, df.Lcost, df.yr)
Z = hcat(ones(nrow(df)), df.age, df.school, df.yr)

myspec = sfmodel_spec(
    type     = :production,
    depvar   = y,              # dependent variable vector
    frontier = X,              # covariate matrix (include constant column)
    zvar     = Z,              # covariates for inefficiency determinants
    noise    = :Normal,
    ineff    = :TruncatedNormal,
    hetero   = [:mu, :sigma_sq],
    varnames = ["_cons", "Lland", "PIland", "Llabor", "Lbull", "Lcost", "yr",
                "_cons", "age", "school", "yr",
                "_cons", "age", "school", "yr",
                "_cons"],  # optional; auto-generated if omitted
)

Step 2: Choose the Estimation Method

mymeth = sfmodel_method(
    method  = :MSLE,     # or :MCI, :MLE
    n_draws = 2^13,      # number of Halton draws per observation
    GPU     = true       # set true for GPU acceleration
)

Step 3: Set Initial Values (optional)

myinit = sfmodel_init(
    spec          = myspec,
    # frontier    = ...,          # skip to use OLS-based defaults
    mu            = zeros(4),     # 4 coefficients in μ equation
    ln_sigma_sq   = zeros(4),     # 4 coefficients in ln(σ²_u) equation
    ln_sigma_v_sq = [0.0]         # 1 coefficient in ln(σ²_v)
)

Step 4: Configure the Optimizer (optional)

An effective strategy for challenging problems is a two-stage approach: a gradient-free solver first to find a good neighborhood, then a gradient-based solver for precise convergence.

myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt    = (iterations = 200, g_abstol = 1e-5),
    main_solver      = Newton(),
    main_opt         = (iterations = 100, g_abstol = 1e-8)
)

Step 5: Estimate the Model

result = sfmodel_fit(
    spec          = myspec,
    method        = mymeth,
    init          = myinit,
    optim_options = myopt,
    marginal      = true,       # compute marginal effects (default)
    jlms_bc_index = true,       # compute efficiency indices (default)
    show_table    = true        # print results to console (default)
)

Results and Post-Estimation Analysis

After estimation, sfmodel_fit() prints a formatted summary to the console. Below is sample output from the paddy-farmer example:

*********************************
      Estimation Results
*********************************
Method: MSLE
Model type: noise=Normal, ineff=TruncatedNormal
Heteroscedastic parameters: [:mu, :sigma_sq]
Number of observations: 271
Number of frontier regressors (K): 7
Number of Z columns (L): 4
Number of draws: 8192
Frontier type: production
GPU computing: true
Number of iterations: 16
Converged: true
Log-likelihood: -82.01844

┌──────────┬────────┬─────────┬──────────┬──────────┬────────┬─────────┬─────────┐
│          │   Var. │   Coef. │ Std.Err. │        z │  P>|z| │ 95%CI_l │ 95%CI_u │
├──────────┼────────┼─────────┼──────────┼──────────┼────────┼─────────┼─────────┤
│ frontier │  _cons │  1.5430 │   0.3578 │   4.3127 │ 0.0000 │  0.8418 │  2.2443 │
│          │  Lland │  0.2582 │   0.0725 │   3.5611 │ 0.0004 │  0.1161 │  0.4004 │
│          │ PIland │  0.1718 │   0.1761 │   0.9753 │ 0.3303 │ -0.1734 │  0.5169 │
│          │ Llabor │  1.1658 │   0.0840 │  13.8805 │ 0.0000 │  1.0012 │  1.3304 │
│          │  Lbull │ -0.4215 │   0.0596 │  -7.0666 │ 0.0000 │ -0.5384 │ -0.3046 │
│          │  Lcost │  0.0142 │   0.0128 │   1.1090 │ 0.2685 │ -0.0109 │  0.0394 │
│          │     yr │  0.0183 │   0.0095 │   1.9226 │ 0.0556 │ -0.0004 │  0.0369 │
│        μ │  _cons │  1.0415 │   0.7284 │   1.4298 │ 0.1540 │ -0.3862 │  2.4691 │
│          │    age │ -0.0479 │   0.0303 │  -1.5804 │ 0.1153 │ -0.1073 │  0.0115 │
│          │ school │ -0.2143 │   0.1712 │  -1.2521 │ 0.2117 │ -0.5497 │  0.1212 │
│          │     yr │  0.1480 │   0.1248 │   1.1854 │ 0.2369 │ -0.0967 │  0.3926 │
│   ln_σᵤ² │  _cons │ -1.1393 │   0.8902 │  -1.2798 │ 0.2018 │ -2.8842 │  0.6055 │
│          │    age │  0.0256 │   0.0096 │   2.6653 │ 0.0082 │  0.0068 │  0.0445 │
│          │ school │  0.1141 │   0.0569 │   2.0054 │ 0.0460 │  0.0026 │  0.2256 │
│          │     yr │ -0.2256 │   0.0496 │  -4.5500 │ 0.0000 │ -0.3228 │ -0.1284 │
│   ln_σᵥ² │ ln_σᵥ² │ -3.2668 │   0.2623 │ -12.4553 │ 0.0000 │ -3.7808 │ -2.7527 │
└──────────┴────────┴─────────┴──────────┴──────────┴────────┴─────────┴─────────┘

Log-parameters converted to original scale (σ² = exp(log_σ²)):
┌─────┬────────┬──────────┐
│     │  Coef. │ Std.Err. │
├─────┼────────┼──────────┤
│ σᵥ² │ 0.0381 │   0.0100 │
└─────┴────────┴──────────┘

Table format: text

***** Additional Information *********
* OLS (frontier-only) log-likelihood: -104.96993
* Skewness of OLS residuals: -0.70351
* The sample mean of the JLMS inefficiency index: 0.33416
* The sample mean of the BC efficiency index: 0.74619

* The sample mean of inefficiency determinants' marginal effects on E(u): (age = -0.00264, school = -0.01197, yr = -0.0265)
* Marginal effects of the inefficiency determinants at the observational level are saved in the return. See the follows.

* Use `name.list` to see saved results (keys and values) where `name` is the return specified in `name = sfmodel_MSLE_fit(..)`. Values may be retrieved using the keys. For instance:
   ** `name.loglikelihood`: the log-likelihood value of the model;
   ** `name.jlms`: Jondrow et al. (1982) inefficiency index;
   ** `name.bc`: Battese and Coelli (1988) efficiency index;
   ** `name.marginal`: a DataFrame with variables' (if any) marginal effects on E(u).
* Use `keys(name.list)` to see available keys.
**************************************

The returned result is a NamedTuple whose fields provide programmatic access to all outputs:

Key fields:

Field Description
result.coeff Full estimated parameter vector
result.std_err Asymptotic standard errors
result.var_cov_mat Variance–covariance matrix
result.table Formatted coefficient table
result.jlms JLMS inefficiency index \(E[u_i \mid \varepsilon_i]\)
result.bc Battese–Coelli efficiency index \(E[\exp(-u_i) \mid \varepsilon_i]\)
result.marginal Observation-level marginal effects (DataFrame)
result.marginal_mean Sample mean of marginal effects
result.loglikelihood Log-likelihood of the estimated model
result.OLS_loglikelihood Log-likelihood of the OLS model
result.OLS_resid_skew Skewness of OLS residuals
result.converged Whether the optimizer converged

Hypothesis Testing

We can test whether the data support the frontier specification against an OLS model using a likelihood ratio (LR) test. The null hypothesis is that inefficiency is absent (\(u_i = 0\)).

julia> LR = -2 * (result.OLS_loglikelihood - result.loglikelihood)
45.90297766691435

Because testing \(u_i = 0\) is on the boundary of the parameter space, the appropriate distribution is the mixed \(\bar{\chi}^2\). Critical values are obtained with sfmodel_MixTable(dof), where dof is the number of parameters involved in \(u_i\) (here, 4 in \(\mu\) + 4 in \(\log \sigma_u^2\) = 8):

julia> sfmodel_MixTable(8)

  * Significance levels and critical values of the mixed χ² distribution
┌─────┬────────┬────────┬────────┬────────┐
│ dof │   0.10 │   0.05 │  0.025 │   0.01 │
├─────┼────────┼────────┼────────┼────────┤
│ 8.0 │ 12.737 │ 14.853 │ 16.856 │ 19.384 │
└─────┴────────┴────────┴────────┴────────┘

source: Table 1, Kodde and Palm (1986, Econometrica).

Since the LR statistic (\(45.903\)) is much larger than the critical value at the 1% level (\(19.384\)), we overwhelmingly reject the null hypothesis of an OLS model.

Inefficiency and Efficiency Index

The JLMS inefficiency index and the Battese–Coelli efficiency index are computed automatically and stored in the result:

julia> [result.jlms  result.bc]
271×2 CuArray{Float64, 2, CUDA.DeviceMemory}:
 0.571107   0.574412
 0.510025   0.610202
 0.10391    0.904546
 0.287699   0.758798
 0.15192    0.864167
 0.570986   0.574326
 ⋮
 1.17584    0.314265
 0.428374   0.662447
 0.847933   0.436294
 0.109994   0.899461
 0.175169   0.845739
 0.165553   0.853446

These can be visualized using standard plotting packages:

using Plots

h1 = histogram(result.jlms, xlabel="JLMS", bins=100, label=false)
h2 = histogram(result.bc, xlabel="BC", bins=50, label=false)
plot(h1, h2, layout=(1, 2), legend=false)
Histograms of the JLMS inefficiency index and the BC efficiency index

Marginal Effects

The marginal effects of the inefficiency determinants on \(E(u_i)\) at the observation level are available as a DataFrame:

julia> result.marginal
271×3 DataFrame
│ Row │ marg_age    │ marg_school │ marg_yr     │
│     │ Float64     │ Float64     │ Float64     │
├─────┼─────────────┼─────────────┼─────────────┤
│ 1   │ -0.0052194  │ -0.0234664  │ -0.0134735  │
│ 2   │ -0.00636135 │ -0.028566   │ -0.00756538 │
│ 3   │ -0.00775549 │ -0.0347949  │ -0.00113382 │
│ 4   │ -0.00953588 │ -0.0427526  │ 0.00630896  │
⋮
│ 268 │ 0.00291455  │ 0.0128417   │ -0.0596646  │
│ 269 │ 0.00221272  │ 0.0097239   │ -0.051836   │
│ 270 │ 0.00160268  │ 0.00701424  │ -0.0449243  │
│ 271 │ 0.00107203  │ 0.0046576   │ -0.0388295  │

The marginal effects can be plotted against covariates to reveal non-linear patterns:

using Plots

scatter(df.age, result.marginal[:, :marg_age],
        xlabel="age", ylabel="marginal effect of age on E(u)",
        label=false)
hline!([0.0], label=false, linestyle=:dash)
Marginal effect of age on E(u)

The plot reveals a non-monotonic effect: production inefficiency decreased with age in the early years of the farmer’s life (perhaps due to experience accumulation), but increased with age in later years (perhaps due to deteriorating physical health). Wang’s (2002) model allows this non-monotonic effect by parameterizing both \(\mu\) and \(\sigma_u^2\) by the same vector of inefficiency determinants.

Saving Results

The result can be saved for later analysis. Using the JLD2 package for binary storage:

using JLD2

save_object("model1.jld2", result)             # save everything
result_loaded = load_object("model1.jld2")     # load it back

For cross-platform, human-readable text storage:

using CSV

CSV.write("marginal.csv", result.marginal)     # marginal effects (DataFrame)

5. API Reference

5.1 sfmodel_spec()

The sfmodel_spec() function creates a model specification object that encapsulates all data, distributional assumptions, and metadata required for estimation. It is independent of the choice of estimation method.

Syntax

sfmodel_spec() — define the model specification.

Required:

Optional:

Arguments

Argument Type Description
datatype Symbol Data type. :cross_sectional (default), :panel_TFE (Wang and Ho 2010 true fixed-effect), :panel_TFE_CSW (Chen, Schmidt, and Wang 2014, MLE only), or :panel_TRE (true random-effect, MLE only). See Section 10.
type Symbol Frontier type. Use:production or :prod for production frontier (\(\varepsilon_i = v_i - u_i\)), and :cost for cost frontier (\(\varepsilon_i = v_i + u_i\)).
depvar Vector Dependent variable. Cross-sectional: \(N\) observations. Panel: \(N \times T\) stacked by firm.
frontier Matrix Covariate matrix for the frontier equation, dimension\(N\times K\). Accepts a Matrix or a list form [v1, v2, ...] that is internally assembled into a matrix. Cross-sectional: include a column of ones (1) for intercept. Panel: do NOT include a constant column (within-demeaning eliminates it).
zvar Matrix Covariate matrix.
Cross-sectional: for heteroscedasticity equations, dimension \(N\times L\); include a column of ones if an intercept is required. When hetero=:scaling, the zvar matrix supplies the \(\mathbf{z}_i\) variables for the scaling function \(h(\mathbf{z}_i)=\exp(\mathbf{z}_i'\boldsymbol{\delta})\); do NOT include a constant column (for identification).
Panel: for scaling function \(h(z)=\exp(z'\delta)\), dimension \(NT \times L\); do NOT include a constant column. Optional for both.
noise Symbol Distribution of the noise term \(v\). Cross-sectional: :Normal, :StudentT, :Laplace. Panel: :Normal only. See Section 6.
ineff Symbol Distribution of the inefficiency term\(u\). Supported options: :HalfNormal, :TruncatedNormal, :Exponential, :Weibull, :Lognormal, :Lomax, :Rayleigh, and :Gamma. See Section 6. Note: :Gamma is MCI only; method=:MLE supports only :HalfNormal, :TruncatedNormal, :Exponential.
copula Symbol Cross-sectional only. Copula for dependence between \(v\) and \(u\). Options: :None (default), :Gaussian, :Clayton, :Clayton90, :Gumbel. Not available with panel datatypes.
hetero Vector{Symbol} or Symbol Cross-sectional only. Parameters of the distributional specification that are allowed to be heteroscedastic (e.g., [:mu, :sigma_sq]), or the symbol :scaling to activate the scaling property model. Not available with panel datatypes. See Section 7 and Section 9.5.
id Vector Panel only. Unit identifier column. Required for all panel datatypes. Data must be grouped by unit (contiguous rows for each firm). For balanced panels, create an id column: e.g., id = repeat(1:N, inner=T).
varnames Vector{String} Variable names used in output tables. Ifnothing (default), names are generated automatically.
eqnames Vector{String} Equation block names (e.g.,["frontier", "mu", "ln_sigma_u_sq"]). If nothing (default), names are generated from ineff.
eq_indices Vector{Int} Equation boundary indices. Ifnothing (default), auto-generated based on the model structure.

Return Value

Returns a UnifiedSpec{T} struct containing the model specifications internally. For cross-sectional models, it holds MCI, MSLE, and MLE backend specs (as applicable). For panel_TFE, it holds both Panel (MCI/MSLE) and MLE specs. For panel_TFE_CSW and panel_TRE, it holds MLE spec only.

Example: Basic Specification

Using the paddy-farmer data from Section 4.1:

using SFrontiers, Optim
using DataFrames, CSV

df = CSV.read("sampledata.csv", DataFrame)
df._cons = ones(nrow(df))  # add a column of _cons to the df DataFrame

# --- Keyword form (matrix data) ---

y = df.yvar
X = hcat(df._cons, df.Lland, df.PIland, df.Llabor, df.Lbull, df.Lcost, df.yr)
Z = hcat(df._cons, df.age, df.school, df.yr)

# Homoscedastic (no Z needed)
spec1 = sfmodel_spec(
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :TruncatedNormal,
)

# Heteroscedastic, also with custom variable names
spec2 = sfmodel_spec(
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu],  # or [:sigma_sq], [:mu, :sigma_sq]. See Section 7 for options by distribution
    varnames = ["constant", "land", "Iland", "labor", "bull", "cost", "year",  # frontier
                "constant", "age", "schooling", "year",                        # mu
                "constant",                                                    # ln_sigma_u_sq
                "constant"],                                                   # ln_sigma_v_sq
)

# --- DSL form (DataFrame). Instead of passing data vectors and matrices directly, users specify column names of the DataFrame in the macros (`@depvar`, `@frontier`, `@zvar`), and the data is extracted automatically. The macros can appear in any order.

# Heteroscedastic

spec3 = sfmodel_spec(
    @useData(df),             # the DataFrame
    @depvar(yvar),            # arguments are column names, not data
    @frontier(_cons, Lland, PIland, Llabor, Lbull, Lcost, yr),
    @zvar(_cons, age, school, yr),
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu],
)

Example: Scaling Property Model

The scaling property model uses hetero = :scaling (a Symbol, not a Vector{Symbol}). In this specification, zvar provides the environmental variables \(\mathbf{z}_i\) for the scaling function \(h(\mathbf{z}_i) = \exp(\mathbf{z}_i'\boldsymbol{\delta})\), and the inefficiency distribution parameters remain scalar (homoscedastic). The zvar matrix must not contain a constant column (for identification; see Section 9.5).

# Scaling property model (using keyword form as an example)
# Z_nocons must NOT contain a constant column

Z_nocons = hcat(df.age, df.school, df.yr)

spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,           # include constant as usual
    zvar = Z_nocons,        # environmental variables (no constant!)
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling,      # scaling property model
)

# Scaling + copula (allowed)
spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    zvar = Z_nocons,
    noise = :Normal,
    ineff = :Exponential,
    hetero = :scaling,
    copula = :Clayton,
)

Notes on scaling property model:

Example: Panel Data Specification

Assume a panel dataset has been loaded and the vectors/matrices \(y\), \(X\), \(Z\), and \(firm\_ids\) are constructed. The \(firm\_ids\) is a vector uniquely identifies each firm. We use keyword form in the examples; DSL form is also available.

# Panel TFE — Wang and Ho (2010) true fixed-effect model.
#   method=:MCI, :MSLE for all inefficiency distributions, :MLE only for half-normal/truncated-normal
spec = sfmodel_spec(
    type = :production,     # optional
    datatype = :panel_TFE,  # Wang and Ho 2010 panel model
    depvar = y,             # N*T stacked by firm
    frontier = X,           # NT x K (no constant)
    zvar = Z,               # NT x L (no constant)
    noise = :Normal,
    ineff = :TruncatedNormal,
    id = firm_ids,          # unit individual identifier (required for all panel models)
    hetero = :scaling       # optional; :scaling is the default and the only permissible
)

# Panel TFE_CSW — Chen, Schmidt, and Wang (2014) fixed-effect model.
#   MLE only, half-normal only
spec = sfmodel_spec(
    datatype = :panel_TFE_CSW,
    depvar = y,
    frontier = X,
    noise = :Normal,
    id = firm_ids,
    ineff = :HalfNormal
)


# Panel TRE — True random-effect model
#   MLE only, half-normal or truncated-normal
spec = sfmodel_spec(
    datatype = :panel_TRE,
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    id = firm_ids,
    ineff = :HalfNormal
)

Notes:


5.2 sfmodel_method()

The sfmodel_method() function specifies the estimation method and its computational settings (number of draws, GPU usage, etc.). This is separate from sfmodel_spec() so that the same model specification can be estimated under different methods.

Syntax

sfmodel_method() — choose the estimation method and simulation settings.

Required:

Optional (MCI/MSLE only — ignored with a warning for MLE):

Arguments

Argument Type Description
method Symbol Estimation method. Use :MSLE for Maximum Simulated Likelihood Estimation, :MCI for Monte Carlo Integration, or :MLE for analytic Maximum Likelihood Estimation (available for a limited subset of models). Required. MLE does not use any simulation arguments below.
transformation Symbol/Nothing MCI only. Transformation rule for mapping uniform draws to inefficiency values. Options: :expo_rule, :logistic_1_rule, :logistic_2_rule, or nothing for distribution-specific defaults. Ignored with a warning if method=:MSLE.
draws Matrix Draws for Monte Carlo integration as a1 x S row matrix. Use reshape(your_draws, 1, length(your_draws)) to convert a vector to matrix. If nothing (default), Halton draws are auto-generated with correct shape.
n_draws Int Number of Halton draws per observation. Default: 1024 for both MCI and MSLE. Used only whendraws is not provided. When multiRand=true, must be \(\leq\) distinct_Halton_length.
multiRand Bool Whether each observation gets different Halton draws.true (default) generates an N x S wrapped Halton matrix where each observation uses different consecutive draws. false uses the original 1 x S shared draws. When true, n_draws must be \(\leq\) distinct_Halton_length.
GPU Bool Whether to use GPU computing.false (default) uses CPU, and true uses GPU (requires using CUDA).
chunks Int Number of chunks for splitting data in MCI/MSLE estimation (default: 10). Effective for both CPU and GPU; especially useful for GPU memory management. Not used by MLE.
distinct_Halton_length Int Maximum length of the distinct Halton sequence generated formultiRand=true mode (default: 2^15-1 = 32767). Increase this if you need n_draws larger than the default limit. See Observation-Specific Halton Draws.

About GPU and chunks options

The chunks option is effective for simulation-based estimators (MSLE and MCI). It works for both CPU and GPU computation and is particularly essential for GPU computing (GPU=true). It is ineffective (not needed) for MLE.

Simulation-based estimation (MSLE and MCI) requires evaluating the likelihood contribution for every combination of N observations and S draws, forming an N x S matrix. When N or S is large, this matrix can exceed available memory (especially GPU VRAM), creating a bottleneck. The chunks option addresses this by splitting the N observations into smaller batches.

When chunks=1, all N observations are processed at once as a single N x S matrix. Setting chunks to a value greater than 1 (e.g., chunks=10) splits the observations into smaller batches, creating matrices of size (N/chunks) x S. Each batch is processed sequentially while accumulating the log-likelihood. This reduces peak memory usage, allowing larger datasets and n_draws to fit in GPU memory at the expense of slightly increased computation overhead due to the splitting and looping.

In Windows, users may use Task Manager to monitor the memory usage and adjust chunks to avoid bottlenecks.

Transformation Rules (MCI Only)

When method=:MCI, the transformation option controls the change-of-variable mapping from uniform draws \(t \in (0,1)\) to inefficiency values \(u \geq 0\). If nothing, a distribution-specific default is used.

Rule Formula Jacobian Default for
:expo_rule \(u = s \cdot (-\ln(1-t))\) \(s/(1-t)\) Exponential, Weibull, Gamma, Rayleigh
:logistic_1_rule \(u = s \cdot t/(1-t)\) \(s/(1-t)^2\) half-normal, truncated-normal, Lognormal, Lomax
:logistic_2_rule \(u = s \cdot (t/(1-t))^2\) \(2st/(1-t)^3\)

Here \(s\) is a scale parameter derived from the inefficiency distribution, determined automatically by the package. When heteroscedasticity is specified, \(s\) becomes observation-specific (\(s_i\)). The scale parameters are:

Distribution Scale\(s\) Meaning
half-normal \(\sigma\) Standard deviation of\(N^+(0, \sigma^2)\)
truncated-normal \(\sigma_u\) Standard deviation of\(N^+(\mu, \sigma_u^2)\)
Exponential \(\sqrt{\lambda}\) \(\sqrt{\lambda}\), where \(\lambda = \text{Var}(u)\)
Weibull \(\lambda\) Scale parameter of\(\text{Weibull}(\lambda, k)\)
Lognormal \(\sigma\) Log-scale standard deviation of\(\text{LogNormal}(\mu, \sigma^2)\)
Lomax \(\lambda\) Scale parameter of\(\text{Lomax}(\alpha, \lambda)\)
Rayleigh \(\sigma\) Scale parameter of\(\text{Rayleigh}(\sigma)\)
Gamma \(\theta\) Scale parameter of\(\text{Gamma}(k, \theta)\)

Return Value

Returns a method specification struct that encodes both the estimation method and computational settings.

Example

# MSLE with default settings
meth1 = sfmodel_method(method = :MSLE)

# MCI with custom draws and GPU
meth2 = sfmodel_method(
    method = :MCI,
    n_draws = 2^12,
    GPU = true,
    chunks = 10,
    transformation = :logistic_1_rule
)

# Larger Halton pool for multiRand mode
meth3 = sfmodel_method(
    method = :MSLE,
    n_draws = 50000,
    distinct_Halton_length = 2^16 - 1  # 65535
)

# MLE — analytic, no simulation parameters needed (limited model support)
meth0 = sfmodel_method(method = :MLE)

# GPU with more chunks for memory management
meth4 = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12,
    GPU = true,
    chunks = 20
)

# User-supplied draws (uniform on (0,1), reshaped to 1×S row matrix)
my_draws = reshape(rand(2^12), 1, :)
meth5 = sfmodel_method(
    method = :MSLE,
    draws = my_draws,  # identical for each obs
    GPU = true,
)

Note: If simulation arguments (transformation, draws, n_draws, GPU) are passed with method=:MLE, a warning is issued and they are ignored.


5.3 sfmodel_init()

The sfmodel_init() function creates an initial value vector for optimization. The initial-value vector depends only on the model specification (distribution choice), not on the estimation method. It supports two usage modes: full vector mode and component mode.

Syntax

sfmodel_init() — set initial values for optimization.

Required:

Optional (full vector mode):

Optional (component mode):

Arguments

Argument Type Description
spec UnifiedSpec Model specification returned bysfmodel_spec(). Required.
init Vector/Tuple Complete initial-value vector (or tuple). Ifinit is provided, all other component initial-value arguments are ignored.
frontier Vector/Tuple Initial values for the frontier coefficients (\(K\) elements).
scaling Vector/Tuple Initial values for the scaling function coefficients \(\boldsymbol{\delta}\) (\(L\) elements, one per column of zvar). Used when hetero = :scaling.
mu Vector/Tuple Initial values for\(\mu\) (used by TruncatedNormal and Lognormal inefficiency specifications).
ln_sigma_sq Vector/Tuple Initial values for\(\ln(\sigma^2)\) (used by TruncatedNormal, HalfNormal, Lognormal, and Rayleigh inefficiency specifications).
ln_sigma_v_sq Vector/Tuple Initial values for\(\ln(\sigma_v^2)\) (used when the noise distribution is Normal or StudentT).
ln_nu_minus_2 Vector/Tuple Initial values for\(\ln(\nu-2)\) (used when the noise distribution is StudentT).
ln_b Vector/Tuple Initial values for\(\ln(b)\) (used when the noise distribution is Laplace).
ln_lambda Vector/Tuple Initial values for\(\ln(\lambda)\) (used by Exponential and Weibull inefficiency specifications).
ln_k Vector/Tuple Initial values for\(\ln(k)\) (used by Weibull and Gamma inefficiency specifications).
ln_lambda Vector/Tuple Initial values for\(\ln(\lambda)\) (used by Lomax inefficiency specifications).
ln_alpha Vector/Tuple Initial values for\(\ln(\alpha)\) (used by Lomax inefficiency specifications).
ln_theta Vector/Tuple Initial values for\(\ln(\theta)\) (used by Gamma inefficiency specifications; MCI only).
theta_rho Vector/Tuple Initial value for the copula parameter\(\theta_\rho\) (used when copula \(\neq\) :None).
message Bool Iftrue, print a warning when init overrides the component initial-value arguments.

Return Value

Returns a Vector{Float64} containing the initial parameter values in the correct order for optimization.

Mode 1: Full Vector Mode

Provide the complete parameter vector directly:

myinit = sfmodel_init(
    spec = myspec,
    init = [0.5, 0.3, 0.2, 0.1, 0.1, 0.1, 0.0, 0.0]
)

Mode 2: Component Mode

Specify initial values by parameter group. The required groups depend on the model specification. The ordering of equation blocks is irrelevant; the program maps each block to the correct position in the coefficient vector automatically.

# Normal + truncated-normal with hetero = [:mu]
myinit = sfmodel_init(
    spec = myspec,
    frontier = [0.5, 0.3, 0.2],     # frontier coefficients
    mu = [0.1, 0.1, 0.1],           # mu of truncated normal
    ln_sigma_sq = [0.0],            # log(sigma_u^2) of truncated normal
    ln_sigma_v_sq = [0.0]           # log(sigma_v^2) of normal
)

# alternatively,
myinit2 = sfmodel_init(
    spec = myspec,
    frontier = X \ y,               # OLS coefficients
    mu = 0.1*ones(size(Z, 2)),      # mu of truncated normal
    ln_sigma_sq = [0.0],            # log(sigma_u^2) of truncated normal
    ln_sigma_v_sq = (0.0)           # log(sigma_v^2) of normal
)

Scaling Property Initial Values

When hetero = :scaling, provide the scaling keyword with initial values for the \(\boldsymbol{\delta}\) coefficients (one per column of zvar). All inefficiency distribution parameters are scalar:

# Normal + half-normal with scaling property
myinit = sfmodel_init(
    spec = myspec,
    frontier = X \ y,                   # OLS estimates
    scaling = zeros(size(Z_nocons, 2)), # δ coefficients (one per z-variable)
    ln_sigma_sq = [0.0],               # scalar base parameter
    ln_sigma_v_sq = [0.0]              # noise variance
)

Panel Initial Values

When spec.datatype == :panel_TFE, the function dispatches to the Panel backend. When spec.datatype is :panel_TFE_CSW or :panel_TRE, it dispatches to the MLE backend. The keyword arguments differ from cross-sectional:

Argument Description
frontier Initial values for frontier coefficients (\(K\) elements)
delta Initial values for scaling function\(h(z)\) coefficients (\(L\) elements)
mu truncated-normal, Lognormal: \(\mu\) (scalar)
ln_sigma_u_sq half-normal, truncated-normal: \(\ln(\sigma_u^2)\) (scalar)
ln_sigma_v_sq All distributions: \(\ln(\sigma_v^2)\) (scalar)
ln_lambda Exponential, Weibull: \(\ln(\lambda)\) (scalar)
ln_k Weibull, Gamma: \(\ln(k)\) (scalar)
ln_sigma_sq Lognormal, Rayleigh: \(\ln(\sigma^2)\) (scalar)
ln_lambda Lomax: \(\ln(\lambda)\) (scalar)
ln_alpha Lomax: \(\ln(\alpha)\) (scalar)
ln_theta Gamma: \(\ln(\theta)\) (scalar)
# Panel TFE: Normal + half-normal
myinit = sfmodel_init(
    spec = myspec,                # spec with datatype=:panel_TFE
    frontier = X_tilde \ y_tilde, # OLS on demeaned data (auto-computed if omitted)
    delta = [0.1, 0.1],          # scaling function coefficients
    ln_sigma_u_sq = 0.1,         # scalar
    ln_sigma_v_sq = 0.1          # scalar
)

Note: In panel models, all inefficiency distribution parameters are scalar (not heteroscedastic). If frontier and delta are omitted, OLS-based defaults are used automatically.

Panel TFE + MLE: When datatype=:panel_TFE and method=:MLE, sfmodel_fit automatically uses MLE’s own OLS-based defaults for initial values (the Panel-backend init is not passed to MLE). You may still call sfmodel_init for MCI/MSLE estimation of the same spec.

Input Format Flexibility

All parameter arguments accept vectors, row vectors, or tuples:

# All equivalent
frontier = [0.5, 0.3, 0.2]    # Vector
frontier = [0.5 0.3 0.2]      # Row vector (1x3 matrix)
frontier = (0.5, 0.3, 0.2)    # Tuple

5.4 sfmodel_opt()

The sfmodel_opt() function specifies the optimization options. Solvers and options are passed directly to Julia’s Optim.jl interface, so any solver (e.g., NelderMead(), BFGS(), Newton()) and any Optim.Options keyword (e.g., iterations, g_abstol, show_trace) are permissible. All three estimation methods (MCI, MSLE, and MLE) use the same optimizer interface, so no method argument is needed. The function supports a two-stage approach: a derivative-free solver in the first stage (warmstart) to refine the initial values (which is often very useful for highly nonlinear models), followed by a gradient-based solver in the second stage (main) for accurate convergence.

Syntax

sfmodel_opt() — configure optimization solvers and options.

Required:

Optional:

Arguments

Argument Description Required
warmstart_solver Warmstart optimizer, e.g.,NelderMead(), BFGS() No
warmstart_opt Warmstart options as a NamedTuple, e.g.,(iterations = 400, g_abstol = 1e-5) No
main_solver Main optimizer, e.g.,Newton(), BFGS() Yes
main_opt Main options as a NamedTuple, e.g.,(iterations = 2000, g_abstol = 1e-8) Yes

Common options for warmstart_opt and main_opt:

Parameter Description Typical Value
iterations Maximum iterations 200–2000
g_abstol
g_reltol
Gradient absolute and relative tolerance 1e-5 to 1e-8
f_abstol
f_reltol
Function absolute and relative tolerance 1e-32
x_abstol
x_reltol
Parameter absolute and relative tolerance 1e-32
show_trace Display iteration progress false

Notes:

Return Value

Returns an optimization specification struct containing the solver and option specifications.

Example: Two-Stage Optimization

myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt = (iterations = 100, g_reltol = 1e-4),
    main_solver = Newton(),
    main_opt = (iterations = 200, g_reltol = 1e-8)
)

Example: Skip Warmstart

If the warmstart solver is not provided, the warmstart stage is skipped:

myopt = sfmodel_opt(
    main_solver = Newton(),
    main_opt = (iterations = 200, g_abstol = 1e-8)
)

Notes

Common Pitfall: Trailing Comma for Single-Element Options

When specifying options with only one element, you must include a trailing comma to create a NamedTuple:

# CORRECT - trailing comma creates a NamedTuple
main_opt = (iterations = 200,)

# WRONG - without comma, this is just the integer 200, not a NamedTuple
main_opt = (iterations = 200)

For two or more elements, the trailing comma is not needed:

# Both work fine
main_opt = (iterations = 200, g_abstol = 1e-8)
main_opt = (iterations = 200, g_abstol = 1e-8,)  # trailing comma optional

If you forget the trailing comma, the function will display a helpful error message:

ERROR: Invalid `main_opt`: expected a NamedTuple, got Int64.
Hint: For single-element options, use a trailing comma:
`main_opt = (iterations = 200,)` not `main_opt = (iterations = 200)`.

5.5 sfmodel_fit()

The sfmodel_fit() function is the main estimation routine. It organizes the entire workflow: optimization, variance-covariance computation, efficiency index calculation, marginal effects, and results presentation.

Syntax

sfmodel_fit() — run estimation and produce results.

Required:

Optional:

Arguments

Argument Type Description
spec UnifiedSpec Model specification fromsfmodel_spec() (required).
method UnifiedMethod Method specification fromsfmodel_method() (required).
init Vector Initial parameter vector fromsfmodel_init(). If nothing, uses OLS for frontier and 0.1 for other parameters.
optim_options Optimization options fromsfmodel_opt(). If nothing, uses defaults.
jlms_bc_index Bool Compute JLMS and BC efficiency indices (default:true).
marginal Bool Compute marginal effects of\(Z\) on E(u) (default: true).
show_table Bool Print formatted estimation table (default:true).
verbose Bool Print detailed progress information (default:true).

Return Value

Returns a NamedTuple with comprehensive results:

Convergence Information

Field Type Description
converged Bool Whether optimization converged
iter_limit_reached Bool Whether iteration limit was reached
redflag Int Warning flag: 0 = OK, 1 = potential issues

Method Information

Field Type Description
GPU Bool Whether GPU acceleration was used
n_draws Int Actual number of draws per observation (or per firm for panel)
multiRand Bool Whether per-observation/per-firm Halton draws were used
chunks Int Number of chunks for memory management
distinct_Halton_length Int Maximum Halton sequence length for multiRand
estimation_method Symbol Estimation method used (:MCI, :MSLE, :MLE)

Model Results

Field Type Description
n_observations Int Number of observations (N)
loglikelihood Float64 Maximized log-likelihood value
coeff Vector Estimated coefficient vector
std_err Vector Standard errors
var_cov_mat Matrix Variance-covariance matrix
table Matrix Formatted coefficient table

Efficiency Indices

Field Type Description
jlms Vector JLMS inefficiency index\(\text{E}(u \mid \varepsilon)\) for each observation
bc Vector Battese-Coelli efficiency index\(\text{E}(e^{-u} \mid \varepsilon)\) for each observation

Marginal Effects

Field Type Description
marginal DataFrame Observation-level marginal effects on E(u)
marginal_mean NamedTuple Mean marginal effects

For the scaling property model (hetero = :scaling), marginal effects are computed via \(\partial E(u_i) / \partial z_{ij} = \delta_j \cdot h(\mathbf{z}_i) \cdot E(u_i^*)\), using automatic differentiation. The coefficient \(\delta_j\) directly gives the semi-elasticity \(\partial \ln E(u_i) / \partial z_{ij} = \delta_j\). See Section 9.5.

OLS Diagnostics

Field Type Description
OLS_loglikelihood Float64 Log-likelihood from OLS frontier
OLS_resid_skew Float64 Skewness of OLS residuals

Technical Details

Field Type Description
model The internal model specification
Hessian Matrix Numerical Hessian at optimum
gradient_norm Float64 Gradient norm at convergence
actual_iterations Int Total iterations across all stages
warmstart_solver Solver Warmstart algorithm used
warmstart_ini Vector Initial values for warmstart
warmstart_maxIT Int Maximum warmstart iterations
main_solver Solver Main algorithm used
main_ini Vector Initial values for main stage
main_maxIT Int Maximum main iterations
main_tolerance Float64 Convergence tolerance

Parameter Subsets

Individual coefficient vectors are also available:

Dictionary Access

All results are also available through result.list, an OrderedDict:

keys(result.list)    # View all available keys
result.list[:coeff]  # Access specific result
result.coeff         # alternative 

Example: Full Estimation

result = sfmodel_fit(
    spec = myspec,
    method = mymeth,
    init = myinit,
    optim_options = myopt,
    jlms_bc_index = true,
    marginal = true,
    show_table = true,
    verbose = true
)

# Access results
println("Converged: ", result.converged)
println("Log-likelihood: ", result.loglikelihood)
println("Frontier coefficients: ", result.frontier)
println("Mean efficiency: ", mean(result.bc))
println("Marginal effects: ", result.marginal_mean)

Example: Minimal Call with Defaults

# Uses OLS-based initial values and default optimization
result = sfmodel_fit(spec = myspec, method = mymeth)

Output Display

When show_table = true, the function prints:

  1. Model specification summary (distributions, sample size, etc.)
  2. Warmstart progress (if enabled)
  3. Main optimization progress
  4. Formatted coefficient table with standard errors, z-statistics, p-values, and confidence intervals
  5. Auxiliary table converting log-transformed parameters to original scale
  6. Additional statistics (OLS log-likelihood, skewness, mean efficiency indices, marginal effects)

Convergence Warnings

The function monitors convergence and sets redflag = 1 if:

5.6 sfmodel_MixTable() and sfmodel_ChiSquareTable()

These utility functions print critical values for hypothesis testing. They are available from the MLE backend.

sfmodel_MixTable(dof)

Prints the critical values of the mixed chi-squared distribution \(\bar{\chi}^2\) for a given number of degrees of freedom (1 to 40). The mixed chi-squared distribution is a 50:50 mixture of \(\chi^2(p-1)\) and \(\chi^2(p)\), where \(p\) is the number of restrictions.

When to use: The mixed chi-squared test is required for likelihood ratio (LR) tests where the null hypothesis places a parameter on the boundary of the parameter space. The classic example in stochastic frontier analysis is testing for the absence of inefficiency:

\[ H_0: \sigma_u^2 = 0 \quad \text{vs.} \quad H_1: \sigma_u^2 > 0. \]

Because \(\sigma_u^2 \ge 0\), the null value is on the boundary, and the standard \(\chi^2\) distribution does not apply. The LR statistic \(-2(\ln L_R - \ln L_U)\) should be compared against the mixed \(\bar{\chi}^2\) critical values instead.

Syntax:

sfmodel_MixTable()       # Print the full table (dof 1 to 40)
sfmodel_MixTable(3)      # Print critical values for dof = 3

Return: A row (or matrix) of critical values at significance levels 0.10, 0.05, 0.025, and 0.01.

Source: Table 1, Kodde and Palm (1986, Econometrica).

sfmodel_ChiSquareTable(dof)

Prints the critical values of the standard chi-squared distribution \(\chi^2\) for a given number of degrees of freedom.

When to use: For standard LR tests where the null hypothesis does not place parameters on the boundary. For example, testing a subset of frontier coefficients:

\[ H_0: \beta_2 = \beta_3 = 0 \quad \text{(interior restriction)}. \]

Syntax:

sfmodel_ChiSquareTable(2)   # Critical values for dof = 2

Return: A row of critical values at significance levels 0.10, 0.05, 0.025, and 0.01.

Example: Testing for Inefficiency

# Estimate unrestricted model (stochastic frontier)
result_sf = sfmodel_fit(spec = myspec, method = mymeth)

# Estimate restricted model (OLS, no inefficiency)
# The OLS log-likelihood is available from the SF estimation output:
ll_ols = result_sf.OLS_loglikelihood
ll_sf  = result_sf.loglikelihood

# Compute LR statistic
LR = -2 * (ll_ols - ll_sf)

# Compare against mixed chi-squared critical values
# dof = 1 (one restriction: sigma_u^2 = 0)
sfmodel_MixTable(1)
# At 5% level, critical value is 2.705
# If LR > 2.705, reject H0 (inefficiency is statistically significant)

6. Supported Models

Noise Distributions

Symbol Distribution Init Parameters Models Methods
:Normal \(v \sim N(0, \sigma_v^2)\) ln_sigma_v_sq \(= \log(\sigma_v^2)\) cross, panel MCI, MSLE, MLE
:StudentT \(v \sim t(0, \sigma_v, \nu)\), \(\nu > 2\) ln_sigma_v_sq \(= \log(\sigma_v^2)\), ln_nu_minus_2 \(= \log(\nu - 2)\) cross MCI, MSLE
:Laplace \(v \sim \text{Laplace}(0, b)\) ln_b \(= \log(b)\) cross MCI, MSLE

Inefficiency Distributions

Symbol Distribution Init Parameters Models Methods
:HalfNormal \(u \sim N^+(0, \sigma^2)\) ln_sigma_sq \(= \log(\sigma^2)\) cross, panel MCI, MSLE, MLE
:TruncatedNormal \(u \sim N^+(\mu, \sigma_u^2)\) mu \(= \mu\), ln_sigma_sq \(= \log(\sigma_u^2)\) cross, panel MCI, MSLE, MLE
:Exponential \(u \sim \text{Exp}(\lambda)\), \(\lambda = \text{Var}(u)\) ln_lambda \(= \log(\lambda)\) cross, panel MCI, MSLE, MLE
:Weibull \(u \sim \text{Weibull}(\lambda, k)\) ln_lambda \(= \log(\lambda)\), ln_k \(= \log(k)\) cross, panel MCI, MSLE
:Lognormal \(u \sim \text{LogNormal}(\mu, \sigma^2)\) mu \(= \mu\), ln_sigma_sq \(= \log(\sigma^2)\) cross, panel MCI, MSLE
:Lomax \(u \sim \text{Lomax}(\alpha, \lambda)\) ln_lambda \(= \log(\lambda)\), ln_alpha \(= \log(\alpha)\) cross, panel MCI, MSLE
:Rayleigh \(u \sim \text{Rayleigh}(\sigma)\) ln_sigma_sq \(= \log(\sigma^2)\) cross, panel MCI, MSLE
:Gamma \(u \sim \text{Gamma}(k, \theta)\) ln_k \(= \log(k)\) (shape), ln_theta \(= \log(\theta)\) (scale) cross, panel MCI

Copula Models

Cross-sectional models only. A copula models the dependence between the noise term \(v\) and the inefficiency term \(u\). When copula=:None (default), \(v\) and \(u\) are assumed independent. With a copula, the joint density becomes \(f(v,u) = f_v(v) \cdot f_u(u) \cdot c(F_v(v), F_u(u); \rho)\), where \(c\) is the copula density and \(\rho\) is the dependence parameter.

Symbol Copula Domain Kendall’s\(\tau\) Tail Dependence Init Parameter
:Gaussian Gaussian \(\rho \in (-1,1)\) \((2/\pi)\arcsin(\rho)\) None theta_rho \(= \text{atanh}(\rho/0.999)\)
:Clayton Clayton \(\rho > 0\) \(\rho/(2+\rho)\) Lower: \(2^{-1/\rho}\) theta_rho \(= \log(\rho - 10^{-6})\)
:Clayton90 Clayton 90° rotation \(\rho > 0\) \(-\rho/(2+\rho)\) Upper-lower: \(2^{-1/\rho}\) theta_rho \(= \log(\rho - 10^{-6})\)
:Gumbel Gumbel \(\rho \geq 1\) \(1 - 1/\rho\) Upper: \(2 - 2^{1/\rho}\) theta_rho \(= \log(\rho - 1)\)

Notes:


7. Distributions Reference

Noise Distributions
Distribution Specification Required Init Parameters
Normal \(v \sim N(0, \sigma_v^2)\) ln_sigma_v_sq = \(\log(\sigma_v^2)\)
StudentT \(v \sim t(0, \sigma_v, \nu)\) with \(\nu > 2\) ln_sigma_v_sq = \(\log(\sigma_v^2)\),
ln_nu_minus_2 = \(\log(\nu-2)\)
Laplace \(v \sim \text{Laplace}(0, b)\) ln_b = \(\log(b)\)
Inefficiency Distributions
Distribution Specification Required Init Parameters Hetero Options
truncated-normal \(u \sim N^+(\mu, \sigma_u^2)\) mu = \(\mu\),
ln_sigma_sq = \(\log(\sigma_u^2)\)
:mu for \(\mu\),
:sigma_sq for \(\sigma_u^2\)
half-normal \(u \sim N^+(0, \sigma^2)\) ln_sigma_sq = \(\log(\sigma^2)\) :sigma_sq for \(\sigma^2\)
Exponential \(u \sim \text{Exp}(\lambda)\), \(\lambda = \text{Var}(u)\) lambda = \(\lambda\) :lambda for \(\lambda\)
Weibull \(u \sim \text{Weibull}(\lambda, k)\) lambda = \(\lambda\),
k = \(k\)
:lambda for \(\lambda\),
:k for \(k\)
Lognormal \(u \sim \text{LogNormal}(\mu, \sigma^2)\) mu = \(\mu\),
ln_sigma_sq = \(\log(\sigma^2)\)
:mu for \(\mu\),
:sigma_sq for \(\sigma^2\)
Lomax \(u \sim \text{Lomax}(\alpha, \lambda)\) ln_lambda = \(\log(\lambda)\),
ln_alpha = \(\log(\alpha)\)
:lambda for \(\lambda\),
:alpha for \(\alpha\)
Rayleigh \(u \sim \text{Rayleigh}(\sigma)\) ln_sigma_sq = \(\log(\sigma^2)\) :sigma_sq for \(\sigma^2\)
Gamma \(u \sim \text{Gamma}(k, \theta)\) (MCI only) k = \(k\),
theta = \(\theta\)
:k for \(k\),
:theta for \(\theta\)

Notes:

Parameterization

When a parameter is modeled as heteroscedastic (observation-specific), we use a link function to ensure it stays in the correct domain:

Parameter Vector Length

The total number of parameters depends on heteroscedasticity settings:

# Homoscedastic: scalar parameters
hetero = Symbol[]
# -> Each inefficiency parameter contributes 1 to parameter count

# Heteroscedastic mu: L parameters
hetero = [:mu]
# -> mu contributes L parameters (one for each Z column)

# Fully heteroscedastic truncated-normal
hetero = [:mu, :sigma_sq]
# -> mu contributes L, sigma_u^2 contributes L

# Scaling property model
hetero = :scaling
# -> Adds L_scaling delta coefficients (one per column of zvar)
# -> All inefficiency parameters are scalar (1 each)

Example: Heteroscedastic Model

# Z has 4 columns (including constant)
# hetero = [:mu, :sigma_sq] means both mu and sigma_u^2 depend on Z

spec = sfmodel_spec(
    depvar = y,
    frontier = X,          # K = 3 variables
    zvar = Z,              # L = 4 variables
    noise = :Normal,
    ineff = :TruncatedNormal,
    hetero = [:mu, :sigma_sq]
)

# Parameter vector structure:
# [beta_1, beta_2, beta_3,           <- frontier (K = 3)
#  delta_1, delta_2, delta_3, delta_4,     <- mu (L = 4, heteroscedastic)
#  gamma_1, gamma_2, gamma_3, gamma_4,     <- ln_sigma_u^2 (L = 4, heteroscedastic)
#  ln_sigma_v^2]                           <- noise variance (1)
# Total: 3 + 4 + 4 + 1 = 12 parameters

Example: Scaling Property Model

# Z_nocons has 2 columns (no constant!)
# hetero = :scaling activates the scaling property model

spec = sfmodel_spec(
    depvar = y,
    frontier = X,              # K = 3 variables
    zvar = Z_nocons,           # L = 2 variables (no constant)
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling
)

# Parameter vector structure:
# [beta_1, beta_2, beta_3,    <- frontier (K = 3)
#  delta_1, delta_2,           <- scaling function (L_scaling = 2)
#  ln_sigma^2,                 <- inefficiency (scalar, 1)
#  ln_sigma_v^2]               <- noise variance (1)
# Total: 3 + 2 + 1 + 1 = 7 parameters

7.1 Distribution Selection Guidance

Inefficiency Distribution

The choice of inefficiency distribution affects the shape of the estimated inefficiency, particularly its mode and tail behavior. The following table summarizes the key properties:

Distribution Parameters Mode at Zero? Tail Recommended Use
half-normal 1 (\(\sigma\)) Yes Light Default starting point; most common in the literature
Exponential 1 (\(\lambda\)) Yes Light Simple alternative to half-normal; monotone decreasing density
truncated-normal 2 (\(\mu\), \(\sigma\)) Not necessarily Light When the mode of inefficiency may be at a positive value
Rayleigh 1 (\(\sigma\)) No (mode > 0) Light One-parameter distribution with mode away from zero
Weibull 2 (\(\lambda\), \(k\)) Flexible Moderate Flexible shape: mode at zero (\(k \le 1\)) or positive (\(k > 1\))
Lognormal 2 (\(\mu\), \(\sigma\)) No (mode > 0) Heavy Right-skewed inefficiency with a heavy right tail
Lomax 2 (\(\alpha\), \(\lambda\)) Yes Heavy Heavy-tailed; useful when a few firms are highly inefficient
Gamma 2 (\(k\), \(\theta\)) Flexible Moderate Maximum flexibility; requires MCI method

Practical workflow:

  1. Start simple. Begin with half-normal or Exponential. These one-parameter distributions are easy to estimate (MLE available) and serve as a baseline.
  2. Allow non-zero mode. If theory suggests that most firms have some positive level of inefficiency (rather than clustering near zero), try truncated-normal or Rayleigh.
  3. Add flexibility. If one- or two-parameter distributions seem restrictive, try Weibull, Lognormal, or Gamma. These can capture a wider variety of shapes but require simulation-based estimation (MCI or MSLE).
  4. Compare models. Compare log-likelihood values across specifications. Higher log-likelihood (less negative) indicates better fit. For nested models, use a likelihood ratio test (see Section 5.6 for critical values).

Noise Distribution

Distribution When to Use
Normal Default and standard assumption. Use unless there is specific evidence otherwise.
Student-t When residuals exhibit excess kurtosis (heavier tails than Normal). The degrees-of-freedom parameter \(\nu\) is estimated.
Laplace When the noise distribution is believed to be double-exponential (sharper peak, heavier tails than Normal).

Note: Student-t and Laplace noise require simulation-based estimation (MCI or MSLE) and are available for cross-sectional models only.


8. Working with Results

Accessing Efficiency Indices

result = sfmodel_fit(spec = myspec, method = mymeth, jlms_bc_index = true)

# JLMS inefficiency index: E(u|epsilon)
jlms = result.jlms
mean_inefficiency = mean(jlms)

# Battese-Coelli efficiency: E(exp(-u)|epsilon)
bc = result.bc
mean_efficiency = mean(bc)

# Efficiency scores are observation-specific
println("Firm 1 efficiency: ", bc[1])
println("Firm 1 inefficiency: ", jlms[1])

Working with Marginal Effects

Marginal effects measure how changes in Z variables affect expected inefficiency E(u):

result = sfmodel_fit(spec = myspec, method = mymeth, marginal = true)

# Mean marginal effects
println(result.marginal_mean)
# Output: (age = 0.023, school = -0.015, ...)

# Observation-level marginal effects (DataFrame)
marginal_df = result.marginal
println(names(marginal_df))  # ["marg_age", "marg_school", ...]

# Access specific firm's marginal effects
println("Firm 1 marginal effect of age: ", marginal_df[1, :marg_age])

Variance-Covariance Matrix

# Full variance-covariance matrix
vcov = result.var_cov_mat

# Standard errors (square root of diagonal)
se = result.std_err

# Compute confidence interval for coefficient i
coef_i = result.coeff[i]
se_i = result.std_err[i]
ci_lower = coef_i - 1.96 * se_i
ci_upper = coef_i + 1.96 * se_i

Extracting Specific Parameter Groups

# Frontier coefficients
beta = result.frontier

# Inefficiency parameters (varies by model)
if haskey(result.list, :mu)
    mu_coef = result.mu
end

# Noise variance
if haskey(result.list, :ln_sigma_v_sq)
    sigma_v_sq = exp(result.ln_sigma_v_sq)
end

8.1 Interpreting Estimation Results

Efficiency Indices

Marginal Effects

When zvar is specified and marginal = true in sfmodel_fit():

Log-Transformed Parameters

Many parameters are estimated on a log-transformed scale (e.g., ln_sigma_sq, ln_sigma_v_sq, ln_lambda). To recover the original-scale value, take the exponential:

sigma_u_sq = exp(result.ln_sigma_sq)    # σ_u²
sigma_v_sq = exp(result.ln_sigma_v_sq)  # σ_v²

The auxiliary table printed by sfmodel_fit() (when show_table = true) already reports the original-scale values alongside the log-transformed estimates.

Red Flags

Watch for these warning signs in the estimation output:

Indicator What It Means
redflag = 1 Convergence issues detected (large gradient, iteration limit, or non-positive Hessian diagonal)
Wrong-sign OLS_resid_skew Data may not support the presence of inefficiency in the assumed direction (see Cost Frontier Interpretation)
BC values near 1.0 for all firms Inefficiency may be negligible; consider whether a frontier model is appropriate
Very large standard errors Possible multicollinearity in covariates or near-boundary parameter estimates
Gradient norm > 0.1 Optimization did not reach a stationary point; results may be unreliable

9. Advanced Topics

Choosing Between MLE, MCI, and MSLE

The package offers three estimation methods. MLE uses analytic (closed-form) log-likelihoods, while MCI and MSLE are simulation-based and use Halton quasi-random draws.

Feature MLE MCI MSLE
Likelihood Analytic (exact) Simulated Simulated
Simulation draws Not needed Halton QMC Halton QMC
GPU support No Yes Yes
Noise distributions Normal only Normal, StudentT, Laplace Normal, StudentT, Laplace
Ineff distributions half-normal, trunc-normal, Expo All 8 All except Gamma
Copula support None Gaussian, Clayton, Clayton90, Gumbel Gaussian, Clayton, Clayton90, Gumbel
Scaling property Yes (TruncNormal only) All 8 distributions All except Gamma
Heteroscedastic Yes Yes Yes
Panel models TFE, TFE_CSW, TRE TFE only TFE only
Defaultn_draws N/A 1024 1024

When to use MLE: If your model uses Normal noise with half-normal, truncated-normal, or Exponential inefficiency (and no copula), MLE is the natural first choice — it is exact (no simulation error), fast, and does not require tuning the number of draws.

When to use MCI/MSLE: For models that MLE cannot handle (non-Normal noise, copula dependence, Weibull/Lognormal/Lomax/Rayleigh/Gamma inefficiency), the simulation-based methods are required. For models supported by all three methods, MLE and simulation estimates typically agree closely.

For a given model supported by both MCI and MSLE, the two simulation methods typically produce similar estimates. Differences may arise because MCI and MSLE use different likelihood constructions, so they can respond differently to the choice of distribution, sample size, and starting values. In particular, when the data do not conform well to the assumed distributional shape, the finite set of simulation draws may cover the tails unevenly, causing the two methods to weight those observations differently.

GPU Computation

For large datasets, GPU computation can significantly speed up estimation:

using CUDA

meth = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12 - 1,
    GPU = true,        # Enable GPU
    chunks = 10        # Split computation into 10 chunks for memory management
)

Requirements:

Copula Models

To model dependence between the noise and inefficiency terms, specify a copula:

# Clayton copula (lower tail dependence)
spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :HalfNormal,
    copula = :Clayton,
)

# Initial values must include theta_rho for the copula parameter
myinit = sfmodel_init(
    spec = spec,
    frontier = X \ y,
    ln_sigma_sq = [0.0],
    ln_sigma_v_sq = [0.0],
    theta_rho = [0.0]       # copula dependence parameter (transformed scale)
)

# Clayton 90° rotated copula (upper-lower tail dependence)
spec_c90 = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :HalfNormal,
    copula = :Clayton90,
)

# Gumbel copula (upper tail dependence)
spec_g = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    noise = :Normal,
    ineff = :Exponential,
    copula = :Gumbel,
)

The estimation output includes a copula auxiliary table showing:

Copula Interpretation Guide

Interpreting Kendall’s \(\tau\):

Kendall’s \(\tau\) measures the degree of concordance (rank correlation) between the noise term \(v\) and the inefficiency term \(u\). A positive \(\tau\) means that firms with larger noise shocks tend to have larger inefficiency (and vice versa). A negative \(\tau\) indicates the opposite pattern. The magnitude of \(\tau\) reflects the strength of the dependence.

Tail dependence by copula type:

Copula Tail Dependence Interpretation
Gaussian None (symmetric, no tail dependence) Dependence is moderate and evenly spread; no extreme co-movement in the tails
Clayton Lower tail Firms at the low end of both \(v\) and \(u\) tend to co-move; captures “floor effects” where small disturbances coincide with low inefficiency
Clayton 90° Upper-lower (rotated) Mixed tail dependence pattern; useful when dependence is asymmetric in a non-standard direction
Gumbel Upper tail Firms at the high end of both \(v\) and \(u\) tend to co-move; captures “ceiling effects” where large shocks coincide with high inefficiency

When to choose which copula:

  1. Start with copula = :None (independence). Estimate the model without copula dependence as a baseline.
  2. Try Gaussian if you suspect general dependence but have no strong prior on tail behavior. It is the most flexible symmetric option.
  3. Try Clayton or Gumbel if you expect that dependence is stronger for extreme values (e.g., very efficient or very inefficient firms behave differently from the rest).
  4. Compare models using the log-likelihood values across specifications. Also check whether the estimated \(\hat{\rho}\) (copula parameter) is statistically significant — its standard error and z-statistic are reported in the estimation output.

Note: Copula models are supported only for cross-sectional data. Student-t noise is not compatible with copulas (the copula density requires evaluating the Student-t CDF, to which its standard implementation is not compatible with the automatic differentiation used in the package).

Scaling Property Model (Cross-Sectional)

The scaling property model introduces observation-specific heterogeneity through a single multiplicative scaling function, rather than making individual distributional parameters heteroscedastic:

\[ u_i = h(\mathbf{z}_i) \cdot u_i^*, \qquad h(\mathbf{z}_i) = \exp(\mathbf{z}_i'\boldsymbol{\delta}), \]

where \(u_i^*\) follows a homoscedastic (scalar-parameter) base distribution and \(\mathbf{z}_i\) is a vector of environmental/exogenous variables. The scaling function \(h(\mathbf{z}_i)\) is always positive and rescales the base inefficiency without changing the distributional shape.

Key mathematical properties:

  1. Jacobian cancellation: The change of variable \(u_i = h_i u_i^*\) produces a Jacobian \(h_i\) that cancels with the \(1/h_i\) from the density transformation \(f_u(u_i) = \frac{1}{h_i} f_{u^*}(u_i/h_i)\), leaving \(f_{u^*}(u_i^*)\) directly in all integrals.
  2. JLMS factorization: \(E(u_i \mid \varepsilon_i) = h_i \cdot E(u_i^* \mid \varepsilon_i)\).
  3. BC non-factorization: \(E(e^{-u_i} \mid \varepsilon_i) = E(e^{-h_i u_i^*} \mid \varepsilon_i)\); the \(h_i\) must remain inside the exponential.
  4. Semi-elasticity: \(\partial \ln E(u_i) / \partial z_{ij} = \delta_j\), providing direct interpretation of the scaling coefficients.

Unconditional mean \(E(u_i) = h(\mathbf{z}_i) \cdot E(u_i^*)\), where \(E(u_i^*)\) depends on the base distribution:

Distribution \(E(u_i^*)\)
half-normal \(\sigma\sqrt{2/\pi}\)
truncated-normal \(\sigma_u(\Lambda + \phi(\Lambda)/\Phi(\Lambda))\), \(\Lambda = \mu/\sigma_u\)
Exponential \(\sqrt{\lambda}\)
Weibull \(\lambda\,\Gamma(1 + 1/k)\)
Lognormal \(\exp(\mu + \sigma^2/2)\)
Lomax \(\lambda / (\alpha - 1)\), \(\alpha > 1\)
Rayleigh \(\sigma\sqrt{\pi/2}\)
Gamma \(k\theta\)

Identification constraint: The zvar matrix must not contain a constant column. A constant in \(\mathbf{z}_i\) would create an intercept \(\delta_0\) that is not separately identified from the scale parameter of the base distribution (e.g., \(\sigma\) in half-normal or \(\lambda\) in Exponential).

Supported distributions: All 8 inefficiency distributions are supported. The Gamma distribution requires method = :MCI because MSLE relies on the quantile function (inverse CDF), but the Gamma distribution lacks a closed-form inverse CDF — computing it numerically is too expensive for MSLE's massively parallel evaluations. All other distributions work with both :MSLE and :MCI.

Usage Example

# Step 1: Specify a scaling property model
spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,                 # include constant
    zvar = Z_nocons,              # environmental variables (NO constant!)
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling,            # activates scaling property model
)

# Step 2: Choose estimation method (both MSLE and MCI work)
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12,
    GPU = true,
    chunks = 10
)

# Step 3: Set initial values — note the "scaling" keyword for δ
init = sfmodel_init(
    spec = spec,
    frontier = X \ y,
    scaling = zeros(size(Z_nocons, 2)),   # δ initial values
    ln_sigma_sq = [0.0],                  # scalar base parameter
    ln_sigma_v_sq = [0.0]
)

# Step 4: Configure optimization
opt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt = (iterations = 200,),
    main_solver = Newton(),
    main_opt = (iterations = 100, g_abstol = 1e-8)
)

# Step 5: Estimate
result = sfmodel_fit(
    spec = spec, method = meth, init = init,
    optim_options = opt, marginal = true, show_table = true
)

# Access scaling coefficients
println("δ coefficients: ", result.delta)

# Marginal effects (computed via ForwardDiff)
println("Mean marginal effects: ", result.marginal_mean)
println("Observation-level marginal effects: ", result.marginal)

Scaling with Copula

The scaling property model can be combined with copula dependence:

spec = sfmodel_spec(
    type = :production,
    depvar = y,
    frontier = X,
    zvar = Z_nocons,
    noise = :Normal,
    ineff = :HalfNormal,
    hetero = :scaling,
    copula = :Clayton,
)

init = sfmodel_init(
    spec = spec,
    frontier = X \ y,
    scaling = zeros(size(Z_nocons, 2)),
    ln_sigma_sq = [0.0],
    ln_sigma_v_sq = [0.0],
    theta_rho = [0.0]
)

Cost Frontier Models

For cost frontiers, where inefficiency increases costs:

spec = sfmodel_spec(
    type = :cost,           # Cost frontier
    depvar = totalC,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :Exponential,
)

Mathematical Difference

The key difference between production and cost frontiers lies in the sign of the inefficiency term:

The software handles the sign convention internally when type = :cost is set. All post-estimation quantities (JLMS, BC, marginal effects) are computed correctly for cost frontiers without additional user adjustments.

Interpretation Differences

Aspect Production Frontier Cost Frontier
Composed error \(\varepsilon = v - u\) \(\varepsilon = v + u\)
OLS residual skewness Should be negative Should be positive
JLMS \(E(u \mid \varepsilon)\) Higher = more inefficient Same interpretation
BC \(E(e^{-u} \mid \varepsilon)\) Closer to 1 = more efficient Same interpretation
BC interpretation Firm produces \(\text{BC} \times 100\%\) of potential output Firm’s cost is \(\frac{1}{\text{BC}} \times 100\%\) of the efficient cost level
Frontier coefficients Output elasticities Cost elasticities

Common Pitfall: Wrong-Sign Skewness

Before estimating a stochastic frontier model, check the OLS residual skewness (reported in the sfmodel_fit() output as OLS_resid_skew):

If the skewness has the wrong sign, the data may not support the presence of inefficiency in the assumed direction, and the model may have difficulty converging or produce unreliable estimates.

Choosing the Number of Halton Draws

The number of QMC draws affects accuracy and computation time:

n_draws Typical Use Case
127 Quick testing
1024 Standard estimation (default)
4095 (\(2^{12}-1\)) Publication-quality results
8191 High precision
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 2^12 - 1  # 4095 draws
)

Observation-Specific Halton Draws (multiRand)

By default (multiRand=true), each observation receives a different consecutive segment of the Halton sequence, providing better quasi-Monte Carlo coverage for the model:

The mechanism is designed so that a complete, equidistributed Halton sequence of length \(L\) is generated and then assigned to \(N\) observations in consecutive blocks of \(S\) draws, and it is recycled, if necessary, to fill in the \(N\times S\) elements.

We impose a default upper bound of \(L=2^{15}-1=32767\) (controlled by distinct_Halton_length) to avoid generating points that lie extremely close to 0 or 1, which can trigger numerical instability in floating-point arithmetic (e.g., through \(\log(t)\), \(\log(1-t)\), or divisions by \(t\) or \((1-t)\)). This bound can be increased via the distinct_Halton_length option in sfmodel_method().

Ideally \(L \leq N \times S\) so that the full sequence is consumed and recycled if necessary. When \(N \times S < L\), the length-\(L\) sequence cannot be fully utilized. In this case, the program automatically selects the longest \(L'\) satisfying \(L' \leq N \times S\) and uses it as described.

Constraint: When multiRand=true, n_draws (the number of draws per observation) must be \(\leq\) distinct_Halton_length (default \(2^{15} - 1 = 32767\)). To use more draws, either increase distinct_Halton_length or set multiRand=false.

# Default: observation-specific draws (recommended)
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 1024,    # Must be <= distinct_Halton_length (default 32767)
    multiRand = true   # Default
)

# Use more draws by increasing distinct_Halton_length
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 50000,
    multiRand = true,
    distinct_Halton_length = 2^16 - 1  # 65535
)

# Legacy mode: all observations share the same draws
meth = sfmodel_method(
    method = :MSLE,
    n_draws = 8191,    # No constraint from distinct_Halton_length
    multiRand = false
)

Custom Halton Sequences

For reproducibility or special requirements when using multiRand=false:

using HaltonSequences

# Generate custom Halton sequence as 1xS matrix (required format for multiRand=false)
halton_vec = make_halton_p(1024; T = Float64)
halton = reshape(halton_vec, 1, length(halton_vec))  # Convert to 1xS

meth = sfmodel_method(
    method = :MSLE,
    draws = halton,     # Pre-reshaped 1xS matrix
    multiRand = false   # Required when providing custom 1xS draws
)

Handling Convergence Issues

If estimation fails to converge:

  1. Try different initial values

    # Use grid search for starting values
    for sigma_init in [-2.0, -1.0, 0.0, 1.0]
        init = sfmodel_init(spec=spec, ..., ln_sigma_sq=[sigma_init])
        result = sfmodel_fit(spec=spec, method=meth, init=init)
        if result.converged
            break
        end
    end
  2. Increase warmstart iterations

    myopt = sfmodel_opt(
        warmstart_solver = NelderMead(),
        warmstart_opt = (iterations = 400,),  # More warmstart iterations
        main_solver = Newton(),
        main_opt = (iterations = 200,)
    )
  3. Use different optimizers

    myopt = sfmodel_opt(
        warmstart_solver = ParticleSwarm(),  # Global search, slow
        warmstart_opt = (iterations = 500,),
        main_solver = BFGS(),  # Alternative to Newton
        main_opt = (iterations = 2000,)
    )
  4. Check the gradient norm

    if result.gradient_norm > 0.1
        println("Warning: Large gradient norm indicates poor convergence")
    end
  5. Try a different estimation method

    If one method has convergence difficulties, try another:

    # If MSLE fails, try MCI (or vice versa)
    meth_alt = sfmodel_method(method = :MCI, n_draws = 1024)
    result = sfmodel_fit(spec = spec, method = meth_alt)
    
    # Or try MLE (if the model supports it: Normal noise, no copula,
    # half-normal/truncated-normal/Exponential inefficiency)
    meth_mle = sfmodel_method(method = :MLE)
    result = sfmodel_fit(spec = spec, method = meth_mle)
  6. Check OLS residual skewness

    Before investing effort in convergence tuning, verify that the data supports the presence of inefficiency. If result.OLS_resid_skew has the wrong sign (positive for production, negative for cost), the data may not exhibit the one-sided pattern that stochastic frontier models require. In such cases, even a perfectly converged model may produce meaningless estimates.

  7. Reduce model complexity first

    If a heteroscedastic or copula model fails to converge, first estimate a simpler version:

    • Drop the copula (copula = :None) to establish a baseline.
    • Remove heteroscedasticity (hetero = Symbol[]) and use a homoscedastic specification.
    • Use a simpler inefficiency distribution (e.g., half-normal instead of Lognormal).

    Once the simple model converges, use its estimates as initial values for the more complex specification.

  8. Diagnostic decision tree

    Symptom Likely Cause Action
    Warmstart did not converge Poor starting region Increase warmstart iterations (e.g., 500+) or switch to ParticleSwarm() for global search
    Main stage did not converge Starting values still far from optimum Use warmstart estimates as init, increase main iterations, or try BFGS() instead of Newton()
    Hessian non-invertible Model overparameterized or data insufficient Reduce model complexity; check for multicollinearity in X or Z
    Very large standard errors Near-boundary parameters or collinear covariates Check correlation among Z variables; consider dropping redundant covariates
    Gradient norm between 0.01 and 0.1 Near convergence but not quite Increase iterations or loosen g_abstol slightly
    Gradient norm > 1 Far from optimum Completely different initial values needed; try grid search

10. Panel Data Models

The module supports several panel stochastic frontier models for estimating persistent firm-level inefficiency from balanced or unbalanced panel data. Panel estimation is accessed through the same unified API by setting the datatype argument in sfmodel_spec().

Panel Model Types

datatype Model Methods Ineff (MLE)
:panel_TFE Wang and Ho (2010) true fixed-effect MCI, MSLE, MLE half-normal, truncated-normal
:panel_TFE_CSW Chen, Schmidt, and Wang (2014) fixed-effect MLE only half-normal only
:panel_TRE True random-effect MLE only half-normal, truncated-normal

For panel_TFE, the simulation-based methods (MCI, MSLE) support all 8 inefficiency distributions. MLE support is limited to the distributions listed above. For panel_TFE_CSW and panel_TRE, only MLE is available.

10.1 Theoretical Background

Wang and Ho (2010) True Fixed-Effect (panel_TFE)

The Wang and Ho (2010) model starts from:

\[ y_{it} = \alpha_i + x_{it}'\beta + v_{it} - h(z_{it}) \cdot u_i^* \]

where \(\alpha_i\) is a firm-specific fixed effect, \(v_{it} \sim N(0, \sigma_v^2)\) is noise, \(u_i^* \ge 0\) is persistent inefficiency, and \(h(z_{it}) = \exp(z_{it}'\delta)\) is a scaling function.

To eliminate \(\alpha_i\), a within-group transformation (demeaning) is applied:

\[ \tilde{y}_{it} = \tilde{x}_{it}'\beta + \tilde{v}_{it} - \tilde{h}_{it} \cdot u_i^* \]

where tildes denote demeaned values. Key point: \(y\) and \(X\) are demeaned, but \(Z\) is NOT demeaned. Instead, \(h(z_{it})\) is computed first, then demeaned to obtain \(\tilde{h}_{it}\).

Because the demeaned noise has a singular covariance \(\Sigma = \sigma_v^2(I - \frac{1}{T}\mathbf{1}\mathbf{1}')\), the quadratic form simplifies to:

\[ \text{pert}' \Sigma^+ \text{pert} = \|\text{pert}\|^2 / \sigma_v^2 \]

This eliminates the need for the pseudo-inverse, keeping computation efficient.

Chen, Schmidt, and Wang (2014) Fixed-Effect (panel_TFE_CSW)

The CSW model (Chen, Schmidt, and Wang 2014, Journal of Econometrics) provides an alternative approach to estimating true fixed-effect stochastic frontier models. After the within-group transformation eliminates the fixed effects \(\alpha_i\), the CSW approach exploits properties of the closed skew-normal (CSN) distribution to derive a closed-form likelihood for the demeaned composed error. This avoids the need for the scaling-property structure required by Wang and Ho (2010).

Key characteristics:

When to use CSW vs. Wang-Ho TFE:

True Random-Effect (panel_TRE)

The TRE model, attributed to Greene (2005), treats the individual effect \(\alpha_i\) as a random draw from \(N(0, \sigma_\alpha^2)\) rather than as a fixed parameter to be eliminated. Because \(\alpha_i\) is random and integrated out of the likelihood, a constant term may be included in the frontier equation (unlike the fixed-effect models where demeaning eliminates constants).

Key characteristics:

When to use TRE vs. TFE:

The choice between TRE and TFE mirrors the classic random-effects vs. fixed-effects tradeoff in panel econometrics:

10.2 Panel Quick Start

Example: Panel TFE with MSLE (simulation-based)

using CSV, DataFrames, Optim
using CUDA
using SFrontiers

# Load panel data (stacked: firm 1 all T periods, firm 2 all T periods, ...)
df = CSV.read("panel_data.csv", DataFrame)
y = df.y
X = hcat(df.x1, df.x2)    # NT x K — no constant column!
Z = hcat(df.z1)            # NT x L — no constant column!

# Step 1: Specify panel model
myspec = sfmodel_spec(
    type = :production,
    datatype = :panel_TFE,    # Wang and Ho 2010 true fixed-effect
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    ineff = :HalfNormal,
    id = df.firm,             # unit identifier (required for all panel models)
)

# Step 2: Choose estimation method
mymeth = sfmodel_method(method = :MSLE, n_draws = 1024)

# Step 3: Set initial values (panel-specific keywords)
myinit = sfmodel_init(
    spec = myspec,
    delta = [0.1],            # scaling function coefficient
    ln_sigma_u_sq = 0.1,     # scalar
    ln_sigma_v_sq = 0.1      # scalar
)

# Step 4: Configure optimization
myopt = sfmodel_opt(
    warmstart_solver = NelderMead(),
    warmstart_opt = (iterations = 200, g_abstol = 1e-3),
    main_solver = Newton(),
    main_opt = (iterations = 200, g_abstol = 1e-8)
)

# Step 5: Estimate
result = sfmodel_fit(
    spec = myspec,
    method = mymeth,
    init = myinit,
    optim_options = myopt,
    show_table = true
)

# Access results
println("Log-likelihood: ", result.loglikelihood)
println("Mean JLMS: ", mean(result.jlms))      # firm-level, N-vector
println("Mean BC: ", mean(result.bc))            # firm-level, N-vector

Example: Panel TFE with MLE (analytic)

# Same spec as above — MLE is available when ineff is half-normal or truncated-normal
mymeth_mle = sfmodel_method(method = :MLE)  # no draws needed
result_mle = sfmodel_fit(spec = myspec, method = mymeth_mle, show_table = true)

Example: Panel TFE_CSW (MLE only)

spec_csw = sfmodel_spec(
    datatype = :panel_TFE_CSW,
    depvar = y,
    frontier = X,
    noise = :Normal,
    id = firm_ids,
    ineff = :HalfNormal         # CSW requires half-normal
)
meth = sfmodel_method(method = :MLE)
result = sfmodel_fit(spec = spec_csw, method = meth)

Example: Panel TRE (MLE only)

spec_tre = sfmodel_spec(
    datatype = :panel_TRE,
    depvar = y,
    frontier = X,
    zvar = Z,
    noise = :Normal,
    id = firm_ids,
    ineff = :HalfNormal         # or :TruncatedNormal
)
meth = sfmodel_method(method = :MLE)
result = sfmodel_fit(spec = spec_tre, method = meth)

10.3 Panel vs. Cross-Sectional Differences

Feature Cross-Sectional Panel (TFE / TFE_CSW / TRE)
datatype :cross_sectional (default) :panel_TFE, :panel_TFE_CSW, :panel_TRE
Noise distributions Normal, Student T, Laplace Normal only
Inefficiency distributions All 8 TFE: all 8 (MCI/MSLE), 2 (MLE); CSW: half-normal; TRE: 2
Methods MCI, MSLE, MLE TFE: all 3; CSW/TRE: MLE only
Copula Gaussian, Clayton, Clayton90, Gumbel Not supported
Heteroscedastichetero Yes (via Z), or hetero=:scaling for scaling property model Not supported; use\(h(z)\) scaling
Scaling property model hetero=:scaling; zvar has no constant; init keyword: scaling Always active; zvar has no constant; init keyword: delta
Constant infrontier/zvar Required in frontier; required in zvar unless hetero=:scaling Not allowed (within-demeaning eliminates)
JLMS/BC indices Observation-level (\(N\) vector) Firm-level (\(N_{\text{firms}}\) vector)
Panel structure N/A id (required for both balanced and unbalanced)
Init: scaling coefficients scaling keyword (when hetero=:scaling) delta keyword
Marginal effects Yes (heteroscedastic and scaling) N/A

10.4 Supported Panel Inefficiency Distributions

Panel TFE (Wang and Ho 2010)

All eight inefficiency distributions are supported via MCI/MSLE. MLE supports a subset:

Distribution Parameters (scalar) MLE MSLE MCI
:HalfNormal ln_sigma_u_sq Yes Yes Yes
:TruncatedNormal mu, ln_sigma_u_sq Yes Yes Yes
:Exponential ln_lambda No Yes Yes
:Weibull ln_lambda, ln_k No Yes Yes
:Lognormal mu, ln_sigma_sq No Yes Yes
:Lomax ln_lambda, ln_alpha No Yes Yes
:Rayleigh ln_sigma_sq No Yes Yes
:Gamma ln_k, ln_theta No No Yes

Panel TFE_CSW (Chen, Schmidt, and Wang 2014)

Distribution MLE
:HalfNormal Yes

Panel TRE (True Random-Effect)

Distribution MLE
:HalfNormal Yes
:TruncatedNormal Yes

Important: In all panel models, inefficiency distribution parameters are scalar (not heteroscedastic via Z). Heterogeneity enters only through the scaling function \(h(z_{it}) = \exp(z_{it}'\delta)\).

10.5 Balanced vs. Unbalanced Panels

Both balanced and unbalanced panels use the id keyword to specify the panel structure. Data must be grouped by firm (contiguous rows for each unit). The number of periods per firm is inferred from the id column.

# Balanced panel (all firms have the same T)
spec = sfmodel_spec(datatype = :panel_TFE,
    depvar = y, frontier = X, zvar = Z,
    noise = :Normal, ineff = :HalfNormal, id = firm_ids)

# Unbalanced panel (firms may have different T)
spec = sfmodel_spec(datatype = :panel_TFE,
    depvar = y, frontier = X, zvar = Z,
    noise = :Normal, ineff = :HalfNormal, id = firm_ids)

For balanced panels where the data does not already have a firm identifier column, one can be created: id = repeat(1:N, inner=T), where N is the number of firms and T is the number of time periods.

10.6 No Constant Columns

In the Wang and Ho (2010) panel model (panel_TFE), within-group demeaning eliminates any constant terms. Therefore:

10.7 Panel DSL Macros

In addition to @useData, @depvar, @frontier, and @zvar, panel models require the @id macro to specify the panel structure:

# Panel TFE (with @zvar)
spec = sfmodel_spec(datatype = :panel_TFE,
    @useData(df), @depvar(y), @frontier(x1, x2), @zvar(z1), @id(firm),
    noise = :Normal, ineff = :HalfNormal)

# Panel TFE (without @zvar)
spec = sfmodel_spec(datatype = :panel_TFE,
    @useData(df), @depvar(y), @frontier(x1, x2), @id(firm),
    noise = :Normal, ineff = :HalfNormal)

# Panel TRE
spec = sfmodel_spec(datatype = :panel_TRE,
    @useData(df), @depvar(y), @frontier(x1, x2), @id(firm),
    noise = :Normal, ineff = :HalfNormal)

10.8 Error Messages for Unsupported Configurations

When method=:MLE is used with a configuration that MLE does not support, sfmodel_fit issues an informative error listing the reason and the supported alternatives. For example:

10.9 Standalone Panel API (Backward Compatible)

For users who prefer the standalone panel API (without the unified sfmodel_spec interface), the following functions are available and delegate directly to the Panel backend (Wang and Ho 2010 model only):

sfmodel_panel_spec(; kwargs...)      # Panel model specification
sfmodel_panel_method(; kwargs...)    # Panel method specification
sfmodel_panel_init(; kwargs...)      # Panel initial values
sfmodel_panel_opt(; kwargs...)       # Panel optimization options
sfmodel_panel_fit(; kwargs...)       # Panel estimation

These are retained for backward compatibility with existing panel scripts.


Citation