Functions of JCAMPDXir

JCAMPDXir.JCAMPDXirModule

This JCAMP-DX (infrared) file format was developed for the exchange of infrared spectra between different laboratories. For general description of format refer to UIPAC.ORG pdf-file In addition to the spectra themselves, the file also stores metadata containing information about the units of measurement and the conditions under which the spectra were acquired.

JCAMP file content example:

        ##TITLE=1 
        ##JCAMP-DX=4.24
        ##DATATYPE=INFRARED SPECTRUM
        ##XUNITS=1/CM
        ##YUNITS=TRANSMITTANCE
        ##YFACTOR=0.00699183
        ##FIRSTY=11746.3412893072
        ##NPOINTS=16384
        ##XYDATA=(X++(Y..Y))
        0 1680010 821286 2148133 1505245 1537124 1367661 1147725 1134981
        7.71603 1166853 1213186 1029828 1067595 1135904 1195128 1157134 1150556
        15.4321 1266743 1164401 1014224 1022338 999780 1138781 1208950 1161258
        .
        .
        ##END=

According to JCAMP specifications there are several format for data compression:

        - simple integer: 7.71603 1166853 5213186 -1029828 -1067595 (y-data  decoded as integer numbers)
        - PAC: 7.71603+1166853+5213186-1029828-1067595 (signs are used as delimiters)
         -SQZ: 7.71603A166853E213186a029828a067595 (signs are converted according to [`SQZ_digits`](@ref))
         -DIF: 7.71603 1166853 M046333 o243014 l7767 (all data chunk except the first one represent relative 
                shift of the value with respect to the previous one)
         -DUP:  in this mode all duplicated y- values are replaced with a single letter according to [`DUP_digits`](@ref)
                these values show the number of duplications, DUP mode can be combined with DIF-mode
source
Core.TypeMethod
(::Type{SQZ})(s::AbstractString)

All types are callable on a string and use string conversion to add delimiters between the data chunks When line decoding is of Unspecified_Line before converting the string, there is an additional operation to get the string type in run-time

source
Core.TypeMethod
(::Type{T})(s::AbstractString) where T<:Decoding

By default, all chunks and line decoding do nothing to the string line

source
JCAMPDXir.DecodingType
Types for decoding of both lines and chunks, line and chunk typisation is used 
to dispatch during parsing
source
JCAMPDXir.addline!Method
addline!(           jdx::JDXblock, 
                    data_buffer::DataBuffer{DataLineType,B,LD,ChunkDecoding},
                    current_line::String; # index of current data chunk
                    delimiter=DEFAULT_DELIMITER(DataLineType),
                    validate_data::Bool=true) where {DataLineType<:XYYline,B,LD,ChunkDecoding}

This function parses current_line string of file, fills data buffer data_buffer by calling fill_data_buffer! and copies buffer content to x- and y-vector of jdx object JDXblock - the number of data point (excluding the x-coordinate) in the line -delimiter data points delimiter used in split function -validate_data if false, parser ignores all data validations

source
JCAMPDXir.addline!Method
addline!(jdx::JDXblock, 
                    data_buffer::DataBuffer{DataLineType},
                    current_line::String; # index of current data chunk
                    delimiter=r"[,;]",
                    validate_data::Bool=true) where DataLineType<:XYXYline

Adds line to XY...XY data

source
JCAMPDXir.check_data_point!Method
check_data_point!(dv::DataValidation,
                                value_checker::T,
                                value_real::T,
                                field::Symbol,
                                point_index::Int=0,
                                line_index::Int=0) where T

Function used to check the value, input args:

- dv - [`DataValidation`](@ref) object
- value_checker - checker value (Usually this value is taken from file headers)
- value_real - value parsed from file
- field must be :x , :y (for data vector validation) or one of [`VIOLATION_CRITERIA`](@ref) keys
- point_index - index of data point (used to form the message if the validation fails)
- line_index - index of line in file (used to form the message if the validation fails)

If valuechecker ≈ valuereal doesnt fulfilled

source
JCAMPDXir.convert!Method
convert!(x::AbstractArray,::sUnits{T},::sUnits{T}) where T

This functions can be used convert y and x units

supported x-units names: NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))

supported y-units names: ABSORBANCE,A,T,A.U.,TRANSMITTANCE,R,KUBELKA-MUNK,ARBITRARY UNITS,REFLECTANCE

Example

julia> convert!([1,2,3],xUnits("MKM"),xUnits("1/cm"))) 
source
JCAMPDXir.fill_data_buffer!Method
fill_data_buffer!(data_buffer::DataBuffer{D,B,LineDecodingType,ChunkType},
                                        current_line,
                                        delimiter, 
                                        chunk_counter=1) where {D,B,
                                        LineDecodingType,
                                        ChunkType}

Fills data buffer data_buffer from current_line, in current_line all data chunks should be separated by the delimiter, returns the number of data chunks added to buffer

source
JCAMPDXir.find_blocksMethod
find_blocks(file_name)

Counts blocks in file file_name, fills coordinates of blocks start and finish in file lines, returns the vector of JDXblock blocks or a singe block.

source
JCAMPDXir.parse_headers!Method
parse_headers!(jdx::JDXblock,headers::Vector{String})

internal function fills headers dictionary from a vector of strings

source
JCAMPDXir.read!Method
read!(jdx::JDXblock; delimiter=nothing,
                           only_headers::Bool=false,
                           fixed_columns_number::Bool=true,
                           fixed_line_decoding::Bool = false,
                           fixed_chunk_decoding::Bool = false,
                           validate_data::Bool = true)

fills precreated JDXblock object see JDXblock.

`delimiter` - data chunk delimiters
`only_headers`  - if true parses only block headers
`fixed_columns_number`  if true, all lines are supposed to have the same number of chunks
`fixed_line_decoding` if true, data line decoding (No_Line_Decoding, SQZ,PAC) is supposed  to be the same for all lines (parser obtains decoding from the first line of data)
`fixed_chunk_decoding` if true, all data chunks decoding (No_Chunk_Decoding,DIF,DUP) is supposed  to be the same for all lines (parser obtains decoding from the first line of data)
`validate_data` if false: we don't need no validation (if false ignores all data validations specified by JCAMP format)
source
JCAMPDXir.read_jdx_fileMethod
read_jdx_file(file_name::String;
            fixed_columns_number::Bool=false,
            delimiter = nothing,
            fixed_line_decoding::Bool = false,
            fixed_chunk_decoding::Bool = false,
            validate_data::Bool=true)

Reads JCAMP-DX file file_name

Input arguments:

`file_name` - full file name
(optional keyword args) 
`fixed_columns_number` - if it is known that each line in file has the same number of data chunks (coded numbers), this flag can be settled to true
`delimiter`  - data chunks delimiter (default value is space)
`fixed_line_decoding` if `true` line decoding type ( PAC,SQZ or no line decoding) is taken only once from the first line of data, otherwise new type is obtained for each line
`fixed_chunk_decoding` if `true` chunk decoding type ( DIF, DUP, mixed DIFDUP or no chunk decoding) is taken from the first line of data, otherwise new type is obtained for each line    
`validate_data` turns on internal data checks

Usually setting flag value result in speeding up the data loading process by reducing the allocations etc. If file loading speed is not impoertant optional flags can be remained at defaults

Output arguments is the namedtuple (or a vector of namedtuples in the case of multiple blocks)

with fields:

x - coordinate (wavelength, wavenumber or other)

y - data 

headers - dictionary in "String => value" format with headers values, values can be both numbers and strings

data_validation -  structure of [`DataValidation`](@ref) type
source
JCAMPDXir.write_jdx_fileFunction
write_jdx_file(file_name,x::Vector{Float64},y::Vector{Float64},x_units::String="1/CM",
    y_units::String="TRANSMITTANCE"; kwargs...)

Saves infrared spectrum given as two vectors x and y to JCAMP-DX file file_name. Input vector should be of the same size. Currently the package suppots only (X++(Y..Y)) table data format. JCAMP-DX 4.24 demands 80 symbols per file line of Y-data and 88 total symbols per line. The y-vector is stored in eight columns, thus the total number of points should be a multiple of eight. All last mod(length(y),8) points of y dtaa will not be written to the file. It is preferable that all x-data is sorted in ascending order and spaced uniformly. If it is not the case, or if the units conversion is envolved, function will automatically interpolate and sort the data on uniformly spaced grid.

x_units - units of x data, must be one of NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))

y_units - units of y data, must be one of ABSORBANCE,A,T,A.U.,TRANSMITTANCE,R,KUBELKA-MUNK,ARBITRARY UNITS,REFLECTANCE

Further any keword arguments can be provided, all of them will be written to the head of the file. All keyword arguments appear in the file in uppercase.

Most impostant keywords are

TITLE - the title of the file (it is always on top of the file)
XUNITS - x data units saved to file,  must be one of  NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))
YUNITS - y data units saved to file, must be one of ABSORBANCE,A,T,A.U.,TRANSMITTANCE,R,KUBELKA-MUNK,ARBITRARY UNITS,REFLECTANCE

If x_units (function's fourth argument) are not equal to the key-word argument XUNITS than the function converts x-values before saving to file see xconvert!

If y_units (function's fifth argument) are not equal to the key-word argument XUNITS than the function converts y-values before saving to file see yconvert!

Example

julia> using JCAMPDXir
julia> filename = joinpath(@__DIR__,"test.jdx")
julia> write_jdx_file(filename,[1,2,3,4,5,6,7,8],rand(8),"MKM","T",title = "new file",XUNIT="1/CM",YUNITS="KUBELKA-MUNK") 
source
JCAMPDXir.xconvert!Method
xconvert!(x::AbstractArray,input_units::String,output_units::String)

Converts the values of `x` from `input_units` to `output_units`.
supported x-units names: NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))

Example

julia> xconvert!([1,2,3],"MKM",xUnits"1/cm")) 
source
JCAMPDXir.yconvert!Method
yconvert!(y::AbstractArray,input_units::String,output_units::String)

Converts the values of y from input_units to output_units. supported x-units names: NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM)) All units can be written both in lower- and in uppercase, T,R and A stay for a shorthand for TRANSMITTANCE,REFLECTANCE and ABSORBANCE

Example

julia> yconvert!([1,2,3],"R","KUBELKA-MUNK")) 
source