Functions of JCAMPDXir
JCAMPDXir.JCAMPDXir — ModuleThis JCAMP-DX (infrared) file format was developed for the exchange of infrared spectra between different laboratories. For general description of format refer to UIPAC.ORG pdf-file In addition to the spectra themselves, the file also stores metadata containing information about the units of measurement and the conditions under which the spectra were acquired.
JCAMP file content example:
##TITLE=1
##JCAMP-DX=4.24
##DATATYPE=INFRARED SPECTRUM
##XUNITS=1/CM
##YUNITS=TRANSMITTANCE
##YFACTOR=0.00699183
##FIRSTY=11746.3412893072
##NPOINTS=16384
##XYDATA=(X++(Y..Y))
0 1680010 821286 2148133 1505245 1537124 1367661 1147725 1134981
7.71603 1166853 1213186 1029828 1067595 1135904 1195128 1157134 1150556
15.4321 1266743 1164401 1014224 1022338 999780 1138781 1208950 1161258
.
.
##END=According to JCAMP specifications there are several format for data compression:
- simple integer: 7.71603 1166853 5213186 -1029828 -1067595 (y-data decoded as integer numbers)
- PAC: 7.71603+1166853+5213186-1029828-1067595 (signs are used as delimiters)
-SQZ: 7.71603A166853E213186a029828a067595 (signs are converted according to [`SQZ_digits`](@ref))
-DIF: 7.71603 1166853 M046333 o243014 l7767 (all data chunk except the first one represent relative
shift of the value with respect to the previous one)
-DUP: in this mode all duplicated y- values are replaced with a single letter according to [`DUP_digits`](@ref)
these values show the number of duplications, DUP mode can be combined with DIF-modeCore.Type — Method(::Type{SQZ})(s::AbstractString)All types are callable on a string and use string conversion to add delimiters between the data chunks When line decoding is of Unspecified_Line before converting the string, there is an additional operation to get the string type in run-time
Core.Type — Method(::Type{T})(s::AbstractString) where T<:DecodingBy default, all chunks and line decoding do nothing to the string line
JCAMPDXir.DATAline — TypeDATAline type specifies the data organization patter viz XY...XY or XY...YJCAMPDXir.DataBuffer — TypeDataBuffer is an intermediate container for data parsed from each stringJCAMPDXir.Decoding — TypeTypes for decoding of both lines and chunks, line and chunk typisation is used
to dispatch during parsingJCAMPDXir.JDXblock — TypeStored parsed dataMust be filled using read! function
JCAMPDXir.JDXblock — MethodJDXblock(file_name::String)Creates an empty JDXblock object from full file name
JCAMPDXir.LineDecoding — TypeTypes for decoding lines and chunk stringJCAMPDXir.No_Line_Decoding — TypeWhen there is no line decoding allJCAMPDXir.Unspecified_Line — TypeWhen using Unspecified_Line type line type parser looks for the actual type for each lineJCAMPDXir.ValidationPoint — Type Stores Validation pointsJCAMPDXir.addline! — Methodaddline!( jdx::JDXblock,
data_buffer::DataBuffer{DataLineType,B,LD,ChunkDecoding},
current_line::String; # index of current data chunk
delimiter=DEFAULT_DELIMITER(DataLineType),
validate_data::Bool=true) where {DataLineType<:XYYline,B,LD,ChunkDecoding}This function parses current_line string of file, fills data buffer data_buffer by calling fill_data_buffer! and copies buffer content to x- and y-vector of jdx object JDXblock - the number of data point (excluding the x-coordinate) in the line -delimiter data points delimiter used in split function -validate_data if false, parser ignores all data validations
JCAMPDXir.addline! — Methodaddline!(jdx::JDXblock,
data_buffer::DataBuffer{DataLineType},
current_line::String; # index of current data chunk
delimiter=r"[,;]",
validate_data::Bool=true) where DataLineType<:XYXYlineAdds line to XY...XY data
JCAMPDXir.check_data_point! — Methodcheck_data_point!(dv::DataValidation,
value_checker::T,
value_real::T,
field::Symbol,
point_index::Int=0,
line_index::Int=0) where TFunction used to check the value, input args:
- dv - [`DataValidation`](@ref) object
- value_checker - checker value (Usually this value is taken from file headers)
- value_real - value parsed from file
- field must be :x , :y (for data vector validation) or one of [`VIOLATION_CRITERIA`](@ref) keys
- point_index - index of data point (used to form the message if the validation fails)
- line_index - index of line in file (used to form the message if the validation fails)If valuechecker ≈ valuereal doesnt fulfilled
JCAMPDXir.convert! — Methodconvert!(x::AbstractArray,::sUnits{T},::sUnits{T}) where TThis functions can be used convert y and x units
supported x-units names: NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))
supported y-units names: ABSORBANCE,A,T,A.U.,TRANSMITTANCE,R,KUBELKA-MUNK,ARBITRARY UNITS,REFLECTANCE
Example
julia> convert!([1,2,3],xUnits("MKM"),xUnits("1/cm"))) JCAMPDXir.fill_data_buffer! — Methodfill_data_buffer!(data_buffer::DataBuffer{D,B,LineDecodingType,ChunkType},
current_line,
delimiter,
chunk_counter=1) where {D,B,
LineDecodingType,
ChunkType}Fills data buffer data_buffer from current_line, in current_line all data chunks should be separated by the delimiter, returns the number of data chunks added to buffer
JCAMPDXir.find_blocks — Methodfind_blocks(file_name)Counts blocks in file file_name, fills coordinates of blocks start and finish in file lines, returns the vector of JDXblock blocks or a singe block.
JCAMPDXir.generateVectors! — MethodgenerateVectors!(::JDXblock,::Type{<:DATAline})Generates x-vector (XY...Y data format) and precreates y-vectors
JCAMPDXir.get_line_decoding — Methodget_line_decoding(s::AbstractString)Returns line type by searching for specific symbols
JCAMPDXir.has_violations — Methodhas_violations(dv::DataValidation)CHeks if there is any data violations
JCAMPDXir.parse_chunk — Functionparse_chunk(::Type{<:ChunkDecoding},s::AbstractString)Functions to parse data chunk
JCAMPDXir.parse_headers! — Methodparse_headers!(jdx::JDXblock,headers::Vector{String})internal function fills headers dictionary from a vector of strings
JCAMPDXir.parse_headers — Methodparse_headers(file::String)Parses headers from JCAMP-DX file, returns dictionary with file headers
JCAMPDXir.prepare_jdx_data — Functionprepare_jdx_data(x::Vector{Float64},y::Vector{Float64},x_units::String="1/CM",
y_units::String="TRANSMITTANCE"; kwargs...)Prepares data to be written using write_jdx_file
JCAMPDXir.read! — Methodread!(jdx::JDXblock; delimiter=nothing,
only_headers::Bool=false,
fixed_columns_number::Bool=true,
fixed_line_decoding::Bool = false,
fixed_chunk_decoding::Bool = false,
validate_data::Bool = true)fills precreated JDXblock object see JDXblock.
`delimiter` - data chunk delimiters
`only_headers` - if true parses only block headers
`fixed_columns_number` if true, all lines are supposed to have the same number of chunks
`fixed_line_decoding` if true, data line decoding (No_Line_Decoding, SQZ,PAC) is supposed to be the same for all lines (parser obtains decoding from the first line of data)
`fixed_chunk_decoding` if true, all data chunks decoding (No_Chunk_Decoding,DIF,DUP) is supposed to be the same for all lines (parser obtains decoding from the first line of data)
`validate_data` if false: we don't need no validation (if false ignores all data validations specified by JCAMP format)JCAMPDXir.read_jdx_file — Methodread_jdx_file(file_name::String;
fixed_columns_number::Bool=false,
delimiter = nothing,
fixed_line_decoding::Bool = false,
fixed_chunk_decoding::Bool = false,
validate_data::Bool=true)Reads JCAMP-DX file file_name
Input arguments:
`file_name` - full file name
(optional keyword args)
`fixed_columns_number` - if it is known that each line in file has the same number of data chunks (coded numbers), this flag can be settled to true
`delimiter` - data chunks delimiter (default value is space)
`fixed_line_decoding` if `true` line decoding type ( PAC,SQZ or no line decoding) is taken only once from the first line of data, otherwise new type is obtained for each line
`fixed_chunk_decoding` if `true` chunk decoding type ( DIF, DUP, mixed DIFDUP or no chunk decoding) is taken from the first line of data, otherwise new type is obtained for each line
`validate_data` turns on internal data checksUsually setting flag value result in speeding up the data loading process by reducing the allocations etc. If file loading speed is not impoertant optional flags can be remained at defaults
Output arguments is the namedtuple (or a vector of namedtuples in the case of multiple blocks)
with fields:
x - coordinate (wavelength, wavenumber or other)
y - data
headers - dictionary in "String => value" format with headers values, values can be both numbers and strings
data_validation - structure of [`DataValidation`](@ref) typeJCAMPDXir.write_jdx_file — Functionwrite_jdx_file(file_name,x::Vector{Float64},y::Vector{Float64},x_units::String="1/CM",
y_units::String="TRANSMITTANCE"; kwargs...)Saves infrared spectrum given as two vectors x and y to JCAMP-DX file file_name. Input vector should be of the same size. Currently the package suppots only (X++(Y..Y)) table data format. JCAMP-DX 4.24 demands 80 symbols per file line of Y-data and 88 total symbols per line. The y-vector is stored in eight columns, thus the total number of points should be a multiple of eight. All last mod(length(y),8) points of y dtaa will not be written to the file. It is preferable that all x-data is sorted in ascending order and spaced uniformly. If it is not the case, or if the units conversion is envolved, function will automatically interpolate and sort the data on uniformly spaced grid.
x_units - units of x data, must be one of NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))
y_units - units of y data, must be one of ABSORBANCE,A,T,A.U.,TRANSMITTANCE,R,KUBELKA-MUNK,ARBITRARY UNITS,REFLECTANCE
Further any keword arguments can be provided, all of them will be written to the head of the file. All keyword arguments appear in the file in uppercase.
Most impostant keywords are
TITLE - the title of the file (it is always on top of the file)
XUNITS - x data units saved to file, must be one of NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))
YUNITS - y data units saved to file, must be one of ABSORBANCE,A,T,A.U.,TRANSMITTANCE,R,KUBELKA-MUNK,ARBITRARY UNITS,REFLECTANCEIf x_units (function's fourth argument) are not equal to the key-word argument XUNITS than the function converts x-values before saving to file see xconvert!
If y_units (function's fifth argument) are not equal to the key-word argument XUNITS than the function converts y-values before saving to file see yconvert!
Example
julia> using JCAMPDXir
julia> filename = joinpath(@__DIR__,"test.jdx")
julia> write_jdx_file(filename,[1,2,3,4,5,6,7,8],rand(8),"MKM","T",title = "new file",XUNIT="1/CM",YUNITS="KUBELKA-MUNK")
JCAMPDXir.write_jdx_file — Methodwrite_jdx_file(file_name,jdx::JDXblock; kwargs...)Writes JDXblock object to file
JCAMPDXir.write_jdx_file — Methodwrite_jdx_file(jdx::JDXblock; kwargs...)Writes JDXblock object to file
JCAMPDXir.xconvert! — Methodxconvert!(x::AbstractArray,input_units::String,output_units::String)
Converts the values of `x` from `input_units` to `output_units`.
supported x-units names: NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM))Example
julia> xconvert!([1,2,3],"MKM",xUnits"1/cm")) JCAMPDXir.yconvert! — Methodyconvert!(y::AbstractArray,input_units::String,output_units::String)Converts the values of y from input_units to output_units. supported x-units names: NANOMETERS,NM,CM-1,MKM,MICROMETERS,1/CM,WAVELENGTH (NM)),CM^-1,WAVELENGTH (UM)) All units can be written both in lower- and in uppercase, T,R and A stay for a shorthand for TRANSMITTANCE,REFLECTANCE and ABSORBANCE
Example
julia> yconvert!([1,2,3],"R","KUBELKA-MUNK"))