Database.jl
This module implements tools to facilitate the work with EEG databases, in particular, BCI databases in NY format — see the BCI Databases Overview.
To learn how to use BCI databases, see Tutorial ML 1.
Methods
| Function | Description |
|---|---|
Eegle.Database.infoDB | immutable structure holding the information summarizing an EEG database |
Eegle.Database.loadNYdb | return a list of .npz files in a directory (this is considered a database) |
Eegle.Database.infoNYdb | print, save and return metadata about a database |
Eegle.Database.selectDB | select databases and sessions based on inclusion criteria |
Eegle.Database.weightsDB | get weights for each session of a database for statistical analysis |
📖
Eegle.Database.infoDB — Typestruct infoDB
dbName :: String
condition :: String
paradigm :: String
files :: Vector{String}
nSessions :: Vector{Int}
nTrials :: Dict{String, Vector{Int}}
nSubjects :: Int
nSensors :: Int
sensors :: Vector{String}
sensorType :: String
nClasses :: Int
cLabels :: Vector{String}
sr :: Int
wl :: Int
offset :: Int
filter :: String
doi :: String
hardware :: String
software :: String
reference :: String
ground :: String
place :: String
investigators :: String
repository :: String
description :: String
timestamp :: Int
formatVersion :: String
endImmutable structure holding the summary information and metadata of an EEG database (DB) in NY format.
It is created by functions infoNYdb and selectDB.
Fields
.filesreturns a list of .npz files, each corresponding to a session in the database. The length of.filesis equal to the total number of sessions.nSessions: vector holding the number of sessions per subject.nTrials: a dictionary mapping each class label to a vector containing the number of trials per session for that class. For example,nTrials["left_hand"]returns a vector with the number of trials for"left_hand"across all sessions.
The following fields are assumed constant across all sessions of the database. This is checked by Eegle when a database is read.
.dbName: name or identifier of the database.condition: experimental condition under which the DB has been recorded.paradigm: for BCI data, this may be :P300, :ERP or :MI — see BCI paradigm.nSubjects: total number of subjects composing the DB — see subject.nSensors: number of sensors composing the recordings (e.g., EEG electrodes).sensors: list of sensor labels (e.g., [Fz, Cz, ...,Oz]).sensorType: type of sensors (wet, dry, Ag/Cl, ...).nClasses: number of classes for which labels are available.cLabels: list of class labels.sr: sampling rate of the recordings (in samples).wl: for BCI, this is the duration of trials (in samples).offset: shift to be applied to markers in order to determine the trial onset (in samples).filter: temporal filter that has been applied to the data.hardware: equipment used to obtain the recordings (typically, the EEG amplifier).software: software used to obtain the recordings.reference: label of the reference electrode for EEG differential amplifiers.ground: label of the electrical ground electrode.doi: digital object identifier (DOI) of the database.place: place where the recordings have been obtained.investigators: investigator(s) that have obtained the recordings.repository: public repository where the DB has made accessible.description: general description of the DB.timestamp: date of the publication of the DB.formatVersion: version of the NY format in which the recordings have been stored.
Eegle.Database.loadNYdb — Function function loadNYdb(dbDir=AbstractString, isin::String="")Return a list of the complete paths of all .npz files found in a directory given as argument dbDir. For each NPZ file, there must be a corresponding YAML metadata file with the same name and extension .yml, otherwise the file is not included in the list.
If a string is provided as kwarg isin, only the files whose name contains the string will be included.
See Also
infoNYdb, FileSystem.getFilesInDir
Examples xxx
Eegle.Database.infoNYdb — FunctionEegle.Database.selectDB — Functionfunction selectDB(rootDir :: String,
paradigm :: Symbol;
classes :: Union{Vector{String}, Nothing} =
paradigm == :P300 ? ["target", "nontarget"] : nothing,
minTrials :: Union{Int, Nothing} = nothing,
summarize :: Bool = true)Select BCI databases pertaining to the given BCI paradigm. Optionally, each session of the selected databases is scrutinized to meet the provided inclusion criteria.
Return the selected databases as a list of infoDB structures, wherein, if inclusion criteria are provided, the infoDB.files field lists the included sessions only.
Arguments
rootDir: the directory on the local computer where to start the search. Any folder in this directory is a candidate database to be selected.paradigm: the BCI paradigm to be used. Supported paradigms at this time are::P300,:ERPor:MI.
If a folder with the same name of the paradigm (for example: "MI") is found in rootDir, the search starts therein and not in rootDir.
Optional Keyword Arguments
classes: the labels of the classes the databases must include:- for the P300 paradigm the default classes are
["target", "nontarget"], as in the FII corpus. - for the MI and ERP paradigm there is no inclusion criterion based on class labels by default.
- for the P300 paradigm the default classes are
In the FII corpus, available MI class labels are: "lefthand", "righthand", "feet", "rest", "both_hands", and "tongue".
minTrials: the minimum number of trials for all classes in the sessions to be included.summarize: if true (default) a summary table of the selected databases is printed in the REPL.
Examples
selectedDB = selectDB(.../directory_to_start_searching/, :P300)
selectedDB = selectDB(.../directory_to_start_searching/, :MI;
classes = ["left_hand", "right_hand"])
selectedDB = selectDB(.../directory_to_start_searching/, :MI;
classes = ["rest", "both_hands", "feet"],
minTrials = 50,
summarize = false)Eegle.Database.weightsDB — Function function weightsDB(files)Given a database provided by argument files as a list of .npz files, compute a weight for each session to be used in statistical analysis when merging the classification performance or any other relevant index across databases.
The goal of the weighting is to balance the contribution of different databases and the different subjects therein, considering both the number of unique subjects in each database and the fact that the number of session for each subject may be different.
The weight assigned to each session is inversely proportional to the square root of the number of unique subjects in the database and to the square root of the number of sessions available for the same subject.
Let $s_m$ be one of the $S_m$ sessions for each unique subject $m$, the weight $w_{m,s_m}$ for session $s_m$ is given by:
\[ w_{m,s_m} = \frac{\sqrt{M} \cdot \sqrt{S_m}}{N}\]
where $M$ is the number of unique subjects in the database and $N$ is the total number of sessions (i.e., length(files)).
This weighting ensures that the sum of the weights for each subject is proportional to
\[\sqrt{M} \cdot \sqrt{S_m}\]
For example,
- if the database has $M = 100$ subjects and each has 1 session, the total weight for each subject is $\sqrt{100} \cdot \sum_{m=1}^{100} \frac{\sqrt{1}}{N} = 10$
- if each of the 100 subjects has 4 sessions, the total weight for each subject is $\sqrt{100} \cdot \sum_{m=1}^{100} \frac{\sqrt{4}}{N} = 20$.
This is a compromise between two extreme strategies commonly used when merging indices across databases, which are both inadequate:
- Uniform per-session weights (i.e., all sessions contribute equally), which favors larger databases or those with many sessions
- Uniform per-database weights (i.e., all databases contribute equally), which overemphasizes small databases.
Once obtained the weights for several databases, they can be globally normalized in any desired way.
Return
weights: a vector of length $N$, containing the weight for each session infilesschedule: an $N × 2$ matrix of integers where:- the first column contains the index of the subject to which the session belongs
- the second column contains the number of sessions for that subject.
Examples
w, schedule = weightsDB(files)Tutorials xxx