Multidimensional Dictionary - Demo
File demonstrating some of the basic capabilities of MDD on a sample neuroscience dataset. For full instructions on using MDD, see tutorial_MDD.m
Contents
- Overview
- Set up paths and formatting
- Load a sample dataset and build an MDD object
- What is an MDD object and why use one?
- MDD subscripts and indexing
- Advanced indexing methods
- Running functions on MDD objects
- Plotting 3D data
- Modularization of plotting
- Merging two MDD objects
- Packing MDD dimensions (analogous to cell2mat)
- Unpacking MDD dimensions
- Next steps
- Summary so far
Overview
MDD is a MATLAB tool for managing high-dimensional data that often arises in scientific data analysis. It can be thoughout of as a MATLAB cell array (or matrix) with some additional functionality.
Set up paths and formatting
% Format format compact format short g % Check if in MDD folder if ~exist(fullfile('.','data','sample_data.mat'), 'file') error('Should be in MDD folder to run this code.') end % Add MDD toolbox to Matlab path if needed if ~exist('MDD','class') addpath(genpath(pwd)); end
Load a sample dataset and build an MDD object
% Load some sample simulated data load('sample_data.mat'); load('sample_data_meta2.mat');
The file sample_data.mat contains dat, a 4-dimensional cell array:
whos dat
Name Size Bytes Class Attributes dat 3x3x2x8 21633984 cell
Each cell in dat contains some neural time series data. For example
dat(1,1,2,1)
ans = 1×1 cell array {1001×20 single}
Use this to build an MDD object.
% Construct MDD object
mdd = MDD(dat,axis_vals,axis_names);
Note that axis_vals and axis_names contain useful metadata, but their details are not important here.
What is an MDD object and why use one?
When working with high-dimensional data, like that cell array dat, it's often difficult to keep track of what each dimension represents. MDD objects provide a way of organizing and manipulating this information.
Let's print a summary of the data contained within mdd:
% Print object summary
mdd.printAxisInfo
Axis Size: [3 3 2 8] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10, 15 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> v, iNa_m, iNa_h, iK_n, I_iGABAa_s, E_iAMPA_s, I_iGABAa_ISYN, E_iAMPA_ISYN
This tells us several things about our MDD object:
- It is a 4-dimensional object with mdd.size = [3,3,2,8]
- The four axes are titled: param1, param2, populations, and variables
- This also summarizes the values that each axis takes on, and the corresponding data type of those values (numeric or cellstr). For example, the 'populations' axis takes on values 'E' and 'I' (for excitatory and inhibitory cells, respectively).
An MDD object can be thought of in several ways:
- A MATLAB matrix or cell array that can be indexed by using strings and regular expressions
- An N-dimensional table (a table is equivalent to an MDD object in 2-dimensions)
- A map/dictionary that associates multiple keys with a single value
MDD subscripts and indexing
MDD contains several options for how to index data. The most basic method is to just index MDD objects like you would index normal Matlab matrices and cells. For example:
% Basic indexing
mdd3D = mdd(:,:,1,2:3);
mdd3D.printAxisInfo
Axis Size: [3 3 1 2] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10, 15 Axis 3: populations (cellstr) -> E Axis 4: variables (cellstr) -> iNa_m, iNa_h
This selects everything from dimensions 1 and 2, the 1st element from dimension 3, and the 2nd and 3rd elements from dimension 4. Another way to do this, however, is to reference the axis values directly. For example:
% Indexing by string query mdd3D = mdd(:,:,'E','iNa'); mdd3D.printAxisInfo
Axis Size: [3 3 1 2] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10, 15 Axis 3: populations (cellstr) -> E Axis 4: variables (cellstr) -> iNa_m, iNa_h
Note that substring matching (based on Matlab's strcmp command) was used here to select both iNa_m and iNa_h. However, regular expressions can be used as well. The "/" denotes to use regular expressions for the search. This command picks out all variable names beginning with an "I".
% Indexing by regular expressions mdd3D = mdd(:,:,'E','/^I/'); mdd3D.printAxisInfo
Axis Size: [3 3 1 2] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10, 15 Axis 3: populations (cellstr) -> E Axis 4: variables (cellstr) -> I_iGABAa_s, I_iGABAa_ISYN
If we only want to specify the values for a single axis, we can use axisSubset. This indexes the 'variables' axis for all values containing 'iNa'
% Indexing by axis-value pairs mdd3D = mdd.axisSubset('variables', 'iNa'); mdd3D.printAxisInfo
Axis Size: [3 3 2 2] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10, 15 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> iNa_m, iNa_h
All standard forms of Matlab indexing work as well, including indexing by position, linear indexing, and logical indexing.
% Method 1 - Indexing by position foo1 = mdd(1:3,3,2,8); % Method 2 - Linear indexing foo2 = mdd(142:144); % Take the last 3 entries in the data. % Method 3 - Logical indexing ind = false(1,144); ind(142:144) = true; foo3 = mdd(ind); % Using logical indexing also produces the same result. disp(isequal(foo1,foo2,foo3)); clear foo1 foo2 foo3
1
The results of all three indexing methods above are the same. For more details about indexing methods in Matlab, see: https://www.mathworks.com/help/matlab/math/array-indexing.html
Advanced indexing methods
Finally, MDD supports inline queries using the valSubset method. Here are two examples.
% Select axis1 values greater than 5, and axis2 equal to 10 mdd3D = mdd.valSubset('>5','==10',:,:); mdd3D.printAxisInfo
Axis Size: [2 1 2 8] Axis 1: param1 (numeric) -> 10, 20 Axis 2: param2 (numeric) -> 10 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> v, iNa_m, iNa_h, iK_n, I_iGABAa_s, E_iAMPA_s, I_iGABAa_ISYN, E_iAMPA_ISYN
% Select axis1 values between 8 and 11, and axis2 values between 3 and 11 mdd3D = mdd.valSubset('8 < x < 11','3 < x <= 11',:,:); mdd3D.printAxisInfo
Axis Size: [1 2 2 8] Axis 1: param1 (numeric) -> 10 Axis 2: param2 (numeric) -> 5, 10 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> v, iNa_m, iNa_h, iK_n, I_iGABAa_s, E_iAMPA_s, I_iGABAa_ISYN, E_iAMPA_ISYN
Running functions on MDD objects
The primary purpose of MDD is to make working with high-dimensional data more convenient. Traditionally, if you are working with an N-dimensional matrix/cell array, you would write a function that takes in that variable and knows what to do with each of the N dimensions (usually involving nested for loops). This function would not work on data with N-1 or N+1 dimensions.
MDD takes a different approach. With MDD, you specify function handles that operate on lower dimensional data (usually 1D, 2D, or 3D) and to "assign" these to specific dimension in your higher dimensional MDD object. The MDD method recursiveFunc is used for this. Examples below will show how recursiveFunc can be used for plotting. But it is not limited to this.
Plotting setup
This section can be safely ignored. Here I'm just adding a metadata structure to our mdd object, which will be used by some of the following plotting commands. This stores some additional metadata about the time series in mdd. I'm also defining a data structure defining how large to make the figures.
meta = struct; meta.datainfo(1:2) = MDDAxis; meta.datainfo(1).name = 'time(ms)'; meta.datainfo(1).values = time; meta.datainfo(2).name = 'cells'; meta.datainfo(2).values = []; mdd.meta = meta; clear meta op.figwidth=0.4; op.figheight=0.3;
Plotting 3D data
The workhorse of performing functions on MDD objects is recursiveFunc. Here is an example of how it is used to plot our data.
First, we will pull out a 3D subset of the data. Our data consists of 9 simulations of a neural network, in the form of a 3x3 parameter sweep (axes 1 and 2). For each simulation, we have data for the two different types of neurons simulated, excitatory (E) and inhibitory (I) cells. We also have the variables associated with each neuron (for example, membrane voltage, ionic currents, etc.). Here, we will only look at the voltage of the neurons.
% Pull out a 3D MDD object with only E cell membrane voltage mdd3D = mdd(:,1:2,:,'v'); mdd3D.printAxisInfo
Axis Size: [3 2 2 1] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> v
Now, let's run the plot:
% Set up plotting arguments function_handles = {@xp_handles_fignew,@xp_subplot_grid,@xp_matrix_basicplot}; dimensions = {{'populations'},{'param1','param2'},{'data'}}; function_arguments = {{op},{},{}}; % Run the plot. Note the "+" icons next to each plot allow zooming. close all mdd3D.recursiveFunc(function_handles,dimensions,function_arguments);
To break down what's going on here:
- function_handles - A cell array of function handles that specify what to do with specific dimensions of the data.
- dimensions - A cell array that tells which dimensions of the data should be assigned to which function handles.
- function_arguments - A cell array that tells what arguments to pass to each of the function handles. Here we pass op, which tells xp_handles_fignew how big to make the figure.
- There should be 1 entry in dimensions and function_arguments for every function handle supplied. (E.g., their lengths should be equal)
In this example, the 'populations' dimension of our data (either E cells or I cells) is handled by the function xp_handles_fignew, which creates a new figure for each value along this dimension provided. The dimensions 'param1' and 'param2' are handled by xp_subplot_grid, which creates a grid of subplots for the parameter sweep. Note that xp_subplot_grid operates on 2D data. Finally, xp_matrix_basicplot creates the actual plots, operating on the time series data contained in mdd.data
Here is the result:
Note that each of the functions passed to function_handles is essentially acting on lower-dimensional MDD objects. For example, xp_matrix_basicplot operates on 0D data (e.g., a single cell)
% Call xp_matrix_basicplot directly with 0D data (e.g., foo = mdd(1,1,2,1); close all figure; xp_matrix_basicplot(foo);
Similarly, xp_subplot_grid operates on 2D data, but it can only sets up the subplots and cannot plot anything. It needs recursiveFunc to chain everything together.
Modularization of plotting
The advantage of recursiveFunc is that the function handles we use can be recycled and applied in different contexts. This allows us to work with data of different dimensionality and to slice the data in different ways. Below are two additional examples of this.
2D plot, sliced to visualize populations vs variables
Here we will use similar methods to look at our data in a different way. We will look at a single simulation (param values), and comparing both E and I cells and their underlying variables
% Pull out a 2D MDD object (variables axis is indexed using regular expressions) mdd2D = mdd(2,2,:,'/v|iNa|iK/'); mdd2D.printAxisInfo % Set up plotting arguments function_handles = {@xp_subplot_grid,@xp_matrix_basicplot}; dimensions = {{'variables','populations'},{'data'}}; function_arguments = {{},{}}; % Run the plot close all figure; mdd2D.recursiveFunc(function_handles,dimensions,function_arguments);
Axis Size: [1 1 2 4] Axis 1: param1 (numeric) -> 10 Axis 2: param2 (numeric) -> 10 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> v, iNa_m, iNa_h, iK_n
2D plot, sliced to visualize populations vs param1
Here we will repeat our above 3D plot, but instead group populations and param1 into the same set of subplots, and ignore param2. We also swapped in xp_matrix_imagesc for xp_matrix_basicplot, which represents the data using imagesc plots.
% Pull out a 2D MDD object mdd2D = mdd(:,2,:,'v'); mdd2D.printAxisInfo % Set up plotting arguments function_handles = {@xp_subplot_grid,@xp_matrix_imagesc}; dimensions = {{'populations','param1'},{'data'}}; function_arguments = {{},{}}; % Run the plot close all figure; mdd2D.recursiveFunc(function_handles,dimensions,function_arguments);
Axis Size: [3 1 2 1] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 10 Axis 3: populations (cellstr) -> E, I Axis 4: variables (cellstr) -> v
Merging two MDD objects
MDD also contains methods to perform operations on MDD objects. Here is an example of how two MDD objects can be merged together. For more details on methods like merge, see tutorial_MDD.m
% Slice the dataset one way close all mdd1 = mdd(2,:,'E','v'); % Slice it another way mdd2 = mdd(:,3,'E','v');
Notice that mdd1 and mdd2 are overlapping at (2,3,1,1). Hence, when we merge, we will set forceMergeBool to true. Without this, it will throw a warning
% Merge with overwrite
mdd_merged = merge(mdd1,mdd2,true);
Now, plot the merged data
dimensions = {[1,2],0};
close all; figure; recursiveFunc(mdd_merged,{@xp_subplot_grid,@xp_matrix_imagesc},dimensions);
Compare to the original dataset
close all; figure; recursiveFunc(mdd(:,:,'E','v'),{@xp_subplot_grid,@xp_matrix_imagesc},dimensions);
Packing MDD dimensions (analogous to cell2mat)
MDD object properties include the raw data (usually in the form of a cell array) plus associated metadata. The raw data is an accessible property of mdd:
% Take a slice of mdd mdd2 = mdd(:,:,:,1); % View MDD data mdd2.data
3×3×2 cell array ans(:,:,1) = {1001×80 single} {1001×80 single} {1001×80 single} {1001×80 single} {1001×80 single} {1001×80 single} {1001×80 single} {1001×80 single} {1001×80 single} ans(:,:,2) = {1001×20 single} {1001×20 single} {1001×20 single} {1001×20 single} {1001×20 single} {1001×20 single} {1001×20 single} {1001×20 single} {1001×20 single}
There are several basic MDD operations for acting on data within an MDD object. packDim allows takes a dimension from the MDD object and pushes it into the internal data property. Here we will use it to pack the two cell populations (E and I) into mdd.data, so that they can be plotted simultaneously.
% First, let's average dimension 2 for every cell in _mdd2.data_. This % averages all the traces together. mdd2.data = cellfun(@(x) squeeze(mean(x,2)), mdd2.data, 'UniformOutput', false); mdd2.data
3×3×2 cell array ans(:,:,1) = {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} ans(:,:,2) = {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single} {1001×1 single}
Now, pack the 'populations' dimension into mdd2.data
% Pack populations into mdd2.data mdd2 = mdd2.packDim('populations'); mdd2.printAxisInfo
Axis Size: [3 3 1 1] Axis 1: param1 (numeric) -> 0, 10, 20 Axis 2: param2 (numeric) -> 5, 10, 15 Axis 3: Dim 3 (numeric) -> 1 Axis 4: variables (cellstr) -> v
The populations property is now missing from mdd2. Instead, mdd2.data contains this information - it is now 1001x2 instead of 10001x1
% View the contents of mdd2.data
mdd2.data
ans = 3×3 cell array {1001×2 single} {1001×2 single} {1001×2 single} {1001×2 single} {1001×2 single} {1001×2 single} {1001×2 single} {1001×2 single} {1001×2 single}
Plot the result
close all figure; mdd2.recursiveFunc({@xp_subplot_grid,@xp_matrix_basicplot},{{'param1','param2'},{'data'}});
Here, red traces are "I" cells and blue traces are "E" cells.
Unpacking MDD dimensions
Conversely, MDD dimensions can also be unpacked. Here, we will use the unpackDim method to pull out individual traces and select a few to examine more closely.
% Take a slice of mdd (variables axis is indexed using regular expressions) mdd2 = mdd(2,2,'E','/v|iNa|iK/'); % Unpack the 2nd dimension from mdd.data (corresponding to individual % traces) and create a new, 5th dimension of mdd to store this. src = 2; dest = 5; mdd2=mdd2.unpackDim(src, dest); mdd2.printAxisInfo
Axis Size: [1 1 1 4 80] Axis 1: param1 (numeric) -> 10 Axis 2: param2 (numeric) -> 10 Axis 3: populations (cellstr) -> E Axis 4: variables (cellstr) -> v, iNa_m, iNa_h, iK_n Axis 5: matrix_dim_2 (numeric) -> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
We now have a 5th dimension of mdd2 corresponding to each of the 80 traces that was originally in mdd2.data. Each of these tracs corresponds to a single cell in the simulation, so we can look at these more closely. Let's select a few.
% Select a few cells. Note we are also renaming axis5. mdd2.axis(5).name = 'Cell Number'; mdd2 = mdd2(:,:,:,:,[1,3,6,9]); mdd2.printAxisInfo
For more details on how to rename axes, see tutorial_MDD.m
Axis Size: [1 1 1 4 4] Axis 1: param1 (numeric) -> 10 Axis 2: param2 (numeric) -> 10 Axis 3: populations (cellstr) -> E Axis 4: variables (cellstr) -> v, iNa_m, iNa_h, iK_n Axis 5: Cell Number (numeric) -> 1, 3, 6, 9
Finally, plot the result
% Plot using recursiveFunc close all figure; mdd2.recursiveFunc({@xp_subplot_grid,@xp_matrix_basicplot},{{'variables','Cell Number'},{'data'}});
Next steps
To start using MDD yourself, see `tutorial_MDD.m`. In particular, this will provide additional details on how to build your own MDD objects and advanced methods to manipulate them.
% Run the following edit tutorial_MDD.m
Summary so far
At its core, MDD extends the way cells and matrices are indexed by allowing string labels to be assigned to each dimension of the cell array, similar to how row and column names are assigned to a table (e.g., in Pandas or SQL). MDD objects can then be indexed, sorted, merged, and manipulated according to these labels. (Section: MDD subscripts and indexing)
Additionally, MDD includes methods for performing operations on high dimensional data. The goal is to modularize the process of working with high dimensional data. Within MDD, functions designed to work on low dimensional data (1 or 2 dimension) can each be assigned to each operate on different dimensions of a higher dimensional object. Chaining several of these functions together can allow the entire high dimensional object to be processed. The advantage of this modular approach is that functions can be easily assigned other dimensions or swapped out entirely, without necessitating substantial code re-writes. (Section: Running functions on MDD objects)