next up previous contents index
Next: 2 User's Manual Up: Plotting and Analysis for Previous: Contents   Contents   Index

Subsections

1 Software Architecture


3 Toolbox Components

Figure 1: Schematic overview of the database software.

An overview of the toolbox functionality is shown in Figure 1. In the figure, boxes represent objects that can be created with the toolbox. Flow starts from the dataset object on the top left which represents the collection of raw data files. The raw data is loaded using information in the dataset to create intermediate objects that, for instance, contain data traces. These objects define electrophysiological measurements to be entered into the data matrix of the database object on the top right. The database object allows filtering and querying to refine its contents. From the database object, one can always go back to the dataset and find the raw data that results from a query. The arrows going to bottom objects and corresponding plots show the types of possible analyses that can be done on a database object. These analyses are typically for displaying statistical information. The red arrow is a speacial analysis for searching and matching rows between different databases. The match is done by taking a row from a database created with data from real neurons and finding best matching model neurons from a simulation database.

The objects in the figure are instances of classes that define their properties in the object-oriented framework. Each class comes with a hierarchy of of subclasses that specialize to specific functions. Subsequent sections describe each of these class hierarchies that make up the main components of the toolbox.

3.1 Databases hold all the information

The database object is at the center of this toolbox (see Figure 1). It holds a data matrix with rows as observations and columns as attributes. The rows would normally correspond to results from individual data traces, or simply neurons. The columns hold values of separate measurements, statistical data, or parameter values.

Figure 2: Database class hierarchy.

A database object can be created from any of the classes in the hierarchy of Figure 2. The top-level database class is tests_db which contains a two-dimensional data matrix of real numbers and some metadata. The metadata consists of column labels (e.g., measure names), a dataset label, and data properties (e.g., time resolution). The subclasses are specialized for different tasks.

If the database object is created using a dataset object, this maintains a connection from the elements of the database (e.g., neurons) to the raw data. This allows raw data associated with database contents to be visualized during analysis. However, a database can be created from any data matrix given in the proper format.

Some specialized subclasses of tests_db are as follows:

params_tests_db
The first num_params columns are reserved for parameters that were changed between different rows. It contains methods that treat these columns specially. Parameters can be simulation parameters, or pharmacological applications to experiments.
tests_3D_db
Contains a three-dimensional data matrix that has additional dimension for pages of information. This is mainly used to look at change in measurements with a parameter using the invarParam method of params_tests_db. Three dimensional databases can be useful for other purposes as well.
stats_db
Contains few rows that describe the statistics obtained possibly from another database. It can contain the mean and standard deviation or error, or in some cases, the minimal and maximal values of columns in a database. It contains special plotting functions. There are methods that use the statistics collected by this class.
ranked_db
Contains distances that resulted from a comparison of a database with a criterion. Its rows are ranked and sorted according to this distance value. Each row would point to a row in dex into the original database. Contains methods to generate reports from information about matching neurons.
spikes_db
Contains results from individual spike shapes of a trace object. It can be obtained using the trace/analyzeSpikesInPeriod method.
histogram_db
Each row corresponds to a histogram bin. Contains plotting methods.
corrcoefs_db
Each row corresponds to a correlation coefficient. Contains plotting methods.
cluster_db
Each row corresponds to a cluster centroid. Contains plotting methods.

3.2 Datasets create the databases

The dataset object is responsible for creating the database objects (see Figure 1). It defines where the raw data is stored and what parameters are used to load and analyze it. It knows that raw data has parameters associated which individual raw data traces and how and which measures will be generated. This information is used to automatically generate a database from the dataset. It also allows reaching back the raw data from rows of an analyzed database.

Figure 3: Dataset class hierarchy.

Figure 3 shows the hierarchy for the dataset classes. The top-level dataset class is params_tests_dataset which is an incomplete class. That is, this class defines general utilities that can work for a variety of dataset subclasses, but one cannot make a object from the params_tests_dataset class directly. Instead, one of its subclasses must be chosen and used. Some of these specialized subclasses are as follows:

params_tests_fileset
This class assumes each raw data item resides in a file and all of these files are in the same directory. The parameter names and values are obtained from each file name itself. This class is mostly useful for simulation filesets.
params_cip_trace_fileset
This is a subclass of params_tests_fileset, therefore it inherits the notion of one file per data item. The files must conform to the current-pulse injection experiments and have a starting time and duration for the pulses. The pulse magnitude is read from the pAcip parameter. This class is mostly useful for simulation filesets.
physiol_cip_traceset
This is a subclass of params_tests_dataset. It is designed to load a set of physiology traces from a single file generated by the PCDX stimulation and acquisition software.
physiol_cip_traceset_fileset
This is a subclass of params_tests_dataset. It is designed to load traces from multiple PCDX data files. It uses the physiol_cip_traceset class for this purpose.
cip_traces_dataset, cip_traceset, cip_traceset_dataset
These are obsolete classes that allow loading physiology traces from older MATLAB formatted objects.

3.3 Bundling the database and dataset together

Figure 4: Bundle class hierarchy.

Since dataset and database objects are related and work together for some operations, it is convenient to have another object that bundles them together. There are several analysis routines that start from the database, retrieve raw data traces and other related information from the dataset and create a result. For instance, matching neurons from one database to another requires first comparing the measurements to find match candidates, and then comparing raw traces to visually represent the match quality.

The top-level dataset_db_bundle class in Figure 4 fulfills this purpose by bundling a dataset with the raw database, db, created from it, and with the reduced database, joined_db, that contains a one-row-per-neuron representation. Although being a virtual class that cannot be instantiated, it contains general methods and prototype methods that must be implemented in subclasses. This way, it provides guidelines for defining subclasses. Its two subclasses provide specialize methods for model and physiology databases, respectively.

model_ct_bundle
Contains methods to name and visualize neurons in the model database. It has methods to compare real neurons to model neurons to find best matching candidates.
physiol_bundle
Contains methods to name and visualize neurons in the physiology database. It contains a new attribute, joined_control_db, that holds only the neurons recorded without any pharmacological treatments.

3.4 Wrapper classes hold raw data

Wrapper classes are designed to hold data and provide simple methods that operate on them. They can either hold raw data, or intermediate processed forms of data being byproducts of analysis routines. In the overall schema of Figure 1, the raw traces obtained from the dataset object are kept in data wrapper objects.

Figure 5: Data wrapper class hierarchy.

Figure 5 shows the hierarchy for the data wrapper classes. The most basic data wrapper class in this toolbox is the trace class, which holds raw voltage or current traces. The spikes object contains the spike times obtained by analyzing a trace object.

A data wrapper class does more than just holding the data. It defines a set of operations in terms of method functions that can work on the data held by the class. As a rule of thumb, if one needs to add some new functionality into the toolbox, it should be added as a method into a class holding the data on which to operate.

Some of the data wrapper classes are as follows:

trace
Generic object that holds a vector of data that changes over time. It has a time resolution and y-axis resolution. Contains simple analysis routines such as finding average values within different periods, or finding spikes given a threshold.
cip_trace
A subclass of trace class for current-injection recording protocols. It defines an initial spontaneous period, followed by a current-injection period, and final recovery period. It contains period-specific analyses that apply to the experimental protocol.
spike_shape
A subclass of trace that holds the shape of a single spike. It contains spike shape measurements.
spikes
A generic class to hold the event times for spikes. It contains methods for making measurements based on spike times, such as rate and ISI calculations.

3.5 Profiles hold results of measurements

Figure 6: Profile class hierarchy.

Profile classes are designed to hold results of analysis and measurements on the data wrapper or database objects. The data and results are separated into different classes for added flexibility of saving data and results separately. Yet, the profiles normally keep a copy of the data wrapper object from which they obtained the measurements. The intention is to save the measurement results for possible visualization or later inspection, without having to repeat the analyses.

In Figure 6, the top-level results_profile class contains a simple MATLAB structure variable, results, that holds a set of name-value pairs. These are names of measurements and their corresponding values. Most of the subclasses are simplistic, and they exist only for organizational reasons. Some of them may implement specialized plotting methods that make use of the saved measurements. These subclasses can be briefly described as follows:

trace_profile
Holds measurements from a trace object. It contains the trace object and the spikes found in it, and averaged spike_shape object.
cip_trace_profile
Holds measurements from a cip_trace object with a current-injection period. It contains the original cip_trace object and the spikes found in it. In addition, it holds averaged spike_shape objects from the spontaneous and current-injection periods.
cip_trace_allspikes_profile
Extended version of cip_trace_profile. Instead of single averaged spike shapes, it contains spike databases from the spontaneous, current-injection and recovery periods. These databases only retain measurements made from individual spikes, but not their shapes.
spike_shape_profile
Holds measurements made from a spike_shape object.
params_tests_profile
Holds analysis results from a params_tests_db object.

3.6 Integrated plotting for easy visualization

Figure 7: Plot classes hierarchy.

To integrate visualization into each class, common MATLAB plotting features are implemented in the supporting classes seen in Figure 7. These bring an object-oriented approach to plot generation in MATLAB. Plots can be generated as objects, saved, modified and included as subplots in larger plots.

The main plotting classes are plot_abstract, plot_superpose, and plot_stack. The most general plotting template class, and the top-level class in the hierarchy, is plot_abstract, which plots an axis using a single MATLAB command, like plot or bar. Multiple plot_abstract objects that use the same command can be superposed and still act as a single plot_abstract object. If they require different plotting commands (e.g., mixing plot and text labels), a plot_superpose object must be used that is composed of an array of plot_abstract objects. Multiple plot_abstract objects or any of the subclass objects can be composed together in a horizontal or vertical stack using the plot_stack class. Since plot_stack is itself a subclass of plot_abstract, it can be stacked as well. This allows creating virtually any complex structured figure using the three classes. Each of these classes have several properties that control the layout and details of placement and looks.

The rest of the classes in the hierarchy create typical types of plots for convenience:

plot_bars
Multi-axis bar plot with extending errorbars using a combination of the bar, errorbar, and text commands.
plot_errorbar
Single-axis errorbar plot using the errorbar command.
plot_errorbars
Multi-axis errorbar plot using the errorbar command.
plot_simple
Simplified single-axis, single command plot.

3.7 Miscellaneous classes

These are miscellaneous classes that do not fit into any of the above categories:

period
Defines a period composed of a start and end time for operations on traces, etc.
script_array
Defines a looping construct that can be extended. It defines an initialization routine, a job that needs to be repeated, and a finalization routine.
script_array_for_cluster
Subclass of script_array, it can submit the array job to run in parallel on a computing cluster that supports the Sun Grid Engine (SGE) commands.
script_factory
Factory class to generate an enumerated array of script files to be distributed on several machines and run in parallel. It also defines a final function to gather results. It is recommended to use script_array_for_cluster instead.

4 Programming Conventions

4.1 Using property structures for passing optional arguments to methods

For flexibility in passing optional arguments to methods, this toolbox adopted using property structures. A MATLAB structure, mostly called props, is passed to a method as the last argument:

>> props.optionalParam = 1

>> myFunc('hello', props)

Each method defines a list of accepted arguments that can be defined as fields in the structure, but should be able to execute without them by substituting defaults. Using a property structure is advantageous over using the varargin keyword for variable number of arguments, because properties allow adding and deleting arguments in methods without causing changes to the contents of the method. Since arguments are addressed by names rather than positional arguments, missing arguments do not affect the other arguments.

Most objects keep a property structure that define custom attributes passed at time of construction. These classes define a setProp method to modify properties after being created.

4.2 Overloaded operators for transparent access to object contents

The simplistic implementation of object-oriented programming features in Matlab impose several strict limitations. MATLAB's powerful and flexible operator overloading feature helps overcome these limitations.

PANDORA Toolbox uses MATLAB operator overloading to facilitate manipulation of local and parent object fields. In MATLAB, object fields can only be accessed from the object's private methods. This means one cannot access the object fields using the dot operator. To give an example, the trace object has a dt field for time resolution. The following command fails:

>> mytrace.dt = 1e-4;

??? Object fields can only be accessed within methods.

Everytime object contents need to be addressed, a method must be called. The recommended way to do this is by defining separate getter/setter methods for each field of the object. For instance, writing getDt and setDt methods for accessing the dt field. This creates a lot of burden for the programmer not just creating a class, but also maintaining it later. Although this probably was intended for strictness in building object-oriented constructs, it is highly inconvenient for command-line manipulations. Therefore our toolbox objects offer generic get and set methods that can read or write the value of any of its fields:

>> mytrace = set(mytrace, 'dt', 1e-4)

>> get(mytrace, 'dt')

ans = 1e-04

These methods are almost identical across different classes. In addition to this, defining the special subsref method for objects allow overloading the dot (.), parenthesis (()), and curly brace ({}) operators. Most3 of the objects in the toolbox allows using the dot operator to read or write to fields. Overloading these operators also help with the limitation of accessing parent object fields, a problem not found in other object-oriented languages such as JAVA. For example without any overloading, from the subclass class cip_trace one needs to first address the parent class name, and then dt:

>> myciptrace.trace.dt

ans = 1e-4

After defining the overloaded operator that call parent methods, one get reach dt directly:

>> myciptrace.dt

ans = 1e-4

Some classes overload indexing operators to allow accessing special functions. For instance the main database class, tests_db, overloads parenthesized indexing to access cells in the database matrix. Some classes define the special subsasgn method to overload the assignment operations when the object is on the left-hand-side of the operation. This allows the command:

>> mytrace.dt = 1e-4;
which would otherwise need to be done the following way:

>> mytrace = set(mytrace, 'dt', 1e-4);

4.3 Troubleshooting errors

For debugging problems with methods, one can turn on the verbosity of information display during execution with:

>> warning on verbose

>> warning on backtrace

4.4 Creating a new class

To get the benefit of overloading, the top-level class must have the generic subsref and subsasgn methods. These methods can be copied from any of the other top-level classes. Any subclasses should have the generic get and set methods in place.


next up previous contents index
Next: 2 User's Manual Up: Plotting and Analysis for Previous: Contents   Contents   Index
Cengiz Gunay 2008-10-13