AmPtool

AmPtool contains Matlab functions for analysis of patterns in (functions of) parameter values of the Add-my-Pet (AmP) collection. It is an extension of DEBtool_M, and makes frequent use of it. This page describes how to use this tool for analysis of data in 4 stacked steps: tree, selections, legends and plotting. Don't forget to set paths to AmPtool and DEBtool_M in Matlab, including subdirectories. Subdirectory "curation" of AmPtool is only meant for curators to maintain the collection.

Data

All data (i.e. meta-data, parameters, meta-parameters, implied properties, but no empirical data) of all entries are collected by function write_allStat into a Matlab-structure allStat.mat. Only curators who have all entries locally can run this function, but the result is available for anybody. To read data directly, use "load allStat", or better read_allStat for all entries, or read_stat for specified entries. The structure allStat.mat has as first level field names the names of all entries, as specified by select (see below). As the AmP collection grows, lists-of-lists change (see below), so does allStat.mat; these two should be seen as a couple that should not be uncoupled. All rates and times that are not primary parameters are given at temperature T_typical, which is entry-specific; use temperature correction factor c_T to convert to the reference temperatures and temperature parameters to other temperatures. The primary parameters are given at T_ref = 20 C, but only those with time in their dimensions depend on temperature. All functions that analyse data read in allStat.mat, using function read_allStat or read_stat.

Apart from allStat.mat, the Matlab-structure allEco.mat exists for codes that specify climate, ecozone, habitat, migration/torpor and food for each entry. The codes are explained in AmPeco, assigned by function get_eco and the Matlab-structure is written by write_allEco. All functions that analyse data and read in allEco.mat use function read_allEco or read_eco.

Taxonomic tree: lists-of-lists

Entries are organised according to the taxonomic position of the taxa that they represent. This position is determined in lists-of-lists; the taxonomic info in the mydata-files is only used for presentation in the species-list and for the default value of the water content by function get_d_V and the default nitrogen waste by function get_N_waste. A list is a simple text-file in subdirectory taxa. Several functions link these lists into a tree. The tree has a root, here called Animalia, nodes, which are names of taxa, and leaves, which are names of entries. Most entries represent a species, but some species have multiple entries, such as geographical races. Each node once occurs in a list and once as name of a list; the root only occurs once as a name of a list. All entry (= leave) names have an underscore in their name, while no node has an underscore. The last node (= list name) in tree-branches only contains leaves and is a genus, which is part of the name of the entries it contains. No other node contains leaves. Function list_taxa returns a list of all nodes. The web-pages species-list and species-tree on the AmP website are composed from this tree. Function treeview_taxa allows you to compose your own interactive tree with any node as root, including pictures on the nodes and links on the leaves, if you are web-connected.

The tree can be read in the direction from leaves to root with the function lineage, and in the direction from root to leaves with the function pedigree. The default input of pedigree is the root Animalia, but can also be any node, which becomes the root of the output-tree. The (character) string produced by pedigree can directly be printed to the screen, which is useful for small trees. The tree can be used to identify useful taxa for analysis.

Tree topology: Sampled at 1012 entries, the tree has 1998 nodes, which are the handles for selection. The left figure shows the survivor function of the number of branches per node and the right one that of the number of nodes between leaves and root. For comparison, a binary tree has 2 branches per node (by definition), 1012-1=1011 nodes and a mean of log2(1012)=9.98 nodes between leaves and root.

Selection of entries

Selection of entries via the tree is done with the functions select and select_01. Select returns a cell-string with names of selected entries, select_01 a vector of booleans and a cell-string with the names of all entries. Notice that allStat.mat, allEco.mat and the lists-of-lists change continulously, so do the results of select and select_01.

Function clade finds the lowest taxon (node) that contains a set of specified taxa, and all its members. It combines functions lineage and pedigree and can also be used to find the closest relatives of a single specified taxon. If a species is not found in the AmP collection, it searches the Catalog of Life and the Taxonomicon for lineages, with functions lineage_CoL and lineage_Taxo and presents the AmP species that are most related.

Print (compound) parameters or statistics of selected entries to screen with prtStat, or, including the tree-structure, with pedigree. Use clade to select related entries and catenate with prtStat by e.g. prtStat(clade('Lemmus_trimucronatus'),'p_M');.
Include the tree as well by e.g. [~, taxon] = clade('Lemmus_trimucronatus'); pedigree(taxon,'p_M').

Entries can be selected via the tree, but also via data types. Entries with a particular combination of zero-variate and uni-variate data can be selected with function select_data. This selection can be restricted to particular typified models, which can be handy for preparing a predict-file for a new species, and for linking parameter values to source data types. The Matlab expression prtStat(select_data({'t-Le','Wwb'},'std'),'v'); prints entry names and their values for the energy conductance at 20 C for all entries with standard (std) models that have the data time-length for embryos as well as wet weight at birth.

A general multi-step way of selecting entries on the basis of a variety of criteria is, e.g. mammals that have a COMPLETE score larger than 2.6: [s1,nm]=select_01('Mammalia');s2=read_allStat('COMPLETE')>2.6;nm=nm(s1&s2). Plot for those entries e.g. energy conductance as function of specific somatic maintenance pM_v=read_stat(nm,'p_M','v') with: Hfig=figure(1);plot(pM_v(:,1),pM_v(:,2),'or');. See entry names by clicking on points in this figure with: h=datacursormode(Hfig);h.UpdateFcn=@(obj,event_obj)xylabels(obj,event_obj,nm,pM_v);datacursormode on.

Legends exploit selections

Spotting patterns in (functions of) parameters of entries starts with plot function shstat (see below; the name stands for "show statistics"), which has inputs data and legend (and optional further inputs).

A (marker) legend is a (n,2)-array of cells specifiying markers and taxa (= nodes and/or leaves). A line legend, called llegend, does this for lines and taxa; it is used for 1-variate data, e.g. survivor functions. Several legends are available as input-free functions that output the required cell-array, such as legend_RSED and legend_fish. Customised legends can be composed by functions select_legend and select_llegend. The choice of possible taxa is restricted to the ones present in the lists-of-lists. Legends can be shown in a figure with DEBtool_M functions shlegend and shllegend. Please notice that the sequence of rows of marker legends matters, see shstat; this is a consequence of the fact that one taxon can contain another one.

Spotting patterns in data with legends

Function shstat can be used in symbolic mode for 1-, 2- and 3-variate data, as given in allStat.mat. In this mode, shstat is using read_allStat to read in allStat.mat; a large number of symbols for (functions of) parameters is available, following DEB notation. Functions of parameters that do not depend on food, called compound parameters, were computed with DEBtool_M function parscomp_st, and that do with statistics_st. These functions briefly describe the various variables, which are presented in context in the DEB book.

Function shstat can also be used in numerical mode in the case that computations are required, e.g. for functions of parameters that are not already in allStat.mat. In this case, shstat does not read in allStat, but still links data to entries via legends.

Markers in plots can be clicked to show the names of the corresponding entries. The script mydata_shstat gives examples of use of shstat and shows how items can be added to figures that have been produced by shstat. If markers in 3D plots do not have color specifications, the third variable is used to set the colors in the lava color scheme.

Get a rapid overview of distributions of a number of (compound) parameters or statistics for selected taxa with compare_taxa. Plot (compound) parameters or statistics as function of normalised taxonomic distance with function shstat_taxa.