A MEASURE OF THE INFORMATION CONTENT OF NEURAL SPIKE TRAINS Miguel A. Jiménez-Montaño, Thorsten Pöschel* and Paul E. Rapp** Departamento de Física y Matemáticas, Universidad d
e las Américas/Puebla Sta. Catarina Mártir, 72820
* Humboldt-Universität zu Berlin, Institut für Physik,Invalindenstrasse 110, D-10115 Berlin, Germany. http://summa.physik.hu-berlin.de:80/~thorsten**Department of Physiology, The Medical College of Pennsylvania, Philadelphia,
After a short review of some informational and grammatical concepts a nd a former algorithm to evaluate the
complexity of neural spike trains, a new algorithm to build a short context -free grammar (also called programor description) that generates a given sequence is introduced. It follows the general lines of the first al gorithmbut it optimizes the information content, instead of the grammar complexity that was used in the previouswork. It is implemented by means of the program SYNTAX and applied to estimate the information contentof neural spikes trains, obtained f rom a sample of seven neurons, before and after penicillin treatment. Acomparison of the sequences ( encoding the inter
-spike intervals) according to their information content ,
grammar complexity , and block -entropies shows that the three context depe ndent measures of complexitygive similar results to categorize the neurons with respect to their structure or randomness , before and afterthe application of penicillin. Introduction
The determination of block -entropies is a well established met hod for the investigation
of discrete data, also called symbols (Herzel et al., 1995). In a recent doctoral dissertation,Schmitt (1995) calculated the block -entropies of the digitized neural spike trains previouslyobtained by Rapp´s group (Rapp et al. , 1994). His results are consistent with the results inthe mentioned paper , which deals about the increase of the algorithmic complexity duringfocal seizures. In that article the measure of algorithmic complexity employed was thegrammar complexity (Eb eling & Jimenez -Montaño, 1980). It was estimated by means ofthe program NVOGRAMM (Quintana -López, 1993), which builds a short context
grammar (also called program or description) that generates the given sequence; it followsthe general lines of a fo rmer algorithm (Chavoya et al.,1992). In this paper we propose anew algorithm to build the grammar by minimizing the information
sequence, instead of the grammar complexity. In the new algorithm the block -entropies ofthe letters of the al phabet (ordinary Shannon entropy) and symbol pairs are used to weightthe letters and syntactic categories, respectively, as explained below
The problem with the grammar -complexity based algorithm is that the letters and the
syntactic categories (runs of binary symbols in the present application) are treated on thesame footing (i.e., with the same weight). Since the self
categories is different, this fact should be taken into account in the calculation of the
information content. Before one can apply the algorithm to interspike interval data it isnecessary to reduce these data to a sequence of symbols. As in the former paper (Rapp etal., 1994) a partition about the median is employed, with the same experimental dat
estimate the information content of neural spike trains.
The main purpose of the present communication is to make a comparison of the ranking
of the sequences according to three measures of complexity:i) The information content ,ii) the grammar complexity, andiii) the block-entropies,to check the consistency of the three approaches. Besides, to have a self
presentation, we recall the definitions of the measures employed and their main properties. Last but not least, we recall our former algorithm to estimate the grammar complexity, andgive a brief description of the program SYNTAX (Jiménez.
estimate the information content of a binary sequence. This last program is written inFORTRAN . It is available upon request. Materials and Methods
For the sake of completeness, and in order to fix the notation, we recall first some well
known concepts from information and formal language theories.
1. Entropy-like Measures of Sequence Structure
Symbol sequences are composed of symbols (letters) from an alphabet of λ letters (e.g.
for λ = 4 , {A,C,G,T} is the DNA alphabet ; for
λ = 2, {0,1} is the binary alphabet, etc.).
Substrings of n letters are termed n-words. If stationary is assumed, i.e., if any word i can beexpected at any arbitrary site to occur with a well-defined probability pi , then the
n-word entropies (block-entropies or higher order entropies) are given by
The summation has to be carried out over all words with p
number of words is λn , so there is a dramatic increase of the number of possible words withrespect to n which makes the estimation of higher -order entropies a difficult task (Schmitt,1995). The entropies H n measure the average amount of information contained in a wordof length n. Defining the self-information of a word of length n as
Hn = < In > , is the expected value of In
give the new information of the n+1th symbol given t he preceding n symbols. The entropyof the source
h = lim n → ∞ hn = lim n→ ∞ Hn /n
quantifies the information content per symbol, and the decay of the hn measures correlationwithin the sequence. H n and h n are good candidates to detect structure in symbolicsequences since they respond to any deviations from statistical independence. In a randomsequence with equidistributed probabilities, p (n) = 1/ λn
For binary sequences λ = 2, and H n = n bits. H n exhibits a linear scaling for randomnon-equidistributed process (Schmitt, 1995),
the coefficient being H1= - Σλi=1 pi log2 pi ,
Mostly, the entropies Hn are estimated from the normalized frequencies of occurrences:
which are called “observed entropies” (respectively, observed self
denotes the total number of words in the sequence, and k i is the number of occurrences of acertain word i . As it was shown by Ebeling et al. (1987) and Herzel (1988), in general thenaive estimation of the probabilities by means of p
sample effect; i.e., a deviation of Hn from its true value as n increases (Pöschel et al.,1995) . However, these effects will not be considered in the present paper.
reader to the mentioned works for further details about this point. In contrast to theinformational quantities defined before, which are referred to an ensemble of sequences ,the observed quantities refer to individual sequences. In the following all entropies will beobserved entropies (with the superscript “obs.” suppressed for convenience).
Grammar Complexity as introduced by Ebeling and Jiménez -Montaño (1980) constitutes
an attempt to determine the algorithmic complexity of a sequence. The essence of thisconcept is to compress a sequence by introducing new variables (s yntactic categories). Thelength of the compressed sequence is then taken as a measure of the complexity of asequence. However, there are different ways to measure the length of the compressedsequence; in the original paper (Ebeling and Jiménez
characters of the compressed sequence was used (counting logarithmically repeatedcharacters). We recall this approach next. Other alternative is the
employed in the new algorithm introduced below. Further possibilitie
The set of all finite strings (words) formed from the members of the alphabet X is called
the free semigroup generated by X, denoted X
subset of X *. If p and q are words from X * , then their concatenated product pq is also amember of X*.
A context.free grammar is a quadruple G = { N, T, P, S } where:
(1) N is a finite set of elements called nonterminals (syntactic categories), including the
(2) T is a finite set of elements, called terminal symbols (letters of the alphabet). (3) P is a finite set of ordered pairs A → q , called production rules , such that
q ε (N ∪ T) and A is a member of N.
Let us consider a grammar G such that L (G) = w ; i.e., the language generated by G
consists of the single sequence w. These grammars are called “programs” or “descriptions”of the word w. The
defined as follows:The complexity of a production rule A → q is defined by an estimation of the complexity
(N ∪ T), for all j = 1, .,m .Therefore, in this definition terminals (letters of the
alphabet) and non -terminals (syntactic categories; sub -words) are treated on the
footing (i.e., with the same weight).
Here [x] denotes the integral part of a real number.
The complexity K(G) of a grammar G is obtained by adding the complexities of theindividual rules. Finally, the complexity of the original sequence is:
K(w) = K(G(w)) = min { K(G) | G →w }.
This quantity, which is a particular realization of the algorithmic complexity introduced bySolomonoff (1964), Chaitin (1990) and Kolmogorov (1958), ref
sequence, in contrast to the Shannonian measures which are related to the sequence source.
3. Algorithm to Estimate the Grammar Complexity
In former papers (Ebeling & Jiménez -Montaño, 1980; Jiménez-Montaño et al. 1987) an
algorithm to estimate the grammar complexity of a sequence was described and applied tothe estimation of the complexity of biosequences (DNA, RNA and proteins). Independently,Wolff (1975 & 1982) introduced a similar algorithm and applied it to the discovery
phrase structure in natural language. Briefly, our former procedure is the following:
A sequence q, with characters from alphabet T = {a1, a2 ,. aλ }.
OUTPUT: A short context-free grammar G such that L (G) = qPROCEDURE:
WHILE there are pairs of contiguous characters (in the sequence q) which occur more thantwo times DO
2) WHILE there are in q two equal sub-words of more than two characters DO
αi by a new symbol A i, and introduce the
T = ( a1, a2 , ., aλ)N = (A1, A2 ,., Ar , S)P:
≤ l( αk) if k < m , where l(x) is the length (number of characters) of x.
K(G(q)) gives a good estimation of the grammar complexity , K(q)END.
This procedure has been implemented with the programs GRAMMAR.C (Chavoya et al,1992) and NVOGRAMM (Quintana, 199
3). In both implementations the heuristic
employed was a hill -climbing optimization procedure, which searches to minimize thegrammar complexity at each step. Therefore, it is not guaranteed that the grammar found isreally the shortest one; that is why one gets only an estimation of K(q).
4. Algorithm to Estimate the Information Content
To estimate the information content of a sequence w, I(w), all one needs to do is to
replace in the above optimization procedure K(w) by I(w) . To evaluate I(w) one
For each production rule, instead of the complexity of a rule defined in [9] one introduces
the information content of a rule defined as
where the quantities I j (j = 1,.m) = Iobs = -log2 ki/N , (see equation [8] ),are the weightsof the terminals (for j = 1) and non -terminals (for j > 1) fr om which q is composed. Theinformation content of a grammar G, I(G), is obtained by adding the information content ofthe individual rules. As the number of rules increases I(G) may increase or diminish. If ,after the introduction of new rules, I(G) do es not diminish the process stops. Therefore,the estimation of I(w) is:
I(w) = I(G(w)) = min { I(G) | G →w }.
This algorithm was implemented with the program SY NTAX (Jiménez-Montaño et al.
If S → w is the trivial grammar that generates the sequence w. The
information content of w, as estimated from this grammar , would be Hn = n H1 ,
Shannon entropy ,estimated from the letter -composition of the sequence (number of zerosand ones, for binary sequences). For example, for binary sequences of length 1000 , withequal number of ze ros and ones, H n = 1000 bits. In contrast to this quantity, I(w) is theestimation of the information content of an individual sequence w (see Table 1). While theformer quantity cannot distinguish among different sequences of the same composition, thelater one can.
As mentioned in the introduction , in a former paper (Rapp et al. 1994) it was shown
that the algorithmic complexity, as estimated with the help of the program NVOGRAMM(Quintana, 1993), increases during focal seizures. The same experimental results reported inthat article, of seven single -unit records obtained from cortical neurons of the rat beforeand after the application of penicillin ,were used for the present work. Therefore, to savespace, we refer the reader to the
mentioned paper for the details of the experimental
methods employed. The purpose of the present study is to compare the results obtained bymeans of the information content, calculated with the help of the program SYNTAX, withour previous results (Rap p et al 1994), and with the values of the block
same seven sequences in the sample, calculated by Schmitt(1995). As explained in ourformer paper, before the complexity of neural spike trains can be estimated it is necessaryto reduce the neural data to a sequence of symbols. The usefulness of information contentcalculations depends crucially on the procedure used to partition the data among a finitealphabet of symbols.
__________________________________________________________________ ___________________ Table 1. Average information content of 1000 points data set (N=5)
________________________________________________________________________Calculations with messages obtained by reducing random numbers to binary symbol sequences establish theupper bound of the information content that can be obtained with 1000 elements. The results are reported withSDs obtained from calculations using five data sets for each distribution.
In Table 1 the average information content of 1000 artificial point data sets, with four
distributions , are displayed. These results were obtained after the data were reduced to asequence by partitioning a bout the mean, the median and the midpoint, as explained in(Rapp et al.,1994). The experimental binary symbol sequences employed in this paper were
constructed by partitioning inter -spike intervals about the median. This choice entails thatH1 = 1, of course. Table 1 is analogous to the corresponding one in (Rapp. et al., 1994) forthe grammar complexity. The results of both tables are consistent, except that theinformation content for the Poisson distribution is slightly different from the other thr
distributions. On the contrary, the grammar complexity is completely insensitive to the distribution for partitions about the median. _____________________________________________________________________________________ Table 2. Statistical properties of the order sensitive measures C2 and I, 1000 events in each dataset
Significance testing (spontaneous vs penicillin treated)C2 : t = 2.277,
_______________________ ______________________________________________________________The values of the binary complexity ,C 2 , and the information content, I , obtained from 1000 element spiketrains partitioned about the median are displayed. In the last two columns the information content of randomlyshuffled sequences, for the spontaneous and the penicillin -treated case, are included for control purposes. Apaired t test was used to compare the values obtained in the two conditions. The neurons are orderedaccording to I Spon.
Table 2 contains the main results of the present communication. In this table the values
of the grammar complexity (Rapp et al., 1994) and information content of the sevenneurons, before and after the application of penicillin, are shown.
values for random shuffled sequences, of the same composition, are also included forcomparison. The results obtained by the three measures of sequence structure; i.e., grammarcomplexity, information content and block -entropies (the las t one calculated by Schmitt(1995), but not reproduced here) are quite consistent: According to the three methods,neurons 1,5 and 6 have a significant structure before the penicillin treatment; and, for thesame condition, neurons 2 and 4 have spike train s which are not too different from those ofa perfectly random equi -distribution (H n = n). Neurons 3, and 7 are at the border ofrandomness and penicillin produces no significant effect on these neurons. It is also clearthat, with both measures, the ave rage values increase after the penicillin treatment and theSD values decrease. However, these two sets of values still differ from the extremes foundfor randomly shuffled sequences.
______________________________________________________________________ _______________
Table 3. Statistical properties of the order sensitive rates RC 2 and RI, 1000 events in each dataset
The values of the rates of increase of the binary complexity ,C
content, I , obtained from 1000 element spike trains partitioned about the median are displayed. As in Table2, the neurons are ordered according to I Spon.
____________________________________________________________________________ _________
As mentioned in our former publication, because the mean firing rate differs from
neuron to neuron, the records employed cover very different time intervals. Since, asargued in (Rapp et al.,1994) timely responsiveness is an essential property of any successfulbiological system, the information content generation per unit time may give additionalinformation for the problem at hand. It is calculated by dividing the information content bythe corresponding time required by that neuron to
represents the rate at which the neuron is increasing its information content and, therefore,loosing its structure. In Table 3 these rates are displayed, together with the correspondingvalues for the grammar complexi
ty rates obtained in the former publication (Rapp et
al.,1994). From the displayed values it is clear that, with both measures, the differentneurons are becoming disorganized at higher rates after the treatment. However, theincrease in firing rate in the untreated condition seems to be uncorrelated with the degree ofstructure. Neurons 5,7 and 2 have low rates, while the others have somewhat greater ratesof disorganization. For the treated condition only neuron 2 present an anomalous low rate. Final Remarks
In this communication we have shown that three different measures of complexity
produce consistent results about the degree of structure of neural spike trains. It is importantto notice, as we noticed in the former publication (Rapp et al.,1994), that the classificationof neurons made in this and the former publication could not have been made on the basisof distribution -determined measures. The patterns observed in spike trains appear to begenuine and not due to chance variations.
Acknowledgement. The authors thank Miguel A. Hernández Morales for his help with thecomputer calculations. References:
Chaitin G.J. (1990) Information, Ramdomness & Incompleteness (Papers on Algorithmic
Information Theory), World Scientific, Singapore.
-Montaño M.A. (1992) Programa para estimar la
complejidad gramatical de una secuencia., Memorias IX Reunión Nacional de InteligenciaArtificial. pp 243-254 Megabyte, Grupo Noriega Editores, Mex.
Ebeling W., Feistel R. , and Herzel H., Phys. Scr. (1987) 35: 761. Ebeling W. and Jiménez-Montaño M.A. (1980) Math. Biosc. 52: 53-71. Herzel H., (1988) Sys. Anal. Mod. Sim. 5 : 435. Herzel H., Ebeling W., Schmitt A.O., Jiménez -Montaño M.A. (1995) Entropies and Lexicographic
Analysis of Biosequences, in From Simplicity to Complexity in Chemistry, and Beyond. MüllerA, Dress A & Vögtle F. (Eds.) Vieweg, Braunschweig, 1995, pp 7 -26.
Jiménez-Montaño M.A, Ebeling W., Pöschel T. (1995), SYNTAX: A Computer Program to
Compress a Seque nce and to Estimate Its Information Content, presented in Complex Systemsand Binary Networks Conference, Guanajuato, Mexico, Jan. 16 -22 1995. No proceedings werepublished.
Jiménez-Montaño M.A., Zamora L., Trejo J., (1987) Aportaciones Matemáticas, Comuni caciones
5: 31-52.
Kolmogorov A.K. (1958)., Dolk. Akad. Sci USSR 119: 861. Pöschel T., Ebeling W., Rose H. (1995) J. Stat. Phys., 80 :1443-1452 Quintana-López M., (1993)., Análisis Sintáctico de Biosecuencias., M. Sc. Thesis. Instituto
Tecnológico de Estudios Superiores de Monterrey campus Edo. de Mex. México.
Rapp P.E., Zimmerman I.D., Vining E.P., Cohen N., Albano A.M., Jiménez -Montaño M.A. (1994)
The Journal of Neuroscience, 14 (8): 4731-4739.
Schmitt A. (1995) Structural Analysis of DNA Sequences. Bi
Solomonoff R. J. (1964) Inform. & Control 7: 1-22. Wolff J.G. (1975) Br. J Psychol. 66:79-90. Wolff J.G. (1982) Language & Communication. 2 (No.1): 57-89.
7610 Kenilworth Ave. Suite 2600 Riverdale, MD 20737 INFORMED CONSENT PATIENT NAME: ______________________ DATE OF BIRTH: ______________ Explanation of procedure Visualization of the digestive tract by ingestion of a non-invasive diagnostic imaging device is referred to as capsule Endoscopy. It is an endoscopic exam of the GI tract. It is not the preferred examination for the stomach or
Minutes of Planning Advisory Committee meeting held on the above date in the Town Council Chambers at 5:30 p.m. Chairman Ken Johnston presided. PRESENT: Mayor Joe Hawes, Councillors Raymond Gregory, Bob Naylor Committee members: Mishi Babinec, Elwin Hemphill, Gary Nowlan, Tony Zuethoff Jeffrey Turnbull, Planner, PCDPC; Scott Conrod, CAO; Penny MacKenzie, Administrative Assistant; Stewart DeSoll