************ k2d.read.me ******************

  Documentation file for the k2d program
for protein secondary structure prediction.

Last update: 14/4/96

*******************************************


1.- FILES SUPPLIED

k2d.read.me    - This file

k2d.zip        - A zip compressed file containing
                  * weights.dat  - a file containing the weights of
                                   100 trainings
                  * k2d.c        - the k2d program written in c
                  * k2d.exe      - the executable k2d program for PC
                  * gd.dat       - CD sample
                                   (glyceraldehyde-3-phosphate dehydrogenase)

k2d.SUN.tar.Z  - A compressed tar file containing
                  * weights.dat  - a file containing the weights of
                                     100 trainings
                  * k2d.c        - the k2d program written in c
                  * k2d          - the executable k2d program for SUN
                  * gd.dat       - CD sample
                                   (glyceraldehyde-3-phosphate dehydrogenase)



2.- HOW TO USE k2d?

- Place in the same directory the executable program and the weights.dat
  file.

- Generate a file with your problem CD spectra. It must contain 41 CD
  values ranging from 200 nm to 241 nm.  You can also experiment with
  the example supplied (gd.dat).  
  The CD values must be given in deg cm^2 dmol^-1 multiplied by 0.001.

- Run the k2d program following the instruccions on the screen.

- The program generates two files:
     * The output CD file has three columns.
       The 1st has the wavelength values, the 2nd is the CD spectra of the
       sample, and the 3rd has the mean CD spectra value of the winning
       neuron for the 100 sets of training weights contained in
       weights.dat.
     * The percentage file gives the predicted alpha, beta and
       random coil values. Additionally, it gives the square of
       the euclidean distance between the real and the winning neuron
       CD spectra, and, acordingly to this distance, the estimation of
       the mean error in the prediction of the three secondary structure
       values. If the distance is too large, the prediction could not be
       reliable, and the program cannot give an error estimation. In this
       case the predicted values should not be taken into account.


3.- AN EXAMPLE

- Run the k2d program.

- Use the 'gd.dat' file as input file.

- Generate a CD file 
  and a percentage file.

- You can display the CD file using your favourite graphics tool
  to see how the computed spectrum mimics the sample spectrum.

- The resulting percentage values are 0.30, 0.12, 0.58. The square
  of the distance between the two spectra is 32.20.  According to this
  distance, the program has given a maximum mean error of 0.080.
  This means that the sum of the errors in the prediction of the
  alpha, beta and random percentage values divided by three is expected
  to be less then 0.08.

  Since the secondary structure percentage values of the
  glyceraldehyde-3-phosphate dehydrogenase are 0.30, 0.22 and 0.48,
  the sum of the absolute errors in the three predicted values is
  0.00+0.10+0.10=0.20.  Since 0.20/3=0.066 < 0.080, the error is
  below the maximal error given by the program.


4.- THE ALGORITHM

- The algorithm has been published in:

  M.A. Andrade, P. Chacon, J.J. Merelo and F. Moran. (1993)
  "Evaluation of secondary structure of proteins from UV circular
  dichroism spectra using an unsupervised learning neural network".
  Protein Engineering. 6: 383-390 

  Merelo, J.J., M.A. Andrade, A. Prieto and F. Moran. (1994)
  "Proteinotopic Feature Maps". Neurocomputing. 6: 443-454


5.- SEND A MAIL

- If you get the program, please let us know your e-mail address by
  sending a mail to andrade@embl-heidelberg.de
  We will inform you about the following versions of the program.


[K2d Home Page]