Download files
(WinZipped)
Download
files (gzipped tar)
Download 'Unique' set: WinZipped
gzipped tar
Download 'Known' set: WinZipped
gzipped tar
N.B. Each file within each archive is in a FASTA style format containing four lines of information e.g. the file 1atx00 contains the lines:
1. >1atx00
2. GAAaLbKSDGPNTRGNSMSGTIWVFGcPSGWNNbEGRAIIGYacKQ
3. EEE TTS S TTSSEEEEEESS TT EEE
SSSSSEEEE
4. CEEEEEHHECEEEECCCECEEEECCCEECCEECEEECCEECEEEEC
1. CATH Domain name (four character PDB code followed by chain identifier
and domain number)
2. DSSP amino acid sequence (lowercase letters are Cys residues)
3. DSSP assigned secondary structure (Kabsch and Sander, 1983)
4. Backbone dihedral angle assigned secondary structure (Przytycka
et al., 1999).
It is recommended that lines 3 and 4 should be interpreted such that a strand equals three or more consecutive Es and a helix equals five or more consecutive Hs.
Please see the following references for more details:
Kabsch,W. and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features, Biopolymers, 22, 2577-2637.
Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., and Thornton, J.M. (1997) CATH- A Hierarchic Classification of Protein Domain Structures. Structure., 5(8), 1093-1108.
Przytycka, T., Aurora, R. and Rose, G. (1999) A protein taxonomy based on secondary structure. Nature Struct. Biol., 6(7), 672-682.