Cathepsins are papain family cysteine proteinases that represent a major component of the lysosomal proteolytic system. Cathepsins generally contain a signal sequence, followed by a propeptide and then a catalytically active mature region. The very long (251 amino acid residues) proregion of the cathepsin F precursor contains a C-terminal domain similar to the pro-segment of cathepsin L-like enzymes, a 50-residue flexible linker peptide, and an N-terminal domain predicted to adopt a cystatin-like fold. The cathepsin F proregion is unique within the papain family cysteine proteases in that it contains this additional N-terminal segment predicted to share structural similarities with cysteine protease inhibitors of the cystatin superfamily. This cystatin-like domain contains some of the elements known to be important for inhibitory activity. CTSF encodes a predicted protein of 484 amino acids which contains a 19 residue signal peptide. Cathepsin F contains five potential N-glycosylation sites, and it may be targeted to the endosomal/lysosomal compartment via the mannose 6-phosphate receptor pathway. The cathepsin F gene is ubiquitously expressed, and it maps to chromosome 11q13, close to the gene encoding cathepsin W.