The SET domain is a conserved C-terminal domain that characterizes proteins of the MLL family, including MLL2. The MLL SET domain is a histone H3 Lys4 (K4)-specific methyltransferase whose activity is stimulated with acetylated H3 peptides. The gene for MLL2 encodes a 5,262-amino acid protein containing a SET domain, 5 PHD fingers, potential zinc fingers, and a long run of glutamines interrupted by hydrophobic residues (mostly leucine). They also detected an alternatively spliced form encoding 4,957 amino acids and lacking an N-terminal zinc finger and PHD finger. By analysis of rodent/human hybrid cells and analysis of the Genebridge radiation hybrid panel, they mapped the gene to the 12p13.1-qter region. The 12q12-q13 region is involved in duplications and translocations associated with cancer. By database searching, Karlin et al. (2002) identified 192 human protein sequences that have multiple amino acid runs, many of which are associated with disease, including cancer. Karlin et al. (2002) found that a key aspect of 82 of these protein sequences is their role in transcription, translation, and developmental regulation. MLL2 is a striking example of proteins with multiple amino acid runs, with 22 glutamine runsGenes encoding a significant number of long amino acid runs are potentially associated with diseases, such as cancer.