release 201307.31, july 2013

ABOUT SelTarbase
SelTarbase is a comprehensive mononucleotide repeat (MNR) mutation database. The primary data are derived from investigations of human microsatellite high-unstable (MSI-H) tumors of different organs.

SelTarbase provides newest information of a very large, growing number of genes respectively the contained mononucleotide repeats. Additionally an up to date tissue specific regression analysis helps to decide which mutation frequencies seem to be elevated or reduced and could help focus direct investigation to promising candidate genes of MSI-H tumorigenesis. Furthermore, SelTarbase allows for upload of new (anonymized) data and recalculation of a regression analysis including these new data endowing the user with new aspects of his own research results.

NAR 2010 Database Issue
Deficient DNA mismatch repair (MMR deficiency) results in deletions or insertions in small repetitive DNA elements consisting of one, two or more nucleotides as single units, known as microsatellites. Microsatellite instability (MSI) occurs in more than 90% of human tumors in patients suffering from hereditary non-polyposis colorectal cancer (HNPCC/ Lynch syndrome, OMIM #120435) but also arises in sporadic carcinomas of the colon, endometrium, and stomach albeit at lower frequency (up to about 15%). The mutability of microsatellites mainly depends on their length and additionally on other biochemical and biological attributes, whose impact is not finally determined. Most of all genes respectively repeat tracts without physiologic relevance will show a similar mutation rate. However, some of the manifest mutations provide a positive or negative selection impact to affected cell clones leading to increased or reduced mutation frequencies. Therefore, the observed mutation rate may vary in a wide range.

We have proposed a statistical model based on sigmoid regression analysis aiming at the identification of relevant genes of MSI driven carcinogenesis by their mutation frequency in regard to the repeat tract length. Extensive literature review leads to inclusion of datasets regarding a specific mononucleotide repeat tract (MNR) in the human genome as well as the number of analyzed MSI-H tumors of a certain tissue type (e. g. colon, stomach, and endometrium) and the number of tumors showing mutations within this MNR.

Mutational data from the literature are collected and stored in a MySQL database. Tissue specific regression analyses are performed with R and nls2. All steps from database query and R calculation to complete web page presentation is done by a number of perl scripts.


The following table summarizes the contents of the selected and the previous release.

release last release latest
date 201306 201307

references analyzed 817 822
references included 611 616
colon 394 397
stomach 165 167
endometrium 96 96
colonculture 105 108
Genes analyzed 737 739
Genes included 696 698
colon 569 571
stomach 218 220
endometrium 105 105
colonculture 396 396
MNRs analyzed 5266 5271
MNRs included 4428 (3756 c., 646 nc.) 4433 (3761 c., 646 nc.)
colon 2322 (1830 c., 484 nc.) 2327 (1835 c., 484 nc.)
stomach 1097 (1021 c., 69 nc.) 1102 (1026 c., 69 nc.)
endometrium 1110 (1076 c., 34 nc.) 1110 (1076 c., 34 nc.)
colonculture 2589 (2309 c., 280 nc.) 2593 (2313 c., 280 nc.)
observations included
colon 182147 182525
stomach 45355 45509
endometrium 60263 60263
colonculture 29616 30614

SelTarbase version latest, release 201307, last updated 20130701.

