Sequences Analysed

by Data Source

abYsis loads sequences from a number of Data Sources which are listed in column 1 of the table. Total Entries in the column 2 indicates the total number of entries for each Data Source - note that some of these may not be antibodies.

abYsis reads every entry and determines a subset of Qualifying Entries (column 3)for further analysis. The aim is to select every entry that contains immunoglobulin sequence data. The selection is based on textual annotations such as keywords and descriptions.

There may be multiple DNA and/or protein sequences for each Qualifying Entry. Both DNA and protein sequences (as available) are parsed and carried forward for further analysis in abYsis. For both DNA Sequences and Protein Sequences, the Total number of sequences of each type is shown (columns 4 and 6 respectively) as well as the number of Non-identical sequences (columns 5 and 7 respecitively). Note that because each entry may contain more than one sequence, the total number of sequences may be greater than the number of qualifying entries.

Note: The abYsis-EMBL-IG data source is compiled automatically based on sequence similarity to known, curated immunoglobulin sequences. Several thousand representative immunoglobulin query sequences have been used to search the EMBL database. The abYsis-EMBL-IG set combines the results of these searches.

by Species

The lower table shows the top 15 Species represented in abYsis, ranked by the number of chains.

The Chain is the basic data entity in abYsis. A chain may have a DNA or protein sequence, or both a DNA and protein associated with it.