WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

Post on 26-Jan-2016

33 views 0 download

description

Questa presentazione può essere utilizzata come traccia per una discussione con gli spettatori, durante la quale potranno essere assegnate delle attività. Per memorizzare le attività durante la presentazione: In visualizzazione Presentazione diapositive fare clic con il pulsante destro del mouse - PowerPoint PPT Presentation

Transcript of WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 1

WP4Analysis of non-EBRCN databases and

network services of interest to BRCs

Current status

Paolo Romano

EBRCN General Meeting, Paris, 28-29/11/2002 2

WP4: databases of interest

Short delay: 1 month ca.

· Definition of a list of databases and services that could be of interest to BRCs done

· Selection of a subsets of those databases and services done

EBRCN General Meeting, Paris, 28-29/11/2002 3

WP4: identifiers and methods

· Selection of information of interest to BRCs within selected databases

ongoing, done for Medline & EMBL

· Analysis of identifiers and information and of methods for linking ongoing,

done for Medline

EBRCN General Meeting, Paris, 28-29/11/2002 4

WP4: Pubmed IDs

· CABRI catalogue production guidelines update ongoing, done for Literature in animal and human cells

· Retrieval of needed PUBMED IDs for linking ongoing, done for ICLC, BCCM/LMBP, NCCB plasmids, support from DSMZ (Kracht) and BCCM (Guissart)

EBRCN General Meeting, Paris, 28-29/11/2002 5

WP4: structure and syntax

· Catalogue structures update ongoing, done for Literature in animal and human cells

· SRS structure and syntax files

ongoing, depending on deadlines for submission of catalogues, done for ICLC

EBRCN General Meeting, Paris, 28-29/11/2002 6

WP4: catalogues updates

Catalogues updates:

done ICLC: November 2002

Plasmids and cell lines: January 2003

“Other catalogues”: February 2003

Bacteria: March 2003

Fungi and Yeasts: May 2003

EBRCN General Meeting, Paris, 28-29/11/2002 7

WP4: EMBL links

• EMBL Data Library is the European database for DNA sequences

• It is updated daily and a coordination with NCBI and DDBJ ensures its completeness

• It is offered at EBI by means of SRS

EBRCN General Meeting, Paris, 28-29/11/2002 8

WP4: EMBL links

• Test have been conducted to identify how to link to EMBL Data Library through SRS, without IDs

• Tests performed on:• Bacteria and Archaea• Animal and Human Cell Lines• Fungi and Yeasts• Plasmids• Viruses

EBRCN General Meeting, Paris, 28-29/11/2002 9

WP4: EMBL links variability

• Links are different for different materials• Links can use various EMBL fields:

• All-text (not very useful)• Organism (for micro-organisms)• Division (useful for viruses and plasmids)• Feature Table data (allow for a correct definition of a

source through Key, Qualifier, Description)

EBRCN General Meeting, Paris, 28-29/11/2002 10

WP4: EMBL links variability

• Example search: CBS 100.20 in CBS_FIL• Fields and values:

• Organism: fungi• Ft-Key: source• Ft-Qualifier: strain• Ft-Description: "cbs 100.20"

EBRCN General Meeting, Paris, 28-29/11/2002 11

WP4: EMBL links variability

• Annotation problems:• CBS 100.20 can be annotated as CBS 100.20 or

CBS100.20• CBS 112345 can be annotated as CBS12345

• Indexing problems:• CBS 100.20 is indexed as CBS, 100 and 20• The dot is not included and is used as a space

EBRCN General Meeting, Paris, 28-29/11/2002 12

WP4: EMBL links variability

Examples of searches:

• Query: Bacteria & source & cip* ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & [emblrelease-FtDescription:cip*]) < [emblrelease-Organism:bacteria*] )

• Query: Cell line & source & dsm* ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:cell_line] & [emblrelease-FtDescription:dsm*]) < [emblrelease-Organism:mammalia*] )

EBRCN General Meeting, Paris, 28-29/11/2002 13

WP4: EMBL links variability

Examples of search:

• Query: Bacteria & source & cbs 100.20( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] )

EBRCN General Meeting, Paris, 28-29/11/2002 14

WP4: extracted databases

Extracted databases

• Selection of a meaningful subset of information (strain identification) for each material, including links to external dbs/services ongoing, proposal sent to collections next month