WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

14
EBRCN General Meeting, Paris, 28-29/11/2002 1 WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status Paolo Romano

description

Questa presentazione può essere utilizzata come traccia per una discussione con gli spettatori, durante la quale potranno essere assegnate delle attività. Per memorizzare le attività durante la presentazione: In visualizzazione Presentazione diapositive fare clic con il pulsante destro del mouse - PowerPoint PPT Presentation

Transcript of WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

Page 1: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 1

WP4Analysis of non-EBRCN databases and

network services of interest to BRCs

Current status

Paolo Romano

Page 2: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 2

WP4: databases of interest

Short delay: 1 month ca.

· Definition of a list of databases and services that could be of interest to BRCs done

· Selection of a subsets of those databases and services done

Page 3: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 3

WP4: identifiers and methods

· Selection of information of interest to BRCs within selected databases

ongoing, done for Medline & EMBL

· Analysis of identifiers and information and of methods for linking ongoing,

done for Medline

Page 4: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 4

WP4: Pubmed IDs

· CABRI catalogue production guidelines update ongoing, done for Literature in animal and human cells

· Retrieval of needed PUBMED IDs for linking ongoing, done for ICLC, BCCM/LMBP, NCCB plasmids, support from DSMZ (Kracht) and BCCM (Guissart)

Page 5: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 5

WP4: structure and syntax

· Catalogue structures update ongoing, done for Literature in animal and human cells

· SRS structure and syntax files

ongoing, depending on deadlines for submission of catalogues, done for ICLC

Page 6: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 6

WP4: catalogues updates

Catalogues updates:

done ICLC: November 2002

Plasmids and cell lines: January 2003

“Other catalogues”: February 2003

Bacteria: March 2003

Fungi and Yeasts: May 2003

Page 7: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 7

WP4: EMBL links

• EMBL Data Library is the European database for DNA sequences

• It is updated daily and a coordination with NCBI and DDBJ ensures its completeness

• It is offered at EBI by means of SRS

Page 8: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 8

WP4: EMBL links

• Test have been conducted to identify how to link to EMBL Data Library through SRS, without IDs

• Tests performed on:• Bacteria and Archaea• Animal and Human Cell Lines• Fungi and Yeasts• Plasmids• Viruses

Page 9: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 9

WP4: EMBL links variability

• Links are different for different materials• Links can use various EMBL fields:

• All-text (not very useful)• Organism (for micro-organisms)• Division (useful for viruses and plasmids)• Feature Table data (allow for a correct definition of a

source through Key, Qualifier, Description)

Page 10: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 10

WP4: EMBL links variability

• Example search: CBS 100.20 in CBS_FIL• Fields and values:

• Organism: fungi• Ft-Key: source• Ft-Qualifier: strain• Ft-Description: "cbs 100.20"

Page 11: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 11

WP4: EMBL links variability

• Annotation problems:• CBS 100.20 can be annotated as CBS 100.20 or

CBS100.20• CBS 112345 can be annotated as CBS12345

• Indexing problems:• CBS 100.20 is indexed as CBS, 100 and 20• The dot is not included and is used as a space

Page 12: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 12

WP4: EMBL links variability

Examples of searches:

• Query: Bacteria & source & cip* ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & [emblrelease-FtDescription:cip*]) < [emblrelease-Organism:bacteria*] )

• Query: Cell line & source & dsm* ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:cell_line] & [emblrelease-FtDescription:dsm*]) < [emblrelease-Organism:mammalia*] )

Page 13: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 13

WP4: EMBL links variability

Examples of search:

• Query: Bacteria & source & cbs 100.20( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] )

Page 14: WP4 Analysis of non-EBRCN databases and network services of interest to BRCs Current status

EBRCN General Meeting, Paris, 28-29/11/2002 14

WP4: extracted databases

Extracted databases

• Selection of a meaningful subset of information (strain identification) for each material, including links to external dbs/services ongoing, proposal sent to collections next month