WP4 Current status

13
EBRCN General Meeting, Genova, 26-28/03/2003 1 WP4 Current status Paolo Romano & WP4 group

description

Questa presentazione può essere utilizzata come traccia per una discussione con gli spettatori, durante la quale potranno essere assegnate delle attività. Per memorizzare le attività durante la presentazione: In visualizzazione Presentazione diapositive fare clic con il pulsante destro del mouse - PowerPoint PPT Presentation

Transcript of WP4 Current status

Page 1: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 1

WP4

Current status

Paolo Romano & WP4 group

Page 2: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 2

WP4: Linking to Medline (i)

Retrieval of PUBMED IDs still ongoing• MK amending DSMZ literature db (40% - 60%)

• FG supporting BCCM, manual insertion after restructuring of dbs

• GS retrieving PMIDs for CBS catalogues

• PR retrieving PMIDs for NCCB catalogues

• Approaching CIP, CABI, NCIMB

• ECACC?

Page 3: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 3

WP4: Linking to Medline (ii)

Catalogue structures• ongoing (collections’ tasks) • done for ICLC, BCCM/LMBP, DSMZ (maybe other

as well)

SRS structure and syntax files• ongoing, based on Catalogue Production

guidellines• completed during conversion to SRS 7• done for ICLC and BCCM/LMBP

Page 4: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 4

WP4: Linking to Medline (iii)

Catalogues updated:

ICLC: November 2002

BCCM/LMBP: March 2003 DSMZ Literature db: May 2003 (plan) Other BCCM, NCCB plasmids: at next catalogue

update

Page 5: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 5

WP4: Linking to other Literature

Some literature refs are missing from Medline

• Biosis & Embase available for downloading by BD• CABI abstracts• check how many recent literature refs are not in

Medline, and are either in Biosis or Embase or CABI Abstracts

Page 6: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 6

Linking to EMBL (i)

• Test for linking to EMBL Data Library through SRS, without IDs, gave negative results:• Links are different for different materials and can use

various EMBL fields: • Organism (micro-organisms), Division (viruses and plasmids),

Feature Table (definition of the source through Key, Qualifier, Description)

• Annotation problems (e.g., missing spaces)• Indexing problems (e.g., use of dots)

Page 7: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 7

Linking to EMBL (ii)

Examples of search:

• Query: Fungi & source & cbs 100.20( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] )

Page 8: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 8

Linking to EMBL (iii)

• Identify ID based crossreferences for linking from CABRI catalogues to EMBL (and viceversa)

• A huge number of EMBL records could be linked to a single CABRI item

• Add links in EMBL and use these links when linking from CABRI (search by SRS)

• ID based links to CABRI included in EMBL data library and distributed with it

Page 9: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 9

Linking to EMBL (iv)

• Agreement with EBI (list of crosserefs)• Work do be done after uploading to EBI of CABRI

extracted catalogues: end of june 2003

• Table of crossreference returned to collections• Possible well known “wrong” EMBL sequence

removed from table• Links from plasmids catalogues to EMBL

managed differently (using current remarks)

Page 10: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 10

Linking to other sources

• Links to maps of BCCM/LMBP plasmids added in a new field: next catalogue update

• Images of micro-organisms (CBS & BCCM) added in a new field: starting from next updates

• Further links (nomenclature, acronyms, genes) under analysis

• D10: integrated Biological Resource database• Interconnected vs integrated

Page 11: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 11

Extracted databases

Extracted databases

• Selected meaningful subset of information: MDS+link to main CABRI site

• Established agreement with EBI

• Preparation of catalogues, using SRS: end of April 2003• Setting up of a purpose FTP site: end of April 2003• Upload of catalogues to EBI: end of May 2003

• Automatic updating by EBI by FTP through SRS Prisma

Page 12: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 12

Inventory of data usage and sets

M12: ”Inventory of BRC data usage and data sets completed to enable the identification of data sources needed to provide a comprehensive information centre on BRCs and their holdings”

• Methodologies and Protocols• Reference lists (acronyms, countries, bacterial

nomenclature)• Internal papers/thesis• CABRI guidelines

Page 13: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 13

Inventory of data usage and sets

• List of data sets, by category, with links to infomration sources

• Searchable database with links to information sources

• GlobalSearch applied to partners’ web site• ht://Dig can be used to index all partners’ site and

search their contents in a unique step (only static files, not searchable archives/databasess)