Download - WP4 Current status

Transcript
Page 1: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 1

WP4

Current status

Paolo Romano & WP4 group

Page 2: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 2

WP4: Linking to Medline (i)

Retrieval of PUBMED IDs still ongoing• MK amending DSMZ literature db (40% - 60%)

• FG supporting BCCM, manual insertion after restructuring of dbs

• GS retrieving PMIDs for CBS catalogues

• PR retrieving PMIDs for NCCB catalogues

• Approaching CIP, CABI, NCIMB

• ECACC?

Page 3: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 3

WP4: Linking to Medline (ii)

Catalogue structures• ongoing (collections’ tasks) • done for ICLC, BCCM/LMBP, DSMZ (maybe other

as well)

SRS structure and syntax files• ongoing, based on Catalogue Production

guidellines• completed during conversion to SRS 7• done for ICLC and BCCM/LMBP

Page 4: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 4

WP4: Linking to Medline (iii)

Catalogues updated:

ICLC: November 2002

BCCM/LMBP: March 2003 DSMZ Literature db: May 2003 (plan) Other BCCM, NCCB plasmids: at next catalogue

update

Page 5: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 5

WP4: Linking to other Literature

Some literature refs are missing from Medline

• Biosis & Embase available for downloading by BD• CABI abstracts• check how many recent literature refs are not in

Medline, and are either in Biosis or Embase or CABI Abstracts

Page 6: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 6

Linking to EMBL (i)

• Test for linking to EMBL Data Library through SRS, without IDs, gave negative results:• Links are different for different materials and can use

various EMBL fields: • Organism (micro-organisms), Division (viruses and plasmids),

Feature Table (definition of the source through Key, Qualifier, Description)

• Annotation problems (e.g., missing spaces)• Indexing problems (e.g., use of dots)

Page 7: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 7

Linking to EMBL (ii)

Examples of search:

• Query: Fungi & source & cbs 100.20( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] )

Page 8: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 8

Linking to EMBL (iii)

• Identify ID based crossreferences for linking from CABRI catalogues to EMBL (and viceversa)

• A huge number of EMBL records could be linked to a single CABRI item

• Add links in EMBL and use these links when linking from CABRI (search by SRS)

• ID based links to CABRI included in EMBL data library and distributed with it

Page 9: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 9

Linking to EMBL (iv)

• Agreement with EBI (list of crosserefs)• Work do be done after uploading to EBI of CABRI

extracted catalogues: end of june 2003

• Table of crossreference returned to collections• Possible well known “wrong” EMBL sequence

removed from table• Links from plasmids catalogues to EMBL

managed differently (using current remarks)

Page 10: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 10

Linking to other sources

• Links to maps of BCCM/LMBP plasmids added in a new field: next catalogue update

• Images of micro-organisms (CBS & BCCM) added in a new field: starting from next updates

• Further links (nomenclature, acronyms, genes) under analysis

• D10: integrated Biological Resource database• Interconnected vs integrated

Page 11: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 11

Extracted databases

Extracted databases

• Selected meaningful subset of information: MDS+link to main CABRI site

• Established agreement with EBI

• Preparation of catalogues, using SRS: end of April 2003• Setting up of a purpose FTP site: end of April 2003• Upload of catalogues to EBI: end of May 2003

• Automatic updating by EBI by FTP through SRS Prisma

Page 12: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 12

Inventory of data usage and sets

M12: ”Inventory of BRC data usage and data sets completed to enable the identification of data sources needed to provide a comprehensive information centre on BRCs and their holdings”

• Methodologies and Protocols• Reference lists (acronyms, countries, bacterial

nomenclature)• Internal papers/thesis• CABRI guidelines

Page 13: WP4 Current status

EBRCN General Meeting, Genova, 26-28/03/2003 13

Inventory of data usage and sets

• List of data sets, by category, with links to infomration sources

• Searchable database with links to information sources

• GlobalSearch applied to partners’ web site• ht://Dig can be used to index all partners’ site and

search their contents in a unique step (only static files, not searchable archives/databasess)