WP4 Current status
description
Transcript of WP4 Current status
EBRCN General Meeting, Genova, 26-28/03/2003 1
WP4
Current status
Paolo Romano & WP4 group
EBRCN General Meeting, Genova, 26-28/03/2003 2
WP4: Linking to Medline (i)
Retrieval of PUBMED IDs still ongoing• MK amending DSMZ literature db (40% - 60%)
• FG supporting BCCM, manual insertion after restructuring of dbs
• GS retrieving PMIDs for CBS catalogues
• PR retrieving PMIDs for NCCB catalogues
• Approaching CIP, CABI, NCIMB
• ECACC?
EBRCN General Meeting, Genova, 26-28/03/2003 3
WP4: Linking to Medline (ii)
Catalogue structures• ongoing (collections’ tasks) • done for ICLC, BCCM/LMBP, DSMZ (maybe other
as well)
SRS structure and syntax files• ongoing, based on Catalogue Production
guidellines• completed during conversion to SRS 7• done for ICLC and BCCM/LMBP
EBRCN General Meeting, Genova, 26-28/03/2003 4
WP4: Linking to Medline (iii)
Catalogues updated:
ICLC: November 2002
BCCM/LMBP: March 2003 DSMZ Literature db: May 2003 (plan) Other BCCM, NCCB plasmids: at next catalogue
update
EBRCN General Meeting, Genova, 26-28/03/2003 5
WP4: Linking to other Literature
Some literature refs are missing from Medline
• Biosis & Embase available for downloading by BD• CABI abstracts• check how many recent literature refs are not in
Medline, and are either in Biosis or Embase or CABI Abstracts
EBRCN General Meeting, Genova, 26-28/03/2003 6
Linking to EMBL (i)
• Test for linking to EMBL Data Library through SRS, without IDs, gave negative results:• Links are different for different materials and can use
various EMBL fields: • Organism (micro-organisms), Division (viruses and plasmids),
Feature Table (definition of the source through Key, Qualifier, Description)
• Annotation problems (e.g., missing spaces)• Indexing problems (e.g., use of dots)
EBRCN General Meeting, Genova, 26-28/03/2003 7
Linking to EMBL (ii)
Examples of search:
• Query: Fungi & source & cbs 100.20( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] )
EBRCN General Meeting, Genova, 26-28/03/2003 8
Linking to EMBL (iii)
• Identify ID based crossreferences for linking from CABRI catalogues to EMBL (and viceversa)
• A huge number of EMBL records could be linked to a single CABRI item
• Add links in EMBL and use these links when linking from CABRI (search by SRS)
• ID based links to CABRI included in EMBL data library and distributed with it
EBRCN General Meeting, Genova, 26-28/03/2003 9
Linking to EMBL (iv)
• Agreement with EBI (list of crosserefs)• Work do be done after uploading to EBI of CABRI
extracted catalogues: end of june 2003
• Table of crossreference returned to collections• Possible well known “wrong” EMBL sequence
removed from table• Links from plasmids catalogues to EMBL
managed differently (using current remarks)
EBRCN General Meeting, Genova, 26-28/03/2003 10
Linking to other sources
• Links to maps of BCCM/LMBP plasmids added in a new field: next catalogue update
• Images of micro-organisms (CBS & BCCM) added in a new field: starting from next updates
• Further links (nomenclature, acronyms, genes) under analysis
• D10: integrated Biological Resource database• Interconnected vs integrated
EBRCN General Meeting, Genova, 26-28/03/2003 11
Extracted databases
Extracted databases
• Selected meaningful subset of information: MDS+link to main CABRI site
• Established agreement with EBI
• Preparation of catalogues, using SRS: end of April 2003• Setting up of a purpose FTP site: end of April 2003• Upload of catalogues to EBI: end of May 2003
• Automatic updating by EBI by FTP through SRS Prisma
EBRCN General Meeting, Genova, 26-28/03/2003 12
Inventory of data usage and sets
M12: ”Inventory of BRC data usage and data sets completed to enable the identification of data sources needed to provide a comprehensive information centre on BRCs and their holdings”
• Methodologies and Protocols• Reference lists (acronyms, countries, bacterial
nomenclature)• Internal papers/thesis• CABRI guidelines
EBRCN General Meeting, Genova, 26-28/03/2003 13
Inventory of data usage and sets
• List of data sets, by category, with links to infomration sources
• Searchable database with links to information sources
• GlobalSearch applied to partners’ web site• ht://Dig can be used to index all partners’ site and
search their contents in a unique step (only static files, not searchable archives/databasess)