OSTIblog article by David Wojick on Tue, 1 Apr, 2008
When it comes to science and technology development, OSTI
people are writing one of the biggest Internet success stories. Everyone talks
about how the Internet is changing science but OSTI is making it happen, and
doing it on a shoestring budget.
The reason is simple, what OSTI makes happen goes to the
heart of what science does, which is to share and combine thinking. Science is
a colossal exercise in thought sharing, and has been for 400 years. Every
achievement is incremental. Thus scientific communication is essential for
scientific progress.
That the Internet greatly increases the potential for
communication is well known. What often goes unrecognized is the great gap
between raw Internet accessibility and actual communication. The missing
element that bridges this gap is something we call "findability." If
something is available via the Internet, can it be found with reasonable
effort? If not then it might as well not be there.
OSTI is leading a revolution in findability. OSTI does not
create new content; rather it creates portals and search engines that find vast
quantities of hard-to-find scientific and technological content that already
exists. This is extremely important to science because the general purpose
search engines like Google rarely find scholarly content.
In some cases OSTI works alone but in many cases it
collaborates with other national and international organizations. Sometimes
OSTI crawls the surface Web but in many cases OSTI has led the application of
federation to deep Web databases. In all cases the goal is the same, to make
important scholarly content findable by those who need it.
The various portals that OSTI either owns or operates form a
rough hierarchy. That is, some are more general than others and in many cases
the narrower, more specialized portals are incorporated into the more general
ones to some degree. This architecture reflects the interlocking nature of
scientific activity.
A few of OSTI's many search tools are described below, from
narrow to broad. Each is a technical tool that has to be understood to be
properly used. None is simple. Also, each is relatively crude. Google spends
over $4 billion a year, including $500 million on R&D. The National Library
of Medicine spends around $100 million on R&D. OSTI's total budget, not
just R&D, is just $9 million so there are few bells and whistles. But there
are over 200,000,000 pages of findable research results and technical material
on OSTI portals, with more every day. Collectively this is by far the largest
source of Web-based, scholarly science and technology available. An astounding
feat for such a small agency.
Some special OSTI
collections
Information Bridge
This is OSTI's foundation collection, the filing cabinet of
all DOE research reports for the last decade. Tens of billions of dollars worth
of research are documented here, much of it power related. It has 165,000 fully
searchable full-text documents, each with extensive bibliographic information.
This makes it possible to do complex advanced searches using different metadata
fields in the document database.
A powerful and independently useful feature in the advanced
search function of Information Bridge is the subject "select" button.
This brings up a very large semantic structure or word-word link system that is
designed to help users find the best technical search terms. The system
combines a taxonomy of energy related words with what is called a thesaurus.
The thesaurus does not provide synonyms, but rather clusters of terms that are
closely related from a scientific or engineering point of view. The system
includes 30,000 words, about 200,000 word-word relations, and 45,000 taxonomic
pathways from broader to narrower concepts. The system is useful in
understanding the concept structure of energy science and engineering.
E-print Network
This is a federated and crawled collection of about 5
million scholarly articles and related materials found in databases and on the
web. It includes what are called preprints which include articles that have not
yet appeared in scholarly journals. It also includes the publication web pages
of over 28,000 university faculty, mostly in science and research engineering
departments. This makes it easy to go from a single paper to the whole body of
a researcher's related work.
Science and Engineering Conference Proceedings
Conference proceedings often precede publication of research
results by a year or more and this collection federates 26 large databases.
There are hundreds of thousands of papers and presentations, many from
professional societies.
OSTI wide search
Science Accelerator
The Science Accelerator searches ten major OSTI collections,
including Information Bridge, E-print Network, and the Conferences portal,
described above. It also searches R&D project descriptions, the Energy
Citations Database, DOE R&D Accomplishments, DOE-sponsored patents, and
EnergyFiles, a collection of energy-related databases and websites.
Government wide search
Federal R&D Project Summaries
This is a federated gateway to individual project summaries
from six of the largest research funding agencies. In many cases the search
results include recent awards, which may precede research reports or
publications by several years.
Science.gov
Science.gov is a search engine for government science
information and research results. Currently in its fourth generation,
Science.gov provides search of more than 50 million pages of science
information with just one query, and is a gateway to over 1,800 authoritative
scientific Web sites and over 30 large scientific databases.
World wide search
Worldwidescience.org
Whereas Science.gov federates the US Government science and
engineering databases and websites, the idea behind Worldwidescience.org is to
combine similar resources from many different countries. While still very new,
WWS.org already includes major collections from 44 different countries, in
every inhabited continent. Science.gov is the major US contribution.
Taken together this is an impressive list of integrated
science and technology portals. But believe it or not, there is a lot more
coming.
David Wojick, Ph.D.
Senior consultant for Innovation
OSTI
No comments:
Post a Comment