Saturday, September 6, 2014

DOE says OA publishers must join CHORUS

Here is an article that I recently wrote for my newsletter "Inside Public Access."

DOE says OA publishers must join CHORUS
From Inside Public Access A weekly newsletter
September 2, 2014

 By David Wojick, Ph.D.

Synopsis: The US Energy Department says that if publishers want PAGES based public access for their OA articles they must join CHORUS to get it. This is ironic, as the OA community has been generally negative about CHORUS. Non-CHORUS publishers may want to ask DOE to reconsider this policy. However a limited form of immediate public access may be available for those publishers who do not join CHORUS. (Note: for background information see

How PAGES works

This picture is a bit complicated so bear with me. The US Energy Department (DOE) funds several billion dollars in research every year. DOE will make the journal articles based on this research publicly accessible via the PAGES system, which presently operates in beta mode. PAGES also includes a lot of human activity and document processing.

The PAGES system uses a tiered approach to provide what DOE calls the "best available version" of each article. The highest priority is given to the version of record posted on the publisher's website. Second highest goes to an accepted manuscript housed in a repository. This may also include an accepted manuscript posted on the publisher's website, but whether it does or not has yet to be determined. The lowest priority, or fallback position, is for DOE to post an accepted manuscript itself.

How DOE gets the links to the off site articles on the websites and in the repositories is important here. For every article, at least one author is supposed to report the event of publication and supply certain metadata for that article, as well as the accepted manuscript. (Technically, every author that used DOE funding probably has to do this.)
This metadata may include a DOI for the published article, on the publisher's website, even when that article is behind a pay wall. DOE will make this metadata available immediately, even though it may not make the article itself available until the end of the embargo period.

The role of CHORUS in PAGES

CHORUS is going to supply DOE with links to the funding related articles on the websites of its publisher members. As part of joining CHORUS, each member publisher agrees to make all articles related to US federal funding open no later than after the federal embargo period ends. However, some articles may become open sooner and PAGES will provide access to them at this earlier date. Thus all CHORUS based OA articles will be made immediately accessible via PAGES.

Note that being PAGES accessible involves more than merely having a DOI link posted in the PAGES metadata. It means, among other things, being indexed by PAGES and thus included in the PAGES search system.

OA publishers who are not in CHORUS

The question thus arises as to what happens to DOE funding related OA articles on the websites of publishers who are not members of CHORUS? The VoR is immediately available for public access, as far as the publisher is concerned. According to DOE these articles will not be made fully PAGES available immediately. Their availability will be limited to the fact that each will have a DOI listed in the article's metadata, just as though it were behind a pay wall.

It appears, moreover, that the fact that such articles are not behind pay walls will not be indicated in the metadata. Thus there is a big difference between being available via a metadata link and being fully PAGES available. Full PAGES availability includes a lot of discovery that a metadata link simply does not provide.

CHORUS is required for immediate PAGES access

In short, the only way an OA publisher can get their articles made fully PAGES available upon publication is by joining CHORUS. This requirement does not seem to be spelled out anywhere in the DOE Public Access plan.

Here is what the DOE plan says about CHORUS: "The publishing community is developing a multi-publisher portal, the Clearinghouse for Open Research of the United States (CHORUS), to provide access to journal articles resulting from government funding. Such an activity offers considerable economies in the integration of article metadata and links for publishers who want to participate in DOE’s public access efforts. PAGES, however, can operate successfully independent of CHORUS." (Page 8)

The last sentence above says that PAGES can operate successfully without CHORUS. It appears however that OA publishers wanting PAGES access cannot operate successfully without CHORUS. This is nowhere mentioned in the DOE plan, but it appears to be a major requirement. The key is that there is nothing in the plan about publishers submitting links to their articles, other than via CHORUS. In effect this omission makes the rule in question.

Moreover, DOE has confirmed this interpretation. In our correspondence, DOE summed up their policy with this statement: "At this time, DOE's engagement with the publishing community is through CHORUS."

The CHORUS membership requirement does make sense from an administrative standpoint. Former OSTI director Walt Warnick points out that CHORUS saves DOE the considerable effort of dealing with each publisher independently, one at a time. This effort is a significant cost for PubMed Central.

However, OA publishers may not be thrilled with having to join CHORUS in order to get immediate, full PAGES access for their articles. There is also the question as to whether DOE can show this sort of favoritism. The DOE quotation above suggests that the present situation may change in the future. Perhaps the non-CHORUS publishers will raise this issue with DOE.


That only CHORUS members will get immediate, full PAGES access for OA articles is a surprising policy, and a questionable one on DOE's part. This DOE policy seems not to have been disclosed but it should be deeply discussed before it becomes final. CHORUS is a great idea but that may not justify making CHORUS membership a requirement for full PAGES access.

Inside Public Access is published weekly. For subscription information: Single issues may be purchased separately.

We also do confidential consulting.

For more information contact David Wojick
(540) 358-1080

Sunday, March 9, 2014

Engineer Tackles Regulatory Confusion

ENR (Engineering News Record) cover story
April 3, 1980

Inside title:
Logician shears woolly regulations
Blueprints untangle complex rules

Several times a week, David E. Wojick drives from his Revolutionary War-era estate in Orange, Va., to the nations's capital to work on a revolution of his own in a field he dubs "regulation engineering." Armed with a technique for simplifying complex issues, Wojick says he can make regulations systematic, coherent and efficient. According to clients, Wojick's four-year-old consulting firm has scored victories with dozens of major regulations — both in critiquing them for industry and in rewriting them for government agencies.

Regulation writing should be a design science based on principles of efficiency, not a political process, contends Wojick, a professional engineer with a doctorate in logic and the philosophy of science. "A regulation is every bit as complex as a major structure. It requires the same care in construction. No one would let a committee of lawyers design an office building or a nuclear power plant, but the regulatory programs for all things are designed by committees of lawyers," Wojick says. "As a result, regulations read like insurance policies, and regulatory programs proceed like lawsuits."

Wojick says the 90,000 pages of government regulations now in force are among the most complex structures ever fabricated. "Today a 100-page regulation is small, 500 pages is not unusual, and the 10,000 pages of federal income tax regulations are a wonder of the world," he says. Because regulation writing is dominated by lawyers, regulations today are powerful and respond to popular concerns, but, Wojick claims, they are generally costly and incoherent.

Counting kinds of Confusion

Wojick's firm, Adams & Wojick Associates, has developed a matrix identifying 126 kinds of confusion in regulations. It first classifies six aspects common to any law or regulation — concepts, rules, procedures, text, structure and logic. Then it lists 21 kinds of faults, such as being ambiguous, overly complex, or ineffective. The matrix yields 126 combinations. Typical examples Wojick cites are an Environmental Protection Agency regulation with more than 3,000 exceptions and nuclear power plant quality assurance regulations that have ambiguous rules and vague procedures.

"We've been successful," Wojick says, "because everybody sees the problem, but nobody's been able to put a finger on it." When Wojick goes to an agency and says a regulation is confusing for three or five specific reasons, he says, the common reaction he gets from officials is, "You're right."

The foundation of Wojick's ability to pinpoint these problems is a technique that applies the idea of a blueprint — a visual picture of a structure — to the structure of an idea. He discovered that through the blueprints, any discussion or piece of text can be broken down and all the individual ideas can be laid out so the relationships become visible.

The development of the technique springs from the unusual combination of engineering and logic in Wojick's background. Shortly after he graduated from Carnegie Institute of Technology in 1964 with a B.S. in civil engineering, Wojick went to work designing dams for the Pittsburgh district of the Corps of Engineers. His exposure there to environmental controversies, watching people "getting lost in complex issues," stirred a longstanding interest in reasoning. He began studying logic and philosophy at the University of Pittsburgh.

Wojick left the Corps in 1970 to work on his dissertation and began teaching at Carnegie Mellon University, where he helped found a Department of Engineering and Public Policy. At Carnegie, he was influenced by the research on human problem solving done by his colleague, Herbert A. Simon, who in 1978 won the Nobel Prize in economics.

Mapping the structure of ideas.

Wojick realized that all issues have a basic underlying structure, one that can be mapped out like an engineering drawing. Using existing theories of conceptual analysis, Wojick "atomized" issues (and later texts) into basic elements. His discovery, he explains, "was that the ideas are held together by unspoken questions. What point is this sentence making? What point is it responding to?"

He was surprised to find that the thousands of pieces of a complex issue fit together in a simple scheme with a logical pattern. Because the kind of hierarchical structure developed is called a "tree" in mathematics, Wojick calls the structures "issue trees". His first practical application, in 1975, was an analysis of the interaction between environmental, energy and economic issues for the Pennsylvania governor's science advisory committee.

A year later, Wojick left the university to devote full time to issue analysis, going into business with his wife and partner, Diane W. Adams. As chief executive officer, Adams manages the finances of the firm and now oversees billings of more than $500,000 a year. She navigated the group's recent move to a 176-acre estate in Virginia, reputed to be the birthplace of President Zachary Taylor. Adams says they chose the property, which includes a house built in 1790, offices and a horse farm for its hour-and-a-half proximity to Washington, D.C.

Devil in the details.

A third key member of the seven-person firm is John E. DeFazio, a chemical engineer who now handles a lot of the analytical work while Wojick hits Washington looking for complicated issues that involve a lot of money. Not all prospective clients can afford the firm's services, because "it takes several person-months to tree out a major regulation," Wojick says. "On the other hand, that's why the process is so powerful. It has the same power that detailed drawings give in constructing a building. We can make hundreds, sometimes thousands, of improvements."

The firm's early jobs included writing compliance manuals on regulations for industry. It wrote a quality assurance manual for Levinson Steel Co., Pittsburgh, for example, setting up the structural steel fabricator's working program for compliance with Nuclear Regulatory Commission standards on nuclear power plant fabrication. Alvin Stein, Levinson's quality assurance director, calls Wojick a "wizard" because the program set up in 1976 in "untested waters" is still working successfully. And it has been flexible enough to allow the company to satisfy the differing regulatory interpretations of different designers.

Next, Wojick landed jobs critiquing regulations for industry. PPG Industries, Inc., Pittsburgh, hired the firm to do a coherence analysis of EPA's proposed premanufacturing regulations under the Toxic Substances Control Act. "The goal of the law is to prevent chemical catastrophes," Wojick explains, "but EPA takes the meat-ax approach of trying to find out everything there is to know about all the chemicals in existence, and then they're going to sort through and find the problems." "We suggest techniques for identifying lines to follow that are most likely to be fruitful," says Wojick. The analysis also found parts of the regulations so unreadable that most accepted scales of readability could not measure them.

Teaching regulators logic.

EPA acknowledged the value of the critique by hiring Wojick to teach EPA regulators how to write logically coherent regs. The firm is also negotiating a contract to rewrite EPA's dredge and fill permit regulations. Wojick has already rewritten regulations for the Water Resources Council. WRC first hired the firm to critique its proposed rules for evaluating the costs and benefits of water projects, then asked the firm for a complete rewrite. DeFazio says, "We threw away 70% of the text, and the other 30% we completely restructured -- all without losing any of the basic ideas." They pared down the roughly 350-page draft to about 80 pages. The firm also rewrote the Council's principles and standards for planning water resource projects.

Adams & Wojick is working with the Department of Commerce and the Office of Management and Budget on a study of information collection burdens. Regulators have a tendency to treat information collection as if it were free, Wojick says, but its costs mount up. "We're working on a computer search program that uses key words like 'document' and 'record' to spot these hidden burdens -- the molasses in the system."

In Occupational Safety and Health Administration regulations, for example, Wojick finds that many added costs of compliance are hidden in inspectors' manuals and other appendixes. OSHA has regulations for worker exposure to more than 500 chemicals, he says, and the regulations say only that exposure levels must stay under certain numbers of parts per million. "The costly record-keeping requirements are in the attachments," he says.

The key to the firm's approach, Wojick says, is that "we're not institutional players. If you're locked into the system, you can't jump on people and make noise. I'm free to offend anybody, and I do." Wojick will tear apart regulations for industry or work for the government writing them. "We're not for one side or the other, we think they're all making mistakes. Our interest is clarity and sound design," he says.

Regulations writing is just the beginning for Wojick. In the future, he says, "We want to design laws for Congress."

Monday, March 3, 2014

Making the Web work for Science

OSTIblog article by David Wojick on Mon, 21 April, 2008

If I had to describe the OSTI revolution in ten words or less it would be "OSTI is making the Web work for science."

It is a colossal irony that the Web does not work for science. The World Wide Web was developed by high energy physicists at CERN, for the purpose of sharing scientific papers. HTML is basically very simple, with features that were specifically designed to display scientific writings.

But the Web quickly transitioned into popular culture, becoming a revolutionary new medium of global communication. Simple HTML has been tricked into producing complex, magazine-like displays and much more.

This trickery is a technological marvel in its own right. Web pages now number in the billions and the pace is quickening, not slowing, as video, blog and personal mega-sites take off. Social, consumer and popular content dominates these huge numbers.

In the process science got left behind. As the Web exploded in size it quickly came to pass that the Web only works where search works, and ordinary Web search does not work for science. The ordinary search engines do not find science, for several reasons.
First, and foremost, most science is in the deep Web. This is explained in detail in other OSTIblog articles. Second, scientific content on the surface Web is swamped by non-scientific content using the same language, especially news and consumer or company information. It is interesting to take a scientific article on a scientist's Web site and see what it takes to get Google to return it as a top hit. If you know the exact title it can be done, otherwise not likely.

A number of attempts have been made to solve this problem and make the Web work for science. Unfortunately most of these have focused on subscription journal articles, which are not in the public domain. There are several projects which provide abstracts in large numbers, but no actual content. These may be very useful for certain purposes, but do not solve the basic problem. Others search large numbers of articles but most of the results are only available on a pay-per-paper basis. This is a good way to buy articles, but browsing is prohibitively expensive in most cases.

OSTI's revolutionary approach has been to focus on full text scientific content in the public domain. Search results include specialized surface Web content, but especially deep Web content. This too is explained in detail in other OSTIblog articles. An estimated 200 million pages of scientific content are now available through specialized search.

Has OSTI made the Web work for science? Yes and no. Yes, because the OSTI owned or operated portals now provide huge amounts of content for science. (Also for technology, which has similar problems.) No, because this is still just a small fraction of the total that is Web accessible. So the OSTI revolution is just beginning, because the Web will really only work for science when everything that is Web accessible is also findable.

David Wojick
Senior consultant for innovation


OSTI's Revolution: "Findability" in Science and Technology

OSTIblog article by David Wojick on Tue, 1 Apr, 2008

When it comes to science and technology development, OSTI people are writing one of the biggest Internet success stories. Everyone talks about how the Internet is changing science but OSTI is making it happen, and doing it on a shoestring budget.

The reason is simple, what OSTI makes happen goes to the heart of what science does, which is to share and combine thinking. Science is a colossal exercise in thought sharing, and has been for 400 years. Every achievement is incremental. Thus scientific communication is essential for scientific progress.

That the Internet greatly increases the potential for communication is well known. What often goes unrecognized is the great gap between raw Internet accessibility and actual communication. The missing element that bridges this gap is something we call "findability." If something is available via the Internet, can it be found with reasonable effort? If not then it might as well not be there.

OSTI is leading a revolution in findability. OSTI does not create new content; rather it creates portals and search engines that find vast quantities of hard-to-find scientific and technological content that already exists. This is extremely important to science because the general purpose search engines like Google rarely find scholarly content.

In some cases OSTI works alone but in many cases it collaborates with other national and international organizations. Sometimes OSTI crawls the surface Web but in many cases OSTI has led the application of federation to deep Web databases. In all cases the goal is the same, to make important scholarly content findable by those who need it.

The various portals that OSTI either owns or operates form a rough hierarchy. That is, some are more general than others and in many cases the narrower, more specialized portals are incorporated into the more general ones to some degree. This architecture reflects the interlocking nature of scientific activity.

A few of OSTI's many search tools are described below, from narrow to broad. Each is a technical tool that has to be understood to be properly used. None is simple. Also, each is relatively crude. Google spends over $4 billion a year, including $500 million on R&D. The National Library of Medicine spends around $100 million on R&D. OSTI's total budget, not just R&D, is just $9 million so there are few bells and whistles. But there are over 200,000,000 pages of findable research results and technical material on OSTI portals, with more every day. Collectively this is by far the largest source of Web-based, scholarly science and technology available. An astounding feat for such a small agency.

Some special OSTI collections

Information Bridge
This is OSTI's foundation collection, the filing cabinet of all DOE research reports for the last decade. Tens of billions of dollars worth of research are documented here, much of it power related. It has 165,000 fully searchable full-text documents, each with extensive bibliographic information. This makes it possible to do complex advanced searches using different metadata fields in the document database.

A powerful and independently useful feature in the advanced search function of Information Bridge is the subject "select" button. This brings up a very large semantic structure or word-word link system that is designed to help users find the best technical search terms. The system combines a taxonomy of energy related words with what is called a thesaurus. The thesaurus does not provide synonyms, but rather clusters of terms that are closely related from a scientific or engineering point of view. The system includes 30,000 words, about 200,000 word-word relations, and 45,000 taxonomic pathways from broader to narrower concepts. The system is useful in understanding the concept structure of energy science and engineering.

E-print Network

This is a federated and crawled collection of about 5 million scholarly articles and related materials found in databases and on the web. It includes what are called preprints which include articles that have not yet appeared in scholarly journals. It also includes the publication web pages of over 28,000 university faculty, mostly in science and research engineering departments. This makes it easy to go from a single paper to the whole body of a researcher's related work.

Science and Engineering Conference Proceedings

Conference proceedings often precede publication of research results by a year or more and this collection federates 26 large databases. There are hundreds of thousands of papers and presentations, many from professional societies.

OSTI wide search

Science Accelerator

The Science Accelerator searches ten major OSTI collections, including Information Bridge, E-print Network, and the Conferences portal, described above. It also searches R&D project descriptions, the Energy Citations Database, DOE R&D Accomplishments, DOE-sponsored patents, and EnergyFiles, a collection of energy-related databases and  websites.

Government wide search

Federal R&D Project Summaries

This is a federated gateway to individual project summaries from six of the largest research funding agencies. In many cases the search results include recent awards, which may precede research reports or publications by several years. is a search engine for government science information and research results. Currently in its fourth generation, provides search of more than 50 million pages of science information with just one query, and is a gateway to over 1,800 authoritative scientific Web sites and over 30 large scientific databases.

World wide search

Whereas federates the US Government science and engineering databases and websites, the idea behind is to combine similar resources from many different countries. While still very new, already includes major collections from 44 different countries, in every inhabited continent. is the major US contribution.

Taken together this is an impressive list of integrated science and technology portals. But believe it or not, there is a lot more coming.

David Wojick, Ph.D.
Senior consultant for Innovation

OSTI versus Google: Different content, different uses

OSTIblog article by David Wojick on Wed, 5 March, 2008

Part of OSTI's R&D aims at understanding how scientists use information. This goal was originally articulated by OSTI's Thurman Whitson, who has since retired. To that end we have begun to look at the different kinds of information provided by the different Web-based science resources. Different kinds of information imply different uses. It is not that one resource is better than another overall; it is that they are very different and support different uses.

Below are four initial results that show clearly that Google tends to return lay information, while returns scholarly information. (Note: Google Scholar also returns scholarly information, but very little is free, so that is a different issue.)
For the purpose of this analysis, we postulate two Search categories: Layman's Level and Scholar's Level.

LL = Layman's Level. Includes news, magazines, blogs, educational material, product or company information, etc.
Education grade level is basic undergraduate or lower, often high school level.

SL = Scholar's Level. Includes journal articles, research reports, conference proceedings, etc. Education level is advanced undergraduate or higher.

Hit counts are based on the first 20 hits. Note this is rough data, based on the snippets. We have not examined every hit. But the differences are so dramatic that the results will not change much with refined analysis or more cases.

Hits on "biofuel", Google Results LL = 20, SL = 0. Science.Gov Results LL = 1, SL = 19
Hits on "nanocatalysis", Google Results LL = 17, SL = 3., Science.Gov Results LL = 1, SL = 19
Hits on "Higgs boson", Google Results LL = 20, SL = 0., Science.Gov Results LL = 1, SL = 19
Hits on "quantum dot", Google Results LL = 20, SL = 0., Science.Gov Results LL = 0, SL = 20

Conclusion: and Google are very different tools, so support different uses. Google returns lay information, while returns scholarly information. Moreover, all of OSTI's Science Accelerator resources follow the pattern. Each is a far better source of scholarly information than Google.

The technology behind (i.e., federated search) allows information product designers to discriminate among information resources, while the technology behind Google (i.e., crawling) does not lend itself to such discrimination. In the case of, federated search enables product designers to focus on R&D results, which are typically scholarly in nature.

Possible next steps:

1. The LL and SL categories can each be refined and related to different uses. For example, some LL hits are simple news items while others are research center home pages. The research centers usually provide linkage to scholarly publications, while the news items usually do not. Likewise, some SL hits are just abstracts while others are full text articles. Also some LL and SL hits are more technical than others in the same category. A simple, faceted taxonomy to capture these differences should be easy to build.

2. This LL/SL analysis can be extended to the Science Accelerator, as well as individual OSTI products like Information Bridge, E-Print Network, or to other content search systems, such as World Wide Science or the Web of Science. How the content of these differ should be useful for users to know.

3. The different uses each of these different content systems might support can also be described. Sometimes scientists need lay information, sometimes scholarly, in different combinations depending on what they are looking for. Cases range from seeking broad understanding to looking for a specific document or problem.

David Wojick
Senior consultant for innovation


Thursday, February 27, 2014

Mapping Technology Chaos

By David E. Wojick. Ph.D.
March 2014

(This paper was accepted and distributed at the March 6, 2014 US Library of Congress conference on "Analytical Methods for Technology Forecasting.")

Today’s engineers are under pressure to develop and deploy new technologies at an ever-quickening pace. Thus the world of engineering might best be described as one of technology chaos. In the midst of all this uproar, engineers are supposed to know what is going on and coming on, but the projects and possibilities are legion. Clearly, engineers have a lot of homework to do for their own projects. They also have to avoid being blindsided in a meeting by the latest whiz-bang Wall Street Journal article. Technology assessment in the face of this kind of chaos may seem impossible, but it is also mandatory. Why is it all so complicated?

Diffusion confusion

How does an engineer find out what’s happening that’s related to his or her field or project? Why is it so hard? The basic technology assessment problem can be put in one word—diffusion. The flow of knowledge in science and technology is an extremely complex diffusion process. Much of the complexity is due to two simple processes that overlay one another—convergence and divergence.

Shown above is a simplified map of science and technology diffusion. The sheer complexity of the interrelationships is what makes the concept so hard to grasp, even though the individual relationships may be clear. Not only is the pattern complex, but it represents many possible combinations of divergent and convergent flow over time in the future. The possibilities are not endless, but they are many. They may even be well defined, but the array is structurally complex and virtually impossible to visualize mentally. As with many engineering problems, mapping the complexity is the answer.

Science and technology diffusion

The flow of knowledge is like a complex diffusion process. Each lettered box represents a project. Links indicate the potential flow of results from one project to another at a later stage of development, or from left to right.

Each lettered box represents a project currently being developed. Projects span the spectrum of development, from basic, cutting-edge research to fielding established technologies. The projects included depend on the scale of interest. A project’s focus might be narrow, like corrosion on turbine blades, or very broad, like clean coal technology. On a broader scale, the projects might represent entire research communities.

Links indicate the potential flow of results from one project to another at a later stage of development, or from left to right. For simplicity’s sake, each project is shown feeding just three downstream projects. Not shown is that fact that new projects may come into being, and existing ones may disappear, as time goes by.

There are a great many link-by-link paths between distant projects. That’s the consequence of divergence and convergence. Results from A may diverge, working their way to C or D, or both. Yet C and D may be very different technologically. Likewise, results from A or B, or both, may converge on C. It is the combination of paths that creates the complexity and confusion, not the individual paths themselves.

People working on operational-level technologies, on the right side of the map, tend to look for convergence. But for those seeking to understand how a new basic technology will change the status quo, divergence is mostly what they look at. Trying to look at both at the same time is already very hard, and it’s getting harder.

Bounding the Problem

In fact, comprehending everything in any technology forecasting case is a daunting challenge. A few Google searches no longer suffice. But hope is not lost. Understanding the diffusion of the science and technology related to a specific engineering issue is like any other engineering problem. You have to scale it to your resources and use the proper tools. In other words, you have to bound the problem or it cannot be solved.

A science and technology assessment problem might include any of the following. The point is that these are very different problems.

1. The convergence of diffusion pathways to a given technology, problem, or application.

2. The divergence of a particular breakthrough.

3. The neighborhood or cluster of activities related to a specific technology at a specific stage of development.

4. A single transition pathway from a specific project to a specific application.

Other combinations of projects and pathways also are possible. In every case, it is critical to limit the search to just those projects and links that can be feasibly assessed. The feasibility of understanding is the key concept here. One cannot examine everything.

Diffusion Distance

It is particularly important to distinguish basic research, pilot tests, and the like from real-world applications. This involves what I call the "diffusion distance" or the number of projects and links between projects at different stages of development. Speculation about new science and technology in the general press often misunderstands and understates the great diffusion distance from basic research to actual application. Every stage of development normally takes several years to work through. In particular, basic research breakthroughs often take 10 to 30 years to become useful. Even proven pilot technologies may be a long way from actual application. A feasible assessment may have to simply ignore distant diffusion.

On the other hand, if one has a very specific technical problem, the solution may already exist in a distant research community. Other things being equal, the narrower the technical problem, the more distant the search can be. Once again it is a matter of knowing what can and cannot be done.

Case study: Naval R&D

Some time ago I tested this diffusion mapping concept for the US Chief of Naval Research. The Navy clusters its R&D projects into an ascending series of seven categories of so-called “budget activities.” Category 1 is basic scientific research, while category 7 is operational systems development. We began with categories 2 and 3, respectively called “applied research” and “advanced technology development,” as these are the earliest technology development categories. Our primary data were project descriptions, especially what are called Research and Development Descriptive Summaries (RDDS).

In every case we were able to identify clear transition paths from projects in categories 2 and 3 to category 7, via projects in the intermediate categories. Interestingly this typically involved something we call semantic overlap or linking. This means that the language that occurs in a given project typically overlaps that used in projects that are adjacent along a given transition path. Thus we were able to find the transition paths using semantic search tools.

Welcome to my stuff

I am using this blog to collect stuff I have written that lies scattered all over the Web, as well as new stuff of course. A lot of it has to do with the science of confusion due to complex situations, a field which I have sort of pioneered. As a consultant I help organizations work through complex situations.

I also blog a bit, especially here: on scientific publishing. on animal cognition, especially horses but not just them.