Knowledge production pipeline

There is an emerging problem within the scientific community – its rapid growth together with advancement of technology increases information flow to unbearable levels. In these conditions existing system for research evaluation is becoming more and more obsolete, creating a bottleneck in the “knowledge production pipeline”, which results in the loss of considerable amount of data. To illustrate this point “knowledge production” can be formalized as following:

Knowledge production pipeline

1) Science is a Machine operating the “knowledge production pipeline”. It extracts, filters and refines the information from the Universe.
2) The end product of the pipeline is structured Knowledge ready to be “consumed” by society, industry or by the Science Machine itself (knowledge reuse creates positive feedback).
3) Pipeline consists of four operational segments, represented by the corresponding sets of tools: extraction of information, documentation, evaluation and knowledge dissemination.

The main bottleneck is created in the segment of research evaluation. Scientists (operators of the pipeline) are still performing this process manually. Increased automation of the preceding “extraction machinery” and the shift to electronic “documentation machinery” results in disproportion between the amount of operators and amount of information to be handled. In other words, while we are perfecting tools for knowledge extraction (thus increasing the information flow), the machinery required for knowledge evaluation is worn out, loosing ability to handle this flow.

Another problem refers to inefficiency of the existing documentation tools – paper laboratory journals and local electronic data storages provide very limited access. In fact, this problem also points to the obsoleteness of the science evaluation practices: since no credit is given for the knowledge bypassing the peer-reviewed publishing process, scientists lack incentive to make negative results and «raw» data publicly available.

So essentially everything boils down to the necessity of delivering new metrics for research activities and partially automating the evaluation process. This does not mean elimination of the peer-review, but increase in its efficiency.

And, beyond everything else, academia has to start giving credit to researchers performing peer-review and make its results publicly available, thereby encouraging this type of scientific activity and removing bias and injustice from the process.

~ ~ ~

More reading:

~ ~ ~ ~ ~

Submit to:  Bookmark at | Digg it! | Stumble It! | Bookmark at Technorati | Bookmark at Reddit | Bookmark at Google | Bookmark at Facebook | Bookmark at Windows Live

Reblog this post [with Zemanta]

5 responses to “Knowledge production pipeline

  1. Not all peers are equal – the scientist you want reviewing work is the one who tried to duplicate it (or ran a very similar experiment). Nobody else is really motivated enough to contribute a meaningful review. That means that if we start making research results available as we get them it is not practical to expect them to be properly reviewed. This is why I think that raw data is much more important that perfunctory “peer review” – especially for “failed” experiments.

    Peer review does have a role in publishing traditional articles but it happens at a much higher level of analysis than the raw experimental results.

  2. Totally on your side – peer-review is required to certify research, not data. And anyway one does not need to “peer-review” the raw data, since only specialists from the same field can truly value from sharing this kind of information (plus data-mining later on). But this could be the key for automation – let the system find most appropriate peer-reviewers for your research, thus providing the highest quality/timing ratio for the peer-review process.

  3. Pingback: Michael Nielsen » Biweekly links for 06/27/2008

  4. Ciao Jareg. I have a question about your “Knowledge production pipeline”. It is not about the main idea of the article but about the processing of data by scientists. Do you really exclude all the conscious data production outside of your pipeline from the model? I mean, the raw data can come spontaneously, but it can also be a result of a planned action which is based on our previous knowledge , forming a knowlegde multiplication circle. In fact it is an open system, that does not have a linear mechanism of data processing. What do you think?

  5. The raw data obviously does not form spontaneously – researchers have to somehow extract it first. Apologies for the confusion, but in the scheme above “raw data” refers to natural world/universe around us, rather than digital data in the raw format.
    As you mention, the system is indeed a circle – produced knowledge feeds back to the pipeline (see the dissemination step)…However it is a bit more convenient to illustrate the process as a [linear] sequence of segments in the context of current discussion 😉

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s