How Do We Gather Scientific Knowledge
In an enlightening talk, Victoria Stodden takes us in a walk to
discuss the scientific method and how do we gather scientific
knowledge.
The full talk is available at:
http://www.youtube.com/watch?v=nYWormzn1Mc
Stodden introduces the principles of scientific research
by going back to Roger Bacon in 1267 and his concepts of:
-
Verification of conclusions by Direct Experiment
-
Importance of Independent Verification
-
Recording experiments with enough detail that others could reproduce the work
Continuing with Francis Bacon in 1620:
-
Introducing the idea of inductive reasoning:
going from experimental observations to generalizations. -
and the influence of this philosophy in the Royal Society of London,
around 1660, where the first Scientific Journal: “Philosophical Transactions”
was created in 1665.
Stodden then bring us back to the present to make the points that:
“Scientific Computation is becoming central to the Scientific Method
-
Changing how research is conducted in many fields
-
Changing the nature of how we learn about our world”
and share her conjecture that:
“Today’s academic scientist probably has more in common
with a large corporation’s information technology manager
than with Philosophy or English professor at the same
University”
Then pointing out that the pervasive use of computation in
scientific research, unfortunately is not being accompanied
by an equal effort for making available the source code and
materials that were used during the research process.
In particular, there is a lag on making scientific data publicly
available under the terms of Open Data.
She then brings our attention to a significant contemporary problem:
“Relaxed practices regarding the communication of
computational details is creating a credibility crisis
in computational science, not only among scientist
but as a basis for policy decision and in the public mind.”
Questions are also raised about whether modern peer-reviewed
Journals are really providing an effective platform for scientific
discussion or not.
As an example, she presents the case of the cancellation of
Clinical trials at Duke, and how the deficiencies in the
computational practices of the original papers were not
detected during peer-review, due to the superficial way in
which peer-review is currently conducted.
The emergence of Computational Research as a third approach
to the scientific process (besides inductive and deductive reasoning)
is challenged by the lack of open sharing of data and source code.
Therefore most published computational result are
“…simply impossible to replicate…“.
Stodden surveyed the reasons behind the lack of willingness to
share data and code on the part of a community of authors and
found them to include:
-
Time required to clean and document
-
Time required to deal with questions from users
-
Preocupation about not receiving attribution
-
Possibility of pursuing patents
-
Legal barriers (e.g. Copyrights)
-
Potential loss of future publications
-
Competitors may gain an advantage
-
Web / Disk space
-
and…The Pursuit of Tenure…
while the top reasons to share were:
-
Encourage Scientific Advancement
-
Encourage sharing with others
-
Be a good community member
-
Set a standard for the field
-
Improve the caliber of research
-
Get others to work in the problem
-
Increase in publicity
-
Opportunity for feedback
-
Finding collaborators
She closes with a discussion on:
-
How do we deal with large bodies of source code ?
-
How do we deal with massive data ?
-
When we share software, who will maintain it ?
-
The need for tools on data provenance.
-
How to train users on the proper use of shared code ?
-
The fragility of software
A very interesting talk for anyone involved
in the practice of scientific research:
http://www.youtube.com/watch?v=nYWormzn1Mc