Open Letter (more on S. Towers’ paper)

The following was sent to Sherry Towers as an e-mail this morning:

Dear Dr. Towers,

this is in response to your web post titled Hatred. You may or not have noticed my own review of your article (by the way, yes, I have read your article, and found it generally interesting and stimulating). If you have, I hope you have appreciated the fact that my entry does not contain a single personal attack, toward you nor anyone else; I do not think it can be, in fairness, called “hateful” by any standard.
I completely agree with you that name-calling does not belong in a civil and useful discourse, and that any possible personal stake or involvement that you may or not have in the matter, is immaterial to the scientific case that you expound.
I do, however, have some reservations on your conclusions; it is possible, and indeed likely, that they are correct, but I respectfully submit to you that they do not come out convincingly enough from the numbers that you quote in your paper.

I would be very interested in running my own statistical analysis, for which I need raw data (i.e., what you call “productivity” and number of conference presentations for each member of your sample). You are obviously right in stating that I, like anyone else, have full access to the database off of which you took them, and it is fully understood that the burden is on me to obtain them independently. However, as you certainly understand my task would be much simpler if you would kindly share your data with me, or anyone else who may wish to analyze them. I wish to point out to you that it is not just a matter of saving time; you have selected the members of your sample based on specific criteria (e.g., race), and it may not be easy for others to identify the very same 57 people.
The easiest way for you to do so, is (probably) to add them to your preprint, as another appendix. I think your case will be greatly strengthened.

Best regards,

Massimo Boninsegni “Okham”

Update of April 28, 2008, 15:14 CDT
Sherry has kindly replied to my e-mail. Much of what she wrote me appears also in this blog entry.
For me, I think I am done discussing gender issues (make that “gender anything“, actually), at least for the time being (after all I have broken my personal record of eight comments for a blog entry, something which seemed unthinkable not so long ago… there’s no telling where I could get next… fifteen ? twenty ? Who knows… all I can say is: watch out, Pharyngula, you have got some serious competition, here…).
I would like to address, instead, the broader issue of scientific trust. Although Sherry and I have had a very civil, even amicable exchange (obviously we disagree on the specifics but hey, that’s life), it is clear that she feels that her scientific competence and integrity are being called into question, at least to some extent, by my request to see her data. This is how she puts it:
I also have many, many physics publications to my name. Not in a single instance during peer review was I ever asked for dis-aggregated physics data […]
I am not saying that the person who asked me for my data is necessarily questioning my scientific integrity […] but it certainly is very, very odd to have someone express some skepticism of your analysis until they see the dis-aggregated data themselves. It does, in my opinion, subtly imply at least a little bit that perhaps I have either fudged the data or manipulated it in some unethical way.

I think this is something worth talking about because
a) it is an interesting subject, in a geeky kind of way (but hey, this is what this blog is all about)
b) much of the hostility that characterizes the blogosphere debate on Towers’ paper may originate from this perceived “lack of trust”.
I can certainly see how a request for data may come across as insolent, aggressive, even insulting. Maybe it is different in different areas of science, but at least in my field (condensed matter theory), the practice of requesting data is not unusual in the least. It is, actually, the norm [0].
I have both asked and been asked to provide numerical data, as well as other details of all sort, including initial conditions, values of parameters (even for calculations whose results do not depend on them), entire data sets (we are talking gigabytes worth of numbers) and even the whole damn code, with precise instructions on how to reproduce the results published in the paper. It happens all the time, and for the most part these requests are seen as reasonable, and honored (to the extent that it is possible). Only when they come a long time after the work is published, does the possibility exist that data may have been erased, either accidentally or simply for lack of disk space and/or interest in keeping them.

(In my case, I do not keep data around for long because I expect that I shall be able to reproduce them later on, otherwise they are probably wrong, and need to be rectified, which in turn means … another paper. See ? A win-win situation…).

So, why is it done ? Why would I write Joe Scientist an e-mail saying “Nice paper, dude. Please send me your data because I would like to run my analysis and see whether your conclusions stand” ? And why would Joe Scientist usually reply by sending me the data, instead of a one-liner with a suggestion for something that I (and maybe the horse I rode in on) should do ?
Based on my experience: a request for data is practically never malicious, or motivated by lack of respect for the scientist who produced the research. In fact, no judgment whatsoever is implied [1].
I can think of three main reasons for asking that data be shared:
1) First and foremost, genuine curiosity. In the case of Towers’ paper, I would love to put her numbers in my computer and see what comes out of the simulation. If her conclusions are quantitatively correct, I would like to arrive at the very same conclusions independently, using my own methodology. Now, for me it’s borderline video game addiction, but there is something really satisfying and empowering in testing one’s model, or hypothesis, or calculation, against someone else’s data.
2) Then there is the issue of providing an independent check. People make mistakes, even competent and bright researchers are human, thus fallible. Fields of science abounds with wrong theories. One of the things that I like so much about science is that practitioners are forced to be humble, and allow their work to be scrutinized down to the smallest detail.
3) Last but not least: narcissism. The author, the colleague, the competitor, may have done a perfectly valid job, but it is almost inevitable for another researcher with blood flowing in his/her veins to think that (s)he a may have a great idea of which the author has not thought, this incredibly deep piece of insight that the other person, hell, that nobody else has. So, it is not about proving them wrong, it’s about showing how smart we are.

As for for the reason why this paper by S. Towers is under the microscope more than others she has published in the past (I assume most of them as a high energy physicist) … well, I hate to suggest it, but I am afraid we are very much on the same boat when it comes to this… let me just say that my most ridiculous blog entry has probably been read by more people than my best physics paper…
OK, I think I am going to stop here. I know you all are dying to start commenting, and I am not a cruel man.

[0] Obviously I am discussing cases where no human subjects are involved and/or no other issue of privacy or confidentiality exists.
[1] In those rare cases when it is, it is invariably because the professional and personal relationships between individuals have deteriorated to a point that a request of this type will elicit no answer, or one filled with four-letter words.

