Given society-wide soul-searching over issues of race and gender today, many scientific institutions are conducting surveys to assess the prevalence of harassment, bullying and discrimination within their fields. In principle this is a good thing, since scientific institutions should, of course, be open to all, and the work environment should feel welcoming. But we also need a degree of rigour when designing and interpreting such surveys.
Nature is the world’s leading scientific magazine (though being a commercial product it also has a predilection for click-bait). A recent headline claims that science is “plagued” by discrimination.
The survey that led to this conclusion is summarised in this table.
So two thirds of scientists respond that they have not experienced nor seen bullying, nor harassment, nor discrimination — not even one incident — in the whole of their current job, a length of typically two to eight years. Does that amount to their institutions being “plagued” by such behaviour?
The survey does not distinguish between observing one such incident in, say, three years, and bullying being a weekly occurrence, and yet that would make a vast difference to the experience of the working environment (note, also, that the survey is worldwide, with most respondents being scientists from Western countries, but many being from other countries with different cultures, which complicates any interpretation).
OK, one might reply, but one-third of scientists have experienced at least one such incident, and surely that’s too many?
Yes, but we can then ask, how do we define “bullying” and “harassment”, what is the threshold? Where is the line between a fair-enough critical remark from a line manager, and “bullying”? Where is the line between a line manager reminding someone about a task, and “harassment”?
In the above survey the threshold is entirely up to the respondent. And that’s a problem since it makes the reporting very subjective. In the same way that the number of drivers exceeding a speed limit by 3 mph will vastly outnumber those exceeding it 30 mph, the number of incidents that only just cross the threshold will vastly outnumber the very serious incidents. So the rate of “bullying and harassment” will depend a lot on the adopted threshold. Without any attempt to define and standardise a threshold, such surveys lack value.
That is not to deny that, in some institutions, there have been very serious cultures of bullying and harassment that have gone on for years. But, that is different from occasional minor clashes that are inevitable when bringing humans together in a work environment, and we need to be clear which of these we are discussing.
One might reply that, if someone perceives an incident as bullying and harassment, then that means that it is. That’s a fair point, but in making policies about such matters we also need to be concerned with what is reasonable, and so cannot just take personal perception as the only criterion.
How people perceive interactions in day-to-day life can vary a lot from person to person, and — importantly — can tend to vary across different groups and different cultures.
The Royal Astronomical Society (the UK society for professional astronomers, of which I am a member) produced such a survey earlier this year. In reporting that bullying and harassment are “rife” in UK astronomy, Nature summarised the survey with this table:
Worryingly, this suggests that minority groups have it much worse, and that was certainly the tenor of the RAS’s own interpretation.
And yet, again, this survey is based on perceptions, and might there be systematic differences in how members of the different groups perceive things? Could members of the different groups tend to adopt a different threshold as to what counts as “harassment”? As the above speeding analogy illustrates, even a small difference in threshold would have a large effect on the reported rate.
Given the current “mood music” regarding matters race and sex in STEM and in wider society, it would not be surprising if, to some extent, narrative-based expectations then fed into perceptions.
Yet such questions are not being asked, either in the design or the interpretation of such surveys. It’s just taken as given that all perceptions and reporting are accurate and unbiased, such that the above tables are faithful representations of how things actually are.
The assumption is that if an incident was perceived as “bullying” then it was bullying, and if one group reports a higher rate then they are indeed being bullied more often. Such assumptions accord with the primacy nowadays granted to “lived experience”, and yet we know that human perception is often hugely unreliable and biased. That’s why scientific trials adopt, for example, control samples and double-blind procedures in order to minimise subjective human evaluations.
No-one would take someone’s self-report of their own likability, agreeableness, leadership capabilities, sense of humour, alcohol intake, or charitable giving, as being reliable guides. And, in a personality clash or a minor dispute at work, both parties will regard themselves as being the one in the right, with the other party being the unreasonable one. This is just an inevitable feature of human interactions.
One of the few to begin querying the assumptions behind such surveys is Wilfred Reilly, a Professor of Political Science at Kentucky State University. He first asks his students at what rate they experience mildly-negative interactions with other people. Interestingly, he finds that white students and black students (this is the American South) report much the same rate. He then asks them what fraction of such incidents they think resulted from racial bias on the part of the other person. As expected the white students say few, but the black students say about half. Which means that black students are perceiving a racial element where — given that the overall rate is the same — there cannot be any. In short, black students are perceiving as racial “micro-aggressions” incidents that are just normal human interactions that happen just as much to whites.
The suggestion is backed up by personal testimony, for example a (black) South African who lived in the US reflects on her past attitudes:
The worldview that I had assumed was awfully cynical, as I filtered all my daily interactions through the lens of racial power dynamics. Any mistreatment that I perceived from strangers or friends I often interpreted as a diluted form of racism called a microaggression.
Thus we cannot assume, in surveys such as the above table from the Royal Astronomical Society, that “people of colour” tend — on average — to adopt the same threshold for labelling an incident as “bullying, harassment or discrimination” as white people. Nor can we assume that men in general are adopting the same threshold as women, or LGBT+ people (again on average) as straight people. And if we can’t assume that then we can’t treat the reported rates as being comparable. Though I’m sure they even raising this issue will be treated by some as heresy, an improper questioning of people’s “lived experience”.
Professor Reilly’s studies concerned race; a recent study from the University of California, San Diego, concerns sexual harassment. The authors (Rupa Jose, James H Fowler & Anita Raj) find that the rate at which women report sexual harassment depends on whether they are politically conservative or politically liberal. But do conservative women actually suffer less harassment, or is that because they tend to adopt a different threshold for what constitutes “harassment”? We don’t know. The study concludes: “Research is needed to determine if political differences are due to reporting biases or differential vulnerabilities”.
All of which means that we need a lot more thought and academic rigour in surveys of harassment and bullying. One can fairly reply that such surveys are relatively new and are a good first step. Yes, that’s true, but if we treat the outcome of such surveys as mattering — which we should — then we need to do them well.