The probability of having one out of twenty six participants at a scientific meeting be female
Jonathan A. Eisen
University of California, Davis
Scientific conferences have key participants which I define to be the speakers and the organizers. Such key participants can be divided into two main classes based on gender: male and female, which I denote here as M and F, respectively (I realize there are other gender classes and I regretfully am not including them here). The number of key participants (which I denote as KP) for conferences varies significantly. For this analysis I focused on meetings with KP = 26. This value was selected for multiple reasons, including (a) that it is the number of letters in the English alphabet (b) that its factors include the number 13 which I like, and (3) because in email announcements for this meeting KP= 26. I sought to answer a relatively simple question - what is the probability that, for a meeting with KP=26, that F = 1. I chose this because this seemed extreme and because F=1 in the email announcements for this meeting. Using the probability mass distribution formula as below:
which becomes
n = NP = number of participants
k = f = the number that are female
p = percentage of f in population being sampled
I have calculated Pr (F=1) for KP = 26. Assuming for the moment that p = 0.5 (i.e., that the population to be sampled is 50:50 male vs female) then Pr (F=1) = 3.8743E-07. This is highly unlikely by chance alone. However the assumption of p = 0.5 is certainly off in some fields. I therefore calculated P (F=1) for different frequencies of F in the population (i.e., what is the expected ratio of females to sample from).
Thus for a meeting with NP = 26, only when the frequency of F is ~0.16 does P (F=1) exceed 0.05. So a question is then, what should we use for p for this meeting? An informal survey (John Hogenesch, posted to Facebook at https://www.facebook.com/jonathaneisen/posts/10151208978630767?comment_id=24634832&offset=0&total_comments=15 ) suggests that in qBio the percentage is about 20%. However that may not ideal since the meeting is specifically about synthetic biology, I do not have a any estimate of p for this field. However, examination of key meetings in the field (e.g., see http://syntheticbiology.org/Conferences.html for a list) reveals a percentage of perhaps a bit higher. For example at SB5 the ratio was about 35%. I conclude that it is likely that p > 20% in Synthetic Biology. Given that for p = 0.2 the Pr (F=1) < 0.05 I therefore conclude that the null hypothesis - that having one female out of 26 key participants) can be rejected - and that this meeting has a biased ratio of males: females.