I wanted to understand the impact of the CHI Program Committee meeting on the final choice of papers. That is to say, there are 126 ACs listed on the website. So that's an average of about 2.4 papers accepted per AC and 8.3 papers rejected per AC, adding up to an average of 10.7 papers per AC. What's the impact on the conference of flying all these people to Atlanta, putting them up a hotel for a few nights, and sitting in rooms together for a day and a half?
We could imagine an alternative might be to make decision based entirely on review scores. What would that look like? Let's assume we would accept the same number of papers. So to look at the impact of the meeting, let's make a prediction which would have the result of accepting the same numbers of papers (302), and compare that to reality. Practically, this means taking a December 2 dump of the precision conference database (to avoid the changes made over the course of the meeting), sorting it by score and taking the first 302 papers. This is the same (to a level of accuracy of about 2 or 3 papers) to accepting all papers with a score >=3.42. Comparing that to what really happened means:
no difference in 1226 out of 1342 papers
papers predicted as rejects (i.e. score <3.42) that were accepted: 56
papers predicted as accepts (i.e. score>=3.42) that were rejected: 59
for a total number of changes as result of the meeting of 115
or, in other words, about 57 papers switched accept/reject decisions because of the meeting
Another way might be to look at the 1AC score. That would mean that ACs had more influence on the final decision of a paper than the reviewers. So do we just accept the decisions of the ACs? Let's do the same math; it turns out that the top 302 papers have an 1AC score that is also (as it happens) >=3.42
no difference in 1199 out of 1342 papers
papers predicted as rejects that were accepted: 31(i.e. if the AC doesn't like it, it's pretty unlikely it'll get in)
papers predicted as accepts that were rejected: 111 (i.e. even if they do, it doesn't mean it will.)
So in conclusion a) the PC meeting, while elaborate and expensive, does make a substantial difference in the shape of the conference and b) ACs do not have excessive powers to make papers they are (at least somewhat) enthusiastic about get into the conference.
The second point I'd like to make involves ACs abilities to reject papers that are clearly unsuited for the conference without involving three reviewers in the process. There were 300 papers this year with an average score <= 2.00, and 100 papers <= 1.00. Or, in other words, that's 1200 reviews written for papers that had little chance of getting in and 400 reviews written for papers that had no chance of getting in. There's some debate about this, because there is a sense of obligation to new researchers who may be trying to enter the field for the first time. However, the amount of time that currently needs to be invested in these works seems to be out of proportion to the amount the authors do or will contribute to the field. In my experience, the short version of the advice to all such authors is "Go read some papers that have been accepted in the past. Now make your paper more like them." There may be some ways to extend this - suggesting two or three papers that are particularly relevant, or adding a few specific details ("Please pick only one topic that you are trying to talk about in the limited space available to you", or "Please have your paper edited by a native English speaker") but this seems like an opportunity to reduce overhead. (James makes a similar point http://palblog.fxpal.com/?p=2425#comments here, I note.)
We could imagine an alternative might be to make decision based entirely on review scores. What would that look like? Let's assume we would accept the same number of papers. So to look at the impact of the meeting, let's make a prediction which would have the result of accepting the same numbers of papers (302), and compare that to reality. Practically, this means taking a December 2 dump of the precision conference database (to avoid the changes made over the course of the meeting), sorting it by score and taking the first 302 papers. This is the same (to a level of accuracy of about 2 or 3 papers) to accepting all papers with a score >=3.42. Comparing that to what really happened means:
no difference in 1226 out of 1342 papers
papers predicted as rejects (i.e. score <3.42) that were accepted: 56
papers predicted as accepts (i.e. score>=3.42) that were rejected: 59
for a total number of changes as result of the meeting of 115
or, in other words, about 57 papers switched accept/reject decisions because of the meeting
Another way might be to look at the 1AC score. That would mean that ACs had more influence on the final decision of a paper than the reviewers. So do we just accept the decisions of the ACs? Let's do the same math; it turns out that the top 302 papers have an 1AC score that is also (as it happens) >=3.42
no difference in 1199 out of 1342 papers
papers predicted as rejects that were accepted: 31(i.e. if the AC doesn't like it, it's pretty unlikely it'll get in)
papers predicted as accepts that were rejected: 111 (i.e. even if they do, it doesn't mean it will.)
So in conclusion a) the PC meeting, while elaborate and expensive, does make a substantial difference in the shape of the conference and b) ACs do not have excessive powers to make papers they are (at least somewhat) enthusiastic about get into the conference.
The second point I'd like to make involves ACs abilities to reject papers that are clearly unsuited for the conference without involving three reviewers in the process. There were 300 papers this year with an average score <= 2.00, and 100 papers <= 1.00. Or, in other words, that's 1200 reviews written for papers that had little chance of getting in and 400 reviews written for papers that had no chance of getting in. There's some debate about this, because there is a sense of obligation to new researchers who may be trying to enter the field for the first time. However, the amount of time that currently needs to be invested in these works seems to be out of proportion to the amount the authors do or will contribute to the field. In my experience, the short version of the advice to all such authors is "Go read some papers that have been accepted in the past. Now make your paper more like them." There may be some ways to extend this - suggesting two or three papers that are particularly relevant, or adding a few specific details ("Please pick only one topic that you are trying to talk about in the limited space available to you", or "Please have your paper edited by a native English speaker") but this seems like an opportunity to reduce overhead. (James makes a similar point http://palblog.fxpal.com/?p=2425#comments here, I note.)