There was another thread about the reason for the jump in quality this year. This thread is about the existence of such a jump. Namely, I don't actually see a jump (I wish I did, and wishing makes me see suggestions of it where the data is probably fairly inconclusive). Sorry for the length, it almost would be better to make this a seperate webpage, but if I did so, I'd allow myself to go even longer, and that doesn't seem like a great idea just yet.
I'm not super good at statistics, can I ask if I am right in my analysis, and if there is anything else I should be doing? Also, am I doing and interpreting the dummy correctly?
My data set is here, note that the data will change as it is updated to include more results posted and to fix for my own errors. If you would like to use the data, feel free (I haven't formally done so, but I intend to put everything I have under some sort of cc license which would require that I be credited for my work...). I code acceptance as 1, rejection as 0, pending/waitlist as -1, and no data as -99. Everything else should be self explanatory, but feel free to ask. If you do use it, I would appreciate being told how it is being used, and if you learn anything interesting.
So, i do a logit model where I regress school X acceptance/rejection on GREQ (normalized by subtracting from each score the mean, 788.7), GPA (normalized by subtracting from each score the mean, 3.734) , and two dummy variables, y2008d, and y2009d, based on the year of the application (am I correct that the intercept can not exactly be thought of as the coefficient on y2007d (which isn't included due to perfect multicollinearity, which if I didn't know before, I would have figured out from the R output), and if so, as what can it be thought of). Here is the output for Penn:
___________ Estimate _ Std. Error _ z value _ Pr(>|z|)
(Intercept) _ -0.47513 _ 0.74847 __ -0.635 __ 0.5256
GREQ ______ 0.01746 __ 0.04917 __ 0.355 ___ 0.7226
GPA _______ 5.34905 _ 2.77505 ___ 1.928 __ 0.0539 .
y2008d ____ 0.52080 __ 0.82979 ___ 0.628 __ 0.5302
y2009d ____ -1.56755 _ 0.87218 __ -1.797 __ 0.0723 .
y2009d of -1.568 is significant only at the 10% level, which isn't bad for as small of n as we have (44).
Anyway, can I interpret that -1.568 as being an indicator of how much harder it was to get into Penn this year? Not quite, I think. If I use the "divide by 4 rule" (Gelman talks about this in arm), this means that applying in 2008 increases the chances of the average applicant getting an acceptance to Penn by something less than 13%, while applying in 2009 decreases the chances by something less than 39%. Applying inverse logit, we can calculate the probability the average applicant will get in; in 2007 = invlogit (-.47513) = 38%, in pr (acc to penn in 2008) = invlogit (-.47513 + .522080) = 51%, and pr (acc to penn in 2009) = invlogit (-.47513 + -1.56755) = 11%.
Now, that last number looks about right, but the other two probabilities look way too high. The standard error for both dummies and the intercept is around .8, and the actual probabilities are likely to be within confidence interverals, but its a bit depressing to have such high sds. Also, I chose to show penn, because, while nearly all schools had negative signs on y2009d, not many are significant at any level.
In fact, Cornell gives probability in 2007 of 5%, in 2008 of 41% and in 2009 of 47% (that is the sign on y2009d was positive, indeed it was equal to 2.76 with sd 1.13). I think Cornell was thrown off by two applied admits and wind up bird's admit all with less than 800 GREQ, and all the rejections having 800 GREQs.
Anyway, the take from this is that I am not seeing a large effect of the y2009d (although there does seem to be a small one). If instead of using dummies, we use include a year trend, usually the coefficient is negative, but this doesn't make everything much clearer.
We do have a few other pieces of data. One of note is jlists statement about chicago, basically that this year was harder than average. Indeed, it was harder than 2008, but only barely harder than 2007. To compare 2009 and 2007 for uc, we look at a model with y2007d and y2009d, so we get:
___________ Estimate _ Std. Error _ z value _ Pr(>|z|)
(Intercept) _ 0.33146 __ 0.53573 __ 0.619 __ 0.5361
GREQ ______ 0.06039 __ 0.03977 __ 1.519 __ 0.1289
GPA _______ 3.03195 __ 2.14808 __ 1.411 __ 0.1581
y2007d ____ -1.35248 __ 0.77709 __ -1.740 _ 0.0818 .
y2009d ____ -1.39423 __ 0.75848 __ -1.838 _ 0.0660 .
Using invlogit we can calculate probabilities of acc in 2009 of 36% versus 38% in 2007 which seems a bit high) on tmers applying to chicago.
The other piece of data we drop is applicants acceptances to other schools. The holy grail for this data set is to somehow incorporate that data, but I'm yet to figure out how...
Other notes: there is some selectivity in reporting information. I don't think I am the only person who did not bother to put MIT rejection in the admitions and rejections thread - I think generally people who don't get any top 10 admits don't bother to post all of their top 5 results, its a bit emberrassing to say that I thought there was even a chance (my excuse was that everyone is supposed to apply for nsf, and if you get nsf, you might get into mit). Further selectivity is based on the fact that the average tmer is possibly above average overall. In fact, I think seeing greater than 50% of tmers get into chicago last year makes it pretty clear. Also, because n is small, small fluctuations such as someone with low numbers getting into top 5 schools can mess things up (in a good way). Today it looks like Princeton is rejecting many of its waitlisted candidates. since the 2009 numbers don't include waitlists, and most waitlists are eventually rejected (especially at top schools), this will change things a bit, too. I checked the numbers under the assumption that all waitlists in 2009 are rejected, and it didn't change much (although the changes were significant for those involved - I hope that trying to look at these statistics doesn't hurt anyone on a personal level).


LinkBack URL
About LinkBacks







Reply With Quote


,Brown,Duke,PSU,UCLA,Maryland,Mich State
Bookmarks