Of Tampered Results and Cyber Security

It was the first week of June 2013 and the name Debarghya Das was all over the news. Be it digital,print or electronic media, Debarghya Das was the news. The reason? An Indian origin student, and a Cornell university graduate, Das, had hacked into the CISCE website and had single-handedly procured the results of around 1,50,000 ICSE and 65,000 ISC students! Oh wait! That’s just the tip of the iceberg. While the fact that a normal college student could easily gain access to such confidential information, with minimum application of technical knowledge, raised many a concern;on the other hand, the results themselves were shocking!

Das chose some most popular courses in ICSE namely, English,History-Civics-Geography, Computer application,Hindi, science and math. And plotted graphs of marks vs no. of students. Surprisingly, all the graphs peaked at more or less the same intervals! The graphs were all put together and a graph of all the distributions together was plotted. All the peaks aligned together and the same values were missing for each subject!

33 numbers, i.e. -36, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 56, 57, 59, 61, 63, 65, 67, 68, 70, 71, 73, 75, 77, 79, 81, 82, 84, 85, 87, 89, 91, 93 were missing from the marks lists of 1,50,000 students, which is statistically impossible!

What made these numbers unattainable? Grace marks? Sheer coincidence?

Here’s an excerpt from his blog:

If you’re a skeptic and still don’t believe when I say the absence of a 93 is statistically impossible, read on. One’s total ICSE score is broadly gauged by one of 3 metrics – Overall Average,Best 5 subjects, and Best 4 subjects, plus English. Statistics says that if you take enough samples of data, regardless of the distribution, it will average out into a Normal distribution. When I plot the distribution of these metric, voila!

 What was initially a jagged mess has all of a sudden become a refined slightly askew bell. Statistics magically transformed that jagged mess into a nice curve. It is the same statistical theory that says that it is not possible for that 93 and those 32 other numbers to be absent from the previous distribution.
Well, the bottom line is that, there is no plausible explanation for the missing numbers other than the possibility of children being granted grace marks. The marks have clearly been tampered with and all it took was some basic statistical and technical knowledge, to discover the truth!
Source of graphs:
Debarghya Das’ TedEx talk on the same

