I've been considering how I would run the RankManiac 2010 contest if I were in charge. On of the pitfalls of the contest is the fact that it leverages existing social connections. Thus, the more well-connected students have a clear advantage over the less well-connected students. In addition, students willing to leverage their social connections will succeed over students who are unwilling to do so. The question, of course, is how to fix this.
In theory, the professor and TAs could set up a dummy internet, with its own mini-Google. Each student could then place pages on the dummy internet and see whose is ranked highest. It seems like this would become either an exercise in adding more pages than anyone else or the number of pages would be capped, either of which would make the assignment simplistic.
I think a big part of this assignment's worth lies in its interaction with the "real" internet. Like it or not, the Internet is the first and last word in modern information. Search engines are a big part of that, and understanding the algorithms they use to produce good search results is key to understanding the underlying structure of the Internet. For example, did you know that Google puts captcha tests on its webpages for known Tor exit nodes?
The existing social networks within the Internet are also key to understanding how and why it works. Systems like Reddit or Slashdot can give some insight into distributed computing problems. For example, with a distributed network of volunteer sensors working on some project, a trust system like Reddit's might be the best solution to stop new users from biasing the results one way or the other.
These interactions have a complexity that would simply be impossible to recreate in the mini-Google environment mentioned earlier. I don't really see a way to get around using the Internet itself as a sandbox for this kind of assignment.
I'll leave you with this picture of the Internet (from the Opte Project at http://www.opte.org/). It's a big, big place.