On my page The Gnutella Network: Why it sucks, I explained why Gnutella sucks and everyone should adept 'my way' of doing things. There were some good things and some bad things in that particular rant. There have been some changes (improvements) to Gnutella, but it's still not quite what you want. Meanwhile, everyone who is using Windows is using KaZaA, a similar network which is fully encrypted, and supports downloading a file through multiple streams, considerably speeding up the downloads. It is really a great tool, although it does seem to have a minor problem with resuming downloads if you have been disconnected for a while (like overnight).
This rant is not about Gnutella nor KaZaA; it is about doing file sharing, and doing it another way than was mentioned before. When I was thinking about it, I figured that this could be the way how KaZaA works, although I honestly don't know and there is no way to check because its design is not open to the community. On the other hand, KaZaA talks to Grokster, and Grokster works much like Gnutella (or so I'm told).
So, what's new? I figured I just wanted to advertise my shared files on a network, and
allow people to download my shared stuff. I could tell a friend 'the file you want is
here' and he could pick it up. That's possible with ftp, nothing new there.
But what if I didn't know where to get a particular file? If I need information on a
particular subject, I put it in Google (web search engine), and numerous locations
pop out. A file sharing tool could work the same way; you advertise to an indexing
service, and you can query the indexing service on where to find the file.
The indexing service must be decentralized. The indexing service can be on its own
Gnutella-like network. A query that cannot be answered by an index server can propagate
the query to the next index server. Instead of routing the answer back through the
network, I suggest sending the answer back to the originator immediately. Another way
would be telling the quering client what index servers there are, and let him query
multiple servers, instead of letting the servers query each other. This creates more
traffic on the client side, but simplifies the design and implementation of the index
server.
The index server to connect to is a dynamic list of well known index servers that
are usually running.
Because the index servers are on a relatively small network, it is possible to count the
amount of files shared and the total amount of bytes.
All files that are shared have a checksum. The SHA-1 algorithm is currently in common
use on peer-to-peer networks, but in fact, any good checksumming algorithm will do.
Computing the checksum takes some time. I suggest the checksum is not computed until
someone actually requests the file, and then saving the checksum locally as a cached copy.
If the index server finds a file of the same size and with the same checksum on
another client machine, it is probably the exact same file that is being shared.
This enables someone else to download the file from multiple sites at once, hence speeding
up the download.
I figured this is probably much like the way KaZaA works, but then again, there is no
way to be sure. I do know that this design would make for an excellent file sharing network.
Something I really miss in the current version of KaZaA is bandwidth regulation. I would like
to be able to say "do not use more than 50 Kbps" or "do not use more than 80% of my total
bandwidth". I'm told that some other peer-to-peer download tools do have this option.
I would also like to be able to say "if the download rate drops below 15 Kbps, go search for
more download sources". Or what about "automatically queue downloads if the line is already
congested".
There are lots of peer-to-peer networks nowadays and plenty of download tools to choose from.
There is still some room for improvement though, if only the developers used their imagination.
Personally, I would like to see an implementation that sees the network as a globally shared
filesystem, which you can mount and use like any other filesystem. This would require a
completely different setup...
Anyway, I think it would still be fun to write my own implementation of a peer-to-peer download
tool some day, when I have nothing useful to do ;-)