Peer-to-Peer overlay networks offers features which weigh differently according to situation. As the authors mentioned in their paper, these are:
- robust wide-area routing architecture
- efficient search of data items
- selection of nearby peers
- redundant storage
- hierarchical naming
- trust and authentication
- massive scalability
- fault tolerance
The authors also mentioned two classifications of P2P overlay networks, namely, structured and unstructured.
On Structured P2P Overlay Networks
Structured P2P overlays take advantage of determinism to increase efficiency of queries within the network. Structured P2P networks make use of Distributed Hash Table for mapping data objects to peers within the network using key-value pairs, in which keys are mapped to unique peers. A look-up table is also maintained by each peer for keeping track of its neighboring peers all the time. Note that this requires a periodic updating of each peer’s look-up table to ensure efficiency of query propagation across the network. It is worth noting that in theory, DHT-based P2P overlays ensure a O(log N) in average query roundtrip time in hops for any data object within the network. The physical topology of the underlying network may differ from what is theoretically assumed though. This may cause in increase in latency within the overlay network which may greatly affect the performance of the applications running on the upper tier of the infrastructure. An advantage of the determinism of structured P2P overlay networks is its efficiency in locating “rare” items within the network. Unstructured networks do not scale well in locating this kind of items within the network due to flooding of query. On the other hand, unstructured networks do work well with high distribution of same items in the network.
Content Addressable Network (CAN) is a highly structured P2P overlay network infrastructure. CAN efficiently facilitates query routing through its mode of identifier assignment to the peers of the network. The identifier space (both for data objects and nodes) is represented by a d-dimensional coordinate space divided into sectors or zones which are assigned to peers. As common to structured P2P overlay networks, peers also keep a look-up table of neighbors to facilitate efficient routing of requests across the network. Introducing multiple d-dimensional coordinate spaces will expand the zone of specific peers thereby also allowing scalability within the network. In contrast to CAN, Chord uses a ring structure to facilitate mapping of peers to identifiers in the identifier space. Peers also maintain a lookup table termed as finger table to keep track of neighbor peers. In Chord, traversal of the network and assignment of keys to peers uses a notion of peer successor which is identified by a factor of modulo 2m from an origin peer during queries. Both CAN and Chord provides facility for handling “insertion” of new peers and “deletion” of current peers into the network. CAN handles this with splitting and joining of zones while Chord does reassignment of successor pointers. Other structured P2P overlay networks discussed by the authors are Tapestry, Pastry, and Kademlia.
Note that in each of these structured infrastructures some of the issues commonly addressed are the query request and response time, keeping track of neighbor peers to facilitate efficient propagation of requests, and joining and leaving of peers resulting to dynamic assignment of data objects within the network. The application performance benefits well from the structure of the these network infrastructures in terms of query propagation due to determinism of the assignment of identifiers in the network. The investment in computing space of structured P2P overlay networks results to gain in speed of propagation of request and response across the network.
On Unstructured P2P Overlay Networks
In contrast to structured P2P overlay networks, unstructured overlay networks do not depend on the topology of the overlay network in providing identifier names for data items and peers. In unstructured networks, there are no visualization of d-dimension zones or rings. The topology of the network evolves freely as peers join and leave the network.
Common to some infrastructures under this category is the use of super or ultra-peers. As the name suggests, these are peers with high bandwidth, large disk space, and high processing power. Gnutella protocol and FastTrack file-sharing system share this commonality. Peers publish their files list to these ultra-peers as meta-info to facilitate more efficient querying of data items. Ultra-peers act as directories wherein ordinary peers could lookup data item information, pointing to the peer/s hosting the queried file, if there are any. Queries are directed to the ultra-peers (they are capable of processing queries due to their high capacity) and are propagated using the common flooding approach.
Decentralized unstructured P2P overlay networks, like Freenet network and networks built on top of the Gnutella protocol, have the advantage of high reliability and fault-tolerance as compared to centralized ones like those built on top of the BitTorrent protocol, which has a single point of failure. On the other hand the details of the design of the BitTorrent protocol justifies the choice of centralization for more efficient file-sharing. A tracker is employed to keep track of the activities and the files available for sharing across the network. The protocol employs measures to ensure fair exchange of bits between peers. Among these are pipelining and choking.
Note that any protocol which requires caching or keeping of meta-info in some part/s of the network requires refreshing either in a periodic or aperiodic manner. In centralized networks like those built on BitTorrent, this is most likely done from all peers propagating their files lists to the central management node. This could be done either periodic (peers republishes their files lists) or in a native manner already built into the design of the protocol (refreshes are incorporated in queries and responses). This incurs overhead space cost in maintaining the list in exchange for faster request-response time. This is addressed by some protocols and systems by using high-capacity peers. Space cost is also starting to cease as an issue due to the enormous advancements in hardware research. Time cost usually still comes first as an issue as compared to space. It may vary though depending on the situation and nature of application of the overlay network.
Weight of the integrity and quality of data received by requesting peers may also vary depending on the situation. Downloaders do not really mind much usually about the quality of movie files they are receiving as long as they get a usable copy. Security, as always, remains to be a major concern in P2P file-sharing systems though. Peers are not assured of this when engaging with file-sharing systems in the Internet. They are only assured, usually, of the availability of data they are looking for.
Violation of copyright laws is also a common issue among file-sharing networks. It is desirable to incorporate in the design of these protocols the filtering of data being shared across the network. It is a difficult task to do so though due to volatility of data in the network and the frequent evolution (maybe revolution should be the more correct term) of topologies of networks. Identifiers for the classification of content could be standardized (but by whom?) but this is a far cry to solving this issue though.