If you have multiple offices or users dialing in from home, or if you are making use of outsourced or offshore development, then you have a distributed development environment. You also likely wish that your network applications worked faster on your limited network connection. In this article, we look at the challenges of distributed version control, what we can do about each challenge, and how we can balance conflicting goals to achieve an effective result.
Challenge 1: A slow network connection. Whether using a cable modem, T1, or other wide area network (WAN) connection, or dialing up at 56k or worse, remote network performance is significantly slower than a local area network (LAN). Moreover, a slow connection can bring a version control tool to its knees and make it unusable. To best understand how connection performance affects application performance, we look at two performance metrics: the round-trip (or ping) time and the throughput.
The Ping Time: The ping time measures the latency of the network, or how long it takes a tiny message to traverse through the network to a remote system and back to the originating system. Delays can arise from the speed of light limitation on signal propagation, network switches that may temporarily store and forward a message, network card and operating system overhead on each end, and TCP or other network protocol overhead. Ping times on a LAN are typically under a millisecond, while on a WAN, they can reach 100s of milliseconds or, in the case of inter-planetary spacecraft, many minutes. Long ping times especially penalize protocols that are making many small, synchronous requests of a remote server. For example, while one might easily make 200 such requests per second over a LAN, with a WAN's ping time of 100 ms, at most 10 synchronous requests per second can be performed.
One solution to effective performance in a high latency network is to make fewer requests, perhaps by packing more data into each request and reply. An example might be to get information on all files in a directory in one request, rather than performing a separate request per file. By keeping a cache of the directory entries, the application's many small requests only generate a few large requests.
Another solution is to perform asynchronous requests. For example, one might structure an application so that multiple requests can be streamed to the server, with the client handling each reply as it comes in later. This strategy has limited use when later queries need the results of earlier queries, but it can help significantly in some situations.
Throughput: The second WAN performance limiter is throughput. Throughput measures the rate at which we can send data through the network. A 100 Base-T LAN can achieve throughputs exceeding 10 MB per second. Dial-up, cable, DSL, ISDN, T1, and other connections provide different speeds. However, they are all slower than the typical LAN.
One solution to a network throughput bottleneck is to compress the data stream. Compression is relatively straightforward and typically can be isolated in the code near the point where data is sent and received. The basic tradeoff is that the time used to compress and decompress the data must be less than the network time saved by sending less data.
Another solution is to send only the minimally required data. This strategy is usually harder to implement, because it can degenerate into lots of little requests, where the ping time will dominate performance.
Long ping times and low throughput require specific and careful design to achieve working performance over a WAN. Carefully chosen request sizes and effective data stream compression can go a long way toward improving a WAN's effectiveness. Moreover, a close match between the specific information the client applications need and what the server can provide will enable optimal performance.
Challenge 2: Communication about shared work objects. When you can't drop in to see what another user is working on, when his/her office hours are different from yours, or when you are working on different schedules, coordination of changes can be a challenge. With the increasing reliance on distributed and offshore development, geographically distributed teams are becoming common, and with them, coordination proves more challenging.
An effective version control solution will help users understand work in progress and assist in the coordination of their changes. File locking is one of the most basic means of coordination. Lock comments allow communication of intent to remote users. History shows what has been changed and why. Up-to-date file status and graphical displays facilitate understanding and quick sharing of completed changes. Private branches enable parallel work for a period of time. All these benefits are simply aids that enable team members to understand each other and to work more effectively together.
How does your solution compare? Do you dread or avoid using your version control over a WAN because of performance? Do you wish it were better?
SnapshotCM delivers a WAN-optimized version control solution. Many of our customers are happily using SnapshotCM over a WAN each day. Check us out by taking advantage of our free evaluation. Go to www.truebluesoftware.com for all the details.