The Fast File Transfer Recipes
Globally fast – a critical USP for the secure file sharing service FTP-Stream. But why are some links slow in the first place and what are the techniques for accelerating file transfer? If you’ve tried sending large files to China you’ll be familiar with excruciatingly slow file transfer speeds, timeouts and failed transfers, even though both the the sending and receiving parties are on fast links to the Internet. So what’s the underlying problem and what are the available solutions.
Well the heart of the problem is in Transmission Control Protocol (TCP) Common Internet applications such as the Web, Email and FTP rely on TCP for controlling the flow of data across the local network and the Internet.
TCP – a quick look at the culprit
When data is transferred over the Internet it’s the job of IP (that’s Internet Protocol) to send packets, but without a control mechanism it’s not reliable. TCP was conceived in 1974 to control the flow of packets and is optimised for accurate delivery rather than fast delivery. TCP guarantees that all bytes are received and in the correct order. The sender keeps a record of each packet sent and expects to receive a timely acknowledgement, and this is where the problem arises.
Over long or high loss links the acknowledgment can take so long to arrive, the sender assumes the packet is lost and stops sending to retransmit the ‘lost’ packet, this accounts for the jerky stop-start file transfers we’ve all seen over long or flaky links.
WAN optimization
Of course there’s stuff you can do to mitigate these effects. See this Wikipedia article for the tech details on WAN optimization, but for a hosted FTP service like FTP-Stream the options are limited as we only control one end of the link.
UDP based solutions
The User Datagram Protocol (UDP) was designed in 1980 and unlike TCP does not require any dialogue between the sender and receiver. The source just pumps packets at the target in a stream, there’s no control – that’s great if you’re streaming a movie where the overwhelming need is to maintain the flow and the odd lost packet doesn’t much matter, but for most file transfer uncontrolled UDP transfer are clearly unsuitable.
However products like Aspera exploit the faster transfer potential of UDP by replacing TCP with a proprietary control mechanism. It does work – the problem however is that it’s a hardware based and you need the hardware at both ends of the link. So that’s OK for static connections between two parties but is not suitable for cloud large file transfer like FTP-Stream where most users will not have the specialist hardware.
The FTP-Stream Solution
As a managed file transfer solution with millions of user worldwide we need to support standard access methods such as FTP, SFTP, and secure Web which all rely on TCP. So FTP-Stream uses two techniques: multi-path and parallelization. Our proprietary systems will break a large file transfer into small chunks and send them in parallel over SFTP for security. Although the transfer speed of each chunk is still slow, but sending man in parallel we get huge increases in speed.
Furthermore we exploit our global network and intelligent routing software to probe for faster, less congested routes. The receiving end of the link assembles the constituent parts and performs an integrity check.
The results are dramatic – for example a 15 gig file transfer from New York to a south-east Asia destination, which would take days over the public Internet reaches its destination in just two-three hours
If you have any comments or questions about large file transfer over high latency links you can mail me antony@maytech.net, we like nothing better than talking about large file sharing.