views:

205

answers:

8

In my experience I see a lot of architecture diagrams which make extensive use of FTP as a medium for linking architectural components.

As someone who doesn't make architectural decisions but tends to look at architecture diagrams could anyone explain what the value is of using FTP, where it's appropriate and when transferring data as files is a good idea.

I get that there are often legacy systems that just need to work that way - although any historical insight would be interesting too

I can see the attraction in transferring files (especially if that's what needs to be transferred) because of the simplicity and familiarity and wonder if the reasoning goes beyond this.

Edit: Thanks to those pointing out that SFTP is preferable, however my question is more broad than wanting a recommendation for a file transfer protocol. Sorry for the confusion.

A: 

I imagine security and disconnected networks or network segments may come into play. I've had various projects where someone needs to import data from a different system, and FTP is an easy/secure way to get the data out through the firewall. Generally you can schedule it to run automatically, and most network security guys will be fine with FTP ports being open.

Thyamine
-1 for saying FTP is a secure way.
Dave Markle
I think most *good* network security guys would be fine with having SFTP ports being open, but not OK with FTP ports being open.
tster
Yes, I should have been more specific about that.
Thyamine
The big issue with FTP is that it uses network ports “strangely”. Other, more recent, protocols (e.g., SFTP, HTTP) pass the data and the control metadata over a single port, and so are much more friendly to network managers and their firewalls.
Donal Fellows
+7  A: 

When is it good to use FTP?

Prior to the invention of SFTP.


Addressing the edit (aka the broader question in this question)

It all comes down to intended usage. Look at your situation and determine

  • What data am I moving?
  • What format is it natively generated in? (PDF's on disk, textual output from webserver scripts, etc)
  • How is the data consumed?
  • When is the data consumed? (Instantly on arrive, scheduled batch jobs?)
  • What connectivity medium connects the data generator and data consumer?

For example:

A process generates PDF documents by writing them to a local raid-array. You have another PC dedicated to Printing all PDFs generated from a multitude of servers connected on a local Gigabit LAN via a cron job scheduled to run at midnight.

Given that the data would most likely be too large to all sit in RAM on the Print Server, it makes sense to use SFTP to transfer the PDFs so they can be grabbed from disk as they are printed.

Another example:

A machine needs to grab large numbers of small files from a machine in an ad-hoc manner, parse them and store the results in database. In this case, using SFTP to move them from disk, back to another disk to be immediately read and pushed into a DB is just silly. There is no reason the smaller files wouldn't fit in RAM until being parsed and pushed into the database and hence SFTP probably isn't the best solution.

Dan McGrath
+2  A: 

Some lecacy systems use folders to transfer data as XML or CSV etc, in those cases files need to be written to disk. If integrating to another system outside the network/on the internet, it makes sense to make these available on an FTP site. Newer systems may use WebServices or other "over the wire" technologies to reduce saving to disk. Its possible that if these files are very large, FTP may be a better solution.

In some industries, eg the printing industry, large PDF files are routes through various workflows where PDF files are processed, manipulated etc through this workflow. In the printing industry using folders (and in turn FTP) is common-place and they useually refer to these as "Hot-Folders"

Mark Redman
-1 for recommending FTP instead of SFTP, and implying that FTP is a reliable way to transfer files.
Dave Markle
@Dave Markle: Thats a little harsh considering the question is about why FTP is used in a general sense. As far as my experience goes, when talking about FTP, its referred to FTP even is SFTP is used.
Mark Redman
@Dave Markle: haha, and I didnt mention "reliable"...you did....
Mark Redman
"If you have a reliable connection" != "FTP is a reliable way to send files".
Dave Markle
+1  A: 

Far better to use scp or sftp with host keys as ftp is insecure ( passes password and file contents in the clear over network )

A: 

FTP is a simple, cross-platform way to transfer files if you have a reliable connection and need absolutely no security whatsoever (don't be fooled by it asking you about passwords -- there's no real security there).

A lot of times, people actually need security, but they make the mistake of using FTP because they just believe that's what's done. The better way to go about it is usually to use SFTP (I like OpenSSH's implementation), or pass data using a secure web service.

Of course, a proper implementation of SFTP means that implementers will have to properly generate, store, and exchange their keys, and understand how trust works. Oftentimes this is too much effort for them, so people will tend to just go the easy route and use FTP. Kind of sad if you ask me.

Dave Markle
+2  A: 

If you need to send a physical letter to the most backwater locale, it is hard to beat a 2000 year old postal service. If you need to send a file to a place of backwater capabilities, it is hard to beat a 40 year old Postel service.

msw
+1 for the pun as well as for the "universal compatibility" point.
mpez0
I can't in good conscience upvote a pun :)
Dave Markle
I know, I'd downvote myself if I could. :)
msw
+2  A: 

If security doesn't matter, then FTP can be useful.

However, given modern options, I would probably never use it, opting instead for SFTP/SCP/rsync or HTTP (possibly with WebDAV). For one thing, those protocols all have options for better security (HTTP at least via SSL). Also, they are simpler protocols. FTP has the nasty gotcha that actual data is transferred over a separate connection than control commands, which makes firewalling more difficult. Also, in non-passive mode, that connection is from server to client, making firewalling nearly a nightmare. If there's legacy interop needs, it might be useful, but HTTP client programs & libraries are readily available so I'd just use that these days.

Michael E
A: 

File-based communication (like via FTP, SFTP, SCP...) is good for

  • transfering large amounts of data
  • batch operation scenarios
  • asynchronous communication

There is nothing wrong with using files. It is a well-understood mature technology, easy to apply, easy to monitor and to debug.

Bernd