I've been a MYSQL user,never tried Postgres .
But MYSQL has bottle neck on fulltext search when the data set is huge.
I've been a MYSQL user,never tried Postgres .
But MYSQL has bottle neck on fulltext search when the data set is huge.
You know there is only one answer to this; and that is your just going to have to try it with your own dataset.
Personally I'd be surprised if there is a significant difference, I'd suspect the performance of this sort of thing is limited by IO bandwidth.
While it's unlikely that you'll find a significant benefit in Postgres over mysql, if can't hurt to test. However, your main issue, full-text search, is better resolved with something like Sphinx or Lucene. I have used Sphinx at work and found it vastly superior to mysql's built-in full text search. It is also quite easy to integrate into existing systems.
also see http://stackoverflow.com/questions/942810/php-mysql-fulltext-search-lucene-sphinx-or my original Question (including refs) about the different full-text search options
As has been mentioned before, it differs a lot between datasets, workload, and between how you set it up.
For example, GIN based full text indexes are very fast for searching, but very slow for insert/update. GIST based indexes are slower for searching (but still pretty fast), but much faster for insert/update.
If you don't have the need for database functionality, I would also probably look at sphinx or lucene for raw performance. The largest advantage of the integrated fulltext search in PostgreSQL is that it is just that - integrated. It has transaction support. Recovery support. Snapshot support. All those things that are of vital importance to a database. But if you don't need the db functionality, a solution that drops those requirements is likely faster.
I ran benchmarks a few years ago on large datasets and found that :
Is pretty slow. Another drawback is that it forces MyISAM on you which brings a lot of problems. Also index updates are quite slow once the index reaches a certain size : when you insert a new row, a substantial part of the index is re-generated, sometimes a few hundred megabytes of index are rewritten just to insert a forum post. In other words, it's OK for a small forum with a few MBytes of posts, but there is a reason Wikipedia doesn't use it...
Is about 10-100x faster than MySQL fulltext, is a lot more powerful, gist is fast on inserts/updates, no problem with locks, in other words it's a totally decent solution.
However searches get slow when the data set is larger than RAM because of MVCC, postgres needs to check the visibility of rows by hitting the heap. Note this may change in a future version. If your query returns 10 rows, no problem. However, if you want to SELECT WHERE (fulltext query) ORDER BY date LIMIT 10 and the fulltext matches 10.000 rows, it can get pretty slow. Still faster than MySQL but not the performance you'd want.
Xapian does not have to conform to the same restrictions as a database, so it can make a lot more opimizations. For instance, it's a single-writer multiple-reader concurrency model, so you'll need some sort of update queue to update your index in the background. It also has its own on-disk format. The result is that it is incredibly fast, even when the dataset is much larger than RAM, and especially on complicated queries matching lots of rows, with sorts, and returning only the most relevant ones.
The index is huge too, it probably contains lots of duplicated stuff. The consequence is that it doesn't need to seek to retrieve the stuff.
Basically once Postgres started to hit the IO-seek wall, MySQL was long dead, and Xapian kept blazing fast.
But it is not as nicely integrated in the database, so it is more work to use. It is only worth it if you have a huge dataset. If this is your case, try it, it's amazing. If your dataset fits in RAM, postgres will just work with a lot less hassle. Also if you want to combine fulltext and database queries, well, integration becomes important.