Hi again,
I am currently analyzing a wikipedia dump file; I am extracting a bunch of data from it using python and persisting it into a PostgreSQL db. I am always trying to make things go faster for this file is huge (18GB). In order to interface with PostgreSQL, I am using psycopg2, but this module seems to mimic many other such DBAPIs...
Hi people,
Does anyone know a situation where a PostgreSQL HASH should be used instead of a B-TREE for it seems to me that these things are a trap. They are take way more time to CREATE or maintain than a B-TREE (at least 10 times more), they also take more space (for one of my table.columns, a B-TREE takes up 240 MB, while a HASH would...
the scenario:
Two databases(each has a database named, for example testdb):
MS Sql Server 2000
Postgresql 8.3
I need to synchronize these two testdbs, actually the direction is from SqlServer to Postgresql.
The structure of testdb on SqlServer may change occasionally.
I only need tables and data of testdb synchronized, exclude ind...
Hi,
I can see in the postgresql logs that certain simple queries (no joins and using only match conditions that use indexes) take anywhere from 1 to 3 seconds to execute. I log queries that take longer than a second to execute thus there are similar queries which execute under a second which don't get reported.
When I try the same quer...
I'm currently developing a pretty big project, and I'm considering open source databases to use.
One of the main factors I consider is support, and it seems like there is not much support/community for PostgreSQL compared to MySQL, even though MySQL seems like a much less fully featured product than PostgreSQL.
Should this fact shape m...
I want to be able to pass an "array" of values to my stored procedure, instead of calling "Add value" procedure serially.
Can anyone suggest a way to do it? am I missing something here?
Edit: I will be using PostgreSQL / MySQL, I haven't decided yet.
...
How do I put my whole PostgreSql database into the RAM for a faster access?? I have 8GB memory and I want to dedicate 2 GB for the DB. I have read about the shared buffers settings but it just caches the most accessed fragment of the database. I needed a solution where the whole DB is put into the RAM and any read would happen from the R...
I have a database populated with 1 million objects. Each object has a 'tags' field - set of integers.
For example:
object1: tags(1,3,4)
object2: tags(2)
object3: tags(3,4)
object4: tags(5)
and so on.
Query parameter is a set on integers, lets try q(3,4,5)
object1 does not match ('1' not in '3,4,5')
object2 does not match ('2' not i...
Hi again,
I just finished transferring as much link-structure data concerning wikipedia (English) as I could. Basically, I downloaded a bunch of SQL dumps from wikipedia's latest dump repository. Since I am using PostgreSQL instead of MySQL, I decided to load all these dumps into my db using pipeline shell commands.
Anyway, one of thes...
Hi again,
I am trying to execute this SQL command:
SELECT page.page_namespace, pagelinks.pl_namespace, COUNT(*)
FROM page, pagelinks
WHERE
(page.page_namespace <=3 OR page.page_namespace = 12
OR page.page_namespace = 13
)
AND
(pagelinks.pl_namespace <=3 OR pagelinks.pl_namespace ...
In output of explain command I found two terms 'Seq Scan' and 'Bitmap heap Scan'. Can somebody tell me what is the difference between these two types of scan? (I am using PostgreSql)
...
Hi again,
I have been playing around with the postgresql.conf file for a couple days now. I was wondering what variables you guys like customizing and why?
Here is a sample of the file:
# - TCP Keepalives -
# see "man 7 tcp" for details
#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tc...
What is the difference between these two apis?
Which one faster, reliable using Python DB API?
Upd:
I see two psql drivers for Django. The first one is psycopg2.
What is the second one? pygresql?
...
I have a table representing values of source file metrics across project revisions, like the following:
Revision FileA FileB FileC FileD FileE ...
1 45 3 12 123 124
2 45 3 12 123 124
3 45 3 12 123 124
4 48 3 12 123 124
5 48 3 12 123 ...
I am using the Python, Django framework and PostgreSQL combination. I am using the full text search of PostgreSQL (8.3) and am overriding the default filter function of Manager class in Django. Using the link: http://barryp.org/blog/entries/postgresql-full-text-search-django/ I was able to configure my full text search, as mentioned in t...
I am receiving a value out of range: underflow error from pgsql, in a query that uses the EXP(x) function. What values of x trigger this? How do I prevent or detect it?
...
Hi again,
I currently working with a larger wikipedia-dump derived PostgreSQL database; it contains about 40 GB of data. The database is running on an HP Proliant ML370 G5 server with Suse Linux Enterprise Server 10; I am querying it from my laptop over a private network managed by a simple D-Link router. I assigned static DHCP (private...
In Oracle's PL/SQL I can create a session based global variable with the package definition. With Postgresql's PLpg/SQL, it doesn't seem possible since there are no packages, only independent procedures and functions.
Here is the syntax for PL/SQL to declare g_spool_key as a global...
CREATE OR REPLACE PACKAGE tox IS
g_spool_key ...
I have a big list of hexadecimal numbers I'd like to insert into a PostgresQL table. I tried something like this:
INSERT INTO foo (i)
VALUES (0x1234);
...but that didn't work. Is this possible?
...
I've got a production DB with, say, ten million rows. I'd like to extract the 10,000 or so rows from the past hour off of production and copy them to my local box. How do I do that?
Let's say the query is:
SELECT * FROM mytable WHERE date > '2009-01-05 12:00:00';
How do I take the output, export it to some sort of dump file, and then...