views:

427

answers:

9

This problem struck me as a bit odd. I'm curious how you could represent a list of prime numbers in a database. I do not know of a single datatype that would be able to acuratly and consistently store a large amount of prime numbers. My concern is that when the prime numbers are starting to contain 1000s of digits, that it might be a bit difficult to reference form the database. Is there a way to represent a large set of primes in a DB? I'm quite sure that this has topic has been approached before.

One of the issues about this that makes it difficult is that prime numbers can not be broken down into factors. If they could this problem would be much easier.

+5  A: 

This is a bit inefficient, but you could store them as strings.

Dan Williams
I don't like this solution [see the main comments], however: it does allow for Mersalt primes to be listed as well.
monksy
Check constraints can enforce that only actual numbers get stored in your table. And what will you use them for? For example - 1000 decimal digits is plenty for open crypto.
jva
+2  A: 

If you are not going to use database-side calculations with these numbers, just store them as bit sequences of their binary representation (BLOB, VARBINARY etc.)

Quassnoi
Where do you get the bit sequences from?
anon
simply read it as a base 2 number...
dionadar
From the prime numbers values, of course. Everything that goes into a computer should be a bit sequence.
Quassnoi
Read it from where? and into what? Weare talking (presumably) about 1000+ digit numbers here.
anon
@Quasino I think we have different understanding of the term "bit sequence". Or we mean the same thing. A string stored in a blob is of course a bit sequence. But then so is everything else, so the use of the term becomes pointless.
anon
`@Neil Butterworth`: well, I don't know where are the prime numbers are taken from and in which form do they go. If they will be later used for the digital processing (like cryptography), they better be stored as a bit sequence, since that's what a computer (especially cryptohgraphic application) operates with.
Quassnoi
By a "bit sequence" I mean the binary representation of the number stored as a `BLOB`. `127` would be store as a one byte `BLOB` containing this value: `01111111`. Endianness should be selected according to the nature of the software or a device these numbers will later go to.
Quassnoi
The point is that the database stores the number to be accessed later. I like this solution because its easier to parse back into the number value that you need.
monksy
Actually, parsing decimal digits from a string is just as easy, or even easier.
anon
`@Neil Butterworth`: how do you know the numbers are stored in decimal in the first place?
Quassnoi
anon
+4  A: 

You could store them as binary data. They won't be human readable straight from the database, but that shouldn't be a problem.

nickf
Agreed, I think this is probably the only way to go.
Jess
+3  A: 

Databases (depending on which) can routinely store numbers up to 38-39 digits accurately. That gets you reasonably far.

Beyond that you won't be doing arithmetic operations on them (accurately) in databases (barring arbitrary-precision modules that may exist for your particular database). But numbers can be stored as text up to several thousand digits. Beyond that you can use CLOB type fields to store millions of digits.

Also, it's worth nothing that if you're storing sequences of prime numbers and your interest is in space-compression of that sequence you could start by storing the difference between one number and the next rather than the whole number.

cletus
That makes sense. I didn't think about taking a differential approach. I like this, as that you can create relationships, and you would be able to find ranges REALLY quickly.
monksy
Well... it probably won't save *that* much space, particularly on large numbers as primes tend to get pretty spaced out. It'd be interesting to see how it does do though. Finding ranges quickly probably wouldn't be possible for the same reason you couldn't do arithmetic operations: the database just ins't capable of dealing with numbers that large.
cletus
You're right... I thought about that a few minutes after agreeing with the statement.
monksy
A: 

I think you're best off using a BLOB. How the data is stored in your BLOB depends on your intended use of the numbers. If you want to use them in calculations I think you'll need to create a class or type to store the values as some variety of ordered binary value and allow them to be treated as numbers, etc. If you just need to display them then storing them as a sequence of characters would be sufficient, and would eliminate the need to convert your calculatable values to something displayable, which can be very time consuming for large values.

Share and enjoy.

Bob Jarvis
A: 

Probably not brilliant, but what if you stored them in some recursive data structure. You could store it as an int, it's exponent, and a reference to the lower bit numbers.

Like the string idea, it probably wouldn't be very good for memory considerations. And query time would be increased due to the recursive nature of the query.

L. Moser
+1  A: 

It all depends on what kinds of operations you want to do with the numbers. If just store and lookup, then just use strings and use a check constraint / domain datatype to enforce that they are numbers. If you want more control, then PostgreSQL will let you define custom datatypes and functions. You can for instance interface with the GMP library to have correct ordering and arithmetic for arbitrary precision integers. Using such a library will even let you implement a check constraint that uses the probabilistic primality test to check if the numbers really are prime.

The real question is actually whether a relational database is the correct tool for the job.

Ants Aasma
+2  A: 

Hi

Here's my 2 cents worth. If you want to store them as numbers in a database then you'll be constrained by the maximum size of integer that your database can handle. You'd probably want a 2 column table, with the prime number in one column and it's sequence number in the other. Then you'd want some indexes to make finding the stored values quick.

But you don't really want to do that do you, you want to store humongous (sp?) primes way beyond any integer datatype you've even though of yet. And you say that you are averse to strings so it's binary data for you. (It would be for me too.) Yes, you could store them in a BLOB in a database but what sort of facilities will the DBMS offer you for finding the n-th prime or checking the primeness of a candidate integer ?

How to design a suitable file structure ? This is the best I could come up with after about 5 minutes thinking:

1 Set a counter to 2.
2 Write the two-bits which represent the first prime number.
3 Write them again, to mark the end of the section containing the 2-bit primes.
4 Set the counter to counter+1
5 Write the 3-bit primes in order. ( I think there are two: 5 and 7)
6 Write the last of the 3-bit primes again to mark the end of the section containing the 3-bit primes.
7 Go back to 4 and carry on mutatis mutandis.

The point about writing the last n-bit prime twice is to provide you with a means to identify the end of the part of the file with n-bit primes in it when you come to read the file.

As you write the file, you'll probably also want to make note of the offsets into the files at various points, perhaps the start of each section containing n-bit primes.

I think this would work, and it would handle primes up to 2^(the largest unsigned integer you can represent). I guess it would be easy enough to find code for translating a 325467-bit (say) value into a big integer.

Sure, you could store this file as a BLOB but I'm not sure why you'd bother.

Regards

Mark

High Performance Mark
The storage would be in the DB for lookup and sharing, not for insuring they are prime... its an assumption going in. I was looking at applying the sieve of erothsophes [sp] on a grid through the use of a database [it handles all of my condictions for sharing]
monksy
That would be Eratosthenes
High Performance Mark
Yep it is......
monksy
+7  A: 

If you really want to store primes as numbers and one of questions, stopping you is "prime numbers can not be broken down into factors", there are another thing: store it in list of modulus of any number ordered by sequence.

Small example:

2831781 == 2*100^3 + 83*100^2 + 17*100^1 + 81*100^0

List is:

81, 17, 83, 2

In real application is useful to split by modulus of 2^32 (32-bits integers), specially if prime numbers in processing application stored as byte arrays.

Storage in DB:

create table PRIMES
(
  PRIME_ID         NUMBER not null,
  PART_ORDER       NUMBER(20) not null,
  PRIME_PART_VALUE NUMBER not null
);

alter table PRIMES 
add constraint PRIMES_PK primary key (PRIME_ID, PART_ORDER) using index;

insert for example above (1647 is for example only):

insert into primes(PRIME_ID, PART_ORDER, PRIME_PART_VALUE) values (1647, 0, 81);
insert into primes(PRIME_ID, PART_ORDER, PRIME_PART_VALUE) values (1647, 1, 17);
insert into primes(PRIME_ID, PART_ORDER, PRIME_PART_VALUE) values (1647, 2, 83);
insert into primes(PRIME_ID, PART_ORDER, PRIME_PART_VALUE) values (1647, 3, 82);

prime_id value can be assigned from oracle sequence ...

create sequence seq_primes start with 1 increment by 1;

Get ID of next prime number to insert:

select seq_primes.nextval from dual;

select prime number content with specified id:

select PART_ORDER, PRIME_PART_VALUE 
from primes where prime_id = 1647 
order by part_order
ThinkJet
Its not the cleanest solution there is, but I really like it.
monksy