The world's first illegal prime number?

Side note: As my girlfriend says, my wording might be quite good, but my spelling is attrocious. You've been warned - and note that my typing is inaccurate too! So far I've been approached from about 10 different directions for more information, and it only makes sense to try to get some sort of "press release" together. This is it, I'm afraid.

Or not!

There is nothing particularly innovative about the principles behind what I've done, and to claim otherwise would be wrong. I simply tied known things together. Pretty much anyone could have done it; I just happened to try it first.

The story...

The "prehistory"

As far as I'm concerned, the history lies with the 2600 case and the Jon Johanson case, information on which can be found on the The Electronic Frontier Foundation's page on DeCSS cases. All roads lead to Rome, and all HREFs lead to Dave Touretzky's DeCSS web page for far more background information.

The plan

My original plan was to make sure that the "bits", the raw data, of the DeCSS code was archived somewhere where it should be beyond the reach of the law. Somewhere where the number would be allowed to be printed because it had some property that made it publishable, independent of whether it was "illegal" or not. I needed to find a presentation of the data such that it had an intrinsicly archivable quality.

I have been a large prime prover (parse that either way!) for quite a while now (I refer you to, for example, the "Near Repdigit" record on Professor Caldwell's Prime Pages), and I tried to think how I could come up with a number archivable on those pages. The Prime Pages is the world's definitive resource for large prime number archiving and information; Professor Caldwell, who manages it, has put an awfully large amount of effort into deveoloping those pages, and into administering the archival of large primes, and has put a lot of thought into the criteria for their archivability.

The only thing I could think of was to try to find one of the largest ever numbers tested with an algorithm called ECPP. It's an algorithm usable on numbers with no particular form (N±1 is not easily factorable to any great extent), and is based on Elliptic Curves (which is quite a nice link to cryptography, and the fact the CSS was supposed to be a "cryptographic" algorithm).

The wrong prime!

It is important to note that the 1401-digit prime as listed on Dave Touretzky's page isn't large enough to be archivable. In fact mathematically there's little particularly interesting about the prime. However it has become "interesting" enough to be one of the curios on the Prime Curios site, which is part of the Prime Pages site, and is for the more "fun" aspects of prime numbers.

So my real archivable discovery is that of a 1905-digit prime, which shares the same initial bytes as the gzip file which forms the start of the 1401-digit prime) but has lots more 0s appended to it, and then an extra byte (99, I seem to remember) to make it prime. This is the 10th largest number ever proven using ECPP. It will therefore remain archived purely because it is the top 20. The reason it is archived is not that it is supposedly illegal, but its size, nothing more. This can be verified by doing a search of the whole database of large primes with the text search key "ECPP". (The server seems to be down at the the time of writing (the slashdot effect?), but if you follow the "Prime Lists" and the "Search whole database" links you'll get there in the end!). The above search will show you the value and sizes of all the ECPP-proven primes.

I had some trouble with the proving software - it's a hard job proving these things sometimes - and so "hedged my bets" and actually have 2 huge primes (the other is 1920 digits or more, and isn't yet submitted to the above database).

OK, with that out of the way, the questions that still remain are "how did I find it?" and "why did I bother doing this at all?".

Finding the Prime

The first question above is an easy one. I used a couple of wonderful pieces of software written by other people!

Firstly, I thought of the expression that I wished the prime to satisfy.
For the 1401-digit prime this was

~~bignum representing the .gz file~~ * 256 + X
where X is an odd number between 1 and 255. i.e I wanted to append just one byte to the .gz file. However, this search failed, so I changed the target to be 2 extra bytes:
~~bignum representing the .gz file~~ * 65536 + X, for X in 1 to 65535

For the archivable prime, I found out how many extra zeroes would be needed in order to pad the .gz file to be big enough to be archivable, and guessed it was about 156. So my search was
~~bignum representing the .gz file~~ *256^(156+Y) + X   
for Y in 0 to 300, and from X in 1 to 255

OK, with those in mind I created a configuration file for an excellent program called OpenPFGW (or just PFGW) which permits one to search for Probable Primes, or PrPs of any form. (It also lets you prove primes which have special properties, but my primes didn't have these properties). Information about OpenPFGW can be got from (maybe via a redirect) Chris Nash's Primeform Web page or via the links on the Prime Pages. (The program used to be called PrimeForm but changed its name when it went open source.)

What PFGW does is first trial factor the numbers (it tries up to a few million or maybe a billion, just to quickly reject obvious non-primes). After that it performs a Probable Primality test, which only takes a few seconds. As the search space I was using was much wider than it needed to be, I found many probable primes. My chances were roughly 1:1600 for finding a small one (1400/1401 digits) and 1:2100 for the larger prime. (It is proportional to ln(x) where x is the number in question, i.e. it is proportional to the number of digits/bits in the number.)

Now to a mathematician probable primes just aren't good enough. They are certainly not archivable on the Prime Pages until a deterministic proof is obtained. So the third part was a deterministic proof, using ECPP. The running time of ECPP deterministic proofs is O(N^6), where N is the number of bits in the probable prime, and so the proofs for very large primes (over 2000 digits) are very long. Remember that the probabilistic test took only 1 second, well the ECPP test would take many days! The software I used was Titanix, written by Marcel Martin, which is about as fast as you can get for ECPP proving. More information is available at Marcel Martin's web site.

If you have any further questions about the mathematics, please get back to me. The username "fatphil" and the domain name "yahoo.co.uk" normally work to this end.

Why?

In simple terms I believe that source code, which is pure information to my way of thinking, cannot intrinsically be illegal. So the banning of the Copyleft T-shirts, for example, I find to be absolutely bizarre. I therefore just wanted to have the data replicated in another form, but specifically a form which could not sensibly be considered illegal. The larger number(s) is/are record (top20) ECPP primes. You can't sensibly remove them from the archive, because the numbers are still prime and still proved using ECPP whether the MPAA likes what they stand for or not.

I am very grateful to Professor Caldwell for permitting me to put his site at risk by my action. If things do turn sour then of course my stubborn attitude must be ignored, and the numbers should be replaced with some place-holder indicating there used to be a prime number there. There is enough publicity surrounding these numbers for it to be impossible to remove every trace of them from the world. Even if they do, they'll still be primes; that's an intrinsic property of the numbers which cannot be removed.

Phil Carmody,
Kivenlahti, nr. Helsinki 2001/03/19

Links/Greets/Gibber

The Sequel

Yes, there's more, an executable prime which is allegedly illegal. Chapter Two is here.

This page © 2001 Phil Carmody