There's a long boring history of some old stuff I did for the original Ballinger/Keller Sierpinski project here.
After a turbulent change-over, about which I've already made my views clear, the project is now being attacked by a massive distributed computing project called Seventeen or Bust.
A sieve which until I spilt the beans about my algorithms on my machines benchmarked faster than Paul Jobling's SoBSieve! Paul's taken the lead again though, so if you are running Windoze, you'll probably want his SobSieve 1.22.
Choose your architecture from
Note - the *BSD ones were built using gcc 2.95 and so probably are less optimised than the linux and windows versions.
The screen output of the above is merely informational, tha actual factors found are written to a file whose contents (apart from CRLF issues) should be identical to that of SoBSieve, and should be pastable straight into the SoB Sieve submission Form.
The Sierpinski project is _different_ from normal prime finding.
For normal prime finding you sieve until you are rejecting composites no faster than a PRP test would. The Sierpinski project is different for 2 reasons:
(1) can be immediately resolved by considering the speed of the fastest PRP test, i.e. the smallest candidate remaining. (3) simply means that you have to be prepared to take an average. (2) is the trickier one to evauate.
If the Sierpinski primes were denser, then you would predict that you'll only need to test a fraction of the range, and therefore you scale down the sieving effectiveness by the ratio of the range you expect to test. However, Sierpinski candidates are sparse, and therefore for most of the candidates you do expect to test the whole range. However, I believe that about half of the k values are expected to crack before the n=20M range. That means that half of them, "on average" (which means nothing) are expected to crack at 10M. And therefore for half of the candidates, sieving is expected to only be half as useful.
With the above in mind, on the whole I'd say don't sieve much beyond the 50% removal rate of PRPing. Any machine that can't effectively PRP test, such as ones with slow cache/memory architectures should certainly be sieving.
Predicting when that rate cutoff will be reached will require some data gathering. However, assuming the numbers now have a flat distribution of prime factors (fair once they've been sieved beyond a certain point), the ratio of candidates being removed changes in proportion with the log of the prime sieved to.
So if there are 1000 candidates after sieving to 4G (2^32, e^22), then you'd expect:
P | log(p) | c | rate (c/G) |
---|---|---|---|
4G | 22.2 | 1000 | - |
10G | 23.0 | 963 | 6.12 |
20G | 23.7 | 935 | 2.82 |
50G | 24.6 | 900 | 1.16 |
100G | 25.3 | 875 | 0.49 |
200G | 26.0 | 852 | 0.233 |
500G | 26.9 | 823 | 0.096 |
1T | 27.6 | 802 | 0.041 |
2T | 28.3 | 783 | 0.0196 |
5T | 29.2 | 759 | 0.0082 |
Hmmm, more of the same, but with the initial conditions matching "smh"'s SoB information:
P | log(p) | c | rate (c/G) |
---|---|---|---|
5.7G | 22.5 | 825000 | - |
10G | 23.0 | 804860 | 4684 |
20G | 23.7 | 781340 | 2352 |
50G | 24.6 | 752280 | 969 |
100G | 25.3 | 731690 | 412 |
200G | 26.0 | 712200 | 195 |
500G | 26.9 | 687970 | 81 |
1T | 27.6 | 670720 | 34 |
2T | 28.3 | 654300 | 16 |
5T | 29.2 | 633800 | 7 |
10T | 29.9 | 619000 | 2.9 |
20T | 30.6 | 605000 | 1.4 |
50T | 31.5 | 588000 | 0.6 |
100T | 32.2 | 575000 | 0.25 |
200T | 32.9 | 563000 | 0.12 |
Someone said that it takes 1 day to do a PRP test, and he can sieve 6G per day. Therefore the 'assume all sieving is useful' cutoff would be rate=1/6. Assuming only 3/4 of sieving is useful (half of half will be unneeded), that becomes rate=2/9. i.e. the target for P is initially 100T given the above data. I'd like to see a real datapoint above 5.7G to help keep the momentous extrapolation on track.
Now as the candidates being tested grow, the time taken to test them grows, with the square of the exponent. Therefore by the time you're at 6G, you're looking at nearly 1/10th of the speed. Therefore the sieving depth should now be about 500T. (a factor of 10 sieving depth change would give more than a factor of 10 decrease in sieving efficiency, and only 13/16 of the sieving effort is relevant anymore, due to the fact 3-6M have been tested.)
Note well - this means that while PRP-ing is taking place, sieving should also take place, in parallel. It will become less useful as time goes on, but at that stage, the project ought to be considering getting 20-100M sieved on whatever k candidates remain.
I've not worked out how much of an issue the different densities of the different k values makes. I'm sure it's not huge, but it could skew things noticably. You see - the ones with the least candidates are the ones which are most likely to be tested all teh way to 20M, as the density's so sparse. Similarly, the ones with the most candidates are the ones that are most likely to not need to be tested the whole way as some should be expected to crack at 5M or 10M, or wherever. That means that the majority of the candidates we're removing are more likely to be ones that we wouldn't have tested anyway. That decreases sieving effectiveness. Maybe by a factor or up to 2 or even more (I've not compared the real densities, but I'm guessing there's a spread over several binary orders of magnitude.).
Wild stab in the dark:
p=50T by the time you're testing n=3000000
p=300T by the time you're testing n=6000000
It's not worth trusting these figures with any certainty, as they're wild extrapolations. By the time we're sieved to 1T or 10T, it should be _much_ clearer where the real target is, and I'll re-do the predictions then. Hopefully, I'll see if I can get Jack Brennen and/or Yves Gallot to refine the predictions using the insight that they have which is undouubtedly greater than that of anyone else's.
Any problems, mail me:
_ _/ / _ (_ _ _/ / ' / _ / _ / / /) (- / (/ / /) /) / ( (a) (/ (/ /) () () . ( () . (/ /( / /
Another hastily constructed page by Phil Carmody
Home /
Maths /
Sierpinski /
index.html