Fast or Accurate Applications?

It is a common topic between software developers whether an application should be fast or accurate. I’ve always chosen accuracy over speed until now, like when I was solving problem 7 on project euler, which sounded very simple: find the 10001st prime number. My algorithm for finding prime numbers used Wilson’s theorem:

def is_prime?(number)
  (Math::Factorial.get(number-1)+1) % number == 0
end

And it was just fine for problem 3, where I had to find the prime factors of a certain number. There, the highest prime number I had to deal with was 6857 which is the 882nd prime number. Then I started to look for the 10001st prime, and my algorithm managed to find the 3000th after 2 hours.

I’ve observed before that my algorithm is not really fast, but why change something until it is not necessary, right? Wilson’s theorem is slow, because of the factorisation. Here is the factorial of 100 (100!):

>> (1..100).to_a.inject(:*)
=> -10384284350875615835879868794784780563948700682078952346418475
948982003462663882896989074486814358339575949818878047438776630877
0729164800000000000000000000
>>

It takes quite a long time and you can imagine how long it takes to calculate the factorial of 1000, or the factorial of the 10001st prime. So I needed something faster, and I found Fermat’s little theorem (setting a = 2):

def is_prime?(number)
   a = 2
   (a ** (number - 1) - 1) % number == 0
end

And it was faster, way faster (~86 seconds with IO operations, because I’m storing the primes in a file cache). But, Project Euler didn’t accept my solution. I thought I had screwed up the indexes and had submitted the 10000th prime instead of 10001st. So, I submitted primes in the range of +/-2, but none of them was good.

Since it would take ages to guess the right number, I took a different approach. I downloaded the first 1000 primes from the internet and compared them to my primes and there were differences. The first was 341. As it turned out, it is a pseudoprime. A pseudoprime looks like a prime according to Fermat’s little theorem, but actually it’s not: 11 * 31 = 341. The application became faster but less accurate. I didn’t go back to Wilson’s theorem, but changed the code quite a bit:

def is_prime?(number)
  a = 2
  if (a ** (number - 1) - 1) % number == 0
    if is_pseudoprime?(number)
      return false
    end
    return true
  end
  return  false
end

protected
def is_pseudoprime?(number)
  number_sqrt = Math.sqrt(number).to_i
  divisors = (1..number_sqrt).to_a.delete_if do |n|
    is_composite_of?(n, [2, 3, 5, 7, 11])
  end

  divisors.each do |n|
    # n > 1 makes sure that it works for actual primes in the recursion
    if number % n == 0 && n > 1
      return true
    end
  end
  return false
end

def is_composite_of?(n, base_primes)
  base_primes.each do |d|
    if (n % d == 0) && ((n / d) > 1)
      return true
    end
  end
  false
end

It is a bit slower, but more accurate (~89 seconds with IO). I think, it can be made faster (and cleaner), but for the moment I have enough primes and therefore I don’t need a different solution. Nevertheless, I learned a couple of things. First of all, playing safe and going for accuracy wasn’t a good choice because I didn’t improve and learn new things. Second, faster algorithms mea faster turnover times and faster development. And third, things like pseudoprimes finally make sense after 10 years I finished the University. Nevertheless, it was fun, solving Project Euler puzzles is fun, too. Give it a try, you’ll like it.


comments powered by Disqus