# Factoring large semi-primes

tl;dr a friend of mine mentioned this problem over beers, and neglected to tell me that it’s considered impossible (thanks). Me being me, I gave it a shot, and appear to have devised a legitimate attack on large semi-prime that is not dependent on their size.

Long story short, it’s easy to multiply x * y, but harder to get x and y from back out of the answer. At it’s most basic level, 7 * 5 = 35. Conversely 35 = x * y, solve for y. There can only be one valid answer if both factors are primes, and when both factors are primes the product is called a ‘semiprime

It’s asymptotic complexity. In short, it sucks to brute force. It’s easy for a 2 digit number, not so much for a 200 digit number. That ‘s why no one uses 2 bit RSA keys. This mathematical asymmetry  is the entire basis of RSA encryption.

Before we continue, let’s declare a few variables. p is the first prime, q is the second prime, pq is their product (and the semi-prime).

The most obvious speed-up is limiting the search space.

• The p and q will never be even numbers.
• p will always be greater than the square root of pq, and q will always be smaller.

At this point, the search gets a bit smaller. But it’s still huge. Taking some time to refactor the formula (and I skipped a step or two, I’ve lost my original notes though will try to recreate them at some point)

x = sqrt(pq+(n^2))
p = x – n
q = x + n

We can simply increment x until p*q == pq.

A few examples;

x = sqrt(35+(1^2)) = 6
p = 6 – 1 = 5
q = 6 + 1 =  7
Then confirm the results.. (5*7 == 35) and we’re done!

x = sqrt(77 + (1^2)) // is not an integer. Stop.
x = sqrt(77+(2^2)) = 9
p = 9 – 2 = 7
q = 9 + 2 =  11
Then confirm the results.. (7*11 == 77) and we’re done!

We’re at this point effectively looping through half of the difference of the primes, which is a lot smaller space than every number ever. This algorithm can factor a 50 digit prime in milliseconds on a netbook. This algorithm is fastest when the numbers are closest together.

Since we know both x and n are integers, we can do the search the other way, as x changes slower than n, by reversing it we get a speedup.

We can rearrange this as

x = ceil(sqrt(pq))
n = sqrt(x^2 – pq)
n = sqrt(6^2 – pq)
n = sqrt(36-35)
n = sqrt(1)
n = 1, x = 6.

p = 6+/- 1 = 5, 7.

Obviously the bigger the numbers are, the greater the difference between them. Ironically, it’s recommended to use values of p and q which are relatively close together. Thus this may be a valid attack against an RSA key

It’s non-deterministic, and can be parallelized (give each machine a 100,000,000,000,000,000 digit range to search

The only further speedup I see is if there’s a way to solve for the next integer value of n. The limiting factor here is that it’s effectively approaching a linear search as x approaches infinity. It’s extremely fast until the slope straightens out.

I’ve yet to come up with something so throwing this algorithm out there to see if anyone else can take it that last step.

UPDATE: Looking into it more, it’s a massive speedup in a small field, but not a large field, as it approaches a linear search of the difference between primes as x approaches infinity. This is fine for a 50 digit semiprime, but not a 200.