Hacking Cars With Z3
A modern car has many many computing parts which are responsible of various functions of the car.
Background about Cars and ECUs
A modern car has many many computing parts which are responsible of various functions of the car. When we say “modern” we mean every car that was manufactured in the last 30 years. These computing parts are called ECU – Electronic Control Unit. And they are all over the place. Even in the simplest car you have over a dozen of those – Engine Control, ABS, Battery Management and many others.
All ECUs in a car are connected to the Controller Area Network (CAN) bus. This is the network over which the ECUs send messages to each other and the car technician connects to when she needs to diagnose and or calibrate the car. You can typically find a port and connect to this network via OBD2 connector which is usually found close to the steering wheel.
These are our main subjects of research – ECUs and how can they be attacked or secured.
Unlocking ECUs
In traditional cyber attacks, once an attacker gains control over a system, he tries to preserve it – i.e. put some malware which grants him access or alter the system’s functionality without a need to attack the system again. In our research, we try to translate this idea to the automotive world of ECUs and CAN. The most straight forward way to put malware in an ECU is to re-program it. To re-flash it with a new firmware specially crafted by the attacker. However, it turns out that ECUs won’t happily agree to re-flash their firmware without any verification. When an ECU boots, it starts in a “locked” state. In locked state the ECU only responds to a subset of low-privileged messages such as status inquiry. To unlock the ECU and perform privileged operations one must go through a challenge-response authentication process. The unlocking process is as follows: 1. A devices sends “unlcok request” to ECU 2. ECU responds with a seed (which is usually a 32 bit value) 3. The device calculates a key using a secret “calculateKey” function and sends it to the ECU 4. The ECU replies with OK and change its state to “unlocked”
The heart of this process is the mysterious “calculateKey” function. This is what stands between us and a successful re-flashing. Let’s explore it!
Secret Functions and Where to Find Them
Technically, this function can be anywhere in the infrastructure used for updating ECU’s firmware. The ECU and car manufacturers obviously have it, as they need it when manufacturing the car. But we don’t have friends in such places yet and we don’t think they just give it away very easily. Fortunately, ECUs are also updated after the car has been manufactured. Sometimes bugs are found in an ECU and the vendor needs to re-flash it. Whenever you put your car in the garage the technician connects it to the vendor’s diagnostics tool and, if you are lucky, installs firmware updates. So it’s possible these “calculateKey” functions are somewhere in the diagnostics tools software or hardware! It is also possible that the vendor keeps these functions hidden somewhere in the backbone of the infrastructure and the diagnostics tool uses some API to invoke them. The latter design requires the diagnostics tool to be connected to the vendor whenever it needs to re-program an ECU. It’s not an unreasonable demand in our always connected world, but keep in mind these infrastructures were designed a long time ago. The newer diagnostics tools actually do require you to have an active account and be online when performing privileged operations, but older ones don’t. To sum, if you hunt for secret functions, your best chance is to reverse engineer old diagnostics tools. And this is exactly where we found them.
We reverse engineered an old diagnostics tool and found out it holds a few dozens of JavaScript files which contain a “calculateKey” function. Each file holds a function for a specific ECU model. But will it work with any of our ECUs? Quite some time ago, when we started our research, we sniffed our own CAN bus when it was in the garage. We were lucky to capture a firmware update session, so we have in our possession a seed/key pair for one of our ECUs. Unfortunately though, none of the functions we found produced the correct key given our seed. Time to go deeper!
What’s in the Box
So far, we didn’t even look at the functions, we just used them. But maybe it’s time to study them. Maybe we can find a pattern to these functions which will enable us to guess the correct function for one of our ECUs.
When we opened the files, we saw that about 70% of them are using the same algorithm with different constants. Here is the general algorithm translated to Python:
def scr1(seed, key_constant_1, key_constant_2):
tmpSeed = 0
tmpSeed |= ((seed >> 16) & 0xff) << 24;
tmpSeed |= ((seed >> 24) & 0xff) << 16;
tmpSeed |= ((seed >> 8) & 0xff) << 0;
tmpSeed |= ((seed >> 0) & 0xff) << 8;
shiftSeed = ((tmpSeed << 11) + (tmpSeed >> 22)) & 0xffffffff
return shiftSeed ^ key_constant_1 ^ (seed & key_constant_2)
The difference between the functions was the values of key_constant_1
and key_constant_2
which are hard-coded in the file (the seed
is the challenge from the ECU).
This algorithm is quite simple, it has only a few dozen bit-wise operations and each bit of the input influences only a few bits of the output. Definitely not a cryptographically secure hash function. We named this function, in lack of a better name, Simple Challenge Response 1 – SCR1. Seeing SCR1 is so popular, we figured it’s reasonable to assume that at least one of our ECUs uses it. We only need to find one and figure out the constants for the algorithm.
Laziness is a Virtue
We already know how to capture privileged sessions in our car, so we can get as many seed/key pairs for ECUs in our own car. We simply take it to the garage, ask the mechanic to forcibly update all our ECUs, and there we go. So, theoretically, we have a bunch of seed/key pairs. The last step is to write some code which process these pairs and spits out the constants for SCR1 and then we can try to unlock the ECU ourselves back home. We are not experts in crypto (as in cryptography) but this whole task smells like cryptanalysis. Unfortunately, none of our crypto friends were available, so we had to come up with something ourselves. Then, an idea popped in our head, maybe the future is already here and we can replace manual labor workers with machines! Maybe it’s time to use an automated mathematical analysis. The best automatic mathematician we are familiar with is the Z3 theorem prover. The Z3 engine (probably the best product ever developed in Microsoft) is a machine you can feed constraints to (axioms) and ask if there is a way to satisfy (prove) them. It has some nice API for common types such as 32 integers (Bit Vectors
) and other logic and arithmetic operations. Using Z3, we can invoke scr1
with the constants unbounded and add the constraint that its output for a specific seed must be the corresponding key. After feeding all these constraints, we can simply ask Z3 to check if the system can be satisfied – i.e. if exist constants which produces the correct keys for the given seeds. We can even ask what these constants are. Here is the code:
import z3
def find_kc(seed_key_arr):
kc1, kc2 = z3.BitVecs('kc1 kc2', 33)
s = z3.Solver()
## constrain the values to 32-bit range, i.e. MSB == 0
s.add(kc1 & 2 ** 32 == 0)
s.add(kc2 & 2 ** 32 == 0)
for seed, key in seed_key_arr:
s.add(z3.BitVecVal(key, 33) == scr1(z3.BitVecVal(seed, 33), kc1, kc2))
if s.check() != z3.sat:
return False
m = s.model()
found_kc1, found_kc2 = m[kc1].as_long(), m[kc2].as_long()
print "found {:#x}, {:#x}".format(found_kc1, found_kc1)
s.add(z3.Or(found_kc1 != kc1, found_kc2 != kc2))
if s.check() != z3.unsat:
return False
return found_kc1, found_kc2
The find_kc
function first defines two symbolic variables kc1
and kc2
of a 32-bit vector type. Then, it receives an array of seed/key pairs and adds each as a constraint, like we said. The constraint added to the system literally says “the output of scr1
, given the seed
, kc1
and kc2
should be the key”. Then we check if the system is satisfiable. If it’s not, we know for sure these pairs were not generated by SCR1. If it is satisfiable, it does not mean we found the correct value. It is possible that multiple different constants can satisfy the system. So, we do a trick. We query the model satisfying the system for the values of kc1
and kc2
and then add another constraint – that the values of kc1
and kc2
must not be the values we found. If the solver can satisfy this additional constraint – it means that multiple constants are possible and we don’t have a unique result. However, if it can’t satisfy this one last constraint – we know for certain we found the only possible constants.
Note: We use 33 bit values and constrain them to the 32-bit range because of an annoying subtlety of the Z3 API. The shift-right operator
>>
in Z3 is an arithmetic (signed) shift and not a logical shift which means “The empty position in the most significant bit is filled with a copy of the original MSB” (Wikipedia). While it’s possible to usez3.LShR
, it complicates the code and prevents passing normal Python types toscr1
. Constraining 33 bit vectors to the 32 bit range ensures the MSB is zero and everything works as expected. (Yes, it took us an hour to debug it.)
We tried this code with some randomly generated values and it works quite well. Crypto friends, it’s time to find a new job, have you ever considered professional driving?
One More Thing
We haven’t tried it on a real ECU yet, but before we do that, there is an important question to ask. How many sessions should we capture before we can derive the constants? How many seed/key pairs should we have before we can re-flash an ECU? The answer, my friend, is quite surprising. Our intuition was it’s more or less like a linear equation system. We have two variables, we probably need two pairs (i.e two equations). But intuition and crypto don’t go hand in hand. So we decided to check it with random values. We wrote a short main which fixes two 32 bit constants to randomly and then generate seed/key pairs until the find_kc
function finds them. Here is the main:
import random
kc1 = random.randint(0 , 2**32)
kc2 = random.randint(0 , 2**32)
sk_arr = []
while not find_kc(sk_arr):
seed = random.randint(0,2**32)
sk_arr.append((seed, scr1(seed, kc1, kc2)))
assert kc1, kc2 == find_kc(sk_arr)
print 'require {} smaples for an accurate finding'.format(len(sk_arr))
To our surprise, it usually takes about 7 iterations, much more than two. In some cases, it took 11 iterations.
Goodbye
This is it for now. We hope you enjoyed it. Hopefully we will be able to verify our speculations next week and see if our guesses were true or not. In the meantime, drive safe 🙂