Personal Page of DM3MAT

Reverse engineering of the code plug format

Once the communication protocol is known, it is time to focus on the code plug format. Depending on the code plug complexity, this can be a very cumbersome task. Moreover, depending on the device, it can be pretty hard to access the binary representation of the configuration (code plug) that is actually written onto the device. In general, one cannot expect, that the binary file, created by the manufacturer, does relate to the binary code plug written onto the device at all.

To figure the relationship out, between the code plug file and the actual code plug written onto the device, the code plug must first be extracted from the Wireshark capture. As we already know the protocol, this can be done easily using Python and pyshark. When doing so, be careful with the addresses, data is written to or read from. The binary code plug might not be written sequentially and with gaps. A best practice is to extract the data read or written along with its addresses and sort it with respect to the address. Then dump everything as a hex-dump.

Once the code plug is extracted from the Wireshark capture, compare it to the corresponding code plug file created by the CPS. If you are lucky (usually for very cheap DMR radios), the CPS code plug file is simply the binary code plug written to the device (e.g., for GD77, RD5R). Then it is very easy to reverse engineer the code plug format, as you can study changes to the CPS code plug file.

If they are not identical, you will need to somehow extract the binary code plug written to the device from the manufacturer CPS. If the protocol is based on Serial-over-USB, you may be able to write an emulator for the device, knowing the communication protocol. This then allows to trick the CPS to talk to the emulator instead of the device and dump the code plug. If not, you are likely stuck with writing a lot of code plugs to the real device and extracting the code plug from the Wireshark captures. This can be considered the worst-case scenario.

Differential analysis

Irrespective on how you gain access to the binary code plug written onto the device, the analysis of this code plug is always the same. This is called differential analysis. The idea is, to change a single option in the CPS and compare the resulting binary code plugs bit by bit. Ideally, only a single place in the binary code plug will change too and this change then reflects the encoding of the particular setting touched. This is then repeated for all possible settings in the CPS until the entire code plug structure is known.

Note

There will be some bits and bytes in the code plug that will never changed. These are usually reserved bits/bytes. There may also be some undocumented and hidden options in the code plug, that cannot be set via the (normal) CPS.

Before you can start reverse engineering the code plug, you need to create a base code plug. That is, one small simple code plug, that touches every feature of the radio. It should not be overly complex with hundreds of channels, but should contain everything the radio provides. For example, it should contain the APRS settings, if the radio has GPS or a roaming zone if the radio supports roaming. You should also create two of every kind. That is, at least two channels, contacts, group lists, zones and scan lists. Otherwise, it will be hard to figure the index scheme out.

Once you created the base-code plug, get its binary representation (e.g., from file, emulation or Wireshark capture). Change something simple. For example the name of the first contact. Then get the binary representation of this modified code plug and compare the two binary code plugs.

As an example, consider the following difference between two hex-dumps of the binary code plugs for a BTech DMR-6X2UV Pro. Here, the name of the first contact was changed.

Example 8.3. Difference between two binary code plugs for a DMR-6X2UV Pro. Only the name of the first contact was changed.

< 02680000 : 01 43 6f 6e 74 61 63 74 20 31 00 00 00 00 00 00 | .Contact 1......
< 02680010 : 00 00 00 55 6e 6b 6e 6f 77 6e 00 00 00 00 00 00 | ...Unknown......
> 02680000 : 01 56 65 72 79 20 6c 6f 6e 67 20 6e 61 6d 65 2e | .Very long name.
> 02680010 : 2e 00 00 55 6e 6b 6e 6f 77 6e 00 00 00 00 00 00 | ...Unknown......

This difference already shows some very important information about the code plug encoding. First, the list of contacts appears to start at address 2680000h. This is very important for the coarse structure of the binary code plug. We also learn about the structure of the encoded contacts, that the name starts at offset 01h. So, the first byte is still unknown. The length of the name is limited to 16 bytes and is terminated/padded with 0-bytes. However, we still don't know how large the encoded contact element in the code plug actually is.

To figure out the size of the contact element, one can use the fact, that all elements very likely have identical sizes. Hence the offset from one name to the name of the next contact will be the size of the contact element. We already know, that the contact list starts at the address 2680000h. Hence, we can look in the code plug directly at that address.

Example 8.4. Codeplug hex-dump at the contact-list address.

02680000 : 01 43 6f 6e 74 61 63 74 20 31 00 00 00 00 00 00 | .Contact 1......
02680010 : 00 00 00 55 6e 6b 6e 6f 77 6e 00 00 00 00 00 00 | ...Unknown......
02680020 : 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 | ................
02680030 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
02680040 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
02680050 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
02680060 : 00 00 00 00 01 43 6f 6e 74 61 63 74 20 32 00 00 | .....Contact 2..
02680070 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
02680080 : 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 | ................
02680090 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
026800A0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
026800B0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
026800C0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................

The offset between the two names Contact 1 and Contact 2 is exactly 100 bytes (64h). With this, we know the exact size of the contact element. However, wo only know what 16 of these 100 bytes mean. So there is a lot to figure out. The differential analysis, however, makes it quiet easy. First we may change the call type of the first contact. From group call to private call and study the differences in the code plug.

Example 8.5. 

< 02680000 : 01 43 6f 6e 74 61 63 74 20 31 00 00 00 00 00 00 | .Contact 1......
> 02680000 : 00 43 6f 6e 74 61 63 74 20 31 00 00 00 00 00 00 | .Contact 1......

We see, that only one byte has changed. The one at offset 0. It changed from 01h to 00h. We may conclude that this byte encodes the contact type, where 00h means private call and 01h means group call. Analogously, we find out that 03h means all-call.

The next step is to find the encoding and offset of the contact DMR ID. To find it, we reload the base code plug and change the DMR ID of the first contact from 1 to something more complex. E.g., to 1234567 and compare the binary .

Example 8.6. 

< 02680020 : 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 | ................
> 02680020 : 00 00 00 01 23 45 67 00 00 00 00 00 00 00 00 00 | ....#Eg.........

Here we find some changes at the offset 23h. The content changed from 00000001h to 01234567h. This is somewhat weird, as the binary representation of 1234567 in hex is actually 12d687h. However, for DMR code plugs, it is quiet common to store integers in so-called BCD. Here each digit is stored in 4bits. This has the consequence, that the decimal digits are readable in hex. But this storage is inefficient, as the DMR ID is actually a 24bit number, that can be stored in 3 bytes. Its BCD encoding requires 4 bytes.

The last setting for contacts remaining in the CPS is the call-alert. This setting is only valid for private calls. Changing the contact 1 from group to private call, changing the call alert from None to Ring and comparing the code plugs, one gets:

Example 8.7. 

< 02680000 : 01 43 6f 6e 74 61 63 74 20 31 00 00 00 00 00 00 | .Contact 1......
> 02680000 : 00 43 6f 6e 74 61 63 74 20 31 00 00 00 00 00 00 | .Contact 1......
< 02680020 : 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 | ................
> 02680020 : 00 00 00 00 00 00 01 01 00 00 00 00 00 00 00 00 | ................

Of cause, we find two changes. The change of the first byte, indicating the change from group to private call, as well as one byte changing at offset 27h from 00h to 01h. This is the only unexplained change and thus must reflect the alert-type setting. Consequently, the alert type None is represented by 00h and Ring by 01h. Analogously, we find that alert-type Online is encoded as 02h.

Obviously, we still don't know the majority of the 100 bytes of the contact element. Lets have a look at the structure of the contact element.

     0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
00 |TYP| Name, 16 x ASCII, 0-pad                                ...
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
10     |Pad| Unknown 16 bytes, ASCII?                           ...
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
20  ...    |Pad|DMR ID, BCD, BE|Rng| Unused, filled with 00h    ...
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
60                 |
   +---+---+---+---+

There are some additional Pad bytes (00h) that terminate ASCII strings. This is nothing, we found by the differential analysis, but it is some common scheme, found in many binary code plugs. Obviously, the firmware developer did not knew the strnlen C function.

The remaining bytes of the contact element appears to be empty, unused space. This is also quiet frequent in code plugs. They contain a lot of unused bytes that might be reserved for future uses. This allows the firmware developer to extend the code plug, while maintaining backward compatibility with older versions of the CPS.

Having reverse engineered the contacts, does not mean that you are done. There are a lot more elements to decipher. Did I mentioned that it is a lot of work? But the general scheme of things remain the same. Change a single setting in the CPS and compare the resulting binary code plugs. Document your findings. Step by step, you will figure out the entire code-plug structure.