Estimated reading time: 6 minutes
Base64 encoding is a conversion technique to convert binary data to ASCII symbols. It is not an encryption technique, it is much about the management of data that is flowing through any network channel. We will try to Base64 encode different strings in the examples below.
Base64 Encode “Get” String
1. Convert Character to Decimal
G = 71 , e = 101 , t = 116
2. Convert Decimal to 8 Bit Binary
Please check this link for converting decimal to 8 bit binary format
071 = 01000111
101 = 01100101
116 = 01110100
Note * 1 byte is of 8 bits so we are adding an extra 0 at the starting index
3. Convert Binary to Base64
3.1 Concatenate Binaries
we will concatenate the above 3 binary numbers :
010001110110010101110100 ( 24 bits )
3.2 Split Binaries to 6 Bits
From the name Base64, we get a clue there is something related to the number 64.
As we have already seen: G = 071 = 1000111
Note * ASCII characters are 7 bits long and "G" is a ASCII Charachter . To store G or transmit G through a network we need 7 bits. As ASCII characters are 7 bit long so total number of characters in ASCII table are 2 ^ 7 = 128 ( please refer the above ASCII table ) which means using 7 bits we can create 128 different combinations of characters .
In the same way, 64 is mapped to 2 ^ 6. So a Base64 character will be 6 bits long. This means in Base64 encoding we will find 64 combinations of characters. We are having a separate table for Base64 which contains only 64 characters. These characters are ASCII symbols don’t confuse them with ASCII values.
Base64 A is not equal to ASCII A. In the first step, we have concatenated 8 * 3 = 24 bits. As we came to know that the Base64 character is 6 bits long, so a character can be of only 6 bits. We will split that 24 bits into 6 bits each 6 * 4 = 24
010001 110110 010101 110100
3.3 Convert Binary to Decimal
Please check this link for converting binary to decimal
010001 = 17
110110 = 54
010101 = 21
110100 = 52
3.4 Convert Decimal to Base64
Base64 encodes the decimals using the above Base64 chart
17 = R
54 = 2
21 = V
52 = 0
Hence, Get is converted to R2V0
Base64 Encode “Java” String
1. Convert Character to Decimal
J = 74 , a = 97 , v = 118 , a = 97 [ Refer the above ASCII table ]
2. Convert Decimal to 8 Bit Binary
Please check this link for converting decimal to 8 bit binary format
074 = 01001010
097 = 01100001
118 = 01110110
097 = 01100001
3. Convert Binary to Base64
3.1 Concatenate Binaries
Concatenate the above 4 binary numbers :
01001010 01100001 01110110 01100001
3.2 Split Binaries to 6 Bits
As Base64 requires 6 bits, so we will split them into 6 bits :
010010 100110 000101 110110 011000 01
The last 6-bit is not complete, so we will add the remaining 4 bits “0000”
010010 100110 000101 110110 011000 010000==
Two equal signs at the end of the bit signify 4 zeroes
3.3 Convert Binary to Decimal
Please check this link for converting binary to decimal
010010 = 18
100110 = 38
000101 = 5
110110 = 54
011000 = 24
010000 = 16
3.4 Convert Decimal to Base64
Base64 encodes the decimals using the above Base64 chart
18 = S
38 = m
5 = F
54 = 2
24 = Y
16= Q
Hence, Java is converted to SmF2YQ==
Reason for using 6 bits
Few network devices such as routers or modems etc can transmit only 7 bits of data. By now we know ASCII characters are 7 bits so these devices can transmit ASCII symbols easily. But what happens when a character is of 8 bits?
There are many different symbols, characters, and emojis which are mostly of 8 bits. Now if this 8 bits data is passed through a 7 bits communication network then it is obvious that some data will be lost and the receiver will not receive the actual data which is sent by Sender, this is not acceptable. Hence Base64 encoding comes into the picture, using which we are converting 8 bits data to 6 bits data and since the network devices are compatible with 7 bits, so it will be a lot easier for them to transport 6 bits data .
If the network devices transport 7 bits of data, then why we are encoding using 6 bits? Why not 7 bits?
To be frank I am also not sure about it but the internet community might have thought on it and then only they might have landed on this 6 bits approach. I have thought enough about this and come up with some conclusions. For my understanding, I have documented it below, hope it makes sense to you as well.
Note* Please don’t treat the below written as fact it is just for understanding on a higher level.
Suppose we have to transmit an 8 bits character, now as the network devices transfer only 7 bits, so we will try to split these 8 bits :
1 ) 8 = 4 + 4 ( 6 bits get wasted )
2 ) 16 = 4 + 4 + 4 + 4 ( 12 bits wasted, so 18 bits will be wasted on 24 bits )
3 ) 16 = 6 + 6 + 4 ( 5 bits wasted on 16 bits, so 8 bits will be wasted on 24 bits )
4 ) 16 = 7 + 7 + 2 ( 5 bits wasted on 16 bits, so 8 bits will be wasted on 24 bits )
5 ) 24 = 6 + 6 + 6 + 6 ( Only 4 bits wasted on 24 bits )
In the 5th case for 24 bits, only 4 bits are wasted. Hence we can consider it the most efficient. Also, the no. of bits is equivalent throughout, so the management of bits will be much easier. So this was all about Base64 encoding.
If you like this article please share it. If you want to add anything extra please comment below and write to me if I have gone wrong anywhere.