Skip to content

Encoding and Decoding

Explore the essentials of encoding and decoding data using Base64, Caesar Cipher, and URL transformations for secure and efficient data representation.
Written by: Brianna Laird
Last updated: December 2024


Encoding and decoding play a crucial role in the way data is represented, stored, and transmitted across systems. Whether it’s transforming binary data into a readable format, encoding a URL for safe web usage, or encrypting messages using a simple cipher, these operations enable secure and efficient communication. In this guide, we’ll explore how to implement popular encoding and decoding techniques like Base64, Caesar Cipher, and URL transformations. These foundational skills not only enhance your understanding of data manipulation but also pave the way for integrating advanced cryptographic solutions into your projects.

Base64

Base64 encoding is a method used to convert binary data into an ASCII string format by translating it into a radix-64 representation. This is particularly useful for encoding data that needs to be stored and transferred over media that are designed to deal with textual data. Base64 ensures that the data remains intact without modification during transport.

When using Base64, you need to define the characters used in the encoding process. This can be done with:

const string BASE64_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

Encoding with Base64

Encoding data with Base64 involves converting the input data into a sequence of characters from the Base64 character set. This process ensures that the data can be safely transmitted over text-based protocols such as HTTP or email. The following code shows how you can implement Base64 encoding to convert a message into a Base64-encoded string.

string base64_encode(const string &input)
{
string encoded_string;
int value = 0, bits = -6;
const unsigned int base64_mask = 0x3F;
for (unsigned char character : input)
{
value = (value << 8) + character;
bits += 8;
while (bits >= 0)
{
encoded_string.push_back(BASE64_CHARS[(value >> bits) & base64_mask]);
bits -= 6;
}
}
if (bits > -6)
{
encoded_string.push_back(BASE64_CHARS[((value << 8) >> (bits + 8)) & base64_mask]);
}
while (encoded_string.size() % 4)
{
encoded_string.push_back('='); // Padding to make the length a multiple of 4
}
return encoded_string;
}

Using this function

Here is an example of how you can use the base64_encode function to encode a message:

string message = "Hello, World!";
string encoded_message = base64_encode(message);
std::cout << "Original: " << message << std::endl;
std::cout << "Encoded: " << encoded_message << std::endl;

Expected Output

The encoded message should be a Base64 representation of the original message.

Original: "Hello, World!"
Encoded: "SGVsbG8sIFdvcmxkIQ=="

Decoding with Base64

Decoding Base64 data involves converting the encoded string back into its original binary form. This process is essential for retrieving the original data after it has been encoded using Base64. The following code shows how you can implement Base64 decoding to convert a Base64-encoded string back into its original form.

string base64_decode(const string &input)
{
string decoded_string;
int value = 0, bits = -8;
for (unsigned char character : input)
{
if (BASE64_CHARS.find(character) == string::npos)
{
if (character == '=')
break; // Padding character, stop decoding
continue; // Ignore any characters not in Base64 alphabet
}
value = (value << 6) + BASE64_CHARS.find(character);
bits += 6;
if (bits >= 0)
{
decoded_string.push_back(char((value >> bits) & 0xFF));
bits -= 8;
}
}
return decoded_string;
}

Using this function

Here is an example of how you can use the base64_decode function to decode a message:

string encoded_message = "SGVsbG8sIFdvcmxkIQ==";
string decoded_message = base64_decode(encoded_message);
std::cout << "Encoded: " << encoded_message << std::endl;
std::cout << "Decoded: " << decoded_message << std::endl;

Expected Output

The decoded message should match the original message that was encoded.

Encoded: "SGVsbG8sIFdvcmxkIQ=="
Decoded: "Hello, World!"

Caesar Cipher

The Caesar Cipher is a simple substitution cipher named after Julius Caesar, who used it to encrypt his private correspondence. It is a type of symmetric encryption, meaning the same key is used for both encryption and decryption. In the Caesar Cipher, each letter in the plaintext is shifted a fixed number of positions down or up the alphabet. For example, with a shift of 3:

  • A becomes D
  • B becomes E
  • C becomes F

The process wraps around at the end of the alphabet, so X becomes A, Y becomes B, and Z becomes C. Numbers and punctuation are typically left unchanged, but this can vary based on implementation.

The encryption function can be described mathematically as:

where:

  • is the encrypted letter,
  • is the position of the plaintext letter in the alphabet ( for , for , …, for ),
  • is the number of positions to shift (the key),
  • denotes the modulo operation.

For decryption, the function is:

where:

  • is the decrypted letter,
  • is the position of the encrypted letter in the alphabet,
  • is the number of positions to shift (the key).

Encoding with the Caesar Cipher

Encoding data with the Caesar Cipher involves shifting the characters in the input message by a fixed number of positions. This process ensures that the message is encrypted and can only be decrypted using the same shift value. The following code shows how you can implement the Caesar Cipher to encode a message by shifting the characters by a specified number of positions.

string caesar_cipher_encode(const string &input, int shift)
{
string encoded_string;
for (char character : input)
{
if (isalpha(character))
{
char base = islower(character) ? 'a' : 'A';
encoded_string += (character - base + shift) % 26 + base;
}
else
{
encoded_string += character; // Non-alphabet characters remain unchanged
}
}
return encoded_string;
}

Using this function

Here is an example of how you can use the caesar_cipher_encode function to encode a message:

string message = "Hello, World!";
int shift = 3;
string encoded_message = caesar_cipher_encode(message, shift);
std::cout << "Original: " << message << std::endl;
std::cout << "Encoded: " << encoded_message << std::endl;

Expected Output

The encoded message should be the original message with each character shifted by the specified number of positions.

Original: "Hello, World!"
Encoded: "Khoor, Zruog!"

Decoding with the Caesar Cipher

Decoding data with the Caesar Cipher involves shifting the characters in the input message by the reverse of the encoding shift value. This process ensures that the message is decrypted and can be read in its original form. The following code shows how you can implement the Caesar Cipher to decode a message by shifting the characters back by the specified number of positions.

string caesar_cipher_decode(const string &input, int shift)
{
return caesar_cipher_encode(input, 26 - (shift % 26)); // Reverse the shift
}

Using this function

The following code demonstrates how you can use the caesar_cipher_decode function to decode a message:

string encoded_message = "Khoor, Zruog!";
int shift = 3;
std::cout << "Encoded: " << encoded_message << std::endl;
std::cout << "Decoded: " << caesar_cipher_decode(encoded_message, shift) << std::endl;

Expected Output

The decoded message should match the original message that was encoded.

Encoded: "Khoor, Zruog!"
Decoded: "Hello, World!"

Brute-Forcing the Caesar Cipher

In some cases, the shift value used in the Caesar Cipher may not be known. To decrypt the message without the key, you can use a brute-force approach to try all possible shift values and find the one that produces a readable message. As there is only 26 possible shift values in the Caesar Cipher, this method is feasible and can be implemented efficiently. Here is how you can brute-force the Caesar Cipher to decrypt a message:

void caesar_cipher_brute_force(const string &input)
{
for (int shift = 0; shift < 26; shift++)
{
std::cout << "Shift " << shift << "=> " << caesar_cipher_decode(input, shift) << std::endl;
}
}

Using this function

Here is an example of how you can use the caesar_cipher_brute_force function to decrypt a message without knowing the shift value:

std::cout << "Enter the encoded message: ";
string encoded_message;
std::getline(std::cin, encoded_message);
caesar_cipher_brute_force(encoded_message);

Expected Output

The brute-force method should display all possible decrypted messages for each shift value.

Enter the encoded message:
Khoor Zruog!
Shift 0=> Khoor Zruog!
Shift 1=> Jgnnq Yqtnf!
Shift 2=> Ifmmp Xpsme!
Shift 3=> Hello World!
Shift 4=> Gdkkn Vnqkc!
Shift 5=> Fcjjm Umpjb!
Shift 6=> Ebiil Tloia!
Shift 7=> Dahhk Sknhz!
Shift 8=> Czggj Rjmgy!
Shift 9=> Byffi Qilfx!
Shift 10=> Axeeh Phkew!
Shift 11=> Zwddg Ogjdv!
Shift 12=> Yvccf Nficu!
Shift 13=> Xubbe Mehbt!
Shift 14=> Wtaad Ldgas!
Shift 15=> Vszzc Kcfzr!
Shift 16=> Uryyb Jbeyq!
Shift 17=> Tqxxa Iadxp!
Shift 18=> Spwwz Hzcwo!
Shift 19=> Rovvy Gybvn!
Shift 20=> Qnuux Fxaum!
Shift 21=> Pmttw Ewztl!
Shift 22=> Olssv Dvysk!
Shift 23=> Nkrru Cuxrj!
Shift 24=> Mjqqt Btwqi!
Shift 25=> Lipps Asvph!

Vigenère Cipher

The Vigenère cipher is a method of encrypting alphabetic text using a form of polyalphabetic substitution, which applies multiple Caesar Ciphers based on a keyword. Each letter in the plaintext is shifted by an amount determined by the corresponding letter in the keyword, creating a more secure encryption than a single-shift cipher. This method is more resistant to brute force attacks because the pattern of shifts varies based on the keyword.

Mathematically, the Vigenère cipher relies on modular arithmetic to compute shifts. For encoding, each letter in the plaintext is shifted according to the corresponding letter in the keyword:

where:

  • is the encoded letter,
  • is the position of the plaintext letter in the alphabet ( for , for , …, for ),
  • is the position of the keyword letter in the alphabet.

For decoding, the process is reversed:

where:

  • is the decoded letter,
  • is the position of the encoded letter in the alphabet,
  • is the position of the keyword letter in the alphabet.

This structured and repeatable process ensures that both encryption and decryption can be performed accurately using the same keyword.

Encoding with the Vigenère Cipher

Encoding a message with the Vigenère cipher involves applying different shifts to each letter in the plaintext, determined by the corresponding letter in the repeated keyword. This ensures that each character is encrypted with a different substitution, making the resulting ciphertext more complex and secure. The process involves converting characters to numeric values, applying modular arithmetic for the shifts, and converting the results back to alphabetic text.

string vigenere_cipher_encode(const string &input, const string &keyword)
{
string encoded_string;
for (size_t i = 0; i < input.size(); i++)
{
char character = input[i];
char key = keyword[i % keyword.size()];
if (isalpha(character))
{
char base = islower(character) ? 'a' : 'A';
encoded_string += (character - base + key - 'A') % 26 + base;
}
else
{
encoded_string += character; // Non-alphabet characters remain unchanged
}
}
return encoded_string;
}

Using this function

Here is an example of how you can use the vigenere_cipher_encode function to encode a message using a keyword:

string message = "Hello, World!";
string keyword = "KEY";
string encoded_message = vigenere_cipher_encode(message, keyword);
std::cout << "Original: " << message << std::endl;
std::cout << "Encoded: " << encoded_message << std::endl;

Expected Output

The encoded message should be the original message with each character shifted by the corresponding letter in the keyword.

Original: "Hello, World!"
Encoded: "Rijvs, Uyvvn!"

Decoding with the Vigenère Cipher

Decoding a message encrypted with the Vigenère cipher involves reversing the shifts applied during encryption using the same keyword. By applying the modular arithmetic in reverse, each character is shifted back to its original position, reconstructing the plaintext. This process relies on the keyword to match the shifts used during encryption, ensuring that the message can only be decrypted by someone with the correct key.

string vigenere_cipher_decode(const string &input, const string &keyword)
{
string decoded_string;
for (size_t i = 0; i < input.size(); i++)
{
char character = input[i];
char key = keyword[i % keyword.size()];
if (isalpha(character))
{
char base = islower(character) ? 'a' : 'A';
decoded_string += (character - base - key + 'A' + 26) % 26 + base;
}
else
{
decoded_string += character; // Non-alphabet characters remain unchanged
}
}
return decoded_string;
}

Using this function

Here is an example of how you can use the vigenere_cipher_decode function to decode a message using a keyword:

string encoded_message = "Rijvs, Uyvvn!";
string keyword = "KEY";
string decoded_message = vigenere_cipher_decode(encoded_message, keyword);
std::cout << "Encoded: " << encoded_message << std::endl;
std::cout << "Decoded: " << decoded_message << std::endl;

Expected Output

The decoded message should match the original message that was encoded.

Encoded: "Rijvs, Uyvvn!"
Decoded: "Hello, World!"

URL Transformations

URL transformations are used to encode and decode URLs to ensure that they are safe for use in web applications. URLs may contain special characters that need to be encoded to prevent errors and security vulnerabilities. The most common encoding scheme used for URLs is percent-encoding, which replaces special characters with a percent sign followed by two hexadecimal digits. This ensures that the URL is correctly interpreted by web browsers and servers.

Encoding URLs

Encoding URLs involves replacing special characters with their percent-encoded equivalents. This process ensures that the URL is safe for use in web applications and can be correctly interpreted by browsers and servers. The following code shows how you can implement URL encoding to convert a URL into a percent-encoded string.

// Function to convert a number to its hexadecimal representation
string to_hex(char num)
{
const char hex_chars[] = "0123456789ABCDEF";
return string(1, hex_chars[num & 0xF]);
}
// Function to encode a URL
string url_encode(const string &input)
{
string encoded_string;
for (char character : input)
{
if (isalnum(character) || character == '-' || character == '_' || character == '.' || character == '~')
{
encoded_string += character;
}
else
{
encoded_string += '%' + to_hex(character >> 4) + to_hex(character & 0xF);
}
}
return encoded_string;
}

Using this function

Here is an example of how you can use the url_encode function to encode a URL:

string url = "https://www.example.com/search?q=hello world";
string encoded_url = url_encode(url);
std::cout << "URL Encode (" << url << ")=> " << url_encode(url) << std::endl;

Expected Output

The encoded URL should replace special characters with their percent-encoded equivalents.

URL Encode (https://www.example.com/search?q=hello world)=> https%3A%2F%2Fwww.example.com%2Fsearch%3Fq%3Dhello%20world

Decoding URLs

Decoding URLs involves converting percent-encoded characters back to their original form. This process ensures that the URL is correctly interpreted by web browsers and servers. The following code shows how you can implement URL decoding to convert a percent-encoded string back into its original URL.

// Function to convert a hexadecimal character to its decimal representation
int from_hex(char hex)
{
if (hex >= '0' && hex <= '9') return hex - '0';
if (hex >= 'A' && hex <= 'F') return hex - 'A' + 10;
if (hex >= 'a' && hex <= 'f') return hex - 'a' + 10;
return -1; // Invalid hex character
}
// Function to decode a URL-encoded string
string UrlDecode(string input)
{
string decodedString = "";
for (int i = 0; i < input.Length; i++)
{
if (input[i] == '%')
{
if (i + 2 < input.Length)
{
char hex1 = input[i + 1];
char hex2 = input[i + 2];
int decodedChar = (FromHex(hex1) << 4) | FromHex(hex2);
if (decodedChar >= 0) // Only append if valid hex
{
decodedString += (char)decodedChar;
i += 2; // Skip the two hex characters
}
else
{
// Handle invalid hex gracefully (optional)
decodedString += '%';
}
}
else
{
// Handle incomplete encoding gracefully
decodedString += '%';
}
}
else if (input[i] == '+')
{
decodedString += ' '; // '+' is interpreted as space in URL encoding
}
else
{
decodedString += input[i];
}
}
return decodedString;
}

Using this function

string encoded_url = "https%3A%2F%2Fwww.example.com%2Fsearch%3Fq%3Dhello%20world";
std::cout << "URL Decode (" << encoded_url << ")=> " << url_decode(encoded_url) << std::endl;

Expected Output

The decoded URL should match the original URL that was encoded.

URL Decode (https%3A%2F%2Fwww.example.com%2Fsearch%3Fq%3Dhello%20world)=> https://www.example.com/search?q=hello world