R package StegosauR - part 2 (Steganography)

[back to intro] - [to part 1] - [to part 3] - [practical example] - [Capabilities]

Now comes the part where this sequence of numbers is transferred into an image. As an example, I picked the reconstruction of a Stegosaurus skeleton published in 1891 by O.C. Marsh in his paper “Restoration of Stegosaurus” (American Journal of Science, series 3, vol. 42).

Marsh_1891_Stegosaurus

#load image
library(jpeg)
img <- readJPEG("./images/StegosauR/Marsh_1891_StegosauRus.jpg")

The function readJPEG from package jpeg converts the image into an array where every pixel is divided into its fundamental color components (channels). In our case, our image has 404x250 pixels and three channels.

#show image dimensions

dims <- dim(img)

dims

## [1] 250 404   3

The three channels represent the RGB values of each pixel, each one ranging from 0 to 1. If all three channels are equal to 1, the pixel is white, if they are equal to 0, the pixel is black. In our case, the values of the top left pixel are:

#red
img[1,1,1]

## [1] 0.9960784

#green
img[1,1,2]

## [1] 0.9960784

#blue
img[1,1,3]

## [1] 0.9960784

The very, very basic approach of StegosauR consists in taking one of these numbers and replacing one decimal digit with a single “code number” of our hidden message. A “code number” is every single digit of our coded message after transformation from text into numerical values. Let’s take our “This is StegosauR” example from part 1.

v.collapsed <- "2222333222223442222234432222352522222422222234432222352522222422222235252222353222223433222234352222345522223525222234232222353322223324"

#split it into single code numbers
v.split <- strsplit(v.collapsed, "")[[1L]]

counter <- 1


#cycle through every pixel in order, starting from [1,1,1]
for (x in c(1:dims[2]))               {

  for (y in c(1:dims[1]))             {

    for (c in c(1:dims[3]))           {

      if (counter <= length(v.split)) {
      
      old_val <- img[y,x,c]

      val_head <- substr(old_val,1,4) #get initial 4 characters of (counting also the decimal separator)

      #update value
      img[y,x,c] <- as.numeric(paste(c(val_head, v.split[counter]), collapse=""))
      #print(paste("x: ", x," - y: ", y," - c:", c," - old_val: ", old_val, 
      #            " - val_head: ", val_head, " - new_val: ", img[y,x,c], " - counter: ", counter, sep=""))
      counter <- counter+1
      
                            }
                          }
                        }
                      }

The old pixel values have been successfully overwritten. Now each one of them contains a digit of our secret message in their third decimal position. For example, the first digit of our message is now coded into the first channel of the first pixel.

v.split[1] #this is the first digit of our coded message

## [1] "2"

img[1,1,1]

## [1] 0.992

All code numbers can be recovered through the same loop.

message_length <- length(v.split)

counter <- 1

code <- numeric()

for (x in c(1:dims[2]))               {

  for (y in c(1:dims[1]))             {

    for (c in c(1:dims[3]))           {

      if (counter <= message_length) {
      
      val <- img[y,x,c]

      get_code <- substr(val,5,5) #extract the third decimal digit
      
      code <- c(code, get_code)

      counter <- counter+1
      
                            }
                          }
                        }
                      }

The recovered code is identical to our starting message:

paste(v.split, collapse = "") == paste(code, collapse = "")

## [1] TRUE

Note that it is necessary to know exactly how long the message is, otherwise the loop would run across every possible pixel without knowing where to stop. For this reason, StegosauR automatically adds information on message length to the message itself. The number of characters within a message are converted into the usual twelve-digit format and prepended to the rest of the code. StegosauR proceeds to read the first 12 grid cells, extracts the information on total message length, and then loops across a known amount of positions.

Even when tiff files are saved without compression, there is always a bit of rounding going on. The message is encoded in the third digital position because there it is relatively safe from image compression, at least judging from my tests. For the moment, StegosauR has been tested with 16 bit tiffs. Using a 32 bit format (maybe in future versions) should allow to store a higher amount of data, although at the cost of a larger file size.

[back to intro] - [to part 1] - [to part 3] - [practical example] - [Capabilities]

R package StegosauR - part 2 (Steganography)

last updated September 2019