The current version of StegosauR
, v.0.1.0, was developed
under R 3.5.1. To operate, it requires a series of additional R packages
(see installation instructions on GitHub). Reliability
tests were performed with the following versions:
## [1] "jpeg 0.1-8" "tiff 0.1-5" "png 0.1-7" "Unicode 12.0.0-1"
## [5] "dplyr 0.8.3" "openssl 1.4.1" "magrittr 1.5"
So far, I could test StegosauR
on four different
systems: three equipped with Windows and one with Ubuntu. I ran my test
using a 100x100
pixels, completely white .jpg file. Having a completely white image
helps to highlight how the file looks after a message is encrypted. My
test message was a 444 characters long randomly generated Lorem Ipsum text. All machines were
able to encode and decode the message correctly. The most recent Windows
computer (number 1 in tab. 1) took 2.5-5 seconds to accomplish each
task. The Ubuntu machine is a 10 years old Samsung Netbook. It took a
bit longer than the others to run StegosauR
, but it did its
job flawlessly nonetheless. A file encoded with one machine could be
decoded correctly by all the other three.
OS version | R version | RAM | CPU | encryption time | decryption time | |
---|---|---|---|---|---|---|
1 | Windows 10 Pro (x64) | 3.5.0 (x64) | 8GB | Intel i5-6500 3.2GHz | 2.5 | 2.4 |
2 | Windows 10 Home (x64) | 3.5.1 (x64) | 16GB | AMD FX-8320 3.5GHz | 4.8 | 4.9 |
3 | Windows 7 Enterprise (x64) | 3.4.2 (x64) | 8GB | Intel i5-2520M 2.5GHz | 5.2 | 5.1 |
4 | Ubuntu 18.03 LTS (i686) | 3.4.4 (i686) | 2GB | Intel Atom N270 1.6GHz | 32.7 | 34.9 |
For reference, storing a 5000 words piece of text (33000+ characters) in a 1000x1000 pixel image took 230 seconds with machine number one.
I ran also some additional tests to understand how many characters
can be stored in a given image. These tests were performed on a series
of .jpg files ranging from 50x50 to 1000x1000 pixels. Every image is in
RGB format, so it has three color channels. The maximum number of
available places to encode a message is given by image width x height x
number of channels. This means that a 100x100 pixels RGB image has a
total of 30000 potential encoding slots. As shown in previous sections,
StegosauR
needs 12 slots to encode every single character
(spaces, new line separators, etc. included), plus 12 additional slots
to store the information on message length. This means that our 100x100
image could theoretically store 2499 characters (the original 30000,
minus the necessary 12 for message length, all divided by 12).
Still, it is not really possible to fit a 2499 character text into a
100x100 pixel image. The pseudo-random generator employed by
StegosauR
has its limits, and it would require an
unpractical amount of computing time to generate all the unique
coordinate sets within an image (i.e. unique combinations of width,
lenght and channel). For this reason, the maximum message length for a
100x100 image is actually in the range of 600 characters. Future
versions of StegosauR
might include an option to sacrifice
speed in order to get more storage space, but this would add yet another
parameter that needs to be comunicated to the decoding user (beside the
passwords and the image itself), reducing the overall ease of use for
this application.
The pseudo-random generator produces a unique sequence of numbers based on the user-defined passwords (secret and salt, if any) and on message length. Different combinations of these parameters can then result in more or less long sequences of unique coordinate sets and, as a result, more or less valid slots for character encoding. Given all these variables, the storage test proceeded in this way: I attempted to store increasingly long character sequences within each test image. Each attempt with a given character sequence was tested with 25 password combinations (5 different secrets and 5 different salts).
The 5 different secrets and salts include StegosauR
default settings (secret=“StegosauR password”, salt=NA), as well as
single words, multiple random words, single characters and sequences of
random characters.
#different secrets ans salts used in the storage test.
secret <- c("StegosauR password","secret", "!" ,";>Tr8b]v{uD-!bF+", "multiply short post stick throw")
salt <- c(NA,"salt", "!" ,";>Tr8b]v{uD-!bF+", "multiply short post stick throw")
Text storage was tested at increments of 10 characters for 50x50 and 100x100 pixel images, 100 characters for 200x200 and 300x300 images, 500 character for 400x400 and 500x500 mages, 1000 characters for 600x600 and 700x700 images, 2000 for the 800x800 image, 2500 for the 900x900 image and finally 5000 for the 1000x1000 image. These different steps are needed because testing large images with small increments would have probably taken a few years of uninterrupted computations. Here are the results.
image size (pixels) | max. theoretical capacity (characters) | min. tested capacity | max. tested capacity | average tested capacity | average efficiency (avg. tested capacity vs. max. theoretical capacity) |
---|---|---|---|---|---|
50x50 | 624 | 70 | 260 | 140 | 22% |
100x100 | 2499 | 240 | 1130 | 590 | 23% |
200x200 | 9999 | 500 | 4300 | 2080 | 20% |
300x300 | 22499 | 1000 | 7400 | 4300 | 19% |
400x400 | 39999 | 2500 | 13000 | 6960 | 17% |
500x500 | 62499 | 3500 | 19500 | 10240 | 16% |
600x600 | 89999 | 4000 | 27000 | 13680 | 15% |
700x700 | 122499 | 5000 | 36000 | 17800 | 14% |
800x800 | 159999 | 6000 | 44000 | 21840 | 13% |
900x900 | 202499 | 5000 | 55000 | 25500 | 12% |
1000x1000 | 249999 | 5000 | 65000 | 29000 | 11% |
And here are the same values in graphical formal.
These tests were carried out with machine number 2 in table 1. The processing time increases very linearly with message length.
So, well, StegosauR
is not a terribly efficient
application. I am sure that the random number generator could be tweaked
to produce a higher number of unique combinations, but at this stage I
am quite happy with the overall results.
As illustrated in the previous sections, StegosauR
encodes messages by slightly altering the color of multiple pixels
within a digital image. But is this color change visible? Hiding some
text in a perfectly white image does result in some visible artifacts.
In this .tiff
image it’s clearly visible that some pixels are noticeably darker
than others. Before encryption, the image was completely white, so it’s
easy to spot which pixels contain hidden information. Below is an
attempt at showing this coloring effect using a screenshot of the
original .tiff. The arrows point at some of the slightly darker pixels
(admittedly, it is much clearer on the zoomed-in original .tiff).
Using any non-monochromatic image, like a normal photograph, helps to hide these anomalous pixels among the rest of the noise. Probably, a dedicated piece of software would still be able to detect weird noise patterns and separate images that contain hidden messages from those that don’t, especially in the case of very long messages that alter a high number of pixels.
StegosauR
offers an option to include random noise
across all pixels prior to encryption. It is just necessary to add
noise = TRUE
while using encosaur()
. This
functionality is quite aggressive though, and can cause some very
visible artifacts. It can also slow down the encoding process quite a
lot since it cycles through all pixels and all channels.
In theory, StegosauR
can process any Unicode character.
In practice, it depends on the capabilities of the underlying package
Unicode
. Let’s take “Ä” as an example. This character can
be converted in its corresponding Unicode value and back without any
issue.
library(Unicode)
char <- "Ä"
#covert to UTF-8
char.utf8 <- iconv(char, Encoding(char), "UTF-8")
#convert to Unicode
as.u_char(utf8ToInt(char.utf8))
## [1] U+00C4
#back to the original character
intToUtf8(as.u_char(as.u_char(utf8ToInt(char.utf8))))
## [1] "Ä"
library(Unicode)
char <- "’"
#covert to UTF-8
char.utf8 <- iconv(char, Encoding(char), "UTF-8")
#convert to Unicode
as.u_char(utf8ToInt(char.utf8))
## [1] U+2019
#back to the original character
intToUtf8(as.u_char(as.u_char(utf8ToInt(char.utf8))))
## [1] "’"
In this situation, it is just better to remove the problematic characters or replace them with similar-looking ones.
char <- "’"
#replace with gsub
gsub("’", "'", char)
## [1] "'"
All physical and digital devices able to carry information should
always be treated with due caution. Do not open suspicious email
attachments. Do not plug USB
drives found on the ground into your computer. Images containing
hidden messages are no exception. In the unlikely event that you receive
an image encoded with StegosauR
with instructions on how to
decode it, don’t do it unless you trust the source.
It is possible to hide R
code (or any code, probably) in
an image using StegosauR
, although so far I did not find a
way to execute that code immediately upon decryption. External help,
such as running decosaur()
inside an
eval(parse())
sequence seems to be needed to achieve this
purpose. I can’t exclude that more knowledgeable people might still find
a way to skip this step though.