Watermarking images on Django sites

Techniques to generate visible and invisible watermarks using Pillow and django-imagekit

Posted by Agustín Bartó 4 months ago Comments

Have you ever noticed how stock photography sites add watermarks to the images shown on their catalogs? They do that to make sure people don’t just take the free samples and use them without proper licensing. Google Maps does it for the satellite imagery as well. Turns out this is pretty easy to do and we’ll show you how to do it for Django sites sites using Pillow and django-imagekit.

All the code is available on GitHub and a Vagrantfile is provided if you want to try the live Django demo by yourself.

Watermarking images

First we’ll show you three simple watermarking techniques that can be used in an Python project that uses images, and then we’ll use what we’ve built to add watermarks to images on a sample Django site.

Text overlay

The first thing we’ll do is add a semi-transparent text legend on the center of an image. We’ll use Pillow’s Image, ImageDraw and ImageFont modules.

The following function adds a text overlay on the center of the supplied image:

# processors.py

from PIL import Image, ImageDraw, ImageFont


_default_font = ImageFont.truetype('/usr/share/fonts/dejavu/DejaVuSans-Bold.ttf', 24)


def add_text_overlay(image, text, font=_default_font):
    rgba_image = image.convert('RGBA')
    text_overlay = Image.new('RGBA', rgba_image.size, (255, 255, 255, 0))
    image_draw = ImageDraw.Draw(text_overlay)
    text_size_x, text_size_y = image_draw.textsize(text, font=font)
    text_xy = ((rgba_image.size[0] / 2) - (text_size_x / 2), (rgba_image.size[1] / 2) - (text_size_y / 2))
    image_draw.text(text_xy, text, font=font, fill=(255, 255, 255, 128))
    image_with_text_overlay = Image.alpha_composite(rgba_image, text_overlay)

    return image_with_text_overlay

We create a new Image with the same dimensions as the original image on which we’re going to draw the text. In order to do that we create an ImageDraw.Draw instance and call ImageDraw.Draw.text (we also use ImageDraw.Draw.textsize to get the size of the text which we’ll need to know where to place it.)

We also need an ImageFont instance that can be created from an existing TrueType font (using ImageFont.truetype).

Finally we use Image.alpha_composite to combine the original image (converted to RGBA mode) and the text overlay.

/static/media/uploads/watermarking-images-django/screenshot-text-overlay.jpg

Watermark

The next technique is superimposing a semi-transparent watermark on top of the image. The code is similar to the one used for text overlays:

# processors.py

from PIL import Image, ImageDraw


def add_watermark(image, watermark):
    rgba_image = image.convert('RGBA')
    rgba_watermark = watermark.convert('RGBA')

    image_x, image_y = rgba_image.size
    watermark_x, watermark_y = rgba_watermark.size

    watermark_scale = max(image_x / (2.0 * watermark_x), image_y / (2.0 * watermark_y))
    new_size = (int(watermark_x * watermark_scale), int(watermark_y * watermark_scale))
    rgba_watermark = rgba_watermark.resize(new_size, resample=Image.ANTIALIAS)

    rgba_watermark_mask = rgba_watermark.convert("L").point(lambda x: min(x, 25))
    rgba_watermark.putalpha(rgba_watermark_mask)

    watermark_x, watermark_y = rgba_watermark.size
    rgba_image.paste(rgba_watermark, ((image_x - watermark_x) // 2, (image_y - watermark_y) // 2), rgba_watermark_mask)

    return rgba_image

Before we can combine the original image with the watermark we have to scale the watermark to make sure it fits the image. We use Image.resize with the new size, which we want it to be around half of the original image surface area, and the watermark image.

We then create a mask using the resized watermark converting it to grayscale with Image.convert, filtering out pixels with a value higher than 25 using Image.point and finally replacing the alpha channel on the watermark image with the mask using Image.putalpha.

The last step is to combine the original image with the watermark with Image.paste.

/static/media/uploads/watermarking-images-django/screenshot-watermark.jpg

Invisible Watermark

The last technique we’ll show is encoding arbitrary data into the image without affecting its appearance significantly (Also known as Steganography). We’ll do this by storing the individual bits of our data package into the least significant bit (LSB) of the original image. It is fairly rudimentary and easily detectable and defeatable, but it’ll give you an idea of what we can do with Pillow.

# processors.py

import numpy as np

from io import BytesIO
from pickle import dump

from PIL import Image, ImageMath


def lsb_encode(data, image):
    bytes_io = BytesIO()
    dump(data, file=bytes_io)
    data_bytes = bytes_io.getvalue()
    data_bytes_array = np.fromiter(data_bytes, dtype=np.uint8)
    data_bits_list = np.unpackbits(data_bytes_array).tolist()
    data_bits_list += [0] * (image.size[0] * image.size[1] - len(data_bits_list))
    watermark = Image.frombytes(data=bytes(data_bits_list), size=image.size, mode='L')
    red, green, blue = image.split()
    watermarked_red = ImageMath.eval("convert(a&0xFE|b&0x1,'L')", a=red, b=watermark)
    watermarked_image = Image.merge("RGB", (watermarked_red, green, blue))
    return watermarked_image

Before we can do anything with the image, we need to convert the data package to bits. We do this pickling the data onto a BytesIO and then use NumPy‘s fromiter and unpackbits and finally padding the result so it fits the dimensions of the original image.

We could have use struct instead of pickle, but the latter is a lot easier to use.

We then create a grayscale image using Image.frombytes and a bytes instance from the bits array we got before.

Finally we split the original image into red, green and blue channels, replace the least significant bit of each byte of the red channel with the information from our watermark using PIL.ImageMath.eval and reconstitute the original image using the new red channel and the old blue and green ones using Image.merge.

The problem with this approach is that even a minor change to the image will mess with the information stored in it. There are more robust techniques we could have used, but the LSB is one of the easiest to understand and implement.

Assuming the image remains intact (which requires the usage of a lossless format), How do we get the data out of it?

# processors.py

import numpy as np

from io import BytesIO
from pickle import load, UnpicklingError

from PIL import Image, ImageMath


def lsb_decode(image):
    try:
        red, green, blue = image.split()
        watermark = ImageMath.eval("(a&0x1)*0x01", a=red)
        watermark = watermark.convert('L')
        watermark_bytes = bytes(watermark.getdata())
        watermark_bits_array = np.fromiter(watermark_bytes, dtype=np.uint8)
        watermark_bytes_array = np.packbits(watermark_bits_array)
        watermark_bytes = bytes(watermark_bytes_array)
        bytes_io = BytesIO(watermark_bytes)
        return load(bytes_io)
    except UnpicklingError:
        return ''

We split the watremarked image into its red, green and blue channels, we filter out the original image’s information from the red channel using PIL.ImageMath.eval and convert the result to grayscale.

We then use NumPy‘s fromiter and packbits on the modified red channel’s data to get the bytes of our original data package, and we unpickle the results.

Watermarks in Django

We can apply what we showed you into a Django site using django-imagekit. It provides template filters and model tools to apply arbitrary processing to Django images.

Everything in django-imagekit is done trough Specs. Specs apply processors to images to generate new ones. They can be used in templates using filters, or in models using ImageSpecField to generate images dynamically, or ProcessedImageField to save the generated images alongside the model.

We’ll show you how to use the specs with templates and later we’ll mention briefly the other methods.

# processors.py

from django.conf import settings
from imagekit import ImageSpec, register

class TextOverlayProcessor(object):
    font = ImageFont.truetype('/usr/share/fonts/dejavu/DejaVuSans-Bold.ttf', 36)

    def process(self, image):
        return add_text_overlay(image, 'django-watermark-images', font=self.font)


class WatermarkProcessor(object):
    watermark = Image.open(settings.WATERMARK_IMAGE)

    def process(self, image):
        return add_watermark(image, self.watermark)


class HiddenWatermarkProcessor(object):
    def process(self, image):
        return lsb_encode('django-watermark-images', image)


class TextOverlay(ImageSpec):
    processors = [TextOverlayProcessor()]
    format = 'JPEG'
    options = {'quality': 75}


class Watermark(ImageSpec):
    processors = [WatermarkProcessor()]
    format = 'JPEG'
    options = {'quality': 75}


class HiddenWatermark(ImageSpec):
    processors = [HiddenWatermarkProcessor()]
    format = 'PNG'


register.generator('items:text-overlay', TextOverlay)
register.generator('items:watermark', Watermark)
register.generator('items:hidden-watermark', HiddenWatermark)

We’ve integrated the watermarking techniques into processors which in term are used by custom ImageSpec implementations. We then register said specs as generators so they can be used with the generateimage filter:

{% load imagekit %}

...

<div class="row item-row">
    <div class="col-lg-2 item-label">Image</div>
    <div class="col-lg-10"><img class="img-responsive item-image" src="{{ object.image.url }}"></div>
</div>
<div class="row item-row">
    <div class="col-lg-2 item-label">Text Overlay</div>
    <div class="col-lg-10">{% generateimage 'items:text-overlay' source=object.image -- class="img-responsive item-image" %}</div>
</div>
<div class="row item-row">
    <div class="col-lg-2 item-label">Watermark</div>
    <div class="col-lg-10">{% generateimage 'items:watermark' source=object.image -- class="img-responsive item-image" %}</div>
</div>
<div class="row item-row">
    <div class="col-lg-2 item-label">Hidden Watermark</div>
    <div class="col-lg-10">{% generateimage 'items:hidden-watermark' source=object.image -- class="img-responsive item-image" %}</div>
</div>

The snippet above is part of a template to display the following model:

# models.py

import uuid

from django.db import models
from django.urls import reverse_lazy
from django.utils.translation import ugettext_lazy as _

from django_extensions.db.models import TitleDescriptionModel, TimeStampedModel


def image_upload_to(instance, filename):
    return 'original_image/{uuid}/{filename}'.format(uuid=uuid.uuid4().hex, filename=filename)


class Item(TitleDescriptionModel, TimeStampedModel):
    image = models.ImageField(_('original image'), upload_to=image_upload_to)

That’s pretty much all you need. If you wanted to use an ImageSpecField or a ProcessedImageField instead of filters, we need to change the model:

# models.py


from .processors import TextOverlay, Watermark, HiddenWatermark


class Item(TitleDescriptionModel, TimeStampedModel):
    image = models.ImageField(_('original image'))
    text_overlay_image = ImageSpecField(source='image', processors=[TextOverlay()], format='JPEG')
    watermark_image = ImageSpecField(source='image', processors=[Watermark()], format='JPEG')
    hidden_watermark_image = ImageSpecField(source='image', processors=[HiddenWatermark()], format='PNG')

and later reference the new fields in templates:

<div class="row item-row">
    <div class="col-lg-2 item-label">Image</div>
    <div class="col-lg-10"><img class="img-responsive item-image" src="{{ object.image.url }}"></div>
</div>
<div class="row item-row">
    <div class="col-lg-2 item-label">Text Overlay</div>
    <div class="col-lg-10">{{ object.text_overlay_image.url }}</div>
</div>
<div class="row item-row">
    <div class="col-lg-2 item-label">Watermark</div>
    <div class="col-lg-10">{{ object.watermark_image.url }}</div>
</div>
<div class="row item-row">
    <div class="col-lg-2 item-label">Hidden Watermark</div>
    <div class="col-lg-10">{{ object.hidden_watermark_image.url }}</div>
</div>

Conclusions

As you can see, all we needed was a little bit of knowledge of Pillow’s API and the rest of work was handled for us by django-imagekit. There are a lot of things that these two frameworks can do for us like generating thumbnails, changing formats, etc.

One thing you should notice is that all these methods work on images during a request and, depending on the complexity of the task, might introduce an unacceptable delay in giving the response back to the user. We usually offload this work (generally using Celery) and respond to the client as soon as possible.

We’ve only shown you a dead simple steganography method. If you want to see more advanced techniques, check out Stéganô.

Feedback

As usual, comments, suggestions and pull requests are more than welcomed.


Previous / Next posts


Comments