Coder Social home page Coder Social logo

waterdown's Introduction

waterdown

Detecting, reconstructing, and masking image watermarks with numpy

Requirements:

  • numpy
  • matplotlib
  • imageio
  • scipy
  • cv2 (I'm using OpenCV 4.0)
  • skimage

Project goals

  • Use numpy to estimate the original watermark (and produce output from which to load it for future use)
  • Inpaint/offset the watermark region so as to unmark the image
  • Calculate the alpha opacity of the watermark
    • 24/255, or around 9.4%
  • Apply to a gif (of a different size to the still images)

Demo

  • In part 1 of this demo, it is explained how watermarks are extracted from images (estimated)
  • In part 2, the estimated watermark [image gradient] is used to detect the watermark in new images

1: Extracting watermarks

So far all I've done is obtain some still/animated images (from Kirby Of The Stars!) and to focus on the watermark region in question.

The file kirby003_01a.png can be used to extract the binary watermark, since it falls in a region of black screen fill.

img = read_image('kirby003_01a.png')
watered = img[6:20, 9:109]
gr_wm = rgb2grey(watered)

The values in gr_wm are a greyscale equivalent to the RGB(A) values given by imageio.imread (i.e., the watermark is white with low opacity, so there's no point representing it as 3 colours).

An example value can be shown to be simply a decimal interpretation of RGB:

  • watered[10,10]Array([ 24, 24, 24, 255], dtype=uint8)
  • gr_wm[10,10]0.09411764705882353
  • 24/2550.09411764705882353

Watermark removal should then just be a matter of offsetting the value stored above in the variable watered...

normed = gr_wm * (1/np.max(gr_wm))
plt.imshow(normed, cmap=plt.get_cmap('gray'))
plt.show()
# fig = plt.figure(figsize=(6,2))
# plt.imshow(normed, cmap=plt.get_cmap('gray'))
# fig.savefig('../img/doc/wm_greyscale.png')

Then for greater accuracy, do it twice more:

img2 = read_image('kirby003_01b.png')
img3 = read_image('kirby003_01c.png')
gr_wm2 = rgb2grey(img2[6:20, 9:109])
gr_wm3 = rgb2grey(img3[6:20, 9:109])
normed2 = gr_wm2 * (1/np.max(gr_wm2))
normed3 = gr_wm3 * (1/np.max(gr_wm3))
# assert np.min(normed2) == np.min(normed3) == 0
# assert np.max(normed2) == np.max(normed3) == 1

fig=plt.figure(figsize=(6, 4))
fig.add_subplot(3,1,1)
plt.imshow(normed, cmap=plt.get_cmap('gray'))
fig.add_subplot(3,1,2)
plt.imshow(normed2, cmap=plt.get_cmap('gray'))
fig.add_subplot(3,1,3)
plt.imshow(normed3, cmap=plt.get_cmap('gray'))
fig.savefig('../img/doc/multi_wm_greyscale.png')

...and as animated GIF:

fig = plt.figure(figsize=(6,2))
plt.imshow(normed2, cmap=plt.get_cmap('gray'))
fig.savefig('../img/doc/wm2_greyscale.png')

fig = plt.figure(figsize=(6,2))
plt.imshow(normed3, cmap=plt.get_cmap('gray'))
fig.savefig('../img/doc/wm3_greyscale.png')

from subprocess import call
call(['convert', '../img/doc/wm*_grey*.png', '../img/doc/wm_greyscale.gif'])

...for good measure, do the same for a couple non-black backgrounded, dark grey b/g images (kirby003_03a.png and kirby003_03b.png), along with a black b/g fade-out screen (kirby003_04.png).

img4 = read_image('kirby003_03a.png')
img5 = read_image('kirby003_03b.png')
img6 = read_image('kirby003_04.png')
gr_wm4 = rgb2grey(img4[6:20, 9:109])
gr_wm5 = rgb2grey(img5[6:20, 9:109])
gr_wm6 = rgb2grey(img6[6:20, 9:109])
prenorm4 = gr_wm4 - (np.min(gr_wm4) * 1.4)
prenorm5 = gr_wm5 - (np.min(gr_wm5) * 1.4)
# Increase the background minimisation by a factor of 40%,
# clipping any values that dip below zero (a_max=None as not needed)
# Otherwise 4 and 5 end up with a light grey watermark background
normed4 = np.clip(prenorm4, 0, None) * (1/np.max(prenorm4))
normed5 = np.clip(prenorm5, 0, None) * (1/np.max(prenorm5))
normed6 = gr_wm6 * (1/np.max(gr_wm6))
# assert np.min(normed4) == np.min(normed5) == np.min(normed6) == 0
# assert np.max(normed4) == np.max(normed5) == np.max(normed6) == 1

...and an animation with all 6:

call(['convert', '../img/doc/wm*_grey*.png', '../img/doc/wm_greyscale_all.gif'])

However there needs to be a single consensus watermark, using these sampled images. This can then be used across images to mask the watermark, as well as reloaded from a single file.

Google Research published a 2017 CVPR paper on this topic (project site, using hundreds of samples (at higher resolution) with excellent results, and went with the median.

  • That paper is really worth reading, and presents this as a 'multi-image matting' optimisation problem.
  • Unlike their paper, I have a blank backgrounded watermarked image, so can avoid the 'chicken and egg' problem of simultaneous watermark estimation and detection (which they resolve by iterated rounds of estimation/detection)
  • Their paper doesn't describe how they get the image gradient (e.g. Sobel vs. Scharr derivative). I opt to convolve a [2D] Sobel operator horizontally and vertically, then take the hypotenuse to get the magnitude (as here).
    • Element-wise, the hypotenuse is equal to the square root of squared dx plus squared dy (see numpy.hypot docs/the OpenCV Canny edge detection tutorial for more info)
    • I note that the GR team's method calculates median of the 2 directions independently, then takes the magnitude (rather than taking the median of 2D Sobels per image, i.e. mag = np.hypot(median_dx, median_dy)).
  • Their paper specifies a "0.4 threshold" for the Canny edge detection used to find a bounding box on the watermark, despite the Canny algorithm (to the best of my understanding) taking 2 threshold parameters (min and max)
    • I think this actually refers to the sigma value demonstrated in this blog post, i.e. upper/lower are set at +/- 40% of the median of the single channel pixel intensities (the post notes that 33% is often optimal, which makes 0.4 a reasonable choice).
  • The image gradients are used to obtain the "initial matted watermark" by "Poisson reconstruction", which appears to be a reference to either:

N.B. - get_grads returns a tuple (dx, dy), whereas get_grad (singular) takes their magnitude (the hypotenuse), below the grads variable is a list of six (dx, dy) tuples, from which independent medians are taken.

imgs = [normed, normed2, normed3, normed4, normed5, normed6]
grads = [get_grads(i) for i in imgs]
med_dx = np.median([m[0] for m in grads], 0)
med_dy = np.median([m[1] for m in grads], 0)
med_mag = np.hypot(med_dx, med_dy)

For now at least, I'm just going to take this as the estimate (since I can't figure out how to do "Poisson reconstruction", and the Google researchers don't cite an implementation to look up.

Saving this estimate in the data directory now, note that read/write modes for pickle load/dump must be 'rb'/'wb' not just 'r'/'w' (the b stands for binary):

import pickle
pickle_file = open('../data/est_grad_wmark.p', 'wb')
pickle.dump(med_mag, pickle_file)

2: Watermark detection

Following the guidance of Dekel et al. still, the next step is as follows:

Given the current estimate ∇Ŵm, we detect the watermark in each of the images using Chamfer distance commonly used for template matching in object detection and recognition. Specifically, for a given watermarked image, we obtain a verbose edge map (using Canny edge detector), and compute its Euclidean distance transform, which is then convolved with ∇Ŵm (flipped horizontally and vertically) to get the Chamfer distance from each pixel to the closest edge. Lastly, the watermark position is taken to be the pixel with minimum distance in the map. We found this detection method to be very robust, providing high detection rates for diverse watermarks and different opacity levels.

So firstly, we run Canny edge detection (N.B. OpenCV 4.0 was released in early 2019, and OpenCV recently became pip installable - so no trouble to pip install it within virtualenv).

# read in image (with black b/g behind watermark) as greyscale & uint8 dtype
kirby = read_image('../img/kirby003_01a.png', grey=True, uint8=True)
edges = auto_canny(kirby)
save_image(edges, (30,17), '../img/doc/canny_demo.png')

Next compute its Euclidean distance transform and convolve this with the estimated watermark (flipped horizontally and vertically). The watermark position is the pixel with minimum distance in the map.

  • This section was helpfully explained here that this is cross-correlation with the image
  • At first I understand from my reading of the paper that this was supposed to be the edge map of the gradient, not the edge map of the watermarked image:

    We crop ∇Ŵm to remove boundary regions by computing the magnitude of ∇Ŵm(p) and taking the bounding box of its edge map (using Canny with 0.4 threshold).

  • ...but the result of that was clearly wrong, whereas the edge map of the median watermarked image looked about right. Code for this is as follows (I scale up to 255 to avoid the decimal values from 0-1 being rounded down to 0 when casting to uint8):
med_img = pickle.load(open('../data/med_wmark.p', 'rb'))
med_img = med_img * (255 / np.max(med_grad))
med_img = np.uint8(med_img) # Must be uint8 for Canny edge detection to run on it
edged_med_img = auto_canny(med_img)

The bounding box is just the min/max on x and y axes of zero values, which can be found with np.where(img != 0):

bb = bbox(edged_med_img)
show_image(edged_med_img[bb[0]:bb[1]+1, bb[2]:bb[3]+1], True)
show_image(med_mag[bb[0]:bb[1]+1, bb[2]:bb[3]+1], True)

However, setting this range to have a green overlay in the original shows that the boundaries are still watermarked, so this doesn't quite work for my case (in which the watermarks are probably much smaller and lower quality, hence the blurry edges that shouldn't be overcropped).

green = to_rgb(np.copy(med_img))
green[bb[0]:bb[1]+1, bb[2]:bb[3]+1] = [0, 200, 0]
show_image(med_img, bw=True, alpha=0.5)
plt.imshow(med_img, cmap=plt.get_cmap('gray'))
plt.imshow(green, alpha=0.7
# fig = plt.figure(figsize=(6,2))
# plt.xticks([]), plt.yticks([])
# plt.tight_layout()
# plt.imshow(med_img, cmap=plt.get_cmap('gray')))
# plt.imshow(green, alpha=0.7)
# fig.savefig('../img/doc/green_bbox.png')

waterdown's People

Contributors

lmmx avatar

Stargazers

 avatar Alexey Bogdanovich avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.