Today we are fortunate to have Prof. Joseph O. Ruanaidh to give us a guest lecture on digital image watermarking. He is an assistant professor at University of Geneva in Switzerland. This is an one hour lecture in Wean 4623, and is open to general public.
A watermark is a (semi-) invisible mark placed on an image that can be detected when the image is compared with the original. This mark is designed to identify both the source of an image as well as its intended recipient. The mark should be tolerant to reasonable quality lossy compression of the image using transform coding or vector quantization [KRDB96a].
The speaker will talk about watermarking digital images only. In reality, people also concern about other content types such as texts, software programs, audio and video steams. The speaker believes that digital image is the easiest content type to embed robust watermarks. The problem with text is that it is hard to hide watermarks without being detected, hence an adversary can systemically remove them. Alternatively, if the watermarks are encoded in the structure of the text document (such as MSWord document), an adversary may easily print out the text and OCR it again to remove the watermarks. The problem with software programs is similar; it's hard to hide watermarks without being detected. Audio stream is harder than image because human beings is much more sensitive in sound but less sensitive for vision (a random note: evolutional reason => the ability to hear a tiger approaching is more important than to enjoy the texture of a tiger's skin). Video streams may not be hard to embed robust watermarks. However, a video stream is typically much larger in size than still images and therefore harder to manipulate and analyze. Not much work has been done in that area, and currently another hot area of research.
It is important to know that human vision is very selective in turns of detecting features of a still image. A good watermarking algorithm should utilize the fact that with human vision, certain modifications of the original images may be less distracting than other modifications.
The speaker is involved in Project Krypict. Krypict is a software environment for copyrighting, authenticating, archiving and retrieving pictorial documents in multimedia database. Due to the explosive publicity of the Internet, distribution of documents with copyright protection becomes critical. Project Krypict enables these documents to develop copyright enforcement authentication methods for image database based on image watermarking methods. There are three major tasks:
The speaker shows us two slides filled with the name of the companies and research projects working on watermarks. The moral is that this is a hot area of research and many people are working on digital watermarks. One of the bigger competitors he mentions is Digimarc, a company that sells digital watermarking software as a Photoshop Plugin. The company has strong venture capital to support lots of research staffs and programmers, where his group is quite limited in geld and human resources.
A watermarked image may be distorted due to various reasons, such as (real-time) network transfer, compression, normal image manipulation, or malicious attempt to confuse or erase watermarks. The speaker categories the distortion into six levels:
The speaker remarked that most people have mechanism that can deal with up to level 2 distortion. His approach can tolerate up to level 4 distortion.
People have come up various ways to compensate distortions. For example, to be resistant against rotation, Digimarc embeds watermarks on a ring, which is quite interesting. The speaker will present a more mathematical way to deal with level 4 distortion using Fourier-Mellin transform.
By the way, Holy Grail is the speaker's expression for a robust watermarking.
There are several approaches to do watermarks: primitive, image transforms, invariants, and spread spectrum. The speaker considered spread spectrum as the best approach. Spread spectrum operates on frequency domain rather than spatial down. It has some nice properties that it becomes harder to temper with the watermarks and at the same time the image quality does not degrade very much. The speaker shows that when an image is watermarked with spread spectrum, the edge becomes less crisp and sometime looks like a shadow.
The private key in spread spectrum is the seed used to generate the random sequence (note: having a bad random seed generator may compromise security).
In spread spectrum, most of the robustness of watermarks comes from using good digital communication techniques. The invisibility of the watermark comes from keeping the amplitude low and marking the right frequencies. With high frequencies, the noise is less perceptible but the energy is lower (energy refers to how strong the watermarked signal is at frequency domain). With low frequencies, the noise is more perceptible but the energy is higher. The compromise is to utilize a broad spectrum of frequencies, which will survive 10% JPEG quality factor by experiment.
Another challenge is that with spread spectrum watermarking, the original image is needed to extract the watermark. The speaker co-developed a technique which allows an verifier to extract the watermarks without the original image.
(Note: The content of this section was extracted directly from the paper [RP97] for your convenience. The speaker briefly mentioned a subset of the equations in class)
Transform invariant watermarking can extract watermark without the help of the original image, at the same time maintaining translation, rotation, and scaling invariant. The mathematics is based on the Fourier-Mellin transform.
Let the image be a real valued continuous function,
f(x,y)
defined on an integer-valued Cartesian grid 0 <= x <= M, 0 <= y <= N
. Let the two dimensional Discrete Fourier Transform (DFT) F(k,l)
where 0 <= k <= M, 0 <= l <= N
be defined in the frequency domain. The translation property (where an image is translated with offset (a, b)
) is:
F(j,k)exp[-j (a*k+b*l)] <--> f(x+a, y+b)
The scaling property (where an image is expanded by factor s) is:
1/s F(k/s, l/s) <--> f(s*x, s*y)
The rotation property (where an image is rotated with an angle p) is:
F(k*cosp–l*cosp, k*sinp+l*cosp) <--> f(x*cosp–y*sinp, x*sinp+y*cosp)
From the translation property of the Fourier transform we understand that spatial shifts affect only the phase representation of an image. This leads to a known fact that DFT magnitude is a circular translation invariant. An ordinary translation can be represented as a cropped circular translation.
Define
(x,y)
in R(^2)
and define x = e(^u)*cosp, y=e(^u)*sinp
by a means of a Log-Polar Mapping (LPM), where u is in R and 0 <= p <= 2*(pi)
. We can see that for every point (x,y)
there is a point (u,p)
that uniquely corresponds to it. Note that in the new coordinate system scaling and rotation are converted to a translation of the u and p coordinates respectively. At this stage one can implement a rotation and scale invariant by applying a translation invariant in the log-polar coordinate system. Taking the Fourier transform of a log-polar map is equivalent to computing the Fourier-Mellin transform.
Consider two invariant operators, G which extracts the modulus of the Fourier transform, and H which extracts the modulus of the Fourier-Mellin transform. Applying the hybrid operator H o G to an image f(x,y) we can get:
I = [H o G] f(x,y)
Let us also apply this operator to an image that has been translated, rotated and scaled:
J = [H o G o R(p) o S(s) o T(a,b)] f(x,y)
= [H o R(p) o G o S(s) o T(a,b)] f(x,y)
= [H o R(p) o S(1/s) o G o T(a,b)] f(x,y)
= [H o G] f(x,y)
= I
Hence
J = I,
and the representation is rotation, scaling, and translation invariant. The invariant is also order independent; that is, any order of rotation, scaling, and translation transformation combinations do not change the invariant.
In short, the speaker uses a Perceptually Adaptive Spread Spectrum (PASS) approach which combines image transformations such as the Fast Fourier Transform (FFT) and Log Polar Map (LPM) with spread spectrum communications techniques.
The diagram illustrates a scheme which forms the basis of above algorithm to embed watermarks specifically designed to withstand translation, rotation, and scaling. A spread spectrum signal is embedded din a domain that is invariant to these transformations.
Our classmate, Adrian Perrig, is the speaker honorary mention for taking a critical role in the successful design and implementation of their watermarking system.
There are two types of messages the image author may want to throw into the watermarks: private message and public message. Everyone can decode the public message but only the people with the correct private key (the seed to generate the random sequence) can see the private message. Both private and public messages can coexist simultaneously and independently.
The speaker suggested 7 pieces of information to go into the private message: private serial number, copyright data, ownership evolution (version number), escrowed serial number, alternation detection (checksum), tracking codes, and private note such as timestamp, f-stop, or film exposure (in photography).
The speaker suggested 8 pieces of information to go into the public message: ownership information, contact information, usage rights, protection bits, (public) serial number, URL, ciphertext, and captions.
The simple act of creating a work does not legally entitle an author to its copyright. Thus, before owning the copyright, it is necessary to register with a trusted third party. At the time of copyright dispute, the trusted third party can act as a witness. In most countries (including the U.S.), the trusted third party is the copyright office run by the government.
The copyright laws differ from countries to countries. Most countries base their copyright laws on the Berne Convention and Universal Copyright Convention, which address copyright issues on the international market.
The Berne Convention, dated 1971, is an international convention for the protection of literary and artistic works. In short, the Berne Convention provides a minimum of 25 years protection for photographic works. Member states may provide additional protection. In general, copyright persists for one hundred years after the date of creation. To learn more about Berne's convention, please check out
http://www.law.cornell.edu/treaties/berne/overview.html.For more information about Copyright Convention, please check out
http://www.tufts.edu/departments/fletcher/multi/texts/UNTS13444.txt.To register a copyright, the following items are necessary to a copyright application:
In the U.S., most art works, including photographs, cartoon strips, and artistic drawing and sculptures cost $20 per piece as non-refundable filing charge.
Question: if one-way hash algorithm is used for registering digital images, can a malicious person register lots of random numbers in hope that he or she can later claim other people's copyright?
Answer (by Doug): you have to pay a fee to get an image registered ($20 in U.S.). The hit ratio is low enough that the attack is not economical.
The speaker presented the up-to-date digital image watermarking techniques and his innovation to eliminate the need of original image to retrieve the watermarks. He also touched on which information should go into the watermarks, what is the current state of the copyright laws, and how to register the copyright in today's world.
[RP98] J.J.K. Ó Ruanaidh and T. Pun, "Rotation, Translation and Scale Invariant Digital Image Watermarking", Signal Processing, Special Issue on Copyright Protection and Control, 1998.
[RP97] J.J.K. Ó Ruanaidh and T. Pun, "Rotation, Translation and Scale Invariant Digital Image Watermarking", IEEE ICIP, Int. Conf. Image Processing, Santa Barbara, California, Oct. 26-29, 1997.
[HPR97] A. Herrigel, A. Perrig and J.J.K. Ó Ruanaidh, "A copyright protection environment for digital images", VIS 97, Verlassliche Informationssysteme, Gesellschaft fuer Informatik, Freiburg, Germany, September 1997.
[RRP97] C. Rauber, J. Ó Ruanaidh and T. Pun, "Secure distribution of watermarked images for a digital library of ancient papers", Second ACM Conf. on Digital Libraries, Philadelphia, PA, July 23-26, 1997.
[KRDB96a] J.J.K. Ó Ruanaidh, W.J. Dowling and F.M. Boland, "Phase Watermarking of Digital Images" IEEE Int. Conf. Image Processing, Vol III pp 239-241, Lausanne, Switzerland, Sept 1996. Oral presentation.
[KRDB96b] J.J.K. Ó Ruanaidh, W.J. Dowling and F.M. Boland, "Watermarking Digital Images for Copyright Protection" IEEE Proceedings on Vision, Signal and Image Processing, August 1996, Vol 143, No. 4, pp 250-256.
[KRDB96c] J.J.K. Ó Ruanaidh, W.J. Dowling and F.M. Boland, "Electronic Watermarking of Digital Images", The Irish DSP and Control Conference, Trinity College, University of Dublin, July 1996. Oral presentation.
[KRBS96] J.J.K. Ó Ruanaidh, F.M. Boland and O. Sinnen, "Watermarking Digital Images for Copyright Protection" Electronic Imaging and the Visual Arts 1996, Florence, Italy, February 1996. Oral presentation.