September 03, 2007
Optimizing Seam Carving

After my post about the various content-aware image resizing projects I couldn't resist to take a closer look at Joa Ebert's ImageResizing class and take him on his word "If you can improve something do not be shy and post it". I just love code optimizing and saw several points to attack.

First I replaced the loop that moved the pixels via getPixel()/setPixel() by a displacementMap filter. The additional benefit of this is that I can use the same displacement map on the grayscale map which is used for the energy calculation so it just needs to get calculated once. It is even possible to apply it to the energy map, too but that map should only be used for a few carves and then get recalculated. Otherwise artifacts will show up quickly.

Another part I changed is the energy function itself. I'm not sure if that improves the speed, but I think it improves the seam selection quality. Instead of a convolution filter I blur the image first and then draw() it onto itself offset by a few pixels with difference mode. The idea behind this is that this will keep strong edges but does not emphasize noise and small patterns as much.

Here is the optimized ImageResizing AS source. And here's a demo of the optimized version:




Click on it to reset it and press any key (whilst it has the focus) to switch between horizontal and vertical shrinking.

For comparison check out the original version here.

The longest time is spend in the search for the seam with the lowest energy. So that's where the most optimization can be done. And indeed there were several easy victims to be found:

This is the original part of the inner loop where a seam gets built - it works by looking at the current pixel, checking its three neighbors to the right and then walking to the one of the three with the lowest energy. For this demonstration I have condensed the code a bit and removed the checks for vertical swipes:


for ( var p: int = 1; p < end; p++ )
{
point = point.next = new SeamPoint;
energy = energyMap.getPixel( o, p - 1 ) & 0xff;

e0 = energyMap.getPixel( o - 1, p ) & 0xff;
e1 = energyMap.getPixel( o , p ) & 0xff;
e2 = energyMap.getPixel( o + 1, p ) & 0xff;

d0 = Math.abs( energy - e0 );
d1 = Math.abs( energy - e1 );
d2 = Math.abs( energy - e2 );

min = Math.min( d0, d1, d2 );

if ( min == d0 ) o--;
else if ( min == d2 ) o++;

if ( o < 0 ) o = 0;
if ( o == endO ) o = endO - 1;

point.x = o;
point.y = p;

differenceTotal += min;
}

_energy = ( differenceTotal / end ) / 255;

The first thing that falls into my eye is that it is not necessary to read the "energy " pixel each time anew. We have that information already because it one of the three neighbors. So let's move that outside the loop - voilá - one getPixel less.


energy = energyMap.getPixel( o, 0 ) & 0xff;

for ( var p: int = 1; p < end; p++ )
{
point = point.next = new SeamPoint;

e0 = energyMap.getPixel( o - 1, p ) & 0xff;
e1 = energyMap.getPixel( o , p ) & 0xff;
e2 = energyMap.getPixel( o + 1, p ) & 0xff;

d0 = Math.abs( energy - e0 );
d1 = Math.abs( energy - e1 );
d2 = Math.abs( energy - e2 );

min = Math.min( d0, d1, d2 );

if ( min == d0 ) {
o--;
energy = e0;
} else if ( min == d2 ) {
o++;
energy = e2;
} else {
energy = e1
}

if ( o < 0 ) o = 0;
if ( o == endO ) o = endO - 1;

point.x = o;
point.y = p;

differenceTotal += min;
}

_energy = ( differenceTotal / end ) / 255;

The next thing is those Math.abs(). Whenever I see a Math.XY I look for ways to replace them with something faster. Ideally with bit arithmetics. And indeed there is a nice alternative possible here. All the pixels we read will be in the range of 0-255 aka one byte. And what we are calculating is the absolute difference between two bytes. This is a case for the XOR operator ^.

This:


d0 = Math.abs( energy - e0 );
d1 = Math.abs( energy - e1 );
d2 = Math.abs( energy - e2 );

becomes this:


d0 = energy ^ e0;
d1 = energy ^ e1;
d2 = energy ^ e2;

But there is another Math in here - Math.min( d0, d1, d2 ). Well of course this looks very nice and clean, but by making it a bit less nice and clean we can speed it up considerably:


energy = energyMap.getPixel( o, 0 ) & 0xff;

for ( var p: int = 1; p < end; p++ )
{
point = point.next = new SeamPoint;

e0 = energyMap.getPixel( o - 1, p ) & 0xff;
e1 = energyMap.getPixel( o , p ) & 0xff;
e2 = energyMap.getPixel( o + 1, p ) & 0xff;

d0 = energy ^ e0;
d1 = energy ^ e1;
d2 = energy ^ e2;

if ( d0 < d1 )
{
if ( d0 < d2 )
{
o--;
energy = e0;
if ( o < 0 ) o = 0;
min = d0;
} else {
o++;
energy = e2;
if ( o > endO ) o = endO ;
min = d2;
}
} else {
if ( d1 <= d2 )
{
energy = e1;
min = d1;
} else {
o++;
energy = e2;
if ( o > endO ) o = endO;
min = d2;
}
}

point.x = o;
point.y = p;

differenceTotal += min;
}

_energy = ( differenceTotal / end ) / 255;

There are some minor cleanups. For example by not creating a grayscale energy map but rather just storing the energy in the blue channel I can get rid of the &ff after the getPixels(). For the energy calculation it is also not necessary to normalize it - so the division at the end can go. One issue I had was with the border pixels - in the original implementation those non-existing pixels were simply read - Flash will return 0 in that case. I think that's not ideal so I made those pixels very expensive instead by adding 0x100 to them:


e0 = o > 0 ? energyMap.getPixel( o - 1, p ) : 0x100 | energy;
e1 = energyMap.getPixel( o, p );
e2 = o < endO ? energyMap.getPixel( o + 1, p ) : 0x100 | energy;

One very important optimization though is simply not not calculate values that are not needed. In our case we don't need to keep following a seam if it is more energy expensive than the best previously found. So by letting the seam finder routine know what is the best level yet it can stop searching once it is worse than that.


_energy += min;
if (_energy> bestEnergy)
{
_energy = Number.MAX_VALUE;
return;
}

So putting it all together it looks like this:


_energy = 0;
energy = energyMap.getPixel( o, 0 ) & 0xff;

for ( var p: int = 1; p < end; p++ )
{
e0 = o > 0 ? energyMap.getPixel( o - 1, p ) : 0x100 | energy;
e1 = energyMap.getPixel( o, p );
e2 = o < endO ? energyMap.getPixel( o + 1, p ) : 0x100 | energy;

d0 = energy ^ e0;
d1 = energy ^ e1;
d2 = energy ^ e2;

if ( d0 < d1 )
{
if ( d0 < d2 )
{
o--;
energy = e0;
if ( o < 0 ) o = 0;
min = d0;
} else {
o++;
energy = e2;
if ( o > endO ) o = endO ;
min = d2;
}
} else {
if ( d1 <= d2 )
{
energy = e1;
min = d1;
} else {
o++;
energy = e2;
if ( o > endO ) o = endO;
min = d2;
}
}

point.x = o;
point.y = p;

_energy += min;
if (_energy> bestEnergy)
{
_energy = Number.MAX_VALUE;
return;
}
}

I have still some ideas for possible improvements and I'm closely working together with Joa on this. The next step is to grow images. And of course in the end the whole thing has to be packaged and documented properly so it can be used conveniently.

Posted at September 03, 2007 12:45 PM | Further reading
Comments

You get much better results IMHO if you picked the best of random N (I used 10,000) paths from one side to another. The dynamic programming solution gets stuck in local minima. Also for certain images the histogram of oriented gaussians energy metric is better. I would link the algorithm but the cgi script is preventing me so oh well. I guess you could manually go to hectorgon dot b l o g s p o t dot com.

Posted by: Hector Yee on September 3, 2007 07:30 PM

Hector - It would be great if you could publish the source codes for your solution. I must admit that since I'm not a mathematician I have no idea what "the histogram of oriented gaussians energy metric" looks like or how it is calculated. Whilst if I see the code I usually understand what is happening behind the scenes.

Oh and sorry for my overstrict spam protection but I'm still running on a very old blog system and had to take measures against the dozens of bot comments that I had to remove manually every day.

Posted by: Mario Klingemann on September 3, 2007 08:10 PM

Everything is mostly the same but instead of dynamic programming for the path just generate a random seam from top to bottom (or left to right) keeping track of the energy of the path. At every step keep track of the path with the smallest energy. i.e. Starting at the top, pick a random pixel y = 0, x = random() % width. Then pick a random pixel y = 1, x = x + random() % 3 - 1 (this gives equal probabiltiy for x-1, x and x+1). Keep on going until the end. This generats a random seam from top to bottom with some energy E1. Do this 10, 000 times and the path with the lowest energy is your seam as per normal.

I find that it tends to perform better if your energy function (in this case edge detection) doesn't have as wide a kernel as the histogram of gradients as described in the paper.

This should give you better results.

Posted by: Hector Yee on September 4, 2007 04:40 AM

Hi Hector,

even if I assume that this would give a better result with a small kernel edge detection it seems to be to hard for Flash. In Flash the problems of the seam carving are the getPixel calls to find a seam. If you shoot now 1000 times random vs. width times you might get a result that is acceptable but takes less getPixel calls.

A bigger kernel could give us already better results if I understand you correct? I was searching on the HOG calculation but unfortunately I'm missing the right Google keywords to find something useful.

Posted by: Joa Ebert on September 4, 2007 11:50 AM

I'd say that the blur + difference draw method I use to find edges is like using a bigger kernel. As for finding the seam with the lowest energy I still want to try out a modified A* algorithm.

Posted by: Mario Klingemann on September 4, 2007 12:05 PM

I wonder - is that technique rather called "Histogram of oriented gradients"? I've found a paper here:
http://lear.inrialpes.fr/pubs/2005/DT05/cvpr2005_talk.pdf
And some source code:
http://tinyurl.com/yrmvdg

Posted by: Mario Klingemann on September 4, 2007 12:16 PM

Re: HOG

I think the authors refer to this paper "Histograms of Oriented Gradients for Human Detection" by Navneet Dalal and Bill Triggs. Blur + Difference sounds like "Laplacian of Gaussians" another nice edge detect method. Since Gaussians are separable you can convolve each dimension separately. Or there was a recent paper on using state machines to do it really fast. Google "state machine" and "gaussian" for the algorithm.

Re: Finding the minimum path
I actually didn't use 1000 paths * width, I used 10, 000 total and it seemed to work fine.

Modified A* sounds like a good idea! Do let me know what comes out of it.

hectorgon dot blogspot dot com

Posted by: Hector Yee on September 5, 2007 08:04 AM

Oh yeah I didn't notice the typo, its Histogram of Oriented Gradients, not Histogram of Oriented Gaussians. Sorry :P. Labor day weekend late nights do that :P.

Posted by: Hector Yee on September 5, 2007 09:53 AM

All right, i'm new to this technique and i don't have any experience on ActionScript programming.
But from your optimized source code, i can tell that, you repeatadly call stepX or stepY; in which you iterate through all possible seams each time, which seems to inefficient to me.
Isn't it possible to build all the vertical or horizontal seams at once and by iterating through them make the required resizing ? Of course you have to take into account the relative orientation of seams which has not been removed.
Or maybe i got it all wrong :s

Posted by: Gloom on September 12, 2007 07:29 PM

Gloom: there are many possibilities for optimizations and different approaches to the finding of the best seam. I would never dare to claim that I have found the best one. Feel free to improve the code that I posted - I'm looking forward to your optimizations.

The "best of n random seams" method that Hector described turned out to be really fast and efficient.

Joa and me are currently working on a tool which will be able to explore all these variants and more once we come across them. Watch out for some interesting news soon.

Posted by: Mario Klingemann on September 12, 2007 08:37 PM

http://flashnifties.com/flash_guestbook.php

Hey! is this YOUR CODE??

seems familiar!

Posted by: steve on September 17, 2007 09:00 PM

I've been doing quite a bit of optimizing to the original algorithm. The idea of random seams, seems, well, against the papers intention. The goal is to remove the seam with the LEAST amount of energy. Random seams are by no means going to guarantee such an operation, and over time will start to result in artifacts.
If you do your edge detection properly you can get it down to about 12 additions per pixel, compared with the traditional 16 multiplications and 18 additions.
See here: http://sourceforge.net/projects/c-a-i-r/
and here: http://brain.recall.googlepages.com/cair

Posted by: Brain_ReCall on October 29, 2007 01:34 AM

I am the programmer who is Korea for image re-sizing with much interest. I resembled the cord which back I whom this sentence was posted on made very much. I compared RGB for a general image, but I let an original image change by the, animation which I showed and seem to have used an upper cord for a manuscript image, but a cord for the, part is not shown. I did not appear hether I declared it with what kind of document type though there was a "energy" variable if I said by a mouthful.

It does not seem to be BITMAP document-shaped declaration simply.

Posted by: jung koo kim on October 31, 2007 12:04 PM

Mario;

Awesome work! Just wondering if you were going to post the source to the updated optimizations that you did for the version utilizing the Aviary framework? Did you code in image preserving and removing areas yet?

Kind regards!
Mike

Posted by: Mike on November 14, 2007 12:12 AM

It works fine with just the 'getPixel(something)'
instead of 'getPixel(something) & 0xFF'.

Check out my YouTube video in the URL link.

But still, your AS code works faster than mine.
Nice work!!

Posted by: Woo on January 30, 2008 08:42 AM
Post a comment
Name:


Email Address:


URL:


Comments:


Remember info?



Thank you!

Most Visited Entries
Sketches, Works & Source Code
Lectures
Contact
Backlog
In Love with
Powered by
Movable Type 2.661

© Copyright Mario Klingemann

Syndicate this site:
RSS 1.0 - RSS 2.0

Quasimondo @ flickr
Quasimondo @ LinkedIn
Quasimondo @ Twitter
Quasimondo @ Facebook
Quasimondo @ MySpace
Quasimondo is a Bright
Citizen of the TRansnational Republic
My other blog in german
Impressum


My family name is written Klingemann,
not Klingelmann, Klingeman, Klingaman, Kingemann,
Kindermann, Killingaman, Klingman, Klingmann, Klingonman
Klingemman, Cleangerman, Klingerman or Kleangerman

profile for Quasimondo at Stack Overflow, Q&A for professional and enthusiast programmers