Gaussian kernels, and the Fourier transforming of them

jothki · January 2010

I've been flailing around with this for a few days, so I figure that I might as well ask here. Wikipedia and the rest of the internet have been somewhat useful, but haven't told me everything I need to know.

I'm currently trying to program a blurring filter that convolves an image with a fairly massive gaussian kernel (I'm trying to correct for intensity variations across an image, so I'm dividing the normal image by a heavily blurred version). Since normal convolution takes so long at that scale, I'm attempting to take the suggestion of a paper that I've read to perform Fourier transformations on the two matrices, multiply them together, and then perform an inverse Fourier transformation on the result. I'm not a mathematician and have very little experience in image analysis so I have very little idea what a Fourier transformation actually is. All I know is that I can pass an M by N matrix into IDL's fft() fuction and it'll spit back out an M by N matrix of complex numbers.

What confuses me is exactly how I'm supposed to combine the two transformed matrices. The image is M by N and the gaussian kernel is O by O, so I can't perform conventional matrix multiplication on them. Anyone know what I'm supposed to be doing with the two matrices?

Edit: Just remembered that 0 values in the kernel don't do anything to the convolution, so I can pad out the guassian kernel with 0s to a size that can be multiplied. What kind of multiplication do I want, though? Standard matrix multiplication or pointwise?

Hamster_style · January 2010

Could you link the paper to me? This is an interesting problem.

Or, alternatively, just give me the citation and I'll go look it up, I've got teh hookupz.

Edit: I screw around a lot ("work" they say) with astronomical image subtraction, which involves a lot of image convolution. Is there only one image that you're trying to flatten? There are other methods used to flatten images that don't involve convolution, as well.

Marty81 · January 2010

I don't know what program you're using to do your convolutions, but if you're using a built-in function (not one you wrote yourself), I'd be surprised if it did them without FFTs.

Savant · January 2010

I only know a tiny bit about image processing, but I can give you a very brief explanation of the Fourier transform, which you could probably figure by snooping on the internet, but oh well.

The Fourier transform is a particular way to transform a complex valued function into another complex valued one, and the most basic way to think about it is a transformation from the time domain to the frequency domain. So with a function sent through the Fourier transform, what you get out is a function of the different frequencies that compose that function. For example, if you think about a ray of light as having a straightfoward wave function, you can get the spectrum of that light out using the Fourier transform. Or with a sound wave of an musical instrument, you can get the combination of pure tones of different frequencies out of a Fourier transform.

The FFT is slightly different, as it is an algorithmically faster and better way to computer the Discrete Fourier transform. The DFT is similar to a Fourier transform of converting from the time domain to the frequency domain, but it operates on a discrete function with a finite duration, or an infinitely periodic finite function where you have the values of one period. Computationally speaking, for signals you only have a finite amount you can work with so a real Fourier transform is impossible on a signal you are sampling, so the FFT can be used to approximate the conversion into the frequency domain.

Anyways, what you are getting out of your function is equivalent to some form of a Discrete Fourier transform, so that's where you should look if you are trying to figure out what it is outputting.

edit: Oh, and the inverse Fourier transform is pretty much just what it sounds like, transforming from the frequency domain to the time domain.

jothki · January 2010

Marty81 PMed me the relevent answer, padding it out does in fact work, so that part's good.

It seems like in order to avoid shifting the image, I have to center the kernel around [0,0], wrapping negative indices into the other end of the matrix. Am I doing something wrong, or is there a more elegant way to set it up?

Edit: Eh, I'll just attach the entire code. It's written in IDL, but most of what I'm doing should be fairly easy to understand.

;; procedure to normalize the intensities across an image
pro Correct_Intensity, event
Widget_Control, event.top, Get_Uvalue = Info, /No_Copy
IF (n_elements(*info.image) gt 0) then begin $
*info.undo = *info.process

; Determine threshold to use for noise replacement
; Needs touching up
threshold = 50

; Replace the noise with the average value of the non-noise
noiseplaceholder = histogram(*info.process, max = threshold - 1, reverse_indices = noiseindices)
dataplaceholder = histogram(*info.process, min = threshold, reverse_indices = dataindices)
noiseindices = reform(noiseindices,n_elements(noiseindices))
dataindices = reform(dataindices,n_elements(dataindices))
datavalues = (*info.process)[dataindices]
dataaverage = mean(datavalues)
print, 'Average image intensity is ', dataaverage
adjustedimage = *info.process
adjustedimage[noiseindices] = dataaverage
;tvscl, adjustedimage

; Smooth out the adjusted image
; Needs touching up
idims = size(adjustedimage,/dimensions)
kernelwidth = 100;200;fix(.375 * idims[0])
kernelheight = 100;200;fix(.375 * idims[1])
;gausskernel = dblarr(kernelwidth,kernelheight)
gausskernel = dblarr(idims[0],idims[1])
standarddev = 50;100
for i = 0, kernelwidth - 1 do begin
  for j = 0, kernelheight - 1 do begin
    x = abs(((kernelwidth - 1) / 2.0) - i)
    y = abs(((kernelheight - 1) / 2.0) - j)
    iloc = 0 - kernelwidth/2 + i
    if iloc lt 0 then begin
      iloc = iloc + idims[0]
    endif
    jloc = 0 - kernelwidth/2 + j
    if jloc lt 0 then begin
      jloc = jloc + idims[1]
    endif
    print, idims[0], idims[1], i, j, iloc, jloc
    gausskernel[iloc,jloc] = 1 / (2.0 * 3.141 * standarddev * standarddev) * exp(-(x*x + y*y) / (2.0 * standarddev * standarddev))
  endfor
endfor
;print, gausskernel
print, total(gausskernel)
;print, fft(gausskernel)
gausskernel = gausskernel / total(gausskernel)
;smoothedimage = convol(double(adjustedimage),gausskernel,/EDGE_TRUNCATE)
smoothedimage = double(fft(fft(adjustedimage) * fft(gausskernel),/inverse))
;smoothedimage = convol(adjustedimage,[[ 1,  8, 15,  8, 1],[ 8, 63,127, 63, 8],[15,127,255,127,15],[ 8, 63,127, 63, 8],[ 1,  8, 15,  8, 1]], /normalize) 
;smoothedimage = smooth(adjustedimage,[kernelwidth,kernelheight],/EDGE_TRUNCATE)
;smoothedimage = smooth(adjustedimage,[100,100],/EDGE_TRUNCATE)
;tvscl, smoothedimage

; Divide the original image by the smoothed image
resultimage = (dataaverage * *info.process) / smoothedimage
;resultimage = resultimage / max(resultimage) * 255
*info.process = smoothedimage
;*info.process = resultimage
;*info.process = gausskernel
tvscl, *info.process

ENDIF
Widget_Control, event.top, set_uvalue=info, /No_copy
END

I'm also having an issue with the values of the blurred image being really tiny, despite the fact that the values in the gaussian filter add up to 1.0 (in fact, I divided by the total to make sure that was the case, since a gaussian kernel can only approach a sum of 1). They seem scaled correctly, but they're just much smaller than I would expect, the normal scaling is from 0 to highs in the 800+ range, while the blurred values are in the .001 to .002 range.

Penny Arcade

Quick Links

Gaussian kernels, and the Fourier transforming of them

Posts