Lets say that I want to process a group of NxN input pixels, then use the result when writing to a group of NxN output pixels. Your standard block-based processing.
Is this possible using GLSL? It seems that there is a 1:1 relation between kernels and output pixels? I would really appreciate examples doing stuff like this.
Indeed a GLSL fragment shader is 1:1 by definition.
But you can source data from 1 or more textures, and sample each one multiple times, allowing the NxN input to 1 output.
With MRT you can also write to multiple color attachments, like explained on this FBO tutorial : http://www.gamedev.net/reference/programming/features/fbo2/page6.asp
This would allow a kind of NxN output.
An extra pass would be needed to explode the NxN color attachments to actual NxN pixels on screen.