Well. The last problem has already been solved. If you wish to know the origin of my question, I think the best way is to get an ATI card and write some stencil blitting code, then you will know the difference between them.

As for this thread, you can see here Alfone Reinheart had found the cause of it.
Can you please also help me to think over if there is an alternative way to write my code without loss of efficiency?
i was asking because i actually use method of blitting, described above, and i'd like to know if it doesn't work on AMD cards and what exactly works. that's the main reason i was so invested into this question.

to get help replacing your code, you should describe, what exactly are you trying to achieve. you are trying to output stencil buffer? for what exactly?