Table of Contents Native Patch Extraction Changing Perspective The Mechanics of as_strided What about RGB? This post is a recreation of Misha Laskin’s Twitter post about patch extraction in numpy. I wanted to provide a version of it that can be accessed without requiring a Twitter account.
Patch extraction is a common image preprocessing technique that splits an input image into a regular grid of sub images. It is commonly used to prepare an image for input into a Vision Transformer (Dosovitskiy et al. 2021). As Misha points out in their original post, it is also used for convolutions, min and max pooling, and splicing audio and text.