From 63cce5c70aae03dc5114714a0be99499554201fa Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 10:05:01 -0700
Subject: [PATCH 1/7] created stb_resample_ideas.txt

---
 docs/stb_resample_ideas.txt | 62 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)
 create mode 100644 docs/stb_resample_ideas.txt

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
new file mode 100644
index 0000000..70fda6a
--- /dev/null
+++ b/docs/stb_resample_ideas.txt
@@ -0,0 +1,62 @@
+Consider three cases just to suggest the spectrum
+of possiblities:
+
+a) linear upsample: each output pixel is a weighted sum
+of 4 input pixels
+
+b) cubic upsample: each output pixel is a weighted sum
+of 16 input pixels
+
+c) downsample by N with box filter: each output pixel
+is a weighted sum of NxN input pixels, N can be very large
+
+Now, suppose you want to handle 8-bit input, 16-bit
+input, and float input, and you want to do sRGB correction
+or not.
+
+Suppose you create a temporary buffer of float pixels, say
+one scanline tall. Actually two temp buffers, one for the
+input and one for the output. You decode a scanline of the
+input into the temp buffer which is always linear floats. This
+isolates the handling of 8/16/float and sRGB to one place
+(and still allows you to make optimized 8-bit-sRGB-to-float
+lookup tables). This also allows you to put wrap logic here,
+explicitly wrapping, reflecting, or replicating-from-edge
+pixels that would come from off-edge.
+
+You then do whatever the appropriate weighted sums are
+into the output buffer, and you move on to the next
+scanline of the input.
+
+The algorithm just described works directly for case (c).
+Suppose you're downsampling by 2.5; then output scanline 0
+sums from input scanlines 0, 1, and 2; output scanline 1
+sums from 2,3,4; output 2 from 5,6,7; output 3 from 7,8,9.
+Note how 2 & 7 get reused, but we don't have to recompute
+them because we can do things in a single linear pass
+through the input and output at the same time.
+
+Now, consider case (a). When upsampling, the same two input
+scanlines will get sampled-from for multiple output scanlines.
+So, to avoid recomputing the input scanlines, we need either
+multiple input or multiple output temp buffer lines. Since
+the number of output lines a given pair of input scanlines
+might touch scales with the upsample amount, it makes more
+sense to use two input scanline buffers. For cubic, you'll
+need four scanline buffers, and in general the number of
+buffers will be limited by the max filter width, which is
+presumably hardcoded.
+
+You want to avoid memory allocations (since you're passing
+in the target buffer already), so instead of using a scanline-width
+temp buffer, use some fixed-width temp buffer that's W pixels,
+and scale the image in vertical stripes that are that wide.
+Suppose you make the temp buffers 256 wide; then an upsample
+by 8 computes 256-pixel-width strips (from ~32-pixel-wide input
+strips), but a downsample by 8 computes ~32-pixel-width
+strips (from a 256-pixel width strip). Note this limits
+the max down/upsampling to be ballpark 256x along the
+horizontal axis.
+
+
+

From c27ccec43656e43965fb58e4a0ec78fe02ff2be8 Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 11:37:54 -0700
Subject: [PATCH 2/7] resampler prototypes

---
 docs/stb_resample_ideas.txt | 66 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
index 70fda6a..e6866d0 100644
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@@ -60,3 +60,69 @@ horizontal axis.
 
 
 
+Function prototypes:
+
+the highest-level one could be:
+
+   stb_resample_8bit(uint8_t       *dest, int dest_width, int dest_height,
+                     uint8_t const *src , int  src_width, int  src_height,
+                     int channels,
+                     stbr_filter filter);
+
+the lowest-level one could be:
+
+   stb_resample_arbitrary(void       *dest, stbr_type dest_type, int dest_width, int dest_height, int dest_stride_in_bytes,
+                          void const *src , stbr_type  src_type, int  src_width, int  src_height, int src_stride_in_bytes,
+                          int channels,
+                          int nonpremul_alpha_channel_index,
+                          stbr_wrapmode wrap,                     // clamp, wrap, mirror
+                          stbr_filter filter,
+                          float s0, float t0, float s1, float t1, // range of source to use, 0..1 in GPU texture-coordinate style
+                          void  *tempmem, size_t tempmem_size_in_bytes);
+
+And there would be a bunch of convenience functions at in-between levels.
+
+
+Some notes:
+
+   Intermediate-level functions should be provided for each source type & same dest type
+   so that the code is typesafe; only when people fall back to stb_resample_arbitrary should
+   they be at risk for type unsafety. (One way to deal with the explosion of functions of
+   every possible type would be to define one function for each input type, and accept three
+   separate output pointers, one for each type, only one of which can be non-NULL.)
+
+   nonpremul_alpha_channel_index:
+       if this is negative, no channels are processed specially
+       if this is non-negative, then it's the index of the alpha channel,
+           and the image should be treated as non-premultiplied alpha that
+           needs to be resampled accounting for this (weight the sampling
+           by the alpha channel, i.e. premultiply, filter, unpremultiply).
+           this mechanism only allows one alpha channel and ALL channels 
+           are scaled by it; an alternative would be to find some way to
+           pass in which channels serve as alpha channels for which other
+           channels, but eh.
+
+   s0,t0,s1,t1:
+       this allows fine subpixel-positioning and subpixel-resizing in an explicit way without
+           things having to be exact pixel multiples. it allows people to pseudo-stream
+           images by computing "tiles" of images a bit at a time without forcing those
+           tiles to quantize their source data.
+
+   tempmem, tempmem_size
+       all functions will needed tempmem, but they can allocate a fixed tempmem buffer
+           on the stack. providing an API that allows overriding the amount of tempmem
+           available allows people to process arbitrarily large images. the return
+           value for the function could be 0 on success or non-0 being the size of
+           tempmem needed.
+   
+
+
+
+Reference:
+
+Cubic sampling function for seperable cubic:
+   f(x) = (a+2)*x^3 - (a+3)*x^2 + 1       for 0 <= x <= 1
+   f(x) = a*x^3 - 5*a*x^2 + 8*a*x - 4*a   for 1 < x <= 2
+   f(x) = 0                               otherwise
+   "a" is configurable, try -1/2 (from http://pixinsight.com/forum/index.php?topic=556.0 )
+

From 3e8a89cad1186e074783259fbdd5d6d14e6e0832 Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 11:57:46 -0700
Subject: [PATCH 3/7] more resampler notes

---
 docs/stb_resample_ideas.txt | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
index e6866d0..a634d91 100644
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@@ -71,8 +71,8 @@ the highest-level one could be:
 
 the lowest-level one could be:
 
-   stb_resample_arbitrary(void       *dest, stbr_type dest_type, int dest_width, int dest_height, int dest_stride_in_bytes,
-                          void const *src , stbr_type  src_type, int  src_width, int  src_height, int src_stride_in_bytes,
+   stb_resample_arbitrary(void       *dst, stbr_type dst_type, int dst_width, int dst_height, int dst_stride_in_bytes,
+                          void const *src, stbr_type src_type, int src_width, int src_height, int src_stride_in_bytes,
                           int channels,
                           int nonpremul_alpha_channel_index,
                           stbr_wrapmode wrap,                     // clamp, wrap, mirror
@@ -80,17 +80,11 @@ the lowest-level one could be:
                           float s0, float t0, float s1, float t1, // range of source to use, 0..1 in GPU texture-coordinate style
                           void  *tempmem, size_t tempmem_size_in_bytes);
 
-And there would be a bunch of convenience functions at in-between levels.
+And there would be a bunch of convenience functions in-between those two levels.
 
 
 Some notes:
 
-   Intermediate-level functions should be provided for each source type & same dest type
-   so that the code is typesafe; only when people fall back to stb_resample_arbitrary should
-   they be at risk for type unsafety. (One way to deal with the explosion of functions of
-   every possible type would be to define one function for each input type, and accept three
-   separate output pointers, one for each type, only one of which can be non-NULL.)
-
    nonpremul_alpha_channel_index:
        if this is negative, no channels are processed specially
        if this is non-negative, then it's the index of the alpha channel,
@@ -108,13 +102,36 @@ Some notes:
            images by computing "tiles" of images a bit at a time without forcing those
            tiles to quantize their source data.
 
-   tempmem, tempmem_size
+   tempmem, tempmem_size:
        all functions will needed tempmem, but they can allocate a fixed tempmem buffer
            on the stack. providing an API that allows overriding the amount of tempmem
            available allows people to process arbitrarily large images. the return
            value for the function could be 0 on success or non-0 being the size of
            tempmem needed.
    
+   src_stride, dest_stride:
+       the stride variables are signed to allow you to describe both traditional
+           top-to-bottom images (pass in a pointer to the top-left pixel and
+           a positive stride) and bottom-to-top images (pass in a pointer to
+           the bottom-left pixel and a negative stride)
+
+   ordering of src & dest:
+       put these in whatever order you like, i just chose one arbitrarily
+
+   width & height
+       these are ints not unsigned ints or size_ts because i personally forbid
+           unsigned variables for almost everything to avoid signed/unsigned comparison
+           issues, but this is a matter of personal taste and you can do differently
+
+   Intermediate-level functions should be provided for each source type & same dest type
+   so that the code is typesafe; only when people fall back to stb_resample_arbitrary should
+   they be at risk for type unsafety. (One way to deal avoid an explosion of functions of
+   every possible *combination* of types in a type-safe way would be to define one function
+   for each input type, and accept three separate output pointers, one for each type, only
+   one of which can be non-NULL. 9 functions isn't that bad, but if you want to have three
+   or four intermediate-level functions with fewer parameters, 9*4 gets silly. Could also
+   use the same trick for stb_resample_arbitrary, replacing it with three typesafe functions.)
+
 
 
 

From 9c9a68787d1d2bfa2f216f80f0b22c32439f46e1 Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 12:16:11 -0700
Subject: [PATCH 4/7] imageresampler library reference

---
 docs/stb_resample_ideas.txt | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
index a634d91..c60b0dc 100644
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@@ -1,3 +1,13 @@
+1.
+
+Consider just porting this C++ public domain
+library back to C:
+    https://code.google.com/p/imageresampler/source/browse/#svn%2Ftrunk
+(recommended by @castano)
+
+
+2.
+
 Consider three cases just to suggest the spectrum
 of possiblities:
 

From 6f779fb67a34c3dac69c624b547c57a8b941f0fe Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 12:17:43 -0700
Subject: [PATCH 5/7] whoops imageresampler link

---
 docs/stb_resample_ideas.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
index c60b0dc..6129dc9 100644
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@@ -8,7 +8,7 @@ library back to C:
 
 2.
 
-Consider three cases just to suggest the spectrum
+@VinoBS Another option is to just port @richgel999's C++ library to C/stb: https://code.google.com/p/imageresampler/source/browse/#svn%2FtrunkConsider three cases just to suggest the spectrum
 of possiblities:
 
 a) linear upsample: each output pixel is a weighted sum

From 92b08aa98aaadb3639030d3966bf1174aee27c8f Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 12:39:29 -0700
Subject: [PATCH 6/7] more resampling notes

---
 docs/stb_resample_ideas.txt | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
index 6129dc9..552cc9a 100644
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@@ -57,7 +57,25 @@ need four scanline buffers, and in general the number of
 buffers will be limited by the max filter width, which is
 presumably hardcoded.
 
-You want to avoid memory allocations (since you're passing
+It turns out to be slightly different for two reasons:
+
+   1. when using an arbitrary filter and downsampling,
+      you actually need N output buffers and 1 input buffer
+      (vs 1 output buffer and N input buffers upsampling)
+
+   2. this approach will be very inefficient as written.
+      you want to use separable filters and actually do
+      seperable computation: first decode an input scanline
+      into a 'decode' buffer, then horizontally resample it
+      into the "input" buffer (kind of a misnomer, but
+      they're the inputs to the vertical resampler)
+
+(The above approach isn't optimal for non-uniform resampling;
+optimal is to do whichever axis is smaller first, but I don't
+think we have to care about doing that right.)
+
+
+Now, you probably want to avoid memory allocations (since you're passing
 in the target buffer already), so instead of using a scanline-width
 temp buffer, use some fixed-width temp buffer that's W pixels,
 and scale the image in vertical stripes that are that wide.

From ee8e926317c034def978c6738d0c1145577f0ba4 Mon Sep 17 00:00:00 2001
From: Sean Barrett <sean2@nothings.org>
Date: Tue, 22 Jul 2014 12:45:24 -0700
Subject: [PATCH 7/7] even more resampling notes

---
 docs/stb_resample_ideas.txt | 39 +++++++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/docs/stb_resample_ideas.txt b/docs/stb_resample_ideas.txt
index 552cc9a..a9842d3 100644
--- a/docs/stb_resample_ideas.txt
+++ b/docs/stb_resample_ideas.txt
@@ -8,7 +8,7 @@ library back to C:
 
 2.
 
-@VinoBS Another option is to just port @richgel999's C++ library to C/stb: https://code.google.com/p/imageresampler/source/browse/#svn%2FtrunkConsider three cases just to suggest the spectrum
+Consider three cases just to suggest the spectrum
 of possiblities:
 
 a) linear upsample: each output pixel is a weighted sum
@@ -75,8 +75,24 @@ optimal is to do whichever axis is smaller first, but I don't
 think we have to care about doing that right.)
 
 
-Now, you probably want to avoid memory allocations (since you're passing
-in the target buffer already), so instead of using a scanline-width
+Now, you can either:
+
+    1. malloc the temp memory
+    2. alloca it
+    3. allocate a fixed amount on the stack
+    4. let the user pass it in
+
+I forbid #2 in stb libraries for portability.
+
+If you're not allocating the output image, but rather requiring
+the user to pass it in, it's probably worth trying to avoid #1
+because people always want to use stb libs without any memory
+allocations for various reason. (Note that most stb libs go
+crazy with memory allocations--you shouldn't use stb_image
+in a console game--but I've tried to avoid it more in newer
+libs.)
+
+The way #3 would work is instead of using a scanline-width
 temp buffer, use some fixed-width temp buffer that's W pixels,
 and scale the image in vertical stripes that are that wide.
 Suppose you make the temp buffers 256 wide; then an upsample
@@ -86,6 +102,9 @@ strips (from a 256-pixel width strip). Note this limits
 the max down/upsampling to be ballpark 256x along the
 horizontal axis.
 
+In the following, I do #3 and allow #4 for cases where #3 is
+too small, but it's not the only possibility:
+
 
 
 Function prototypes:
@@ -101,11 +120,11 @@ the lowest-level one could be:
 
    stb_resample_arbitrary(void       *dst, stbr_type dst_type, int dst_width, int dst_height, int dst_stride_in_bytes,
                           void const *src, stbr_type src_type, int src_width, int src_height, int src_stride_in_bytes,
+                          float s0, float t0, float s1, float t1, // range of source to use, 0..1 in GPU texture-coordinate style
                           int channels,
                           int nonpremul_alpha_channel_index,
                           stbr_wrapmode wrap,                     // clamp, wrap, mirror
                           stbr_filter filter,
-                          float s0, float t0, float s1, float t1, // range of source to use, 0..1 in GPU texture-coordinate style
                           void  *tempmem, size_t tempmem_size_in_bytes);
 
 And there would be a bunch of convenience functions in-between those two levels.
@@ -113,6 +132,12 @@ And there would be a bunch of convenience functions in-between those two levels.
 
 Some notes:
 
+   s0,t0,s1,t1:
+       this allows fine subpixel-positioning and subpixel-resizing in an explicit way without
+           things having to be exact pixel multiples. it allows people to pseudo-stream
+           images by computing "tiles" of images a bit at a time without forcing those
+           tiles to quantize their source data.
+
    nonpremul_alpha_channel_index:
        if this is negative, no channels are processed specially
        if this is non-negative, then it's the index of the alpha channel,
@@ -124,12 +149,6 @@ Some notes:
            pass in which channels serve as alpha channels for which other
            channels, but eh.
 
-   s0,t0,s1,t1:
-       this allows fine subpixel-positioning and subpixel-resizing in an explicit way without
-           things having to be exact pixel multiples. it allows people to pseudo-stream
-           images by computing "tiles" of images a bit at a time without forcing those
-           tiles to quantize their source data.
-
    tempmem, tempmem_size:
        all functions will needed tempmem, but they can allocate a fixed tempmem buffer
            on the stack. providing an API that allows overriding the amount of tempmem