Please read the general instructions on this page first, and then check the individual tasks for more details. For each task you can download a zip file that contains the code templates you can use for development.

Task | Attempts | Points | Max | Rating | Man. | Deadline for full points |
---|---|---|---|---|---|---|

MF1: CPU baseline | ||||||

Implement a simple |
||||||

– | – |
5 + 0 | ★ | M | 2024-08-31 at 23:59:59 | |

MF2: multicore parallelism | ||||||

Parallelize your solution to MF1 with the help of |
||||||

– | – |
3 + 0 | ★ | M | 2024-08-31 at 23:59:59 | |

MF9a: better algorithm | ||||||

Design a |
||||||

– | – |
5 + 0 | ★★★ | M | 2024-08-31 at 23:59:59 |

In this task, we will implement a program for doing 2-dimensional median filtering with a rectangular window.

You need to implement the following function:

void mf(int ny, int nx, int hy, int hx, const float* in, float* out)

Here `in`

and `out`

are pointers to `float`

arrays, each of them contains `ny*nx`

elements. The arrays are already allocated by whoever calls this function; you do not need to do any memory management.

The original value of pixel (x,y) is given in `in[x + nx*y]`

and its new value will be stored in `out[x + nx*y]`

.

In the output, pixel (x,y) will contain the median of all pixels with coordinates (i,j) where

`0 <= i < nx`

,`0 <= j < ny`

,`x - hx <= i <= x + hx`

,`y - hy <= j <= y + hy`

.

This is the *sliding window*. Note that the window will contain at most `(2*hx+1) * (2*hy+1)`

pixels, but near the boundaries there will be fewer pixels.

For our purposes, the median of the vector **a** = (a_{1}, a_{2}, …, a_{n}) is defined as follows:

- Let x
_{1}, x_{2}, …, x_{n}be the values of**a**sorted in a non-decreasing order. - If n = 2k + 1, then the median of
**a**is x_{k+1}. - If n = 2k, then the median of
**a**is (x_{k}+ x_{k+1})/2.

These examples show what a correct implementation will do if you apply it to each color component of a bitmap image. In these examples, for each color component, the value of each pixel is the median of all pixel values within a sliding window of dimensions (2k+1) × (2k+1). Hover the mouse on the output images to see the differences between input and output.

*k = 1*

*k = 2*

*k = 5*

*k = 10*

*Noise reduction, k = 1*

*Noise reduction, k = 2*

Make sure that you use a fast selection algorithm to find the median. Do not sort; quickselect and other good selection algorithms are much faster than sorting, and there is already a good implementation in the C++ standard library, so you do not need to reinvent it.

Here are some examples of possible techniques that you can use to solve task **MF9a**:

- Partition the image in smaller, partially overlapping
**blocks**. For example, if your windows size is 21 × 21, a reasonable block size might be 50 × 50. Solve the median filter problem separately for each block; place the blocks so that each output pixel comes from exactly one block. Be careful with the boundaries. - Within each block, sort the input data and replace the original values by their ordinals, breaking ties arbitrarily — for example, if the input data was (1.0, 1.0, 0.1, 0.9, 0.5, 0.2, 1.0), replace it with (4, 5, 0, 3, 2, 1, 6). Now you have a much simpler task: instead of finding a median of floating point numbers, you only need to find the median of small,
**distinct**integers. - You can now represent the sliding window as a
**bit vector**: for example, if your block size is 50 × 50, then your sliding window can be represented as a bit vector with 2500 bits. Do not recompute the bit vector from scratch for every window position — if you move the window from (x,y) to (x+1,y), you only need to reset 2k+1 bits and set 2k+1 bits. - Exploit
**bit-parallelism**and the special CPU instructions that help with bit manipulation. To find the median from the bit vector efficiently, you may want to have a look at e.g. compiler intrinsics __builtin_popcountll (POPCNT instruction), __builtin_ctzll (TZCNT instruction), and _pdep_u64 (PDEP instruction). Some caching of pre-computed values may be helpful, too. - Pay attention to the
**locality**of memory references. For example, does it make more sense to slide the window from left to right or from top to bottom? - You can use
**OpenMP**to process multiple blocks in parallel.