Programming Parallel Computers

HY 2023–2024

IS: image segmentation

Please read the general instructions on this page first, and then check the individual tasks for more details. For each task you can download a zip file that contains the code templates you can use for development.

Task Attempts Points Max Rating Man. Deadline for full points
IS4: fast solution

Using all resources that you have in the CPU, solve the task as fast as possible. You are encouraged to exploit instruction-level parallelism, multithreading, and vector instructions whenever possible, and also to optimize the memory access pattern. Please do all arithmetic with double-precision floating point numbers.

5 ★★ 2024-08-31 at 23:59:59
IS6a: fast CPU solution for 1-bit images

In this task, the input is always a monochromatic image: each input pixel is either entirely white with the RGB values (1,1,1) or entirely black with the RGB values (0,0,0). Make your solution to IS4 faster by exploiting this property. It is now enough to find a solution for only one color channel, and you will also have much less trouble with rounding errors. In this task, you are permitted to use single-precision floating point numbers.

5 ★★★ 2024-08-31 at 23:59:59
IS6b: fast GPU solution for 1-bit images

Port your solution to IS6a to the GPU; again, make it run as fast as possible.

5 ★★ 2024-08-31 at 23:59:59
IS9a: better algorithm

Design a more efficient algorithm that (at least in typical cases) does not need to try out all possible locations of the rectangle. Implement the algorithm efficiently on the CPU.

5 ★★★ 2024-08-31 at 23:59:59

General instructions for this exercise

Find the best way to partition the given figure in two parts: a monochromatic rectangle and a monochromatic background. The objective is to minimize the sum of squared errors.


We have already defined the following type for storing the result:

struct Result {
    int y0;
    int x0;
    int y1;
    int x1;
    float outer[3];
    float inner[3];

You need to implement the following function:

Result segment(int ny, int nx, const float* data)

Here data is a color image with ny*nx pixels, and each pixel consists of three color components, red, green, and blue. In total, there are ny*nx*3 floating point numbers in the array data.

The color components are numbered 0 <= c < 3, x coordinates are numbered 0 <= x < nx, y coordinates are numbered 0 <= y < ny, and the value of this color component is stored in data[c + 3 * x + 3 * nx * y].

Correct output

In the Result structure, the first four fields indicate the location of the rectangle. The upper left corner of the rectangle is at coordinates (x0y0), and the lower right corner is at coordinates (x1-1y1-1). That is, the width of the rectangle is x1-x0 pixels and the height is y1-y0 pixels. The coordinates have to satisfy 0 <= y0 < y1 <= ny and 0 <= x0 < x1 <= nx.

The last two fields indicate the color of the background and the rectangle. Field outer contains the three color components of the background and field inner contains the three color components of the rectangle.

Objective function

For each pixel (x,y) and color component c, we define the error error(y,x,c) as follows:

The total cost of the segmentation is the sum of squared errors, that is, the sum of error(y,x,c) * error(y,x,c) over all 0 <= c < 3 and 0 <= x < nx and 0 <= y < ny.

Your task is to find a segmentation that minimizes the total cost.


In IS4, IS6a, and IS6b tasks you are expected to use an algorithm that tries out all possible locations 0 ≤ y0 < y1 ≤ ny and 0 ≤ x0 < x1 ≤ nx for the rectangle and finds the best one. However, for each candidate location you should only perform O(1) operations to evaluate how good this position is. To achieve this, some preprocessing will be needed.

In IS9a you are expected to design a more efficient algorithm that (at least in typical cases) does not need to try out all possible locations of the rectangle. In IS9a your submission will be graded using a structured input that might resemble e.g. a real-world image in which some candidate positions are much better than others.


These examples show the segmentation produced by a correct implementation (right) for some sample images (left). Hover the mouse on the output to better see the segmentation.

Input OutputOutput
Input OutputOutput
Input OutputOutput
Input OutputOutput
Input OutputOutput

General hints

Spend some time with pen and paper first to get the math right. You need to find a very efficient way of checking which of the candidate positions are best. There is a fairly straightforward solution in which for each candidate position you will only need to do a few array lookups (to a precomputed array) and a few arithmetic operations to calculate an indicator for the quality of this position.

Remember that the average minimizes the sum of squared errors.

Use the inclusion–exclusion principle to quickly calculate the sum of values within a rectangle — if you can look up the orange and blue sums from a precomputed array, you can also calculate the sum of the gray area:

It may be helpful to organise the loops so that the outer loop tries out different sizes of the rectangle, and the inner loop tries all possible positions of the rectangle. Then in the innermost loop you only need to be able to compare candidate positions for rectangles with the same size. Precompute everything that you can outside the innermost loops.

Hints for IS9a

Here is one approach: First use a coarse grid, with e.g. 10 × 10 pixels per grid cell. Then try all possible ways to place the rectangle in this coarse grid. Each coarse location represents a set of approx. 10000 fine-grained locations (so it is much faster to try out all possible coarse locations). For each coarse location, calculate the following two estimates (efficiently, in constant time):

  • An upper bound: at most how much is the cost if I place the rectangle in some of the fine-grained locations that are represented by this coarse-grained location?
  • A lower bound: at least how much is the cost if I place the rectangle in some of the fine-grained locations that are represented by this coarse-grained location?

After these calculations, you can hopefully rule out a majority of coarse-grained locations: you know that there are other locations where the cost is at most some value s, so you can then ignore all coarse-grained locations where the cost is larger than s.

Finally, zoom in to those coarse-grained locations that you have not yet ruled out.