You need to log in to make submissions.
Please read the general instructions for this exercise first. Here are the additional instructions specific to this task:
Parallelize your solution to CP1 by exploiting instruction-level parallelism. Make sure that the performance-critical operations are pipelined efficiently. Do not use any other form of parallelism yet in this exercise. Please do all arithmetic with double-precision floating point numbers.
For this technical exercise, we have disabled auto-vectorization.
I will first run all kinds of tests to see that your code works correctly. You can try it out locally by running ./grading test, but please note that your code has to compile and work correctly not only on your own computer but also on our machines.
If all is fine, I will run the benchmarks. You can try it out on your own computer by running ./grading benchmark, but of course the precise running time on your own computer might be different from the performance on our grading hardware.
| Name | Parameters | 
|---|---|
| benchmarks/1 | nx = 1000, ny = 1000 | 
| the input contains 1000 × 1000 pixels, and the output should contain 1000 × 1000 pixels | |
| benchmarks/2 | nx = 1000, ny = 4000 | 
| the input contains 4000 × 1000 pixels, and the output should contain 4000 × 4000 pixels | |
In this task your submission will be graded using benchmarks/2: the input contains 4000 × 1000 pixels, and the output should contain 4000 × 4000 pixels.
The point thresholds are as follows. If you submit your solution no later than on Wednesday, 08 October 2025, at 23:59:59 (Helsinki), your score will be:
| Running time | Points | 
|---|---|
| ≤ 7.000 sec | 1 | 
| ≤ 6.000 sec | 2 | 
| ≤ 5.000 sec | 3 | 
If you submit your solution after the deadline, but before the course ends on Wednesday, 31 December 2025, at 23:59:59 (Helsinki), your score will be:
| Running time | Points | 
|---|---|
| ≤ 6.000 sec | 1 | 
| ≤ 5.000 sec | 2 |