You need to log in to make submissions.
Please read the general instructions for this exercise first. Here are the additional instructions specific to this task:
Parallelize your solution to CP1 by exploiting instruction-level parallelism. Make sure that the performance-critical operations are pipelined efficiently. Do not use any other form of parallelism yet in this exercise. Please do all arithmetic with double-precision floating point numbers.
For this technical exercise, we have disabled auto-vectorization.
I will first run all kinds of tests to see that your code works correctly. You can try it out locally by running ./grading test
, but please note that your code has to compile and work correctly not only on your own computer but also on our machines.
If all is fine, I will run the benchmarks. You can try it out on your own computer by running ./grading benchmark
, but of course the precise running time on your own computer might be different from the performance on our grading hardware.
Name | Parameters |
---|---|
benchmarks/1 | nx = 1000, ny = 1000 |
the input contains 1000 × 1000 pixels, and the output should contain 1000 × 1000 pixels | |
benchmarks/2 | nx = 1000, ny = 4000 |
the input contains 4000 × 1000 pixels, and the output should contain 4000 × 4000 pixels |
In this task your submission will be graded using benchmarks/2: the input contains 4000 × 1000 pixels, and the output should contain 4000 × 4000 pixels.
The point thresholds are as follows. If you submit your solution no later than on Saturday, 31 August 2024, at 23:59:59 (Helsinki), your score will be:
Running time | Points |
---|---|
≤ 7.000 sec | 1 |
≤ 6.000 sec | 2 |
≤ 5.000 sec | 3 |
For late submissions you will not get any points.