Problem 4 - ORAC

We will discuss the solution to ORAC from AIO 2025.

Solution

Observation: training earlier is better than training later

A useful observation in this problem is that training earlier is better than training later. Specifically, assume that we train for a total of x days. Then it is best to train for the first x days and relax afterwards, rather than relaxing between our x days of training. You shouldn't just trust me though, instead you should try to prove that this observation is correct. I've written my proof below, but try to prove it for yourself first!

Proof

We will use a type of proof called an exchange argument. In this proof style, we start with a solution that does not meet the criteria of our observation and show that we can change it bit-by-bit until it does meet the criteria.

Assume we train for x days but these are not the first x days (for example, suppose we train 4 times on days 1, 2, 3, and 6). Then, we can always find some day d where we trained but we did not train on day d-1 (in the previous example we could pick d = 6 since we trained on day 6 but not day 5).

What happens if we change our training schedule slightly so that we train on day d-1 instead of day d? In our example, we now train on days 1, 2, 3, 5 and so our skill levels in the afternoons are 1, 2, 3, 2, 3, 2, 1, 0, -1, ... etc. Before the change we trained on days 1, 2, 3, 6 and so our skill levels were 1, 2, 3, 2, 1, 2, 1, 0, -1, ... etc. In this example, the swap increased our skill level on day 5 from 1 to 3 while the other days were unchanged, meaning this swap was a good thing to do! This is not unique to our example: it works no matter what example we pick. Specifically, the skill level on day d-1 increases by 2 because rather than decreasing by 1 it now increases by 1. The skill levels on the other days do not change because the skill level on a specific day depends only on how many times we have trained and relaxed so far, and these quantities do not change for the other days.

If we keep finding days d where we train on day d but not d-1 and swap these, we will eventually end up with a solution where we train for the x days and relax afterwards. Each of these swaps increased our skill level on one day and did not change the other days, which proves our observation.

All the solutions presented on this page rely on the observation. However, there are other solutions for every subtask that don't rely on it.

Subtask 1: D_i = 1 for all i and N ≤ 1000

Using the observation from above, we want to train continuously before we relax. We just need to determine how many days we must train for. Let's try an example where N = 10. If we train for 5 days then our skill levels will be 1, 2, 3, 4, 5, 4, 3, 2, 1, 0, ... etc giving us only 9 days with a skill of at least 1, which isn't enough. If we train for 6 days then we will have 11 days with a skill level of at least 1 which is enough. The answers for N = 10 and N = 11 are therefore both 6. In general, the answer is ⌊N/2⌋ + 1 (in C++ this is N/2+1 and in Python this is N//2+1).

Subtask 2: D_i ≤ 2 for all i and N ≤ 1000

If D_i = 2 for all i then using similar logic to subtask 1 the answer would be ⌊N/2⌋ + 2. This is only one higher than the answer for subtask 1, and so for any case in this subtask the answer will either be ⌊N/2⌋ + 1 or ⌊N/2⌋ + 2 depending on how many 1s and 2s there are.

Let's use N₂ to represent the number of problems with D_i = 2. If we ignore the difficulty 1 tasks, then the answer is ⌊N₂/2⌋ + 2. We must also check if this answer is big enough to solve the difficulty 1 tasks. Specifically, the answer will be the maximum of ⌊N₂/2⌋ + 2 (the answer if we only consider difficulty 2 problems) and ⌊N/2⌋ + 1 (the answer if we pretend all problems have D_i = 1).

Subtask 3: D_i ≤ 1000 for all i and N ≤ 1000

For this subtask we will extend our subtask 2 solution. We already know that the answer is at least ⌊N/2⌋ + 1. We can redefine N₂ as the number of problems with difficulty at least 2 and then we also know that the answer is at least ⌊N₂/2⌋ + 2. In general, we define N_i as the number of problems with difficulty at least i. If N_i ≥ 1, then by extending the ideas from subtask 1 and 2 we know the answer is at least ⌊N_i/2⌋ + i. This is because we must train for ⌊N_i/2⌋ + i days to solve all the tasks with difficulty at least i.

Let max represent the maximum value of ⌊N_i/2⌋ + i across all i where N_i ≥ 1. We know that the answer must be at least max. It is also possible to prove that the answer is equal to max. You shouldn't just believe me: try to prove it yourself!

Let's analyse the time complexity of this solution. Since the input is given to us in sorted order we know that D_N is the maximum D_i and our solution must compute N_i for each i from 1 to D_i. We can do this in O(N) time for each i, giving a time complexity of O(ND_N). This is fast enough to solve subtask 3 but not fast enough to solve the full problem.

Full Solution

The subtask 3 solution can be optimised to pass this subtask. There are several ways to do this, some with a time complexity of O(D_Nlog(N)), others with O(N + D_N) or even O(N). We leave it as a challenge for you to find one of these solutions!

Code

Below you can see the subtask 3 solution written in Python and C++. I suggest trying to code the solution yourself and only look at our code after you finish or if you get stuck.

Python 3
C++

# N is the number of informatics problems on ORAC.
N = 0

# D contains the difficulty scores of the problems. Note that the list starts
# from 0, and so the values are D[0] to D[N-1].
D = []

# Read the value of N and the difficulty scores.
N = int(input().strip())
D = list(map(int, input().strip().split()))

# Compute the answer
answer = 0

for i in range(1, D[-1]+1):
  # find Ni
  Ni = 0
  for d in D:
    if d >= i:
      Ni += 1
  answer = max(answer, Ni//2+i)

print(answer)

#include <cstdio>
#include <algorithm> // for std::max

/* N is the number of informatics problems on ORAC. */
int N;

/*
 * D contains the difficulty scores of the problems. Note that the array starts
 * from 0, and so the values are D[0] to D[N-1].
 */
int D[200005];

int answer;
int main(void) {
  /* Read the value of N and the difficulty scores. */
  scanf("%d", &N);
  for (int i = 0; i < N; i++) {
    scanf("%d", &D[i]);
  }

  /* Compute the answer */
  int answer = 0;

  for (int i = 1; i <= D[N-1]; i++) {
    // find Ni
    int Ni = 0;
    for (int d : D) {
      if (d >= i) {
        Ni++;
      }
    }
    answer = std::max(answer, Ni/2+i);
  }

  printf("%d\n", answer);
}

Click here to go back.