By dkl9, written 2023-222, revised 2023-222 (0 revisions)

Suppose you have to do something once every `k` days, evenly spaced.
(Nothing special about days.
It could be any sampling interval; my choice is arbitrary.)
You can also do it on other days, but there is a cost to that.

An enemy, Eve, wants to figure out that regular component of the schedule.
She suspects that you may be following a once-every-`k`-days pattern as part of your evil plots, but she's not sure of that, and she starts having no clue what `k` is, or the alignment of that pattern (offset `j`).
Eve receives an `n`-day sample of your behaviour, with a binary variable to indicate whether you acted on each day.
That sample will include at least `n`/`k` `true`

s.

Here's some RTensor code to generate a sample like that (you'll have to put in values for the variables):

```
n = ...
k = ...
j = ...
div(u, v) = (u/v == floor(u/v))
v = map(zero(n), ((_, i) => div(i - j, k)))
```

You want to obstruct Eve's understanding of your patterns. How much noise (irregular additions to the schedule) do you need to add to confuse her?

```
# Add random noise to the sample
# (you'll have to pick a noise coefficient)
c = ...
v = map(v, (x => any((x, floor(c + random())))))
```

Really, this is my strategy, and Eve would use it iff she thinks like I do. A serious enemy might do something smarter than this.

To find the pattern, for every possible value of `k` (1 to `n`, inclusive, tho values too close to either extreme are unlikely), for every possible starting point `j` (there are `k` meaningfully distinct starting points for each value of `k`), check if the modeled pattern exists in the sample, i.e. for every day with an index `j` + `a``k` (1 ≤ `j` + `a``k` ≤ `n`, `a` ∈ ℕ), you took the action under investigation.

As an example to work with, say `n` = 50.
Mapping `true`

to 1 and `false`

to 0, I present the sample

```
01000111110001000111010011001101101011010001001001
```

Assume Eve loads the sample and sets her own values of `k` and `j`, to run thru guesses.
Here's code to check if a pattern appears to exist with spacing `k` and starting point `j`:

```
pe(k, j) = all(map(v, ((x, i) => if(div(i - j, k), x, 1))))
```

Then Eve checks every plausible value.
This is roughly O(`n`³), i.e. slow.
Try replacing `n` with a lesser number, like 15.

```
map(1 .. n, (k => filter(1 .. k, (j => pe(k, j)))))
```

Eve gets the following (`k`, `j`) (spacing, offset) pairs:

- (6, 2)
- (10, 10)
- (11, 7)
- (12, 2)
- (12, 8)
- (15, 7)
- (15, 10)
- (15, 14)

Those can't all be right. There's only one pattern to find here — even if there were multiple, it couldn't be as many as 8 — and some patterns will show up as illusions from the noise.

Whether a pattern would show up in random noise depends on how much coincidence the pattern demands.
Tighter, more frequent patterns (lower `k`) need more coincidences to show up by chance.

Recall Bayes' theorem (in odds form here): `O`(`H`|`E`) = (`P`(`E`|`H`)/`P`(`E`|¬`H`)) `O`(`H`).
Patterns (`H`) with low `k` don't show up by chance much, i.e. have low `P`(`E`|¬`H`).
For all patterns in this hypothesis-space, `P`(`E`|`H`) (seeing the pattern, given that the pattern was involved in the process underlying the sample) is approximately 1.
Eve doesn't know which pattern to favour, so the prior `O`(`H`) is about the same for each hypothesis.

Thus Eve will suspect the pattern matched to the sample with the lowest value of `k`.
Here, that's a 6-day cycle, starting with day 2.

Alas, the true schedule was indeed such a 6-day cycle as Eve found. Clearly, you didn't add enough noise. What would be "enough" noise?

"Enough" noise is such noise as to confuse Eve (i.e. make her suspect a false pattern). Eve picks the highest-frequency schedule she sees in the sample, so you need to add enough noise such that she'll find a pattern in the sample of higher frequency than the true schedule you're hiding.

Now might be a good time to ask why you shouldn't just contrive a consistent, secondary schedule to layer on the first. That may be a good solution, but it only works if you can follow a secondary schedule well, which I'll assume you can't, as a condition of the problem. More importantly, with noise, you might get enough extra fake patterns in the records to completely confuse Eve. With two schedules, Eve sees the two patterns and just has to investigate which one is important.

What's the probability of a misleading pattern with interval `K` emerging at a given sample size `n`?

To answer that, we need the overall rate of action within the sample.
An action from the main schedule may be taken as event `A`, with probability 1/`k`.
An action from the added noise may be taken as event `B` (independent of `A`), which probability 0 < `c` < 1, the noise coefficient.

The rate of action is the probability `P`(`A` ∪ `B`) = `P`(`A`) + `P`(`B`) - `P`(`A`) `P`(`B`) = 1/`k` + `c` - `c`/`k`.

You know that the true schedule has `k` = 6.
I generated the sample earlier with `c` = 0.4.
Thus the overall rate of action would be 0.5.
(Theoretically.
The actual sample has an empirical rate of 0.48, having 24 1s.)

A pattern with a given interval `K` depends on `n`/`K` specific actions, which we are modelling here as iid.
Thus its probability (other than `K` = `k`, `J` = `j`), for each deceptive offset `J`, is (1/`k` + `c` - `c`/`k`)^{n/K}.
There are `K` opportune values of `J`, so the probability of any deceptive pattern with a given interval (but any offset) is `K` (1/`k` + `c` - `c`/`k`)^{n/K}.
(This is approximate, suitable for the small probabilities we expect here.)

Complication: actions in the true schedule are nonrandom in a way which may correlate with the illusory schedule under consideration, making the depended actions not really iid.
I will not address this.
You are welcome to figure it out yourself if you care enough.
I don't expect it affects results too drastically, especially if `k` is prime.

Anyway, now we need to pick a value for that probability and solve for the minimum requisite `c`.
Set `K` = `k` - 1, for that corresponds to the most probable misleading patterns.

Turns out, for the numbers chosen here, that `c` > 0.6 is needed for usably-probable misdirection.
That would require you to perform the action randomly on an extra more-than-half of days.
Which is unworkable, if we assume the action is costly.

The numbers are much more favourable if Eve only gets a smaller sample.