Technical Assignment
There is a file with a sequence consisting only of 0 and 1.
It is necessary to analyze this sequence and try to find any patterns that will allow predicting the next symbol better than random guessing.
At the same time, the task is not to learn to predict each subsequent symbol. If it turns out that this is impossible, it is sufficient to find such repeating patterns or states of the sequence after which the probability of the next symbol is significantly higher than 50%.
For example, if after a certain combination of symbols the next symbol is equal to 1 in 70–80% of cases, then such a pattern is already of interest, even if it occurs not very often.
The main task is to find any patterns that can be used to achieve a positive mathematical expectation.
The evaluation rules are very simple:
- correct prediction — +1 point;
- incorrect prediction — −1 point.
The ultimate goal is to build a model or find a rule that will have a positive mathematical expectation over the long term, meaning the sum of correct predictions should exceed the sum of incorrect ones.
The performer independently chooses the research methods. These can be statistical methods, searching for repeating patterns, machine learning, neural networks, or any other approaches. It is not necessary to limit oneself to standard methods — any ideas that can help discover patterns are welcome.
As a result of the work, it is necessary to provide:
- a description of the found patterns (if any);
- a description of the methods used;
- results of testing the model or found patterns;
- prediction accuracy;
- final result according to the evaluation system (+1 for correct prediction, -1 for incorrect);
- source code of the research.