The bit might carry across many subsequent additions, so passing a nearby context window isn't always going to work.
The rules are simple enough, but the problem is that bits need to “carry” over from previous additions if there is an overflow.
For example the training sequence for the above addition is (left to right):
We generate 1000 such random integer additions and then split them into training and test sets.
After training a graph execution engine is created with the highest scoring set of parameters and executed against 8 unseen integer addition pairs.