Reading for class on Rollback, recovery and checkpointing is "Recovery in Distributed Systems using optimistic logging and checkpointing"
You can find the paper
here