[RoBERT & ToBERT] Hierarchical Transformers for Long Document Classification