Optimized distributed large-scale analytics over decentralized data sources with imperfect communication