Microsofts Cosmos Service
Posted on In Computing systems, Resource management, Storage systemsCosmos is “Microsoft’s internal data storage/query system for analyzing enormous amounts (as in petabytes) of data”.
There is no paper/technical report about Cosmos published yet. I compiled a list of information about Cosmos on the Web as follows.
What is Microsoft’s Cosmos service? by Yaron Y. Goland.
Microsoft Cosmos: Petabytes perfectly processed perfunctorily by Seth Eliot.
Cosmos Big Data and Big Challenges by Pat Helland.
What Is COSMOS?
- Petabyte Store and Computation System
- About 62 physical petabytes stored (~275 logical petabytes stored)
- Tens of thousands of computers across many datacenters
- Massively parallel processing based on Dryad
- Similar to MapReduce but can represent arbitrary DAGs of computation
- Automatic computation placement with data
- SCOPE (Structured Computation Optimized for Parallel Execution)
- SQL-like language with set-oriented record and column manipulation
- Automatically compiled and optimized for execution over Dryad
- Management of hundreds of “Virtual Clusters” for computation allocation
- Buy your machines and give them to COSMOS
- Guaranteed that many compute resources
- May use more when they are not in use
- Ubiquitous access to OSD’s data
- Combining knowledge from different datasets is today’s secret sauce
One comment