Gilberto Camara, INPE, Brazilian Ministry of Science
“The database industry has had to rethink its processes to get around the limited movement of two dimensional structures of row and tables stored in a single location.” By Telefonica
As our CTO and co-founder, Mike Stonebraker, likes to say one size database does not fit all. SciDB was designed for managing and analyzing machine-generated data; that’s its sweet spot.
Of course the proof is in the real world use of SciDB. One example: Gilberto Camara, quoted above, is a scientist with INPE, the National Institute for Space Research in Brazil. INPE’s mission is to serve as a national and international resource for better understanding space and earth environments. A key mission is to monitor and report on climate changes in the Amazon, using satellite data from NASA and others combined with land-based sensor data. Combining huge volumes of data all in different formats is a function SciDB excels at.
Machine-generated data like INPE’s does not fit neatly or efficiently into tables, the data model used in relational databases. Unlike SQL DBMS, SciDB’s native multi-dimensional array data model is designed specifically to manage and analyze highly dimensional, multifaceted data. SciDB is designed to efficiently handle both dense and sparse arrays providing dramatic storage efficiencies as the number of dimensions and attributes grows. Math operations run directly on the native data format. Partitioning data in each coordinate of an array facilitates fast joins and access along any dimension, thereby speeding up clustering, array operations and population selection.
As our chief scientist, Paul Brown, says, “An Earth scientist shouldn’t need to spend time learning about four different file formats, multiple interfaces, and three new programming languages before he can even begin his real work.”
For more on how SciDB is helping INPE and others, like NASA, manage and analyze machine-generated data, click here.