Massively Parallel Processing of Big Data - CEO Interview with ParStream
In the world of data there's long been a limitation in what type of analytics can be generated in real-time. Stream analytics for things like calling on in-memory data are available, but it's not been possible to pair real-time with massive volumes of big data. One company's set to try and change that.
ParStream’s vision is to revolutionize the database market to enable real time big data analytics applications. To achieve this, the company is performing fundamental research and driving innovation in database technology. The results allow users to perform real-time analytics on big data at a significantly lower cost.
In areas like keyword monitoring, for example, ParStream is the analytics platform for Searchmetrics, a search and social analytics software company. The company monitors over 75 million domains and 100 million keywords on the world's biggest search engines.
Searchmetrics customers use the service to monitor competing domains and to optimize their keywords to drive traffic. This translates into typical imports of over seven terabytes of data and querying more than ten billion data records. By switching to ParStream, Searchmetrics greatly reduced infrastructure requirement and achieved faster import and query execution times.
Sam/Andre: What is your secret sauce? And would you consider Parsteam to be a player in the NewSQL database category since it ships with an SQL interface?
Michael: Yes, we are in the NewSQL database market even though the NewSQL market is not clearly defined as other markets. Even NoSQL is not clearly defined and there are many definitions out there. But it is fair to say we are in the NewSQL database market or Analytics Platform product.
You asked what is our secret sauce- It’s the HPCI or High Performance Compression Index. At ParStream, we have found a way to represent the data in an index which allows us to find the relevant record much faster by looking at much less data. We not only work with numbers or low commonalities, we can also index strings, dates, floating point numbers and of course integers as well. The efficiency in our software comes from not having to crunch so much data. The other major impact is the significant reduction in the number of servers required to run ParStream.Continued on the next page