An introduction and detailed explanation of SVM (an ML algorithm used for classification, regression problems, and outlier detection).Read More
Recently, in December, the 2020 season of Major League Soccer ended after a tough year. Although I have a distant relationship with the sport nowadays, the subject ends up being very natural to me as a Brazilian. Since I returned to live in the USA, I started to be interested in the soccer growth in the country, seeing great potential as a sport and as a business. Reading about the finals, I realized that I know little about the players and decided to explore a little, to learn more about the league, its teams, fans, and players.Read More
Join us for the world’s leading event about accelerating enterprise transformation with AI and Data, for enterprise technology decision-makers, presented by the #1 publisher in AI and Data
Dremio, a startup offering tools to help streamline and curate data, today announced that it raised $135 million in series D funding at a post-money valuation of $1 billion. The company says it’ll use the funds, which come nine months after a $70 million round, to invest in cloud data lake technologies that could benefit businesses looking to connect, analyze, and process data while accelerating database queries. Specifically, Dremio plans to expand its engineering centers of excellence and grow its customer-facing organizations to keep pace with new customer acquisitions.
Due to its scalability, low cost, and simplicity of management, cloud data lake storage has become the destination of choice for storing high volumes of data. According to a recent Allied Market Research report, the global data warehousing market size was valued at $18.61 billion in 2017, growing at a compound annual growth rate of 8.2% from 2018 to 2025. However, to audit that data, it has to be moved and copied into proprietary data warehouses, a process that can become costly, complex, and inflexible.
MapR veterans Jacques Nadeau and Tomer Shiran founded Santa Clara, California-based Dremio in 2015 to solve this challenge. CEO Billy Bosworth tells VentureBeat that Tomer, a former product manager at Microsoft who’s held engineering and research roles at IBM and HP, saw the rise of public clouds like Amazon Web Services, Microsoft Azure, and Google Cloud Platform as an opportunity to reinvent big data technology and develop a cloud data lake engine, enabling companies with large storage volumes to rapidly analyze their data.
“Dremio customers are running millions of queries per day for high concurrency BI with tools like Tableau and Power BI, ad-hoc data processing, and mission-critical dashboards. This is made possible by fundamentally simplifying the workflow for data engineers who are already centralizing data from many sources into cloud stores like AWS S3 and Microsoft ADLS,” Bosworth said in an email interview with VentureBeat. “With Dremio, that data does not need to be further moved or copied into data warehouses for analytics; instead, the full data set is available directly in native cloud storage.”
Dremio offers a virtualization toolkit that bridges the gaps among relational databases, Hadoop, NoSQL, ElasticSearch, and other data stores, connecting to business intelligence software as if it were a primary data source and querying it via SQL. (SQL is the domain-specific language designed for stream processing and managing data held in a relational database management system.) The startup’s eponymous platform maintains a catalog of sources, physical and virtual datasets, and datasets’ lineage, making it easier to search and find datasets and see how data are being transformed.
Above: A few of the data sources Dremio’s platform supports.
Image Credit: Dremio
Dremio is available in an open source Community edition as well as a commercial Enterprise edition. It runs in the cloud via Kubernetes or in a Hadoop cluster, and subscription pricing scales based on the number of nodes to which Dremio is deployed.
Joining capabilities native to Dremio enable data lakes to benefit from other stores, including Oracle, SQL Server, and PostgreSQL databases. And Dremio automatically detects schemas and supports cloud data lakes in Amazon S3 and other cloud storage providers, leveraging the Apache Arrow data structure to speed up performance by 1,000 times, the company claims.
Thanks to features like automatic failover, Dremio can automatically select new nodes in the event of node and instance cluster failures. The platform’s dynamic access, moreover, delivers programmatic security controls through integration with Kerberos, LDAP, and other centralized providers.
On the AI side of the equation, Dremio taps machine learning to recommend datasets to users and adapt catalogs in response to changes in schema and execution. It also algorithmically caches and indexes metadata as needed, in real time and on the fly.
Asked whether the pandemic has affected business, Bosworth said it hadn’t, pointing to Dremio’s 60% growth in headcount since March. Other than a delayed sales cycle when the startup’s customers transitioned to working from home, Dremio weathered the storm well, growing its customer base to 100 companies — a majority of which are from the Forbes Global 2000 — with over 75,000 users.
“Data analytics has always been important to our customers. This year, it has become more imperative than ever as we navigate this pandemic,” Bosworth said. “Dremio was already a distributed company, so we did not experience any loss of productivity.”
Dremio’s series D round announced today was led by Sapphire Ventures and included participation from existing Dremio investors Insight Partners, Lightspeed Ventures, Norwest Venture Partners, Redpoint Ventures, and Cisco Investments. As of today, the company has about 160 employees — a number it expects will double by the end of 2021 — and has raised $247 million in venture capital.
VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you,
gated thought-leader content and discounted access to our prized events, such as Transform
networking features, and more.
Become a member
The internet of medical things (IoMT ) is a combination of medical devices and internet technology to provide up-to-date and remote health care services. Learn more about IoMT and how it is revolutionizing the health care sector in this article.Read More
An Explanation for Splitting Data into Training and Testing Sets adn teh danger of overfitting.Read More
This isn’t your typical recruiting story. I wasn’t actively looking for a new job and Netflix was the only place I applied. I didn’t know anyone who worked there and just submitted my resume through the Jobs page 🤷🏼♀️ . I wasn’t even entirely sure what the right role fit would be and originally applied for a different position, before being redirected to the Analytics Engineer role. So if you find yourself in a similar situation, don’t be discouraged!Read More
Relational database startup SingleStore (previously MemSQL) closed an $80 million funding round today, bringing its total raised to $238 million. The San Francisco-based company plans to use the funds to increase its market presence; expand its engineering team in Portugal, Ukraine, and the U.S.; and grow its customer base internationally.Read More
While it is important to measure and track the actual CLTV from the existing customer base, a company also need to be able to estimate the CLTV for both existing and prospect customers over the extended period. There are a few things you will need to consider while predicting the CLTV.Read More
ActivTrak, an Austin, Texas-based cloud productivity monitoring software provider, today raised $50 million in a series B round from Sapphire Ventures. The fresh capital will be used to scale ActivTrak’s go-to-market activities across sales, marketing, and channels and expand the company’s capabilities using AI-driven analytics, according to CEO Rita Selvaggi.Read More
Seattle-based Algorithmia has announced Insights, a solution for monitoring the performance of machine learning models. Algorithmia specialises in artificial intelligence operations and management. The company is backed by Google LLC and focuses on simplifying AI projects for enterprises just getting started. Diego Oppenheimer, CEO of Algorithmia, says: “Organisations have specific needs when it comes to…Read More
As an inspiring data scientist, building interesting portfolio projects is key to showcase your skills. When I learned coding and data science as a business student through online courses, I disliked that datasets were made up of fake data or were solved before like Boston House Prices or the Titanic dataset on Kaggle.Read More
Conducting a data science/ analytics project always takes time and has never been easy. A successful and comprehensive analytics project is way beyond coding. Instead, it involves sophisticated planning and a large amount of communication.Read More
As we can see, a number of variables differ significantly between the Churn and Non-Churn group, so this dataset likely holds a good deal of useful intelligence.Read More
There’s no question most CIOs and analysts, whether focused on marketing or operations, rely heavily on data. What might be unclear is specifically how that data is being retrieved, analyzed and ultimately leveraged for better decision making.Read More
Until 2015, even professional programmers didn’t consider machine learning has real potential and benefits. However, with innovation the development of AI and computing capabilities build-up, autonomous MLOps platforms began to develop rapidly and became an integral part of computer systems development.Read More
With the seemingly successful application of deep learning, experts are opining, with conviction, that the AI winter has finally come to an end.Read More
It’s not a matter of choosing ETL vs. ELT as a combination of both approaches may be right depending on a company’s needs.Read More
The balance between ignorance and confidence. All you need in this life is ignorance and confidence, and then success is sure. ~Mark TwainRead More
Businesses today can benefit in real-time from the data they continuously generate at massive scale and speed from various data sources.Read More