For the past several years, the focus for 'Big Data' technology has revolved around using Hadoop-related technologies for achieving good performance at an affordable cost. For example, the 'Data Lake' concept emerged as a way to store, discover, query and analyze large amounts of 'raw' data in commodity machines. However the limitations of "naive" Data Lake implementations are increasingly evident as they become adopted by large corporations due to a number of factors
Have a question?