Category Archives: Data Warehousing

Top News – Data, Data Warehousing, Analytics & BI

November 18th, 2013
Stephen McDaniel
Chief Data Officer Advisor at Freakalytics, LLC

Finding it hard to make time to keep up with the rapidly changing world of data, data warehousing, analytics, data science, business intelligence and visual analytics? We understand! Here’s our curated summary of relevant news that could help with your future data and analytic projects. We also add commentary on the topic, a summary of the article and the link to read the full article.

There are four articles in this update:
     Amazon wades into big data streams with Kinesis
     Top 10 Trends in Text Analytics
     Effective Customer Analytics Call for Data Integration, Culture Shifts
     Your Car Is a Data Platform, What Can It Tell About You?

Missed our last issue of Top News, November 15th? Stories included RapidMiner (free and premium data mining), big data not top CFO priority, the DATA Act passes Senate, SAS replacing PowerPoint and big data sources to consider at your company.
 
 
 
 
 
1_1Amazon wades into big data streams with Kinesis

Amazon adds another layer to data storage and streaming options-Kinesis. Kinesis is all about real-time data collection and aggregation in a hosted cloud-scalable from Megabytes to Terabytes per hour! As such, it is a service that keeps your data for a maximum of 24 hours, by which time you presumably used it or stored it in a data warehouse (like Amazon Redshift), Hadoop system (like Amazon Elastic Map Reduce), NoSQL system (like Amazon Dynamo DB) or file store (like Amazon S3!) Do you notice a trend here?
Continue reading

Top News – Data, Data Warehousing, Analytics & BI

November 15th, 2013
Stephen McDaniel
Chief Data Officer Advisor at Freakalytics, LLC

Finding it hard to make time to keep up with the rapidly changing world of data, data warehousing, analytics, data science, business intelligence and visual analytics?  We understand! Here’s our curated summary of relevant news that could help with your future data and analytic projects. We also add commentary on the topic, a summary of the article and the link to read the full article.

There are five articles in this update:
     Rapid-I data mining now RapidMiner, the Redhat of data mining?
     Integrating data and mobile trumps big data for many CFOs
     Bipartisan DATA Act unanimously approved by Senate Committee
     Can SAS Visual Analytics replace PowerPoint?
     Big data sources to consider for your company

Missed our last issue of Top News, November 11th? Stories included Big Data and Society, Data Mining Blues, 2014 INFORMS Conference, Facebook's Free Big Data System for Analysts, Adaptive Data Preparation, How Trust Affects the Use of Analytics and Meeting a VAST Challenge
 
 
 
 
 
2Rapid-I data mining now RapidMiner,
the Redhat of data mining?

German predictive analytics, data mining and text mining company receives $5M in funding and announces a planned move of their headquarters to Boston from Dortmund, Germany. I would liken it to the Redhat of data mining, with a free community edition and paid corporate editions that adds support, more data sources and more capabilities. With over 3 million downloads, 20,000 deployments and 400 paid customers including eBay, Intel, PepsiCo and Kraft you may want to consider RapidMiner for your advanced analytics projects. The 2013 KDNuggets poll showed RapidMiner’s free edition ahead of every other advanced analytics choice including R.
Continue reading

Top News – Data, Data Warehousing, Analytics & BI

November 11th, 2013
Stephen McDaniel
Chief Data Officer Advisor at Freakalytics, LLC

i5_2Finding it hard to make time to keep up with the rapidly changing world of data, data warehousing, analytics, data science, business intelligence and visual analytics?  We understand! Here's our curated summary of relevant news that could help with your future data and analytic projects. We also add commentary on the topic, a summary of the article and the link to read the full article.

There are seven articles in this update:
     How Big Data Is Changing Science (and Society)
     Big data blues: The dangers of data mining
     2014 INFORMS Conference on the Business of Big Data
     Facebook System for Massive Big Data (Hadoop FS) Offered Free to World
     Paxata Launches Industry’s First Adaptive Data Preparation Platform
     C-Suite and Trust Both Affect Financial Returns on Analytics, Big Data
     Meeting a VAST challenge – Lincoln Laboratory staff create winning visualization
 
 
 
 
 
i6How Big Data Is Changing Science (and Society)

Traditional statistical approaches that long dominated scientific research are being challenged and augmented by new approaches from the fields of big data and data science.

HOW CAN YOU PREDICT something without understanding it? Simple: Continue reading

Data scientists & the data warehouse team-building success

Data-Scientist-Tech-200Whether you are a CIO, data architect, or a data management professional, it is imperative to understand the different approaches, attitudes and needs of the next generation of data warehouse consumers. Traditional data warehouse users include reporting teams, BI teams (who created reports for the rest of the company), statisticians and others. In the past few years, this has been rapidly changing with the new roles of data scientists, the rise of Data Enthusiasts and the burgeoning population of Accidental Analysts. In Part 1 of this series, we focus on successful collaboration between data scientists and data warehouse teams.

Data scientists have been with us for many years. However, the moniker “data scientist” is a recent change. The same role existed (and still exists) with titles such as statisticians, mathematicians, computer scientists or systems analysts; however, having one of these alternate titles doesn’t necessarily imply that one is a data scientist, although a wide range of techniques may be used by both groups.

Traditional training for people now in data science focused on Continue reading

A Share the Data client project plan

Stephen McDaniel, Chief Data Officer AdvisoryHere's an actual Share the Data project.  This is a simplification of a plan developed with a real client, with details changed to remove any identifying information.  While every project is unique, this is not an unusual engagement. Share the Data engagements range in length from 3 days to several weeks, depending on project complexity and client resource availability.

Background 

The client is approaching a key turning point in their operational history- going beyond traditional reporting with mainframe technologies to an advanced data warehouse with web-based reporting, dashboards, self-service analytics and even advanced analytics to optimize customer recruitment, operational management and improve the overall customer experience.

All of these areas are a dynamic, iterative undertaking with each area interlinked and interdependent.  It is critical that the management team at the client work to prioritize, fund and support each iterative phase of these projects.  Likewise, it is imperative that each phase of these projects show an acceptable rate of return (or greater!) for the time and money invested in these initiatives.  Better data and analytics are not an end in themselves, but rather a means to a more profitable and competitive company.

Objective
Continue reading

Data warehousing success- 100% Vision / 80% Tactical / 20% of the Work

My Background in Data Warehousing

I have been involved in the creation, development, and maintenance of seven data warehouses through the years- one of them before I even knew about the term "data warehousing"!   I have built them with SAS and Oracle, SAS alone, Informatica and Oracle, Oracle alone, and SQL Server.

I have also visited many companies as an adviser, consultant and user of their data warehouse. In these many visits, I have seen some successes and many failures. Often, the failures could have been prevented with some key guiding principles.

Data Warehousing or Enterprise Data Integration?

Data warehousing is now known by a new buzz word, Enterprise Data Integration. In fact, SAS recently renamed SAS ETL Studio as SAS Data Integration Studio (they also added some new features around the EDI area, one new feature was around continual data acquisition so that near real time data feeds are available in the data warehouse.) Another great part of SAS EDI is SAS Data Quality, this should be a consideration throughout the entire process, but I won't directly comment about data quality in this post. Since most people still use the term data warehousing, so I will keep the popular terminology over the analysts and even SAS.

What Does it Take to Build a Successful Data Warehouse?
Continue reading