Collecting Big Data is easy (well...). Now finding the patterns in them and translating them into real insight is easy, too. While troves of data and disparate sources were enough to scare you, we will present easy to implement case studies of diverse statistical analytics and machine learning algorithms that allow practitioners to make actionable inference from their data, painlessly.
Open-source computer science communities for R and Python have developed powerful statistical libraries that are unprecedented in their reliability, transparency, potential, and now accessibility and ease of use. This presentation will cover the following big-data and statistical learning capabilities in R and Python:
-visualizations of summary statistics, -classification and regression problems, -dimensionality reduction, -how to implement the above in hadoop/mapreduce ecosystems and NoSQL capabilities in our case studies.
Attendees will leave with an understanding of a breadth of statistical learning capabilities and with a general impression of the ease to which these techniques can be applied to diverse domains and data types, across varied data disciplines.
Minnesota boy at home in the mountains. When he is not teaching his 5 yr old how to ski powder, Charlie is using mathematics to tell computers how to discover patterns in data. He believes anyone can do machine learning and that by sharing information on computer science, we are all better off. He thinks that if you only give him the chance, he can teach you any statistical concept, and that you'll walk away actually thinking positively about math.