Submitted by thedanindanger on Thu, 05/31/2018 - 13:32
As I discussed in a previous article, Data Science is in desperate need of Devops. Fortunately, there are finally some emerging devops patterns to support Data Science development. DataBricks themselves are providing much of it.
Two concepts keep popping up in the devops patterns: “Continuous Integration / Continuous Deployment” and “Test Driven Design” (Moving toward “Behavioral Driven Design” but that’s not a widely used term).
Submitted by thedanindanger on Sun, 02/18/2018 - 18:35
The past month I’ve taken two EdX courses to brush up on Enterprise Data Integration Architecture. One on Active Directory Identity Management in Azure and the other on deploying data application interface services with C#.
What does that have to do with Data Science? Everything. At least that is my strong suspicion. This article explores industry trends towards “Enterprise” data science and how we can build our architectures to support very rapidly evolving Data Science Solutions.
Submitted by thedanindanger on Sun, 09/10/2017 - 19:39
Frequently Asked Questions
Sometime when I get asked a question I send an extremely well thought out response, which may or may not be appreciated by that person, but I feel like there might be people who would - maybe...
Regardless of my delusions of grandeur, I do want to start documenting these responses both to feed my massive ego and because I am hopelessly lazy. Since the one I get asked the most is "How do I get started in Data Science?" we might as well start there. So here's my Fall 2017 go to answer on how to get started quickly in Data Science.
Submitted by thedanindanger on Sun, 09/10/2017 - 17:47
People say data science is difficult, which it is, but even harder is explaining it to other people!
Data Science itself is to blame for this, mostly because we don’t have a concrete definition of it either, which has created a few problem. There are companies promoting ‘Data Science’ tools as ways to enable all your analysts to become “Data Scientists”. The job market is full of people who took a course on Python calling themselves “Data Scientists”. And businesses so focused on reporting that they think all Analytics, Data Science included, is just getting data faster and prettier.
But the tools we use are just that, tools. The code we use requires specialized knowledge to apply it effectively. The data pipelines we create are to monitor the success and failure of our models, it’s an added bonus it helps with reporting. To mitigate these challenges we have to come up with some clever metaphors, let's explore them a little more deeply.
Submitted by thedanindanger on Wed, 08/03/2016 - 16:10
A quick demo of powerBI using movie actors and actresses:
Submitted by thedanindanger on Tue, 11/24/2015 - 21:04
Submitted by thedanindanger on Tue, 11/24/2015 - 20:38
In a previous post, I showed you how to embed excel interactive tables. I will not link to it here, because unfortunately, that feature has now been deprecated. And that was a cool feature, why must you ruin everything I love Microsoft!? :)
Submitted by thedanindanger on Sun, 05/24/2015 - 17:21
The past few weeks I have been working on the Coursera course "Building Data Projects." They introduce some great tools for building a data app, one of which I have been meaning to try for quite some time - Shiny Apps.
Shiny in an application framework that allows the creation of sweet data apps only using some "basic" r scripting and a little understanding of how a user would need to input data. Here's my current working POC for a business intelligence project calculator. Hopefully I'll be able to ad some information later around how I made it.
Submitted by thedanindanger on Tue, 12/23/2014 - 17:19
A fun viz using custom shapes in a scatterplot. Happy Holidays!!!
Submitted by thedanindanger on Sat, 11/22/2014 - 20:02
Lately, I've been building the Dallas Office for Syntelli Solutions. One of core service offering is that of integration: