Our podcast includes both technical & non-technical discussions on BigData, DataScience, BI, AI, DW, Business Intelligence, TDWI, SqlServer, SQL, NoSql, AWS, Azure, R, Python.
Hosts: Rajib Bahar, Shabnam Khan
php/* */ ?>
Our podcast includes both technical & non-technical discussions on BigData, DataScience, BI, AI, DW, Business Intelligence, TDWI, SqlServer, SQL, NoSql, AWS, Azure, R, Python.
Hosts: Rajib Bahar, Shabnam Khan
Joe Sack is a Principal Program Manager in the Azure SQL Database and SQL Server product team at Microsoft, with a focus on the Query Processor. Joe is an author and speaker with over 20 years of experience in the industry, specializing in performance tuning, high availability and disaster recovery. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - Your team created Adaptive Query Processing or QP. It is new in SQL Server 2017 and SQL Azure. As we know, SQL Server uses query plan internally to run tsql statements. Sometimes the plan chosen by the query optimizer is not optimal for reasons such as incorrect cardinal estimate and various other issues. What are some other pain points Adaptive QP is meant to cure? SB - Adapative QP's strength lies behind Batch mode memory grant feedback, Batch mode adaptive joins, Interleaved execution... How do they work internally? RB - What are steps to enabling QP and some best practices? Can you tell us what's in the pipe line for upcoming enhancement? SB - How do we connect with you on Social Media? Music: www.freesfx.co.uk
Bill Inmon – the “father of data warehouse” – has written 57 books published in nine languages. Bill’s latest adventure is the building of technology known as textual disambiguation – technology that reads raw text in a narrative format and allows the text to be placed in a conventional data base so that it can be analyzed by standard analytical technology, thereby creating unique business value for Big Data/unstructured data. Bill was named by ComputerWorld as one of the ten most influential people in the history of the computer profession. Bill lives in Castle Rock, Colorado. For more information about textual disambiguation refer to www.forestrimtech.com. Interviewer: Rajib Bahar, Shabnam Khan Agenda: SB - In the 1970s, you have coined the term, "Datawarehouse". There are countless Data gurus referring to you as the father of "Datawarehousing". We are curious how did your journey start? What did you envision a "Datawarehouse" to be back then, and now? RB - Who were the earliest adopter? What were some interesting discoveries back then? How has the industry evolved? SB - In current state of the Data industry, do you think Datawarehousing is relevant in this hyped up age of Big Data and Data Science? Do these technologies simply compliment existing Data practices? What is your thought on it? RB- One of your project in the data space is called Textual ETL... What is it about? Is it a theoretical concept? Are there any tool in the industry that meets the standard? SB - Your recent publications are on Taxonomies, and Textual Analytics... Our knowledge on it is quite limited. Please enlighten us about the use case scenario for which it's relevant. RB - How do we connect with you in Social media such as Twitter or Blog? Music: http://www.freesfx.co.uk
Varun Bhartia is the cofounder of BeeHyve.io an online learning platform for computer science students around the world - helping students connect with each other and with the best career opportunities. He has spend his entire career working in technology NASA, Microsoft, Facebook, and Uber. He has an undergraduate degree from the university of arizona and an mba from Harvard. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - You have served some of the most interesting and awesome organizations... As a product manager, what is the most valuable lesson you have learned? SB - Recently, you left Uber to launch a startup venture called Beehyve so that college students can utilize it as a portal to find exams and homework from prior years... I can see why students would love it... If they can predict the questions on the next test, it'll definitely add value in their academic career. On the other hand, Don't you think professor would hate this idea if they had to do additional homework on coming up with unique test each year? Won't that add risk to your venture? How did you go about critically analyzing all the risk and benefits? What's your vision behind it? RB - I understand you are working on a Data Vertical. How do you plan to achieve it? Is Cloud computing or Big Data involved in any way? SB - Lately, IoT is getting similar kind of positive attention Data Science, Cloud Computing, and Big Data part of the world are receiving? Weren't you involved in an IoT competition in Minnesota? Can you tell us your thoughts on it? RB - How do we connect with you in Twitter, LinkedIn or Blog? Music: www.freesfx.co.uk
Michael Ludwig is a Data Solution Architect at Microsoft, where he works on Machine Learning, Big Data and Blockchain applications on the Azure platform. Prior to joining Microsoft, Michael worked at Silver Bay, designing and optimizing geographical and financial statistical analysis solutions (mostly regression analysis and clustering). Before that, he was a database architect and then the lead systems architect of a multi-tenant cloud-based Internet-of-Things application for LogicPD, in Minneapolis. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - What is the purpose of a graph database? Why do we use it? SB - Does GoogleMap use graph databases in it's application? RB - What are the major Graph Systems out there? How does Apache Gremlin fit into that? SB - SQL Server 2017 has support for graph table.How do you implement it? RB - How is this similar or dis-similar to graph computing solution implemented in vendor agnostic tools such as Apache Tinker pop? SB- Who can be involved in coding community of TinkerPop? RB - How do we connect with you professionally? Music: www.freesfx.co.uk
Frank La Vigne leads the Data & Analytics practice at Wintellect and co-hosts the DataDriven podcast. He blogs regularly at FranksWorld.com and you can watch him on his YouTube channel, “Frank’s World TV” (FranksWorld.TV). Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - You have recently gone through Microsoft's Professional Certification for Data Scientists. Also, you are training others in this area. What are the 4 units of this Data Science certification program, and where does the units of modules also overlap with Microsoft's Big Data certifiation program? SB - Can you tell us a little bit about the Cortana Intelligence Capstone project in Data Science certification? What sort of time committment and technical knowledge required? RB - We often see questionable studies stating something like coffee is unhealthy followed by a counter study contradicting it? Does statistics or overfitting a data model play a role in it? SB - One of the cool thing you do is co-hosting the "Data Driven" podcast with Andy Leonard. He was our guest in the past. In your facebook page for "Data Driven" podcast, your listeners also get to become your viewer and see live videos from Data Science, SQL Server, & other Technology related conferences. What are some insights from recent big conferences. RB - How do we connect with you in Twitter or Blogs or Social Media in general? Music: www.freesfx.co.uk
Curtis Seare is a co-host of the Data Crunch podcast, a Tableau and Trifacta instructor, and the Director of Analytics at Shelfbucks, a retail analytics startup in Austin, Texas. He’s worked for almost a decade in the data-science field across multiple companies and industries. He’s solved problems spanning IoT, retail, marketing, sales, competitive intelligence, nonprofit donations, and product development, among others. Bringing organizational change and innovation in analytical processes has been the center of his work. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - Please give us a little background on Data Crunch podcast's history. SB - We have listened to your Data Crunch episodes highlighting some really interesting applications of analytics such as preventing honey bee fallout, eradicating malaria in Zambia etc. Please enlighten us more on what you have discovered in your research. RB - What are some top application of IoT that retailers find useful? SB - One of the buzzword associated with IoT is streaming analytics. How is this different from standard analytics that we know or understand? RB - In our lifetime, we may find ourselves in a situation where we over-analyze a problem leading to analysis-paralysis. Is there a methodology do you follow in keeping solutions simple with complex analytics project. SB - How do we connect with you on Twitter or Social Media or blog? Music: www.freesfx.co.uk
Libby Duane is the Chief Customer Officer and a founding partner of Alteryx. In this role, Libby is responsible for overseeing and maximizing the complete Alteryx customer experience, from engagement to on-boarding, communications, performance, and retention. She has interacted with nearly every Alteryx customer, giving her a holistic perspective of the overall experience from implementation to adoption success. Nick Jewell, Technology Evangelist for Alteryx. He started his career with a PhD in Data Science before Data Science was a sexy term! His background is in studying Chemical Information Science and got to work on some exciting data projects around drug design. He made the jump into ‘big finance’ and spent over a decade learning and developing BI, Big Data and Analytics solutions before the perfect opportunity presented itself to join Alteryx as part of their solutions team. Interviewer: Rajib Bahar, Shabnam Khan Agenda: SB - Alteryx is a platform for Self-Service Data Analytics. What is the mission and vision about it? RB - In Gartner 2017 Magic Quadrant For Data Science Platforms, Alteryx was positioned as a Challenger. Also, it's at the top of niche players in the Business Intelligence and analytics platforms. How did your organization achieve it? SB - There are studies out there on Forbes stating that the majority of time a Data Scientist spends is on preparation of Data. What advantage does Alteryx give on that regard? RB - What kind of Machine Learning or Deep Learning algorithm can Alteryx implement? Please name few of them. Is it possible to customize them to fit a specific scenario? SB - Does Alteryx designer's workflow output directly to Dashboard applications such as Tableau and PowerBI? RB- There was a major conference namely #Alteryx17 recently. We would like to learn about some inside scoop from there. How is it organized and what kind of learning opportunities are available? What kind of audience does it cater to? Techies? or Business experts? or both? SB - What kind of Alteryx learning opportunities are in Minnesota? Is there a user group? RB - How can we utilize your site to learn about Alteryx? Music: www.freesfx.co.uk
"Guy In a Cube" is a youtube channel for Power BI, which is a data visualization application. Currently, Patrick Leblanc and Adam Saxton are producing contents there. Adam Saxton is just a guy in a cube doing the work! He is on the Power BI team, at Microsoft, working on documentation for Power BI and Reporting Services. He is based in Texas, and started with Microsoft supporting SQL Server connectivity and Reporting Services in 2005. Adam has worked with Power BI since the beginning, on the support side, and now helps to produce content for these products. In addition to documentation, he produces weekly videos for the "Guy in a Cube" YouTube channel. Patrick LeBlanc is currently a Data Platform Solutions Architect at Microsoft and a contributing partner to Guy in a Cube. Along with his 15+ years’ experience in IT, he holds a Masters of Science degree from Louisiana State University. He is the author and co-author of five SQL Server books. Prior to joining Microsoft, he was awarded the Microsoft MVP award for his contributions to the community. Patrick is a regular speaker at many SQL Server conferences. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - Both of you make highly engaging contents touching on Power BI technology in your YouTube Channel. That channel has helped us learn concepts we weren't aware of. One thing I enjoyed was family-guy-esque randomness with video clips. Please tell us why you chose that format. SB - There is a video where Adam states the concept behind "Guy in Cube". Do you both share the same philosophy? Would you like to add more to that history? RB - What's the latest with Power BI Report Server? What was the driving factor behind it? It used to reside primarily in SQL platform. SB - Patrick showed us how to enable Cortana to connect with Power BI. That was a super amazing demo. What is the idea behind it? Is it fair to say the use case would be to broadly distribute dashboards so that they won't have to login to a specific site. RB - How is security different in Power BI as opposed to SSRS? Can you share few pointers about RLS aka "Row Level Security"? SB - Is RLS in PowerBI similar to RLS in SSAS? RB - Will the Power BI team prevent the issue that causes RLS settings to reset when data is refreshed? SB - Patrick, there aren't many dashboard-ing application that has native features to utilize SQL Server Availability groups. It makes Power BI stand out. For those of us who don't know or understand High Availability feature, please tell us what it's about, and how Power BI adds value there. RB - Adam, you have recently visited Israel to visit your team of engineers there. I have been to presentations by Arina, & Aviv on excel's integration with Power BI. Have you worked with their team? What kind of exciting challenges and insights have they given you? SB - Please share any interesting stories from one of the recent conferences you both have attended? RB - Are you both going to PASS summit or any other interesting conferences in coming months? SB - How do we connect with you on Social Media? Music: www.freesfx.co.uk
Joe Boutros is the Director of Product Engineering at data.world, overseeing the product development and user research programs. He has spent the last 15 years working on early stage consumer facing technology problems as a software engineer, product manager, entrepreneur, and consultant. Joe focuses deeply on data informed product development - the magic at the intersection of measurement and testing, and hands-on user research. Previously, Joe was the founder of Indeed Labs, the innovation team inside the #1 job site in the world. His team was responsible for envisioning the future of Indeed.com’s product suite via invention and rapid prototyping and was the genesis of their entrance into new product categories. Alex Zelenak is a Product and User Experience Designer, currently helping people collaborate around data at data.world. Alex has spend over a decade designing and building products on behalf of agencies, enterprises, and startups. Whether through a mobile app, analytics portal, or social platform, Alex has a passion for translating ideas into positive outcomes. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - Most data experts at one point or another have been to one of the 2700 open data portals. Data.world has made it's own space. Unlike other open data portals, it is a crowd sourced data collection site aiming to have quality data. Without it, I would not have known facts such as 18 million open datasets in the world or 2.4 million websites existence during Google’s launch in 1998. We have seen some interesting data collection effort when Hurrican Harvey hit. Please tell us more about this portal. SB - Datasets subreddit is a place where many data hungry pros request for datasets. Where is it's limitation? And what kind of common 4 problems related to researching Open Data are you trying to cure? RB - Tell us more about your collaborative efforts with around 200 Data experts in various disciplines. SB - Based upon our understanding, a data project in Data.World is when you're ready to share your collected data to the whole world. Why do you make a distinction between Data Project and DataSet? RB - How do we use workspace in Data.World? Is that some sort of Integrated development environment for data focus? SB - In a dataset, do you import data or have live connection to it only? RB - Has anyone raised privacy or other related concern? How do you make sure someone is not sharing sensitive data that may fall into PHI or other relevant category? SB - How do we connect with you both Twitter/LinkedIn or blog? Music: www.freesfx.co.uk
Dharma Shukla is a distinguished Engineer at Microsoft. He is the General Manager of Azure Cosmos DB. He is the founder of Azure Cosmos DB, which was launched officially in May 2017. It is Microsoft’s globally-distributed, multi-model database service for managing data at planet-scale". One interesting cool fact about him that he has more than 60 patents in technology industry and he is a long distance runner. Interviewer: Rajib Bahar & Shabnam Khan Agenda: SB - Why the name Cosmos? How did Cosmos DB started at Microsoft? Or Why did you decide to build Cosmos DB? We heard that it is used extensively within Microsoft, is it true? RB - What is the programming language in which Cosmos DB is written? SB - What makes Cosmos DB special? Can you give us more insights into its capabilities? (resume) RB - When we heard your podcast episode in Data Skeptic, there was a mention of "Auto Index". You explained it quite well at high level and how it gives freedom to developer from worries related to indexing as their application scale up. Our follow up question to that is how does this Auto-indexing work internally? Does Cosmos keep track of most used data internally in some kind of table/tree structure to determine this? Is this based on an existing algorithm in Computer Science realm or something propriety? SB - Can you tell us more about the new capabilities and features your team is working on? RB - We see that Cosmos DB keeps shipping new features every few weeks. Can you tell us how do you roll out new features? SB - How do we connect with you in Twitter or other professional network or blogs? Music: www.freesfx.co.uk
Gregory Piatetsky-Shapiro, PHD, is the President of KDnuggets, a leading site for Analytics, Big Data, Data Science, and Machine Learning. Gregory is a co-founder of KDD (Knowledge Discovery and Data mining conferences), and a top research conference in the field... He is also a co-founder and past chair of ACM SIGKDD, a professional association for Data Mining and Data Science, and a well-known Data Scientist. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - According to Forbes, you're one of the top Big Data Influencers...Your KDNuggets site has well over 60 awards and mentions as a leading publication. I often find myself reading and tweeting your articles from KDNuggets. There are few places to find exciting articles on Data Science, AI, BigData... Please tell us a brief history of KDNuggets... SB - I understand you have transitioned from researcher role to a high level editor role. What do you enjoy about it? RB - What are your thoughts on Global trend on Machine Learning, AI, & Big Data? SB - Where does automation of Data Science come into play? Is that a helpful process or distraction from useful analysis? How do you implement it? RB - We all have bias. Does Big Data suffer from any bias such as implicit bias? SB - Who are these so called Citizen Data Scientists? Why are they important? How can they serve society at large? RB - How do we connect with you on twitter & social media? Additional Reference Materials from Gregory for our listeners. Data Science Automation: Data Scientists Automated and Unemployed by 2025? http://www.kdnuggets.com/2015/05/data-scientists-automated-2025.html The Current State of Automated Machine Learning http://www.kdnuggets.com/2017/01/current-state-automated-machine-learning.html Trends: Machine Learning overtaking Big Data http://www.kdnuggets.com/2017/05/machine-learning-overtaking-big-data.html Optimism about AI improving society is high, but drops with experience developing AI systems http://www.kdnuggets.com/2017/07/optimism-ai-impact-experience.html Bias in Big Data http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender https://www.theguardian.com/technology/2017/apr/13/ai-programs-exhibit-racist-and-sexist-biases-research-reveals https://www.forbes.com/sites/mariyayao/2017/05/01/dangers-algorithmic-bias-homogenous-thinking-ai/#42d724d270b3 Mirage of a Citizen Data Scientist: http://www.kdnuggets.com/2016/03/mirage-citizen-data-scientist.html Citizen Data Scientist Cartoon: http://www.kdnuggets.com/2016/03/cartoon-citizen-data-scientist.html Overfitting The Cardinal Sin of Data Mining and Data Science: Overfitting http://www.kdnuggets.com/2014/06/cardinal-sin-data-mining-data-science.html Music: www.freesfx.co.uk
Jen Underwood, founder of Impact Analytix, LLC, is a recognized analytics industry expert. She has a unique blend of product management, design and over 20 years of “hands-on” development of data warehouses, reporting, visualization and advanced analytics solutions. In addition to keeping a constant pulse on industry trends, she enjoys digging into oceans of data. Jen is honored to be an IBM Analytics Insider, SAS contributor, former Tableau Zen Master, and active analytics community member. In the past, Jen has held worldwide product management roles at Microsoft and served as a technical lead for system implementation firms. She has launched new analytics products and turned around failed projects. Today she provides industry thought leadership, advisory, strategy, and market research. She also writes for InformationWeek, O’Reilly Media and other tech industry publications. Jen has a Bachelor of Business Administration – Marketing, Cum Laude from the University of Wisconsin, Milwaukee and a post-graduate certificate in Computer Science – Data Mining from the University of California, San Diego. Interviewer: Rajib Bahar, Shabnam Khan - WSJ had an article on automation analytics recently. As if we don't have enough terms to keep track of such as descriptive analytics, predictive analytics, prescriptive analytics... What is the deal with automation analytics? Are they calling automatically scheduled jobs automation analytics? Or is this concept completely different? - According to Gartner, “By 2019, natural-language generation will be a standard feature of 90% of modern BI and analytics platforms.” NLG was also cited by Forbes in 2017 as a Top 10 Hot AI technology. What is natural-language generation? How does this subfield of AI differ from Natural language processing or NLP? - Recently, you released a white-paper on "Humanizing Enterprise Application Software with Natural Language". Would you like to share the lessons you have learned? - What major forces are currently driving demand for Advanced NLG? - How do Basic & Advanced NLG work? - Are there any benefits of embedding NLG into applications? - Is Quill by Narrative Science the only NLG product in this area? How does it compare to the competition? Please share it's pros and cons to other similar platform. - how do we connect with you in Twitter or other professional networking sites? Music: www.freesfx.co.uk
Dan English is a Microsoft Data Platform MVP, author, speaker, community leader, husband, and father. He is the Group Leader for the PASS Business Analytics Virtual Group. Also, He is a Business Intelligence Architect and Community Leader with a strong passion for Microsoft technologies. Specializes in Business Intelligence, Microsoft BI Toolset, Analysis Services (SSAS), BI Semantic Model (BISM), Datazen, Excel, Integration Services (SSIS), OLAP, PerformancePoint (PPS), Power BI, Power BI Desktop, Power Map, Power Query, Power View, Power Pivot, ProClarity (PAS), Pyramid Analytics (BI Office), Reporting Services (SSRS), Report Builder (RB), SharePoint Server, SQL Server. Interviewer: Rajib Bahar, Shabnam Khan Agenda: RB - I used to work in your team... You would do your part in getting the team involved in technical community. As I recall at the time you were involved with PASS BA Virtual Group. What was the motivation behind it's origin? Why is it important? SB - Let's talk about SQL Saturday program which provides the tools and knowledge needed for groups and event leaders to organize and host a free day of training for SQL Server professionals... Several years ago you interviewed several experts local to the Minnesota's technology community. Those were broadcasted via KFAI radio later. What were some of the most interesting things you learned from that experience? RB - You were involved in co-authoring and technical editing of multiple books. Most technologists we talked to found the book writing process to be a painful set of activity... What was your experience like? SB - Microsoft runs this MVP program, which recognizes notable professionals in their area of expertise. You are a data platform MVP. What is the MVP program about? Is it based on technical knowledge or serving of such knowledge? SB - Any thoughts on Gartner's ranking of current Business Intelligence tools, practices, platforms? RB - When you're not a technologist, you're a father and often coach your kids soccer team. Do you bring your work to your coaching experience or coaching experience to work? or both? SB - How do we connect with you on twitter or blog or linkedin? Music: www.freesfx.co.uk
Dave leads the technology strategy for AllSight and works with clients on their initial implementations of AllSight. He has many years of experience working on the architecture and development of mission-critical enterprise applications in both the product and custom solution spaces. Dave came from IBM where he was a Senior Technical Staff Member responsible for leading the InfoSphere MDM portfolio of products and working directly with clients around the world. Interviewer; Rajib Bahar Agenda: - In your previous role at IBM, you were a MDM Data Architect. How has that role related to your Customer Intelligence 360 initiative? - What's Entity Resolution in the Big Data Scene? - I'm interested in learning about as many techniques involving stitching together and de-duplicate customer data on Big Data platform? What operational and analytical use cases do they have? - As it relates to Customer Intellgence 360, what are some Machine Learning and Graph DB solutions out there? - How do we connect with you on twitter / social media? Music: www.freesfx.co.uk
Brad Rubin has been a professor in the Graduate Programs in Software department in the School of Engineering at the University of St. Thomas for the past 13 years. He is also Director of the Center of Excellence for Big Data, and teaches a course in Big Data Architecture, along with courses in Computer Security and Scala. He co-leads the Twin Cities Spark and Hadoop User Group. Previously, he spent most of his industry career at IBM in Rochester, MN. Brad has degrees in Computer and Electrical Engineering from the University of Illinois, Urbana and a doctorate in Computer Science from the University of Wisconsin, Madison. Interviewer: Rajib Bahar Agenda: - You are one of the organizers of TC Spark & Hadoop user group. What motivated you to start it? When did it start? We would love to hear about it's history. - In your role as Director of "Center of Excellence for Big Data" in the Graduate Programs at the University of St. Thomas, how do you make sure your program is actually relevant to the industry? In early 2000, computer science curriculum were generally behind from industry trend. Do you find your program having similar struggle? - What is the difference between your Data Science Certificate & Master of Data Science program? - Is your Big Data curriculum technology agnostic? I am just wondering... Do they employ only open source framework or both commercial and open source? - How many Big Data and Data Science experts have graduated from University of St Thomas's program? - Now, back to TC Spark & Hadoop User group... What has been the most interesting presentation? - How can we get connected with you online or social media? Music: www.freesfx.co.uk
Constantine Kokkinos currently works as a DevOps DBA with a deep focus on performance tuning TSQL; nonetheless his first love will always be the *nix ecosystem. His primary side hustle is the dbatools project where he works on bug fixing, git support, testing, and helping newcomers jump start their contributions, wherever they need help. Interviewer: Rajib Bahar, Shabnam Khan Agenda: - You're presenting this year at PASS precon. PASS stands for "Professional Association of SQL Server". It must be really exciting to present there. Is this the first time you're presenting there or have you done it before? Tell us more about the topic you're presenting, the venue, and culture over there... - PowerShell reminds me of the unix Bash Shell, Korn shell during the years in university. In the early years, open source and commercial platforms would be diametrically opposed to each other. However, the current trend appears to be that commercial platforms are adopting more open source projects and vice versa. Is that fair to say? What has been your experience with this regard. - Can you break down the evolution of various iteration of PowerShell in SQL Server? Which releases brought about the most significant changes? - To those of us who are unaware of the DBA role, DBA is some one who administers one or more Databases or Servers or DataCenters. Let's say I am a DBA who is afraid of command prompt or powershell. Teach me some basics... - You are involved with community driven open source project called DBATools? What can you do with it? How did you get involved? Any new updates coming out from that team? - Where do you blog or tweet? How can our listeners connect with you? Music: www.freesfx.co.uk
Michael is a Data Solution Architect at Microsoft, where he works on Machine Learning, Big Data and Blockchain applications on the Azure platform. Prior to joining Microsoft, Michael worked at Silver Bay, designing and optimizing geographical and financial statistical analysis solutions (mostly regression analysis and clustering). Before that, he was a database architect and then the lead systems architect of a multi-tenant cloud-based Internet-of-Things application for LogicPD, in Minneapolis. Interviewer: Rajib Bahar, Shabnam Khan Music: www.freesfx.co.uk
Andy Leonard is a Data Philosopher at Enterprise Data & Analytics, a Biml (Business Intelligence Markup Language) developer and BimlHero, SSIS architect, consultant, and trainer. He created and maintains the DILM (Data Integration Lifecycle Management) Suite that includes tools and utilities for managing SSIS in the enterprise. Andy is an avid blogger and has co-authored several books on SSIS, ETL, and database technology. Interviewer: Rajib Bahar Music: www.freesfx.co.uk
Robert is the Director of Business Development for Beyond Impact a Minneapolis based consulting company focused on Cloud, Data, Security. Robert is also the founder of the local Power BI Meetup, which meets monthly. He resides in Eagan, MN with his wife and three young sons. Robert is an avid baseball fan and craft-beer enthusiast. Interviewer: Rajib Bahar Music: www.freesfx.co.uk
David joined AllSight in 2016 to help establish it as the leader in Customer Intelligence Management. He has 18 years of experience in enterprise software and has launched and grown successful product businesses within startups and large software vendors. David led product management at DWL and was instrumental in the formation of the Customer Data Integration (MDM) market, and later the MDM market after acquisition by IBM. Prior to joining AllSight, David led product management and marketing at IBM Big Data & Analytics, where he launched the MDM market, IBM’s big data platform, and it’s self-service data platform. David has an MBA from Schulich School of Business. Interviewer: Rajib Bahar Music: www.freesfx.co.uk
Pedro Alexander Medina is Founder & Chief Analytics Officer at Haystack, LLC - an Advanced Analytics Agency specializing in custom managed solutions across the data value chain. With deep expertise in information management and advanced analytics, his mission is to help organizations optimize their strategic data assets by converting complex data into intelligence; intelligence into innovation; innovation into success. Interviewer: Rajib Bahar Music: www.freesfx.co.uk
Todd is a Microsoft Technology Center Architect based out of Minneapolis, Minnesota and is primary focused on data, analytics, and IoT scenarios. Todd’s background ranges from application development and architecture to implementation of data and analytics solutions in the field for many years as well as work at a Microsoft ISV prior to joining Microsoft. In addition, Todd has authored several books on Microsoft technologies such as .NET and SharePoint. Interviewer: Rajib Bahar - Bill Gates said, if he was a fresh college graduate, Artificial Intelligence would have been is top choice for career. I am really interested in learning about what Microsoft is doing in the Artificial Intelligence World! Please give us the big picture of how Cortana Intelligence Suite, Azure ML Studio, and Cognitive Toolkit tie in. - How does Microsoft Bot Framework work? What kind of task can it automate? Any limitations or risks? Can it take my conversation and execute commands on a remote server? - Crowdsourcing is the practice of obtaining information or input into a task or project by enlisting the services of a large number of people, either paid or unpaid, typically via the Internet.We have seen it's application in social media for fundraising, team collaboration, organizing events. With that said, how does data catalog in Cortana utilize that concept? - Customer Churning is a common scenario in any organization either profit or non-profit. Is it easy to implement a solution in Cortana to discover the root cause? How does that approach differ from building charts in Excel or visualizations in PowerBI? I'm curious to learn how would you balance when you need Cortana or some other traditional tool that already exist? - Tell us about the books you have written and your Cortana Intelligence suite course in MS Virtual Academy. - How can we connect with you in Twitter or other professional network? Music: www.freesfx.co.uk
Ruben is head of data at VSCO, a creative platform and a community of expression. His focus is on designing the data strategy, understanding user behavior, and designing metrics for the entire organization. Prior to VSCO, Ruben was the head of Content Analytics at Udemy. And before that he had a completely different career developing and troubleshooting chip manufacturing technology. Ruben’s interests span statistics, machine learning, economics, and photography. Interviewer: Rajib Bahar, Shabnam Khan Agenda: - In your role, You're in charge of Data Governance at VSCO... Would you like to elaborate on that? - According to Gregory's article in KDNuggets, top 3 most heavily used Data Science algorithms were Regression, Clustering, Decision Trees. In your projects, how useful were they? - What are some of your favorite KDNuggets like online resources, where you find good code samples, articles, podcasts etc? - Are you applying evolving branches of Artificial Intelligence such as Machine Learning, Deep Learning, or Reinforced Learning in your organization? - Many people are aware of Instagram... How is VSCO more unique compared to that platform? - Now, on to some art related question... Are you a portrait, landscape, or travel photographer? Or do you experiment with different techniques? - Please tell us where we can find you in social media? Music: www.freesfx.co.uk
Matthew C. Kelly is the founder of Squarehouse Data Solutions, a business intelligence software and solutions firm based in Baltimore, MD, USA. Matt brings over 20 years of enterprise software development experience, with over 17 dedicated to data warehousing solutions. Prior to starting Squarehouse, Matt spent 15 years as the original architect of a series of packaged data warehousing solutions for Higher Education institutions at a major software firm, overseeing hundreds of data warehouse implementations worldwide. Interviewer: Rajib Bahar, Shabnam Khan Music: www.freesfx.co.uk
Hendrik Feddersen is leader of the HRIS at the European Medicines Agency (which is the European equivalent of the FDA) in London. Six years ago, he led the full SAP HCM project from conception to completion and was the main Change Manager. His current tasks are to introduce further process improvements, problem solving, data cleaning, reporting, preparing predictions and training of colleagues. For more than three years he has been connecting internationally with like-minded HR professionals interested in HR Analytics, attending conferences, studying Data Sciences and collecting and writing articles on HR Analytics. His special interests are Text Analytics, Social Network Analysis and open source software like R. Interviewer: Rajib Bahar Music: www.freesfx.co.uk
Officially, Alison Cossette is a data analyst for the University of Vermont Medical Center. Unofficially, she says proudly, "I'm the resident data nerd." Interviewer: Rajib Bahar, Shabnam Khan Agenda: - In Twitter, your motto is "Numbers are the best story tellers", why do you say that? - Terms such as Linear Regression, Logistics regression may sound scary... is it? How did you implement them when you faced prediction challenge like that? - What do you appreciate between R & python languages (if you know only one, then talk about that one only)? - AI field keeps evolving... how do you define, AI, Machine Learning, Deep Learning, Reinforced Learning? - How do you build network in the DataScience community online or offline? - Any social media presence in Twitter, LinkedIn? Music: www.freesfx.co.uk
Daniel (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (ODSC, Spark Summit, Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects. Interviewer: Rajib Bahar Agenda: - Many of us may or may not be aware of "Jupyter Notebook", which is a web application to write codes in various Languages such is R, Python, Julia, node.js, GoLang, Ruby, & Scala. That appliation in turn creates separate process in the Kernel to receive output from the OS and return the output back to the web application. One of the coolest thing you do is to maintain the Kernela on GoLang aka Go. Currently, Data Scientists tend to gravitate toward either R, or Python as language. You're playing with a bit more modern languages in data science. Why Go? How is it more useful in statistical analysis or Data visualization? - How do you achive reproducibility in data science? - Most of us heard of Virtual Machine tools such as VMWare, Virtual PC, Virtual Box. This is the 1st time I heard of containers. What are some key benefits of it?Are there websites such as Turnkey hub where you can get some good images of various OS / software / DBMS platforms? - What are some best practices around deploying Data Science Models? Do you do something similar to DBAs or DataEngineers to run a job at certain frequencies in the day or hour? - How do you use data pipelines in your project? Is that something used in ETL like Data-Wrangling process? - Please tell us where we can find you in social media? Music: www.freesfx.co.uk
Denny Cherry is the owner and principal consultant for Denny Cherry & Associates Consulting and has over a decade of experience working with platforms such as Microsoft SQL Server, Hyper-V, vSphere and Enterprise Storage solutions. Denny’s areas of technical expertise include system architecture, performance tuning, security, replication and troubleshooting. Denny currently holds several of the Microsoft Certifications related to SQL Server for versions 2000 through 2014 including the Microsoft Certified Master as well as being a Microsoft MVP for several years. Denny has written several books and dozens of technical articles on SQL Server management and how SQL Server integrates with various other technologies. Interviewer: Rajib Bahar Agenda: - What makes SQL Server exciting? Why are you so passionate about data? - What is the application of Hyper-V? - Have you worked with Big Data analytics in Microsoft SQL Azure cloud environment? - You have been consulting for a while. In your opinion is it worth the challenges? What difficulties have you faced initially? - What has been the most important and high impactful project have you worked on? - You're a "Microsoft Certified Master". What does it mean and How did you go about achieving it? - What would you recommend to aspiring professional who want to move into SQL Server or database world? - Where can people find out about you? are you in linkediN? facebook? Music: www.freesfx.co.uk
Colin Bartol has led a team that built 46% of the servers for Tricare West which covers 2.6 million people for the military which required security for NIST, PCI, and HIPPA. He has his MBA from Carlson school of management, CISSP, and is a SME for the CompTIA Project+ exam. Having been a consultant at 5 Fortune 100 companies in e-commerce, financial services, and retail sectors Colin has a wide experience of what happens in information technology. He is currently employed in telecommunications at a major health insurance company. Interviewers: Rajib Bahar, Shabnam Khan Discussion: - Your background is largely in computer security infrastructure... Tell us about your Data Science related experience? We would like to hear about Affinity project - What is Data Lake? How have you utilized it? - I noticed UHG is hiring Big Data experts like crazy... Are you involved in similar project in your department? - We are interested in learning about Tricare West - As it relates to computer Security, what tips or recommendations do you have on securing data and infrastructure in general? and Social media presence Music: www.freesfx.co.uk