Sunday, 11 February 2018

The different data science roles in the industry

The Data Scientist
A data scientist is probably one of the hottest job titles that you can put on your business card, and the closer you get to Silicon Valley, the more valuable this role becomes. A data scientist is as rare as a unicorn and gets to work everyday with the mindset of a curious data wizard. He/she masters a whole range of skills and talents going from being able to handle the raw data, analyzing that data with the help of statistical techniques, to sharing his/her insights with his peers in a compelling way. No wonder these profiles are highly wanted by companies like Google and Microsoft.
data-scientist-infographic
The Data Analyst
The data analyst is the Sherlock Holmes of the data science team. Languages like R, Python, SQL and C are elementary to him/her. Just like the data scientist, the skills and talents that are needed for this role are diverse and range the entire spectrum of the data science process combined with a healthy “figure-it-out” attitude. Data analysts are wanted by companies like HP and IBM (where they can be teamed up with Watson).
data-analyst-infographic
The Data Architect
With the rise of big data, the importance of the data architect’s job is rapidly increasing. The person in this role creates the blueprints for data management systems to integrate, centralize, protect and maintain the data sources. The data architect masters technologies like Hive, Pig and Spark, and needs to be on top of every new innovation in the industry.
data-architect-infographic
The Data Engineer
The data engineer often has a background in software engineering and loves to play around with databases and large –scale processing systems.  Thanks to these interests, he/she can easily master technologies and is therefore familiar with a diverse set of languages that span both statistical programing languages and languages oriented more towards web development. Your data engineer is your jack of all trades.
data-engineer-infographic
The Statistician
Ah the statistician! The historical leader of data and its insights. Although often forgotten or replaced by fancier sounding job titles, the statistician represents what the data science field stands for: getting useful insights from data. With his/her strong background in statistical theories and methodologies, and a logical and stats oriented mindset, he/she harvests the data and turns it into information and knowledge. Statisticians can handle all sorts of data. What’s more, thanks to their quantitative background, modern statisticians are often able to quickly master new technologies and use these to boost their intellectual capacities. A statistician brings the mathemagic to the table, and his/her insights are able to radically transform businesses.
statistician-infographic
The Database Administrator
People often say that data is the new gold. This means you need someone who exploits that valuable mine. Enter the Database Administrator. Your DA makes sure that the database is available to all relevant users, is performing properly and is being kept save. Thinking how to prevent disasters comes natural to him/her. A DA makes sure that all backup and recovery systems are in place, that security is taken care of, and keeps track of the different technologies that are being used and how to support these.
dba-infographic
The Business Analyst
The business analyst is often a bit different from the rest of the team. While often less technically oriented, the business analyst makes up for it with his/her deep knowledge of the different business processes. (S)he masters the skill of linking data insights to actionable business insights and is able to use storytelling techniques to spread the message across the entire organization. They often act as the intermediary between the business guys and the techies. Companies looking for business analysts are diverse and active in very different industries. Some examples are Uber, Dell and Oracle.
business-analyst-infographic
Data and Analytics Manager
The cheerleader of the team. A data analytics manager steers the direction of the data science team and makes sure the right priorities are set. This person combines strong technical skills in a diverse set of technologies (SQL, R, SAS, …) with the social skills required to manage a team. It’s a hard job but if you feel that its fit for you, make sure to have a look at the job offerings at Coursera, Slack, or Motorola.
data-analytics-manager-infographic
The Salary
To end, we had a quick look at the average salaries displayed for each roles. Note that these salaries can highly differ based on location, industry, etc. In general, it looks like a job as a data and analytics manager or a data scientists will give you the highest paycheck. This was to be expected, given the latter’s unicorn status and the former’s team lead responsibility.
In the middle we find the roles of data engineer and data architect. People with a software engineering background or a combination of software engineering and data analytics skills often fill in these roles. The recruiting competition with the jobs that are more oriented towards software development probably drives up their national average salaries.
The remaining jobs are all closely together in terms of remuneration. However, it is not unlikely the assume their wages will go up on the scale very soon, since these roles are currently increasing in demand.
data-industry-salary
Closing remarks
While the above is just an interpretation of what we observed while looking at different data science job postings, we do shed some light on the different data science jobs that are available in today’s market. While looking for your data science dream job, however, take into account that roles can differ: that is why you should make sure to ask about details on what projects and technologies you’ll be working with to make sure there is a fit for you and your skills. In the meantime, don’t forget to keep your data science skills up to date.
datacamp-summary

Sunday, 28 January 2018

5 Must-Read Books for the Budding Data Scientist

Getting started in the exciting field of data science can be a bit overwhelming. There are so many new tools, groundbreaking applications and innovative ways to explore data that even experts in the field don’t have it all figured out. But for budding data scientists, understanding this complex field may be just a few pages away. These highly acclaimed books explain the basics of big data and beyond with predictive analysis, illuminating information, applications and even potential threats, offering a comprehensive introduction to the field. Consider this an essential reading list for the aspiring data scientist.


Big Data: A Revolution That Will Transform How We Live, Work, and Think

This book provides a highly detailed introduction to the emerging science of big data, while also uncovering some of the most pressing issues related to both its current and future applications. Exploring big data in business, health, politics and more, you’ll learn all about how big data is transforming the way we process the information around us.




Automate This: How Algorithms Came to Rule Our World

Automate This, Christopher Steiner explains how algorithms are increasingly being used to tackle high-level tasks that were once achieved only by humans with advanced training, including medical diagnosis and foreign policy analysis.





Doing Data Science: Straight Talk from the Frontline

Doing Data Science is an ideal read for budding data scientists who are just getting started in the field. Based on Columbia University’s Introduction to Data Science class, this book will teach you to see through the popular hype around “big data,” and it will give you the knowledge and insights you need to hit the ground running in this fast-growing field. Study the book’s chapters for lectures from leading data scientists from Google, Microsoft and eBay as they share case studies and code for analysis, algorithms, modeling, visualization and more.

Privacy in the Age of Big Data: Recognizing Threats, Defending Your Rights, and Protecting Your Family

Big data can predict what you’ll buy at the grocery store, the spread of disease and even when you’ll die. The power to tell the future with seemingly incomprehensible quantities of data is truly astounding; but, is it all just a little too personal? In a world where practically every move can increasingly be predicted, it’s difficult to maintain a sense of privacy. 




Predictive Analytics offers tangible and easy-to-understand insights into the complex world of data analysis. Read this book to find out how institutions are increasingly predicting human behavior – whether you’re going to click, buy, lie, or die, as the title suggests. Predictive Analytics also shares the “why” and the “how” of behavior prediction – highlighting the many ways in which predictive analysis is able to improve healthcare, fight crime and boost sales – all through the careful analysis of big data

Monday, 22 January 2018

5 Super exciting Data Science / Machine Learning / Artificial Intelligence based startups in India BUSINESS ANALYTICS MACHINE LEARNING

Introduction

Data technologies have been around for some time now. But, increase in data generation and availability of servers on the cloud has enabled an entire generation of startups working on ideas which were unthinkable a few years back. The change in landscape is aptly summarised by the quote below.

List of Companies

1. Edge Networks

Incorporated in 2012 by  Arjun Pratap, Edge Networks dreams to change the way HR industry works right now. With an ever increasing number of job seekers, the process of finding a right match for a particular job profile today has become extremely cumbersome. With Data Science and Artificial Intelligence at its core, Edge Networks has developed their product HIREalchemy to match people with the required job. The solution provided by them facilitates talent acquisition, internal workforce optimization and talent analytics. Edge Networks was featured in Nasscom’s Emerge 50 2016 list.

 2. Fluid AI

What if I tell you, there is a company out there which is working to convert any screen into a gesture controlled AI powered assistant. Then be it in malls, banks etc. these screens will be able to  address you when you approach a product kept next to it just like a human staff does? Fluid AI is one such company which is on the verge of a revolution for personalisation  in Finance, Government, Web and Marketing. Founded in 2009 by two brothers Abhinav Aggarwal and Raghav Aggarwal, Fluid AI is leading the virtual customer assistance market. It aims to cater to various sectors to mimic human interaction with the customer & help reduce operational cost for a company. Fluid AI serves clients like Vodafone, Toyota, Deloitte, Emirates, NBD, Barclays, Rolls Royce, Accenture and Axis Bank. This is one company that you must keep an eye on.

3. Flutura

Every now & then we see new analytics startups trying to generate insights from structured and unstructured data. But Flutura founded by Derick Jose, Srikanth Muralidhara and Krishnan Raman is different. Flutura believes in actions and not insights. Flutura works on M2M model via its product Cerebra where it collects data on thousands of data points of various different machines. And it then leverages these data points to convert it into actionable strategies like pre-scheduling repairs for machines, order spare parts, etc. This model increases the life of machines, saves cost on operational loss and increases efficiency. Flutura has been recognized as one of the Top 20 Most Promising Big Data Companies globally by California-based Tech magazine, CIO Review and was also recognized by TechSparks2013 as one of the Top 3 startups out of India.

 4.Heckyl

Trading is an uncertain world and the best example of Butterfly Effect. Any small incident in some part of the globe can result in huge gains or losses in the trading industry. What if there was a way to keep track of all these news, people emotions, trending sentiments, etc all in a single place that can optimize your trading strategy ?
Founded in 2010 by four former Merrill Lynch executives Abhijit Vedak, Jaison Mathews, Mukund Mudras, Som Sagar, Heckyl is revolutionizing the trading industry for brokerage firms, short-term traders, investors and fund managers. Heckyl does this through its integrated trading terminal which also provides visuals and heat maps of sentiments and market data to help traders find the right trading opportunities.

5. Niki.ai

Founded in 2015 by Sachin Jaiswal, Nitin Babel, Shishir Modi and Keshav Prawasi  Niki.ai aims to be the one-stop solution for a customer’s order.
The startup leverages natural language processing and machine learning technologies to converse with customers over a simple chat interface, and places their orders with their partner businesses within seconds.

End Notes

It has not been long for AI and Machine Learning in India. Yet, various exciting Startups have been incorporated which are pushing the boundaries of technology and human comfort meanwhile solving real world problems. Apart from the 10 above startups, there are many more startups in analytics industry which are waiting to leave a mark. If you are aware of a company which is pushing boundaries of AI, share with me in the comments below.
I hope you enjoyed reading this article as I much as I did writing it.  I would like to know your thoughts on the above startups, share your opinion in the comments below.

5 best data scientists and why you should follow them on Twitter

No matter what industry you work in, chances are the applications your business uses are generating vast quantities of data that are increasing exponentially. The confluence of technologies that generate enormous amounts of data, store it, and analyze it in a timely manner affords businesses an unprecedented opportunity to put that data to work. Big data, when processed with predictive analytics algorithms, lets you find new patterns in data, and make increasingly accurate predictions about future business trends and opportunities. It's also playing a role in the software development life cycle, according to the 2015 World Quality Report survey.
The role of predictive analytics has become such a disruptive force in business that the Harvard Business Review recently suggested that data scientists might have the sexiest jobs of the 21st century.
Who are the people leading the charge? To help you keep up with the current trends and opportunities in data science and predictive analytics, we pulled together this list of influential data scientists worth following. Here, in alphabetical order, are 10 data scientists you should be following.

1. Dean Abbott

Dean Abbott is co-founder and chief data scientist at SmarterHQ, and founder and president of Abbott Analytics. He is a co-author of the IBM SPSS Modeler Cookbook, and the author of Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst. Follow his blog at http://abbottanalytics.blogspot.com.

2. Kenneth Cukier

Kenneth Cukier is the data editor for The Economist. He's a co-author of the book Big Data: A Revolution That Will Transform How We Live, Work, and Think, and is a popular speaker. Watch him gave a fascinating TED talk on "Big data is better data."

3. John Elder

John Elder is the founder of data mining consultancy Elder Research, Inc. He is a frequent keynote speaker, and served for five years on a presidential panel tasked with guiding technology for national security. He is the author of several books, including the Handbook of Statistical Analysis and Data Mining ApplicationsEnsemble Methods in Data Mining, and Practical Text Mining, and is an adjunct professor at the University of Virginia, where he teaches the optimization of data mining. You can watch many of his presentations on YouTube.

4. Bernard Marr

Bernard Marr, founder and CEO of the Advanced Performance Institute, regularly advises businesses and government organizations on how to gain better insights from their data. He is a contributor to the World Economic Forum, and is recognized by LinkedIn as one of the world's top 50 business influencers. He is a sought-after keynote speaker, and is the author of many articles and books, including Big Data: Using SMART Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance.

5. Hilary Mason

Hilary Mason, founder of Fast Forward Labs, has also served as chief scientist at Bitly, Inc., co-founded HackNY, and is a member of NYCResistor. Mason was cited by Fortune in its 40 Under 40 list, and is a popular influencer on LinkedIn, where she has a large following. She enjoys speaking, and you can find many of her presentations on YouTube.


10 Ultimate Data Science Projects To Boost Your Knowledge and Skills

Introduction

Data science projects offer you a promising way to kick-start your analytics career. Not only you get to learn data science by applying, you also get projects to showcase on your CV. Nowadays, recruiters evaluate a candidate’s potential by his/her work, not as much by certificates and resumes. It wouldn’t matter, if you just tell them how much you know, if you have nothing to show them! That’s where most people struggle and miss out!

Useful Information

To help you decide your start line, I’ve divided the data set into 3 levels namely:

1.Beginner Level: This level comprises of data sets which are fairly easy to work with, and doesn’t require complex data science techniques. You can solve them using basic regression / classification algorithms

2.Intermediate Level: This level comprises of data sets which are challenging. It consists of mid & large data sets which require some serious pattern recognition skills. 

3.Advanced Level: This level is best suited for people who understand advanced topics like neural networks, deep learning, recomender systems etc. 


Table of Contents

Beginner Level

  • Iris Data GetData Problem: Predict the flower class based on available attributes
  • Titanic Data GetData Problem: Predict the survival of passengers in Titanic.
  • Loan Prediction Data GetData Problem: Predict if a loan will get approved or not.

Intermediate Level

  • Human Activity Recognition Data GetData Problem: Predict the activity category of a human
  • Black Friday Data  GetData Problem: Predict purchase amount.
  • Siam Competition Data1  GetData Problem: Classify the documents according to their labels
  • Trip History Data  GetData Problem: Predict the class of user

Advanced Level

  • Identify your Digits GetData  Problem: Identify digits from an image
  • Yelp Data  GetData.  Problem: Find insights from images
  • ImageNet Data  GetData  Problem: Problem to solve is subjected to the image type you download

The different data science roles in the industry

The Data Scientist A data scientist is probably one of the hottest job titles that you can put on your business card, and the closer you...