Data science platforms need to adapt to trends in enterprise IT
From simple technologies like emails to sophisticated ones like ERP platforms, the transformation in enterprise IT is undeniable. There are several consistent trends within enterprise IT that would directly impact data scientists. These changes are fairly consistent across industries, which would require data scientists to closely examine their data science platform of choice.
Here are some of the key IT trends to watch out for:
- Adoption of Big Data Technologies: Most companies we have spoken to have either adopted or plan to adopt Big Data technologies. More specifically, they use either Hortonworks or Cloudera to store their data. The impact to data science platform is that it now needs to access data within HDFS and Hive.
- Use of new data sources for decision making: Enterprise applications, both on premise and in the cloud, hold data vital for building models for business strategies. Some banks are leveraging social data (facebook, linkedin, etc.) along with demographic information to evaluate credit risk for applicants. Your data science product needs to have the ability to connect and integrate with these data sources.
- Adoption of Open Standards and Libraries: Along with Python and R, organizations are adopting open source machine learning libraries like SparkML, TensorFlow, etc. The data science platform should provide tight integration with these open libraries in additional to support for python and R.
- Integrations with enterprise applications and tools: With the main stream adoption of advanced analytics, data science platforms need to integrate with enterprise applications in real time. For example, organizations now embed models within ecommerce websites, enterprise applications, customer support portals, SalesForce, etc.
- IT Centralization: Many of the customers we have spoken to have already started moving towards centralized IT, in order to reduce hardware and storage overheads. However, this means shared data between departments and no ability to subset or copy the data. This improves IT efficiency and lowers cost, but it requires the data science platform to support the multi-tenant / shared environment.
- Security and Governance: With IT Centralization, the importance of governance is magnified for both, the models as well as the data.
Angoss’ data science platform has started the journey to enable data scientists to address these trends. Read more about the Angoss Big Data Platform here.