About the Job:
Selected Candidate's day-to-day responsibilities include:
- Build scalable, available, and supportable processes to collect, manipulate, present, and analyze large datasets in a production environment that meet functional/non-functional business requirements. Articulate problem definition and work on all aspects of data including acquisition, exploration/visualization, feature engineering, experimentation with machine learning algorithms, deploying models.
- Mine and analyze data from company databases to drive optimization and improvement of product
development, marketing techniques and business strategies.
Develop working prototypes of algorithms and evaluate and compare metrics based on the real-world
- Create and maintain data pipeline architecture for optimal extraction, transformation and loading of data from a variety of resources using SQL and AWS big data technologies.Identify, design and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.Provide design input specifications, requirements, and guidance to software engineers for algorithmimplementation for solution/product development.
- Mentor, Collaborate and guide a team of talented software engineers to implement methodologies
from statistics/machine learning and computational science.
Improving and advancing the companys products and technologies using methodical research
- Using large proprietary datasets to conduct training sets and correlation
Applying the latest available techniques to datasets for yield and operational improvement.
Working with stakeholders including the Product, Process, Design, Quality and Reliability teams to assist
with data-related technical issues and identify opportunities for leveraging company data to drive
- Predicts outcomes based on rigorous experimental design and statistical method
Creates repeatable solutions through written project documentation, process flowcharts, layouts,
diagrams, charts, code comments and clear code.
- Good knowledge of Computer Science, math and statistics fundamentals (algorithms and data
structures, meshing, sampling theory, linear algebra, etc.)
- Strong programming experience with any of the modern object-oriented languages.
- Experience performing root cause analysis on internal and external data processes to answer specific
business questions and identify opportunities for improvement.
- Experience of data warehousing technologies including data modeling, ETL and reporting, end-to-end
data warehouse and ODS implementations.
- Familiarity with Linux/Unix environments.
- Knowledge in Big Data and Advanced Predictive Analytics including an exposure to areas such as Artificial Intelligence, Deep Learning, Intelligent Bots, Reinforcement Learning, Neural Networks.Statistical modeling (Lease Squares & Logistic Regression, GLM, Segmentation, Clustering, DynamicBayesian Networks, etc.)
- Experience with big data tools: Hadoop, Hive, HBase, Apache Spark, Kafka, Pig, Oozie etc.
- Experience with data pipeline and cloud implementation platforms: Streamsets, AWS, Cloudera. Working knowledge of distributed data/computing tools: Map/Reduce, Spark, MySQL, etc.
- Knowledge of data science technologies such as R, Octave, Python, pandas, Spark MLlib, PySpark, SparkR, scipy, numpy, matplotlib, etc.
- Data Science visualization tools (R Shiny, Tableau, Qlikview etc.) Experience with graph databases is a plus.
- Advanced working knowledge of relational (SQL) and non-relational databases (Hadoop, MongoDB,Apache Spark, Cassandra, etc.)
Education & Experience:
- Bachelors / Masters degree from a recognized institute in Computer Science, Mathematics, Statistics, Engineering, Bioinformatics, Physics
- Minimum 3 years of relevant work experience with a proven track record of designing, developing and deploying advanced analytics solutions in commercial setting.
INR 4,00,000 - 8,00,000 P.A.
- Computer Vision, Industrial Automation, Internet of Things,Machine Learning, Rapid Prototyping and Simulation.
- Image Processing Algorithms,Deep Learning( TensorFlow, Theano, Keras, and Caffe), MATLAB (CodeGeneration and prototype), GPU Processing, Simulink, OpenCV, IPP, NEURAL NETWORKS, MACHINE LEARNING, Deep Learning, Java, C, C++, C#, Python, ESP8266, ARDUINO, RaspberryPi, Visual Studio, Eclipse, Socket programming.