Job Responsibilities:

  • Identifying the key problem in the system and proposing a solution

  • Data mining using state-of-the-art methods

  • Identify third party data that can be used to enhance the information gained

  • Processing, cleansing, and verifying the integrity of data used for analysis

Mandatory skills:-

• Strong Python coding.

• Experience in Advance SQL

• Experience in AWS Cloud services like Redshift, S3, Ec2, RDS, VPC

• Experience in GCP Cloud services like Compute Engine, App Engine, Cloud SQL, Cloud Storage

• Hands on experience on Talend ETL Tool

• Strong hands-on Linux and Windows environment for troubleshooting existing pipeline issues

• Experience on AWS EMR Clusters or Google Data Proc clusters ( will be an advantage)

Other Skills & Qualifications:

  • Proficiency in SQL is a must.

  • Experience/knowledgeable in Big Data architectures.

  • Designing and constructing a highly scalable data management system

  • Experience with common data science toolkits, such as R, Python, Java.

  • Hands-on experience in building up real-time data pipeline or batch-based data pipelines

  • Experience in building software components and analytics applications

  • Experience with NoSQL databases, such as MongoDB, Cassandra, HBase, Redis is an added bonus.

  • Experienced in data mining exercises for business insights.

  • 4+ years of experience in Data Engineering and Data Warehousing

  • Experienced in Big Data, Relational Data and Unstructured Databases

  • Data-oriented personality with ability to logically breakdown data problems and find solutions

  • Highly curious, self-starter and can work with minimum supervision and guidance.

  • Bachelor or Master degree in computing domains

  • Great communication skills