Challenging opportunity awaits a BIG DATA ENGINEER who seeks to HANDLE DATA THAT RANGES FROM 100GB TO PETABYTE SIZES using an AI Platform that helps its clients out predict their competitors and improve overall customer satisfaction and efficiency.
This role is based in Sandton and paying between R750 000 and R850 000 PA.
Having made great strides in the field of AI, this South African based company has created a revolutionary AI Platform that can deliver industry leading automated consumer behaviour predictions for companies by using their raw unstructured data. By using deep learning algorithms this platform has provided clients with better business insights, improved customer satisfaction and overall efficiency by matching people with products, matching inventory with business opportunities and spending with propensity.
With a $2 Million venture backing, this AI platform has superseded expectations and is in the process of expanding its team to meet the high demand for their platform and produce even better results which usually only takes 2 weeks.
As part of a team of extraordinary engineers you will be expected to deliver automated consumer behaviour prediction platforms to companies that produces results multiply better than traditional statistical methods. The company in question automates and commoditizes cutting edge AI results directly from client data, as a result you will need to help the platform handle data at the massive scale required.
- Selecting and integrating any Big Data tools and frameworks required to provide capabilities.
- Implementing ETL process.
- Part of ETL is analysing and understanding the data well enough to integrate into the API.
- Propose, design and implement Big Data Architecture including infrastructure.
- Monitoring performance and advising any necessary infrastructure changes.
- Defining data retention policies.
- 3 years’ experience with Hadoop v2 and MapReduce.
- Proficiency with the management of Hadoop cluster and accompanying services including Hive, Spark, Kafka, Scoop and Oozie.
- Proficiency with Presto.
- 3 years’ experience in NoSql databases. Cassandra preferred.
- Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming.
- Experience with integration of data from multiple data sources.
- Knowledge of various ETL techniques.
- 3 years’ experience in creating Lambda Architecture, along with knowledge of its advantages and drawbacks.
- Experience with Cloudera/MapR/Hortonworks