Big Data Hadoop Administrator- LT Contract
Provide technical leadership for development and support of enterprise big data applications within the Hadoop ecosystem.
Work with infrastructure Team to migrate existing Hadoop platform from Cloudera Data platform.
• Responsible for implementation and ongoing administration of Hadoop infrastructure
• Ensure data security, data quality, and governance of data within the Hadoop Ecosystem
• Responsible for cluster security, data governance, maintenance, creation and removal of nodes using tools like Apache Ranger, Apache Atlas, Hortonworks Data Platform, Cloudera Manager
• Develop, and maintain development best practices for developing against Hadoop clusters
• Work with stakeholder teams to setup new Hadoop users, Linux users, set up access for the new users.
• Collaborate with administrators and application teams to ensure that business applications are highly available and performing within agreed upon service levels
• 2+ years working experience as Hadoop Administrator and Cloudera Data Platform
• Manage Production/Development databases in areas like Capacity Planning, Performance Monitoring & Tuning, Backup/Recovery Techniques, Space/ User/ Security Management
• Performance tuning of Hadoop clusters and Hadoop MapReduce routines
• Work with systems engineering team to deploy new hardware and software environments for Hadoop and expand existing environments.
• Proficiency with programming and query languages such as Java, Python, and SQL
• Manage and review Hadoop log files. File system management and monitoring
• Team with infrastructure teams to guarantee high data quality and availability
• Screen Hadoop cluster job performances and capacity planning, monitor connectivity and security
• Work with application teams to install operating system and Hadoop updates, patches, upgrades
• Good Agile Scrum knowledge
• General expertise such as good troubleshooting skills, understanding of systems capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks.
• Ability to deploy Hadoop cluster, add and remove nodes, keep track of jobs, monitor critical parts of the cluster, configure name node high availability, schedule and configure it and take backups.
• Good knowledge of Linux as Hadoop runs on Linux.
• Good interpersonal and communication skills
• Bachelor's Degree in Computer Science, Information Systems, or related field
Additional nice to have skills
Knowledge of Troubleshooting Core Java Applications is a plus,
Databricks, AWS analytics tools,
GCP analytics tools, Apache Impala, Ansible, Docker, Kubernetes, Apache Airflow,
good understanding Linux administration, Terraform, Bash