1. Responsible for the research and development of machine learning platform, including requirements communication, function design and development, etc.
2. Optimize and adjust the infrastructure service architecture for high availability and high scalability.
1. Proficient in Java language, familiar with shell, Python and other scripting languages;
2. Proficient in various mainstream Java frameworks, including spring, netty, hibernate, mybatis, etc.
3. Have the ability of system architecture, and be familiar with distributed fault tolerance, distributed cache, high concurrency and other mainstream technologies;
4. Experience in 2B delivery is preferred.
1. Algorithm modeling and parameter adjustment;
2. Deep learning compiler optimization;
3. Optimize the resource reuse of large-scale GPU cluster;
Job requirements:(Accept internship)
（1） Model optimization direction
1. Be familiar with deep learning theory and have deep learning model training experience;
2. Keep an eye on and curiosity about the latest model development in the industry and academia, and understand the basic structure and principle of the latest model;
3. Good engineering literacy, proficient in Python, familiar with tensorflow / pytorch framework;
The following are additional items:
1. Be familiar with the framework and latest research progress of model inferencegtraining compression technology, including but not limited to quantification, pruning, tensor analysis and KD; Have a clear understanding of the principle of the method;
2. Be familiar with the basic principles of each optimizer in model training, and understand the basic methods and framework of distributed training; < br > Have some knowledge of the latest training acceleration methods, such as hybrid precision training, low bit training, distributed gradient compression;
3. Have a certain understanding of automl, and be familiar with reinforcement learning or Bayesian optimization;
For network structure search and automatic compression and other fields have a certain understanding, experience is the best.
（2） System optimization direction
Have solid C + + development experience or proficient in Python, have good programming habits, familiar with multithreading programming, memory management, design patterns and Linux / unix development environment, have a clear understanding and understanding of the syntax characteristics of C + +, and have a good perception and experience of high concurrency program development; The following are additional items:
1. Experience in distributed system related projects, design ability and debugging ability of complex system software, familiar with trade-off of common design patterns and architectures;
2. Proficient in various big data computing frameworks, familiar with distributed computing frameworks such as Hadoop / hive / MPI / spark / tensorflow, and understand common deep learning framework engines;
3. Experience in machine learning, deep learning, large-scale distributed machine learning, search, advertising, recommendation, machine translation and other fields are preferred;
4. Familiar with GPU hardware architecture, CUDA, cudnn, and rich optimization experience in deep learning computing framework is preferred;
5. Compiler development experience is preferred, familiar with XLA / TVM / MLIR is preferred.